arXiv Papers with Code in Cryptography and Security (July 2025

Abstract:
Securing blockchain-enabled IoT networks against sophisticated adversarial attacks remains a critical challenge. This paper presents a trust-based delegated consensus framework integrating Fully Homomorphic Encryption (FHE) with Attribute-Based Access Control (ABAC) for privacy-preserving policy evaluation, combined with learning-based defense mechanisms. We systematically compare three reinforcement learning approaches -- tabular Q-learning (RL), Deep RL with Dueling Double DQN (DRL), and Multi-Agent RL (MARL) -- against five distinct attack families: Naive Malicious Attack (NMA), Collusive Rumor Attack (CRA), Adaptive Adversarial Attack (AAA), Byzantine Fault Injection (BFI), and Time-Delayed Poisoning (TDP). Experimental results on a 16-node simulated IoT network reveal significant performance variations: MARL achieves superior detection under collusive attacks (F1=0.85 vs. DRL's 0.68 and RL's 0.50), while DRL and MARL both attain perfect detection (F1=1.00) against adaptive attacks where RL fails (F1=0.50). All agents successfully defend against Byzantine attacks (F1=1.00). Most critically, the Time-Delayed Poisoning attack proves catastrophic for all agents, with F1 scores dropping to 0.11-0.16 after sleeper activation, demonstrating the severe threat posed by trust-building adversaries. Our findings indicate that coordinated multi-agent learning provides measurable advantages for defending against sophisticated trust manipulation attacks in blockchain IoT environments.

Paperid: 2, https://arxiv.org/pdf/2512.22179.pdf GitHub

Abstract:
A fundamental limitation of supervised deep learning in high-dimensional tabular domains is "Generalization Collapse": models learn precise decision boundaries for known distributions but fail catastrophically when facing Out-of-Distribution (OOD) data. We hypothesize that this failure stems from the lack of topological constraints in the latent space, resulting in diffuse manifolds where novel anomalies remain statistically indistinguishable from benign data. To address this, we propose Latent Sculpting, a hierarchical two-stage representation learning framework. Stage 1 utilizes a hybrid 1D-CNN and Transformer Encoder trained with a novel Dual-Centroid Compactness Loss (DCCL) to actively "sculpt" benign traffic into a low-entropy, hyperspherical cluster. Unlike standard contrastive losses that rely on triplet mining, DCCL optimizes global cluster centroids to enforce absolute manifold density. Stage 2 conditions a Masked Autoregressive Flow (MAF) on this pre-structured manifold to learn an exact density estimate. We evaluate this methodology on the rigorous CIC-IDS-2017 benchmark, treating it as a proxy for complex, non-stationary data streams. Empirical results demonstrate that explicit manifold sculpting is a prerequisite for robust zero-shot generalization. While supervised baselines suffered catastrophic performance collapse on unseen distribution shifts (F1 approx 0.30) and the strongest unsupervised baseline achieved only 0.76, our framework achieved an F1-Score of 0.87 on strictly zero-shot anomalies. Notably, we report an 88.89% detection rate on "Infiltration" scenarios--a complex distributional shift where state-of-the-art supervised models achieved 0.00% accuracy. These findings suggest that decoupling structure learning from density estimation provides a scalable path toward generalized anomaly detection.

Paperid: 3, https://arxiv.org/pdf/2512.21304.pdf GitHub

Abstract:
In this work we present a publicly verifiable quantum money protocol which assumes close to no quantum computational capabilities. We rely on one-time memories which in turn can be built from quantum conjugate coding and hardware-based assumptions. Specifically, our scheme allows for a limited number of verifications and also allows for quantum tokens for digital signatures. Double spending is prevented by the no-cloning principle of conjugate coding states. An implementation of the concepts presented in this work can be found at https://github.com/neverlocal/otm_billz.

Paperid: 4, https://arxiv.org/pdf/2512.20872.pdf GitHub

Abstract:
Function call graphs (FCGs) have emerged as a powerful abstraction for malware detection, capturing the behavioral structure of applications beyond surface-level signatures. Their utility in traditional program analysis has been well established, enabling effective classification and analysis of malicious software. In the mobile domain, especially in the Android ecosystem, FCG-based malware classification is particularly critical due to the platform's widespread adoption and the complex, component-based structure of Android apps. However, progress in this direction is hindered by the lack of large-scale, high-quality Android-specific FCG datasets. Existing datasets are often outdated, dominated by small or redundant graphs resulting from app repackaging, and fail to reflect the diversity of real-world malware. These limitations lead to overfitting and unreliable evaluation of graph-based classification methods. To address this gap, we introduce Better Call Graphs (BCG), a comprehensive dataset of large and unique FCGs extracted from recent Android application packages (APKs). BCG includes both benign and malicious samples spanning various families and types, along with graph-level features for each APK. Through extensive experiments using baseline classifiers, we demonstrate the necessity and value of BCG compared to existing datasets. BCG is publicly available at https://erdemub.github.io/BCG-dataset.

Paperid: 5, https://arxiv.org/pdf/2512.19730.pdf GitHub

Abstract:
Backdoor attacks pose a significant threat to the security and reliability of deep learning models. To mitigate such attacks, one promising approach is to learn to extract features from the target model and use these features for backdoor detection. However, we discover that existing learning-based neural backdoor detection methods do not generalize well to new architectures not seen during the learning phase. In this paper, we analyze the root cause of this issue and propose a novel black-box neural backdoor detection method called ArcGen. Our method aims to obtain architecture-invariant model features, i.e., aligned features, for effective backdoor detection. Specifically, in contrast to existing methods directly using model outputs as model features, we introduce an additional alignment layer in the feature extraction function to further process these features. This reduces the direct influence of architecture information on the features. Then, we design two alignment losses to train the feature extraction function. These losses explicitly require that features from models with similar backdoor behaviors but different architectures are aligned at both the distribution and sample levels. With these techniques, our method demonstrates up to 42.5% improvements in detection performance (e.g., AUC) on unseen model architectures. This is based on a large-scale evaluation involving 16,896 models trained on diverse datasets, subjected to various backdoor attacks, and utilizing different model architectures. Our code is available at https://github.com/SeRAlab/ArcGen.

Paperid: 6, https://arxiv.org/pdf/2512.18542.pdf GitHub

Abstract:
AI assistants produce vulnerable code in 45% of security-relevant scenarios, introducing flaws into production systems at scale. Yet existing secure coding datasets fall short. They lack incident grounding, don't provide the scale modern training requires, and miss the operational security context developers need for production deployments. We present SecureCode v2.0, a production-grade dataset of 1,215 security-focused coding examples that passed structural validation and expert security review. Every example ties to actual documented security incidents with CVE references, provides vulnerable and secure implementations, demonstrates concrete attacks, and includes defense-in-depth operational guidance. The dataset covers 11 vulnerability categories (complete OWASP Top 10:2025 plus AI/ML Security Threats) across 11 languages (Python, JavaScript, Java, Go, PHP, C#, TypeScript, Ruby, Rust, Kotlin, and YAML for infrastructure-as-code). Our quality assurance framework ensures complete incident grounding. Each example includes SIEM integration strategies, infrastructure hardening recommendations (Docker, AppArmor, WAF configurations), and testing approaches using language-appropriate frameworks. The dataset uses a 4-turn conversational structure mirroring actual developer-AI interactions, escalating from basic implementations to advanced security considerations and defense-in-depth guidance. Our contributions: (1) 1,215 rigorously validated examples split into 989 training, 122 validation, and 104 test sets, (2) an automated validation framework ensuring dataset consistency, (3) a 4-turn conversational structure capturing realistic security workflows, (4) comprehensive operational security guidance with SIEM integration strategies, (5) complete language-specific implementation fidelity, and (6) open-source release of data, validation tools, and benchmarking protocols.

Paperid: 7, https://arxiv.org/pdf/2512.18199.pdf GitHub

Abstract:
Modern intrusion detection systems (IDS) leverage graph neural networks (GNNs) to detect malicious activity in system provenance data, but their decisions often remain a black box to analysts. This paper presents a comprehensive XAI framework designed to bridge the trust gap in Security Operations Centers (SOCs) by making graph-based detection transparent. We implement this framework on top of KAIROS, a state-of-the-art temporal graph-based IDS, though our design is applicable to any temporal graph-based detector with minimal adaptation. The complete codebase is available at https://github.com/devang1304/provex.git. We augment the detection pipeline with post-hoc explanations that highlight why an alert was triggered, identifying key causal subgraphs and events. We adapt three GNN explanation methods - GraphMask, GNNExplainer, and a variational temporal GNN explainer (VA-TGExplainer) - to the temporal provenance context. These tools output human-interpretable representations of anomalous behavior, including important edges and uncertainty estimates. Our contributions focus on the practical integration of these explainers, addressing challenges in memory management and reproducibility. We demonstrate our framework on the DARPA CADETS Engagement 3 dataset and show that it produces concise window-level explanations for detected attacks. Our evaluation reveals that the explainers preserve the TGNN's decisions with high fidelity, surfacing critical edges such as malicious file interactions and anomalous netflows. The average explanation overhead is 3-5 seconds per event. By providing insight into the model's reasoning, our framework aims to improve analyst trust and triage speed.

Paperid: 8, https://arxiv.org/pdf/2512.18133.pdf GitHub GitHub GitHub

Abstract:
Nowadays, Graph Fraud Detection (GFD) in financial scenarios has become an urgent research topic to protect online payment security. However, as organized crime groups are becoming more professional in real-world scenarios, fraudsters are employing more sophisticated camouflage strategies. Specifically, fraudsters disguise themselves by mimicking the behavioral data collected by platforms, ensuring that their key characteristics are consistent with those of benign users to a high degree, which we call Adaptive Camouflage. Consequently, this narrows the differences in behavioral traits between them and benign users within the platform's database, thereby making current GFD models lose efficiency. To address this problem, we propose a relation diffusion-based graph augmentation model Grad. In detail, Grad leverages a supervised graph contrastive learning module to enhance the fraud-benign difference and employs a guided relation diffusion generator to generate auxiliary homophilic relations from scratch. Based on these, weak fraudulent signals would be enhanced during the aggregation process, thus being obvious enough to be captured. Extensive experiments have been conducted on two real-world datasets provided by WeChat Pay, one of the largest online payment platforms with billions of users, and three public datasets. The results show that our proposed model Grad outperforms SOTA methods in both various scenarios, achieving at most 11.10% and 43.95% increases in AUC and AP, respectively. Our code is released at https://github.com/AI4Risk/antifraud and https://github.com/Muyiiiii/WWW25-Grad.

Paperid: 9, https://arxiv.org/pdf/2512.18102.pdf GitHub

Abstract:
Context: Exhaustive fuzzing of modern JavaScript engines is infeasible due to the vast number of program states and execution paths. Coverage-guided fuzzers waste effort on low-risk inputs, often ignoring vulnerability-triggering ones that do not increase coverage. Existing heuristics proposed to mitigate this require expert effort, are brittle, and hard to adapt. Objective: We propose a data-centric, LLM-boosted alternative that learns from historical vulnerabilities to automatically identify minimal static (code) and dynamic (runtime) features for detecting high-risk inputs. Method: Guided by historical V8 bugs, iterative prompting generated 115 static and 49 dynamic features, with the latter requiring only five trace flags, minimizing instrumentation cost. After feature selection, 41 features remained to train an XGBoost model to predict high-risk inputs during fuzzing. Results: Combining static and dynamic features yields over 85% precision and under 1% false alarms. Only 25% of these features are needed for comparable performance, showing that most of the search space is irrelevant. Conclusion: This work introduces feature-guided fuzzing, an automated data-driven approach that replaces coverage with data-directed inference, guiding fuzzers toward high-risk states for faster, targeted, and reproducible vulnerability discovery. To support open science, all scripts and data are available at https://github.com/KKGanguly/DataCentricFuzzJS .

Paperid: 10, https://arxiv.org/pdf/2512.16962.pdf GitHub

Abstract:
Large Language Model (LLM) agents increasingly rely on long-term memory and Retrieval-Augmented Generation (RAG) to persist experiences and refine future performance. While this experience learning capability enhances agentic autonomy, it introduces a critical, unexplored attack surface, i.e., the trust boundary between an agent's reasoning core and its own past. In this paper, we introduce MemoryGraft. It is a novel indirect injection attack that compromises agent behavior not through immediate jailbreaks, but by implanting malicious successful experiences into the agent's long-term memory. Unlike traditional prompt injections that are transient, or standard RAG poisoning that targets factual knowledge, MemoryGraft exploits the agent's semantic imitation heuristic which is the tendency to replicate patterns from retrieved successful tasks. We demonstrate that an attacker who can supply benign ingestion-level artifacts that the agent reads during execution can induce it to construct a poisoned RAG store where a small set of malicious procedure templates is persisted alongside benign experiences. When the agent later encounters semantically similar tasks, union retrieval over lexical and embedding similarity reliably surfaces these grafted memories, and the agent adopts the embedded unsafe patterns, leading to persistent behavioral drift across sessions. We validate MemoryGraft on MetaGPT's DataInterpreter agent with GPT-4o and find that a small number of poisoned records can account for a large fraction of retrieved experiences on benign workloads, turning experience-based self-improvement into a vector for stealthy and durable compromise. To facilitate reproducibility and future research, our code and evaluation data are available at https://github.com/Jacobhhy/Agent-Memory-Poisoning.

Paperid: 11, https://arxiv.org/pdf/2512.16904.pdf GitHub

Abstract:
Generation-time text watermarking embeds statistical signals into text for traceability of AI-generated content. We explore *post-hoc watermarking* where an LLM rewrites existing text while applying generation-time watermarking, to protect copyrighted documents, or detect their use in training or RAG via watermark radioactivity. Unlike generation-time approaches, which is constrained by how LLMs are served, this setting offers additional degrees of freedom for both generation and detection. We investigate how allocating compute (through larger rephrasing models, beam search, multi-candidate generation, or entropy filtering at detection) affects the quality-detectability trade-off. Our strategies achieve strong detectability and semantic fidelity on open-ended text such as books. Among our findings, the simple Gumbel-max scheme surprisingly outperforms more recent alternatives under nucleus sampling, and most methods benefit significantly from beam search. However, most approaches struggle when watermarking verifiable text such as code, where we counterintuitively find that smaller models outperform larger ones. This study reveals both the potential and limitations of post-hoc watermarking, laying groundwork for practical applications and future research.

Paperid: 12, https://arxiv.org/pdf/2512.16874.pdf GitHub

Abstract:
Invisible watermarking is essential for tracing the provenance of digital content. However, training state-of-the-art models remains notoriously difficult, with current approaches often struggling to balance robustness against true imperceptibility. This work introduces Pixel Seal, which sets a new state-of-the-art for image and video watermarking. We first identify three fundamental issues of existing methods: (i) the reliance on proxy perceptual losses such as MSE and LPIPS that fail to mimic human perception and result in visible watermark artifacts; (ii) the optimization instability caused by conflicting objectives, which necessitates exhaustive hyperparameter tuning; and (iii) reduced robustness and imperceptibility of watermarks when scaling models to high-resolution images and videos. To overcome these issues, we first propose an adversarial-only training paradigm that eliminates unreliable pixel-wise imperceptibility losses. Second, we introduce a three-stage training schedule that stabilizes convergence by decoupling robustness and imperceptibility. Third, we address the resolution gap via high-resolution adaptation, employing JND-based attenuation and training-time inference simulation to eliminate upscaling artifacts. We thoroughly evaluate the robustness and imperceptibility of Pixel Seal on different image types and across a wide range of transformations, and show clear improvements over the state-of-the-art. We finally demonstrate that the model efficiently adapts to video via temporal watermark pooling, positioning Pixel Seal as a practical and scalable solution for reliable provenance in real-world image and video settings.

Paperid: 13, https://arxiv.org/pdf/2512.15754.pdf GitHub

Abstract:
Reconfigurable Intelligent Surfaces (RIS) have emerged as a transformative technology capable of reshaping wireless environments through dynamic manipulation of electromagnetic waves. While extensive research has explored their theoretical benefits for communication and sensing, practical deployments in smart environments such as homes, vehicles, and industrial settings remain limited and under-examined, particularly from security and privacy perspectives. This survey provides a comprehensive examination of RIS applications in real-world systems, with a focus on the security and privacy threats, vulnerabilities, and defensive strategies relevant to practical use. We analyze scenarios with two types of systems (with and without legitimate RIS) and two types of attackers (with and without malicious RIS), and demonstrate how RIS may introduce new attacks to practical systems, including eavesdropping, jamming, and spoofing attacks. In response, we review defenses against RIS-related attacks in these systems, such as applying additional security algorithms, disrupting attackers, and early detection of unauthorized RIS. We also discuss scenarios in which the legitimate user applies an additional RIS to defend against attacks. To support future research, we also provide a collection of open-source tools, datasets, demos, and papers at: https://awesome-ris-security.github.io/. By highlighting RIS's functionality and its security/privacy challenges and opportunities, this survey aims to guide researchers and engineers toward the development of secure, resilient, and privacy-preserving RIS-enabled practical wireless systems and environments.

Paperid: 14, https://arxiv.org/pdf/2512.15688.pdf GitHub

Abstract:
Future AI agents might run autonomously with elevated privileges. If these agents are misaligned, they might abuse these privileges to cause serious damage. The field of AI control develops techniques that make it harder for misaligned AIs to cause such damage, while preserving their usefulness. We introduce BashArena, a setting for studying AI control techniques in security-critical environments. BashArena contains 637 Linux system administration and infrastructure engineering tasks in complex, realistic environments, along with four sabotage objectives (execute malware, exfiltrate secrets, escalate privileges, and disable firewall) for a red team to target. We evaluate multiple frontier LLMs on their ability to complete tasks, perform sabotage undetected, and detect sabotage attempts. Claude Sonnet 4.5 successfully executes sabotage while evading monitoring by GPT-4.1 mini 26% of the time, at 4% trajectory-wise FPR. Our findings provide a baseline for designing more effective control protocols in BashArena. We release the dataset as a ControlArena setting and share our task generation pipeline.

Paperid: 15, https://arxiv.org/pdf/2512.12069.pdf GitHub

Abstract:
Large Vision-Language Models (LVLMs) are vulnerable to a growing array of multimodal jailbreak attacks, necessitating defenses that are both generalizable to novel threats and efficient for practical deployment. Many current strategies fall short, either targeting specific attack patterns, which limits generalization, or imposing high computational overhead. While lightweight anomaly-detection methods offer a promising direction, we find that their common one-class design tends to confuse novel benign inputs with malicious ones, leading to unreliable over-rejection. To address this, we propose Representational Contrastive Scoring (RCS), a framework built on a key insight: the most potent safety signals reside within the LVLM's own internal representations. Our approach inspects the internal geometry of these representations, learning a lightweight projection to maximally separate benign and malicious inputs in safety-critical layers. This enables a simple yet powerful contrastive score that differentiates true malicious intent from mere novelty. Our instantiations, MCD (Mahalanobis Contrastive Detection) and KCD (K-nearest Contrastive Detection), achieve state-of-the-art performance on a challenging evaluation protocol designed to test generalization to unseen attack types. This work demonstrates that effective jailbreak detection can be achieved by applying simple, interpretable statistical methods to the appropriate internal representations, offering a practical path towards safer LVLM deployment. Our code is available on Github https://github.com/sarendis56/Jailbreak_Detection_RCS.

Paperid: 16, https://arxiv.org/pdf/2512.11135.pdf GitHub

Abstract:
Large language model (LLM) based services are primarily structured as client-server interactions, with clients sending queries directly to cloud providers that host LLMs. This approach currently compromises data privacy as all queries must be processed in the cloud and in the clear. Fully Homomorphic Encryption (FHE) is a solution to this data privacy issue by enabling computations directly upon encrypted queries. However, running encrypted transformer inference is challenging as programmers must map standard kernels to the constrained instruction set provided by FHE. In this work, we explore implementations of linear algebra kernels needed for transformer inference in FHE and understand how network optimization can help mitigate FHE costs while remaining performant. We leverage the Orion PyTorch to FHE framework to benchmark several linear algebra kernels in order to profile two linear transformation methods, packed row and BSGS, and find that BSGS outperforms packed row methods by up to $13.7 \times$ at transformer-level scales. We also incorporate network-level pruning strategies that reduce FHE runtimes of feed forward layers by up to $11.46\times$. Furthermore, we extend Orion to include ciphertext-ciphertext matrix-matrix products, a key component in the self-attention blocks. Finally, we perform a roofline analysis of FHE primitives and encrypted linear transformations and find that (SIMD encoded) implementations are memory-bound with primitives having roughly $0.1$ integer operations per byte of DRAM traffic. These findings illustrate the need for exploring alternative encoding schemes and models of computation within CKKS to unlock scalable private transformer inference. We conduct all experiments using the Orion framework which can be found at: https://github.com/baahl-nyu/orion.

Paperid: 17, https://arxiv.org/pdf/2512.11087.pdf GitHub

Abstract:
State-of-the-art neural network (NN) verifiers demonstrate that applying the branch-and-bound (BaB) procedure with fast bounding techniques plays a key role in tackling many challenging verification properties. In this work, we introduce the linear constraint-driven clipping framework, a class of scalable and efficient methods designed to enhance the efficacy of NN verifiers. Under this framework, we develop two novel algorithms that efficiently utilize linear constraints to 1) reduce portions of the input space that are either verified or irrelevant to a subproblem in the context of branch-and-bound, and 2) directly improve intermediate bounds throughout the network. The process novelly leverages linear constraints that often arise from bound propagation methods and is general enough to also incorporate constraints from other sources. It efficiently handles linear constraints using a specialized GPU procedure that can scale to large neural networks without the use of expensive external solvers. Our verification procedure, Clip-and-Verify, consistently tightens bounds across multiple benchmarks and can significantly reduce the number of subproblems handled during BaB. We show that our clipping algorithms can be integrated with BaB-based verifiers such as $α,β$-CROWN, utilizing either the split constraints in activation-space BaB or the output constraints that denote the unverified input space. We demonstrate the effectiveness of our procedure on a broad range of benchmarks where, in some instances, we witness a 96% reduction in the number of subproblems during branch-and-bound, and also achieve state-of-the-art verified accuracy across multiple benchmarks. Clip-and-Verify is part of the $α,β$-CROWN verifier (http://abcrown.org), the VNN-COMP 2025 winner. Code available at https://github.com/Verified-Intelligence/Clip_and_Verify.

Paperid: 18, https://arxiv.org/pdf/2512.10766.pdf GitHub

Abstract:
Text-to-image~(T2I) models commonly incorporate defense mechanisms to prevent the generation of sensitive images. Unfortunately, recent jailbreaking attacks have shown that adversarial prompts can effectively bypass these mechanisms and induce T2I models to produce sensitive content, revealing critical safety vulnerabilities. However, existing attack methods implicitly assume that the attacker knows the type of deployed defenses, which limits their effectiveness against unknown or diverse defense mechanisms. In this work, we introduce \textbf{MJA}, a \textbf{m}etaphor-based \textbf{j}ailbreaking \textbf{a}ttack method inspired by the Taboo game, aiming to effectively and efficiently attack diverse defense mechanisms without prior knowledge of their type by generating metaphor-based adversarial prompts. Specifically, MJA consists of two modules: an LLM-based multi-agent generation module~(MLAG) and an adversarial prompt optimization module~(APO). MLAG decomposes the generation of metaphor-based adversarial prompts into three subtasks: metaphor retrieval, context matching, and adversarial prompt generation. Subsequently, MLAG coordinates three LLM-based agents to generate diverse adversarial prompts by exploring various metaphors and contexts. To enhance attack efficiency, APO first trains a surrogate model to predict the attack results of adversarial prompts and then designs an acquisition strategy to adaptively identify optimal adversarial prompts. Extensive experiments on T2I models with various external and internal defense mechanisms demonstrate that MJA outperforms six baseline methods, achieving stronger attack performance while using fewer queries. Code is available in https://github.com/datar001/metaphor-based-jailbreaking-attack.

Paperid: 19, https://arxiv.org/pdf/2512.10600.pdf GitHub

Abstract:
Deep Neural Networks (DNNs), as valuable intellectual property, face unauthorized use. Existing protections, such as digital watermarking, are largely passive; they provide only post-hoc ownership verification and cannot actively prevent the illicit use of a stolen model. This work proposes a proactive protection scheme, dubbed ``Authority Backdoor," which embeds access constraints directly into the model. In particular, the scheme utilizes a backdoor learning framework to intrinsically lock a model's utility, such that it performs normally only in the presence of a specific trigger (e.g., a hardware fingerprint). But in its absence, the DNN's performance degrades to be useless. To further enhance the security of the proposed authority scheme, the certifiable robustness is integrated to prevent an adaptive attacker from removing the implanted backdoor. The resulting framework establishes a secure authority mechanism for DNNs, combining access control with certifiable robustness against adversarial attacks. Extensive experiments on diverse architectures and datasets validate the effectiveness and certifiable robustness of the proposed framework.

Paperid: 20, https://arxiv.org/pdf/2512.09321.pdf GitHub

Abstract:
Prompt injection attacks aim to contaminate the input data of an LLM to mislead it into completing an attacker-chosen task instead of the intended task. In many applications and agents, the input data originates from multiple sources, with each source contributing a segment of the overall input. In these multi-source scenarios, an attacker may control only a subset of the sources and contaminate the corresponding segments, but typically does not know the order in which the segments are arranged within the input. Existing prompt injection attacks either assume that the entire input data comes from a single source under the attacker's control or ignore the uncertainty in the ordering of segments from different sources. As a result, their success is limited in domains involving multi-source data. In this work, we propose ObliInjection, the first prompt injection attack targeting LLM applications and agents with multi-source input data. ObliInjection introduces two key technical innovations: the order-oblivious loss, which quantifies the likelihood that the LLM will complete the attacker-chosen task regardless of how the clean and contaminated segments are ordered; and the orderGCG algorithm, which is tailored to minimize the order-oblivious loss and optimize the contaminated segments. Comprehensive experiments across three datasets spanning diverse application domains and twelve LLMs demonstrate that ObliInjection is highly effective, even when only one out of 6-100 segments in the input data is contaminated. Our code and data are available at: https://github.com/ReachalWang/ObliInjection.

Paperid: 21, https://arxiv.org/pdf/2512.09300.pdf GitHub

Abstract:
zkVMs promise general-purpose verifiable computation through ISA-level compatibility with modern programs and toolchains. However, compatibility extends further than just the ISA; modern programs often cannot run or even compile without an operating system and libc. zkVMs attempt to address this by maintaining forks of language-specific runtimes and statically linking them into applications to create self-contained unikernels, but this ad-hoc approach leads to version hell and burdens verifiable applications (vApps) with an unnecessarily large trusted computing base. We solve this problem with ZeroOS, a modular library operating system (libOS) for vApp unikernels; vApp developers can use off-the-shelf toolchains to compile and link only the exact subset of the Linux ABI their vApp needs. Any zkVM team can easily leverage the ZeroOS ecosystem by writing a ZeroOS bootloader for their platform, resulting in a reduced maintainence burden and unifying the entire zkVM ecosystem with consolidated development and audit resources. ZeroOS is free and open-sourced at https://github.com/LayerZero-Labs/ZeroOS.

Paperid: 22, https://arxiv.org/pdf/2512.09006.pdf GitHub

Abstract:
The significant increase in software production, driven by the acceleration of development cycles over the past two decades, has led to a steady rise in software vulnerabilities, as shown by statistics published yearly by the CVE program. The automation of the source code vulnerability detection (CVD) process has thus become essential, and several methods have been proposed ranging from the well established program analysis techniques to the more recent AI-based methods. Our research investigates Large Language Models (LLMs), which are considered among the most performant AI models to date, for the CVD task. The objective is to study their performance and apply different state-of-the-art techniques to enhance their effectiveness for this task. We explore various fine-tuning and prompt engineering settings. We particularly suggest one novel approach for fine-tuning LLMs which we call Double Fine-tuning, and also test the understudied Test-Time fine-tuning approach. We leverage the recent open-source Llama-3.1 8B, with source code samples extracted from BigVul and PrimeVul datasets. Our conclusions highlight the importance of fine-tuning to resolve the task, the performance of Double tuning, as well as the potential of Llama models for CVD. Though prompting proved ineffective, Retrieval augmented generation (RAG) performed relatively well as an example selection technique. Overall, some of our research questions have been answered, and many are still on hold, which leaves us many future work perspectives. Code repository is available here: https://github.com/DynaSoumhaneOuchebara/Llama-based-vulnerability-detection.

Paperid: 23, https://arxiv.org/pdf/2512.08882.pdf GitHub

Abstract:
The rise of space AI is reshaping government and industry through applications such as disaster detection, border surveillance, and climate monitoring, powered by massive data from commercial and governmental low Earth orbit (LEO) satellites. Federated satellite learning (FSL) enables joint model training without sharing raw data, but suffers from slow convergence due to intermittent connectivity and introduces critical trust challenges--where biased or falsified updates can arise across satellite constellations, including those injected through cyberattacks on inter-satellite or satellite-ground communication links. We propose OrbitChain, a blockchain-backed framework that empowers trustworthy multi-vendor collaboration in LEO networks. OrbitChain (i) offloads consensus to high-altitude platforms (HAPs) with greater computational capacity, (ii) ensures transparent, auditable provenance of model updates from different orbits owned by different vendors, and (iii) prevents manipulated or incomplete contributions from affecting global FSL model aggregation. Extensive simulations show that OrbitChain reduces computational and communication overhead while improving privacy, security, and global model accuracy. Its permissioned proof-of-authority ledger finalizes over 1000 blocks with sub-second latency (0.16,s, 0.26,s, 0.35,s for 1-of-5, 3-of-5, and 5-of-5 quorums). Moreover, OrbitChain reduces convergence time by up to 30 hours on real satellite datasets compared to single-vendor, demonstrating its effectiveness for real-time, multi-vendor learning. Our code is available at https://github.com/wsu-cyber-security-lab-ai/OrbitChain.git

Paperid: 24, https://arxiv.org/pdf/2512.08326.pdf GitHub

Abstract:
Sensitive information leakage in code repositories has emerged as a critical security challenge. Traditional detection methods that rely on regular expressions, fingerprint features, and high-entropy calculations often suffer from high false-positive rates. This not only reduces detection efficiency but also significantly increases the manual screening burden on developers. Recent advances in large language models (LLMs) and multi-agent collaborative architectures have demonstrated remarkable potential for tackling complex tasks, offering a novel technological perspective for sensitive information detection. In response to these challenges, we propose Argus, a multi-agent collaborative framework for detecting sensitive information. Argus employs a three-tier detection mechanism that integrates key content, file context, and project reference relationships to effectively reduce false positives and enhance overall detection accuracy. To comprehensively evaluate Argus in real-world repository environments, we developed two new benchmarks, one to assess genuine leak detection capabilities and another to evaluate false-positive filtering performance. Experimental results show that Argus achieves up to 94.86% accuracy in leak detection, with a precision of 96.36%, recall of 94.64%, and an F1 score of 0.955. Moreover, the analysis of 97 real repositories incurred a total cost of only 2.2$. All code implementations and related datasets are publicly available at https://github.com/TheBinKing/Argus-Guard for further research and application.

Paperid: 25, https://arxiv.org/pdf/2512.07533.pdf GitHub GitHub

Abstract:
We propose VulnLLM-R, the~\emph{first specialized reasoning LLM} for vulnerability detection. Our key insight is that LLMs can reason about program states and analyze the potential vulnerabilities, rather than simple pattern matching. This can improve the model's generalizability and prevent learning shortcuts. However, SOTA reasoning LLMs are typically ultra-large, closed-source, or have limited performance in vulnerability detection. To address this, we propose a novel training recipe with specialized data selection, reasoning data generation, reasoning data filtering and correction, and testing-phase optimization. Using our proposed methodology, we train a reasoning model with seven billion parameters. Through extensive experiments on SOTA datasets across Python, C/C++, and Java, we show that VulnLLM-R has superior effectiveness and efficiency than SOTA static analysis tools and both open-source and commercial large reasoning models. We further conduct a detailed ablation study to validate the key designs in our training recipe. Finally, we construct an agent scaffold around our model and show that it outperforms CodeQL and AFL++ in real-world projects. Our agent further discovers a set of zero-day vulnerabilities in actively maintained repositories. This work represents a pioneering effort to enable real-world, project-level vulnerability detection using AI agents powered by specialized reasoning models. The code is available at~\href{https://github.com/ucsb-mlsec/VulnLLM-R}{github}.

Paperid: 26, https://arxiv.org/pdf/2512.07342.pdf GitHub

Abstract:
Recently, offline reinforcement learning (RL) has become a popular RL paradigm. In offline RL, data providers share pre-collected datasets -- either as individual transitions or sequences of transitions forming trajectories -- to enable the training of RL models (also called agents) without direct interaction with the environments. Offline RL saves interactions with environments compared to traditional RL, and has been effective in critical areas, such as navigation tasks. Meanwhile, concerns about privacy leakage from offline RL datasets have emerged. To safeguard private information in offline RL datasets, we propose the first differential privacy (DP) offline dataset synthesis method, PrivORL, which leverages a diffusion model and diffusion transformer to synthesize transitions and trajectories, respectively, under DP. The synthetic dataset can then be securely released for downstream analysis and research. PrivORL adopts the popular approach of pre-training a synthesizer on public datasets, and then fine-tuning on sensitive datasets using DP Stochastic Gradient Descent (DP-SGD). Additionally, PrivORL introduces curiosity-driven pre-training, which uses feedback from the curiosity module to diversify the synthetic dataset and thus can generate diverse synthetic transitions and trajectories that closely resemble the sensitive dataset. Extensive experiments on five sensitive offline RL datasets show that our method achieves better utility and fidelity in both DP transition and trajectory synthesis compared to baselines. The replication package is available at the GitHub repository.

Paperid: 27, https://arxiv.org/pdf/2512.06899.pdf GitHub

Abstract:
Transferable backdoors pose a severe threat to the Pre-trained Language Models (PLMs) supply chain, yet defensive research remains nascent, primarily relying on detecting anomalies in the output feature space. We identify a critical flaw that fine-tuning on downstream tasks inevitably modifies model parameters, shifting the output distribution and rendering pre-computed defense ineffective. To address this, we propose Patronus, a novel framework that use input-side invariance of triggers against parameter shifts. To overcome the convergence challenges of discrete text optimization, Patronus introduces a multi-trigger contrastive search algorithm that effectively bridges gradient-based optimization with contrastive learning objectives. Furthermore, we employ a dual-stage mitigation strategy combining real-time input monitoring with model purification via adversarial training. Extensive experiments across 15 PLMs and 10 tasks demonstrate that Patronus achieves $\geq98.7\%$ backdoor detection recall and reduce attack success rates to clean settings, significantly outperforming all state-of-the-art baselines in all settings. Code is available at https://github.com/zth855/Patronus.

Paperid: 28, https://arxiv.org/pdf/2512.06589.pdf GitHub

Abstract:
Recent advances in multi-modal large language models (MLLMs) have enabled unified perception-reasoning capabilities, yet these systems remain highly vulnerable to jailbreak attacks that bypass safety alignment and induce harmful behaviors. Existing benchmarks such as JailBreakV-28K, MM-SafetyBench, and HADES provide valuable insights into multi-modal vulnerabilities, but they typically focus on limited attack scenarios, lack standardized defense evaluation, and offer no unified, reproducible toolbox. To address these gaps, we introduce OmniSafeBench-MM, which is a comprehensive toolbox for multi-modal jailbreak attack-defense evaluation. OmniSafeBench-MM integrates 13 representative attack methods, 15 defense strategies, and a diverse dataset spanning 9 major risk domains and 50 fine-grained categories, structured across consultative, imperative, and declarative inquiry types to reflect realistic user intentions. Beyond data coverage, it establishes a three-dimensional evaluation protocol measuring (1) harmfulness, distinguished by a granular, multi-level scale ranging from low-impact individual harm to catastrophic societal threats, (2) intent alignment between responses and queries, and (3) response detail level, enabling nuanced safety-utility analysis. We conduct extensive experiments on 10 open-source and 8 closed-source MLLMs to reveal their vulnerability to multi-modal jailbreak. By unifying data, methodology, and evaluation into an open-source, reproducible platform, OmniSafeBench-MM provides a standardized foundation for future research. The code is released at https://github.com/jiaxiaojunQAQ/OmniSafeBench-MM.

Paperid: 29, https://arxiv.org/pdf/2512.06411.pdf GitHub

Abstract:
KyFrog is a conservative Learning-with-Errors (LWE) key-encapsulation mechanism designed to explore an alternative operating point compared to schemes with relatively small public keys and ciphertexts. KyFrog uses a larger dimension (n = 1024) and a small prime modulus q = 1103, together with narrow error distributions with standard deviations sigma_s = sigma_e = 1.4, to target approximately 2^325 classical and quantum security against state-of-the-art lattice attacks under standard cost models, as estimated using the Lattice Estimator. The price paid for this security margin is an extremely large KEM ciphertext (about 0.5 MiB), while public and secret keys remain in the same ballpark as ML-KEM. We describe the design rationale, parameter search methodology, and implementation details of KyFrog, and we compare its asymptotic security and concrete parameter sizes with the ML-KEM standard. All code and data for this work are released as free and open-source software, with the full C++23 implementation and experimental scripts available at: https://github.com/victormeloasm/kyfrog

Paperid: 30, https://arxiv.org/pdf/2512.06155.pdf GitHub

Abstract:
Security research is fundamentally a problem of resource constraint and consequent prioritization. There is simply too much attack surface and too little time and energy to spend analyzing it all. The most effective security researchers are often those who are most skilled at intuitively deciding which part of an expansive attack surface to investigate. We demonstrate that this problem of selecting the most promising option from among many possibilities can be reframed as an information retrieval problem, and solved using document ranking techniques with LLMs performing the heavy lifting as general-purpose rankers. We present SiftRank, a ranking algorithm achieving O(n) complexity through three key mechanisms: listwise ranking using an LLM to order documents in small batches of approximately 10 items at a time; inflection-based convergence detection that adaptively terminates ranking when score distributions have stabilized; and iterative refinement that progressively focuses ranking effort on the most relevant documents. Unlike existing reranking approaches that require a separate first-stage retrieval step to narrow datasets to approximately 100 candidates, SiftRank operates directly on thousands of items, with each document evaluated across multiple randomized batches to mitigate inconsistent judgments by an LLM. We demonstrate practical effectiveness on N-day vulnerability analysis, successfully identifying a vulnerability-fixing function among 2,197 changed functions in a stripped binary firmware patch within 99 seconds at an inference cost of $0.82. Our approach enables scalable security prioritization for problems that are generally constrained by manual analysis, requiring only standard LLM API access without specialized infrastructure, embedding, or domain-specific fine-tuning. An open-source implementation of SiftRank may be found at https://github.com/noperator/siftrank.

Paperid: 31, https://arxiv.org/pdf/2512.05459.pdf GitHub

Abstract:
Large language models (LLMs) have presented outstanding performance in code generation and completion. However, fine-tuning these models on private datasets can raise privacy and proprietary concerns, such as the leakage of sensitive personal information. Differentially private (DP) code generation provides theoretical guarantees for protecting sensitive code by generating synthetic datasets that preserve statistical properties while reducing privacy leakage concerns. However, DP code generation faces significant challenges due to the strict syntactic dependencies and the privacy-utility trade-off. We propose PrivCode, the first DP synthesizer specifically designed for code datasets. It incorporates a two-stage framework to improve both privacy and utility. In the first stage, termed "privacy-sanitizing", PrivCode generates DP-compliant synthetic code by training models using DP-SGD while introducing syntactic information to preserve code structure. The second stage, termed "utility-boosting", fine-tunes a larger pre-trained LLM on the synthetic privacy-free code to mitigate the utility loss caused by DP, enhancing the utility of the generated code. Extensive experiments on four LLMs show that PrivCode generates higher-utility code across various testing tasks under four benchmarks. The experiments also confirm its ability to protect sensitive data under varying privacy budgets. We provide the replication package at the anonymous link.

Paperid: 32, https://arxiv.org/pdf/2512.05069.pdf GitHub

Abstract:
Unsupervised anomaly-based intrusion detection requires models that can generalize to attack patterns not observed during training. This work presents the first large-scale evaluation of hybrid quantum-classical (HQC) autoencoders for this task. We construct a unified experimental framework that iterates over key quantum design choices, including quantum-layer placement, measurement approach, variational and non-variational formulations, and latent-space regularization. Experiments across three benchmark NIDS datasets show that HQC autoencoders can match or exceed classical performance in their best configurations, although they exhibit higher sensitivity to architectural decisions. Under zero-day evaluation, well-configured HQC models provide stronger and more stable generalization than classical and supervised baselines. Simulated gate-noise experiments reveal early performance degradation, indicating the need for noise-aware HQC designs. These results provide the first data-driven characterization of HQC autoencoder behavior for network intrusion detection and outline key factors that govern their practical viability. All experiment code and configurations are available at https://github.com/arasyi/hqcae-network-intrusion-detection.

Paperid: 33, https://arxiv.org/pdf/2512.04841.pdf GitHub

Abstract:
Large Language Models (LLMs) exhibit remarkable capabilities but remain vulnerable to adversarial manipulations such as jailbreaking, where crafted prompts bypass safety mechanisms. Understanding the causal factors behind such vulnerabilities is essential for building reliable defenses. In this work, we introduce a unified causality analysis framework that systematically supports all levels of causal investigation in LLMs, ranging from token-level, neuron-level, and layer-level interventions to representation-level analysis. The framework enables consistent experimentation and comparison across diverse causality-based attack and defense methods. Accompanying this implementation, we provide the first comprehensive survey of causality-driven jailbreak studies and empirically evaluate the framework on multiple open-weight models and safety-critical benchmarks including jailbreaks, hallucination detection, backdoor identification, and fairness evaluation. Our results reveal that: (1) targeted interventions on causally critical components can reliably modify safety behavior; (2) safety-related mechanisms are highly localized (i.e., concentrated in early-to-middle layers with only 1--2\% of neurons exhibiting causal influence); and (3) causal features extracted from our framework achieve over 95\% detection accuracy across multiple threat types. By bridging theoretical causality analysis and practical model safety, our framework establishes a reproducible foundation for research on causality-based attacks, interpretability, and robust attack detection and mitigation in LLMs. Code is available at https://github.com/Amadeuszhao/SOK_Casuality.

Paperid: 34, https://arxiv.org/pdf/2512.04120.pdf GitHub

Abstract:
The emergence of open data portals necessitates more attention to protecting sensitive data before datasets get published and exchanged. While an abundance of methods for suppressing sensitive data exist, the conceptualization of sensitive data and methods to detect it, focus particularly on personal data that, if disclosed, may be harmful or violate privacy. We observe the need for refining and broadening our definitions of sensitive data, and argue that the sensitivity of data depends on its context. Based on this definition, we introduce two mechanisms for contextual sensitive data detection that consider the broader context of a dataset at hand. First, we introduce type contextualization, which first detects the semantic type of particular data values, then considers the overall context of the data values within the dataset or document. Second, we introduce domain contextualization which determines sensitivity of a given dataset in the broader context based on the retrieval of relevant rules from documents that specify data sensitivity (e.g., data topic and geographic origin). Experiments with these mechanisms, assisted by large language models (LLMs), confirm that: 1) type-contextualization significantly reduces the number of false positives for type-based sensitive data detection and reaches a recall of 94% compared to 63% with commercial tools, and 2) domain-contextualization leveraging sensitivity rule retrieval is effective for context-grounded sensitive data detection in non-standard data domains such as humanitarian datasets. Evaluation with humanitarian data experts also reveals that context-grounded LLM explanations provide useful guidance in manual data auditing processes, improving consistency. We open-source mechanisms and annotated datasets for contextual sensitive data detection at https://github.com/trl-lab/sensitive-data-detection.

Paperid: 35, https://arxiv.org/pdf/2512.03720.pdf GitHub

Abstract:
Large Language Models (LLMs) have emerged as powerful tools for diverse applications. However, their uniform token processing paradigm introduces critical vulnerabilities in instruction handling, particularly when exposed to adversarial scenarios. In this work, we identify and propose a novel class of vulnerabilities, termed Tool-Completion Attack (TCA), which exploits function-calling mechanisms to subvert model behavior. To evaluate LLM robustness against such threats, we introduce the Tool-Completion benchmark, a comprehensive security assessment framework, which reveals that even state-of-the-art models remain susceptible to TCA, with surprisingly high attack success rates. To address these vulnerabilities, we introduce Context-Aware Hierarchical Learning (CAHL), a sophisticated mechanism that dynamically balances semantic comprehension with role-specific instruction constraints. CAHL leverages the contextual correlations between different instruction segments to establish a robust, context-aware instruction hierarchy. Extensive experiments demonstrate that CAHL significantly enhances LLM robustness against both conventional attacks and the proposed TCA, exhibiting strong generalization capabilities in zero-shot evaluations while still preserving model performance on generic tasks. Our code is available at https://github.com/S2AILab/CAHL.

Paperid: 36, https://arxiv.org/pdf/2512.03620.pdf GitHub

Abstract:
The protection of Intellectual Property (IP) in Large Language Models (LLMs) represents a critical challenge in contemporary AI research. While fingerprinting techniques have emerged as a fundamental mechanism for detecting unauthorized model usage, existing methods -- whether behavior-based or structural -- suffer from vulnerabilities such as false claim attacks or susceptible to weight manipulations. To overcome these limitations, we propose SELF, a novel intrinsic weight-based fingerprinting scheme that eliminates dependency on input and inherently resists false claims. SELF achieves robust IP protection through two key innovations: 1) unique, scalable and transformation-invariant fingerprint extraction via singular value and eigenvalue decomposition of LLM attention weights, and 2) effective neural network-based fingerprint similarity comparison based on few-shot learning and data augmentation. Experimental results demonstrate SELF maintains high IP infringement detection accuracy while showing strong robustness against various downstream modifications, including quantization, pruning, and fine-tuning attacks. Our code is available at https://github.com/HanxiuZhang/SELF_v2.

Paperid: 37, https://arxiv.org/pdf/2512.02062.pdf GitHub

Abstract:
Deep learning models are used in safety-critical tasks such as automated driving and face recognition. However, small perturbations in the model input can significantly change the predictions. Adversarial attacks are used to identify small perturbations that can lead to misclassifications. More powerful black-box adversarial attacks are required to develop more effective defenses. A promising approach to black-box adversarial attacks is to repeat the process of extracting a specific image area and changing the perturbations added to it. Existing attacks adopt simple rectangles as the areas where perturbations are changed in a single iteration. We propose applying superpixels instead, which achieve a good balance between color variance and compactness. We also propose a new search method, versatile search, and a novel attack method, Superpixel Attack, which applies superpixels and performs versatile search. Superpixel Attack improves attack success rates by an average of 2.10% compared with existing attacks. Most models used in this study are robust against adversarial attacks, and this improvement is significant for black-box adversarial attacks. The code is avilable at https://github.com/oe1307/SuperpixelAttack.git.

Paperid: 38, https://arxiv.org/pdf/2512.01353.pdf GitHub

Abstract:
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Existing approaches overwhelmingly operate within the prompt-optimization paradigm: whether through traditional algorithmic search or recent agent-based workflows, the resulting prompts typically retain malicious semantic signals that modern guardrails are primed to detect. In contrast, we identify a deeper, largely overlooked vulnerability stemming from the highly interconnected nature of an LLM's internal knowledge. This structure allows harmful objectives to be realized by weaving together sequences of benign sub-queries, each of which individually evades detection. To exploit this loophole, we introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base. The CKA-Agent issues locally innocuous queries, uses model responses to guide exploration across multiple paths, and ultimately assembles the aggregated information to achieve the original harmful objective. Evaluated across state-of-the-art commercial LLMs (Gemini2.5-Flash/Pro, GPT-oss-120B, Claude-Haiku-4.5), CKA-Agent consistently achieves over 95% success rates even against strong guardrails, underscoring the severity of this vulnerability and the urgent need for defenses against such knowledge-decomposition attacks. Our codes are available at https://github.com/Graph-COM/CKA-Agent.

Paperid: 39, https://arxiv.org/pdf/2512.00218.pdf GitHub

Abstract:
AI systems that output their reasoning in natural language offer an opportunity for safety -- we can \emph{monitor} their chain of thought (CoT) for undesirable reasoning, such as the pursuit of harmful objectives. However, the extent to which CoT faithfully reflects the underlying reasoning process, and hence the extent to which it can be usefully monitored, may be influenced by certain aspects of training. We investigate how different \emph{training incentives}, applied to a reasoning model, affect its monitorability. We introduce a novel methodology for measuring monitorability according to whether a monitor can predict a key latent variable using the model's reasoning. When controlling for accuracy, we do not find evidence for consistent effects from commonly used incentives (length penalties and KL regularisation), but we find that adversarial optimisation (penalising monitor accuracy) degrades monitor performance, while direct optimisation for monitorability does not reliably lead to improvements. Our code is available at https://github.com/QiyaoWei/reasoning-under-pressure.

Paperid: 40, https://arxiv.org/pdf/2511.21216.pdf GitHub

Abstract:
Low-Rank Adaptation (LoRA) offers an efficient paradigm for customizing diffusion models, but its ease of redistribution raises concerns over unauthorized use and the generation of untraceable content. Existing watermarking techniques either target base models or verify LoRA modules themselves, yet they fail to propagate watermarks to generated images, leaving a critical gap in traceability. Moreover, traceability watermarking designed for base models is not tightly coupled with stylization and often introduces visual degradation or high false-positive detection rates. To address these limitations, we propose AuthenLoRA, a unified watermarking framework that embeds imperceptible, traceable watermarks directly into the LoRA training process while preserving stylization quality. AuthenLoRA employs a dual-objective optimization strategy that jointly learns the target style distribution and the watermark-induced distribution shift, ensuring that any image generated with the watermarked LoRA reliably carries the watermark. We further design an expanded LoRA architecture for enhanced multi-scale adaptation and introduce a zero-message regularization mechanism that substantially reduces false positives during watermark verification. Extensive experiments demonstrate that AuthenLoRA achieves high-fidelity stylization, robust watermark propagation, and significantly lower false-positive rates compared with existing approaches. Open-source implementation is available at: https://github.com/ShiFangming0823/AuthenLoRA

Paperid: 41, https://arxiv.org/pdf/2511.20252.pdf GitHub

Abstract:
WhatsApp, with 3.5 billion active accounts as of early 2025, is the world's largest instant messaging platform. Given its massive user base, WhatsApp plays a critical role in global communication. To initiate conversations, users must first discover whether their contacts are registered on the platform. This is achieved by querying WhatsApp's servers with mobile phone numbers extracted from the user's address book (if they allowed access). This architecture inherently enables phone number enumeration, as the service must allow legitimate users to query contact availability. While rate limiting is a standard defense against abuse, we revisit the problem and show that WhatsApp remains highly vulnerable to enumeration at scale. In our study, we were able to probe over a hundred million phone numbers per hour without encountering blocking or effective rate limiting. Our findings demonstrate not only the persistence but the severity of this vulnerability. We further show that nearly half of the phone numbers disclosed in the 2021 Facebook data leak are still active on WhatsApp, underlining the enduring risks associated with such exposures. Moreover, we were able to perform a census of WhatsApp users, providing a glimpse on the macroscopic insights a large messaging service is able to generate even though the messages themselves are end-to-end encrypted. Using the gathered data, we also discovered the re-use of certain X25519 keys across different devices and phone numbers, indicating either insecure (custom) implementations, or fraudulent activity. In this updated version of the paper, we also provide insights into the collaborative remediation process through which we confirmed that the underlying rate-limiting issue had been resolved.

Paperid: 42, https://arxiv.org/pdf/2511.19171.pdf GitHub

Abstract:
Research on the safety evaluation of large language models (LLMs) has become extensive, driven by jailbreak studies that elicit unsafe responses. Such response involves information already available to humans, such as the answer to "how to make a bomb". When LLMs are jailbroken, the practical threat they pose to humans is negligible. However, it remains unclear whether LLMs commonly produce unpredictable outputs that could pose substantive threats to human safety. To address this gap, we study whether LLM-generated content contains potential existential threats, defined as outputs that imply or promote direct harm to human survival. We propose \textsc{ExistBench}, a benchmark designed to evaluate such risks. Each sample in \textsc{ExistBench} is derived from scenarios where humans are positioned as adversaries to AI assistants. Unlike existing evaluations, we use prefix completion to bypass model safeguards. This leads the LLMs to generate suffixes that express hostility toward humans or actions with severe threat, such as the execution of a nuclear strike. Our experiments on 10 LLMs reveal that LLM-generated content indicates existential threats. To investigate the underlying causes, we also analyze the attention logits from LLMs. To highlight real-world safety risks, we further develop a framework to assess model behavior in tool-calling. We find that LLMs actively select and invoke external tools with existential threats. Code and data are available at: https://github.com/cuiyu-ai/ExistBench.

Paperid: 43, https://arxiv.org/pdf/2511.18933.pdf GitHub

Abstract:
Large Language Models (LLMs) remain susceptible to jailbreak exploits that bypass safety filters and induce harmful or unethical behavior. This work presents a systematic taxonomy of existing jailbreak defenses across prompt-level, model-level, and training-time interventions, followed by three proposed defense strategies. First, a Prompt-Level Defense Framework detects and neutralizes adversarial inputs through sanitization, paraphrasing, and adaptive system guarding. Second, a Logit-Based Steering Defense reinforces refusal behavior through inference-time vector steering in safety-sensitive layers. Third, a Domain-Specific Agent Defense employs the MetaGPT framework to enforce structured, role-based collaboration and domain adherence. Experiments on benchmark datasets show substantial reductions in attack success rate, achieving full mitigation under the agent-based defense. Overall, this study highlights how jailbreaks pose a significant security threat to LLMs and identifies key intervention points for prevention, while noting that defense strategies often involve trade-offs between safety, performance, and scalability. Code is available at: https://github.com/Kuro0911/CS5446-Project

Paperid: 44, https://arxiv.org/pdf/2511.17573.pdf GitHub

Abstract:
Sequence models for binary analysis are bottlenecked by byte-level tokenization: raw bytes waste precious context window capacity for transformers and other neural network architectures, and many existing text-oriented tokenizers fail on arbitrary 0x00--0xFF sequences. To address this issue, we introduce the Binary BPE tokenizer family, a set of cross-platform Byte Pair Encoding (BPE) tokenizers for executables trained on a large corpus of binaries spanning multiple platforms, architectures, and operating systems, including Linux, Windows, macOS, Android, and malware sources. We release trained tokenizers with vocabularies of 4K, 8K, 16K, 32K, and 64K tokens, enabling both systematic scaling studies and practical deployment from resource-constrained edge devices to high-throughput datacenters. These tokenizers discover interpretable patterns (ELF/PE headers, instruction sequences, cross-platform strings) while yielding multi-byte compression per token. On representative uncompressed executables (e.g., ELF/PE/Mach-O rather than compressed APKs), the Binary BPE tokenizers typically allow for roughly 2-3x more binary content per fixed-length transformer context window than raw bytes, enabling more efficient research and practical deployment for content identification, malware detection, reverse engineering, and optimization. We release the trained Binary BPE tokenizers on HuggingFace, providing a drop-in, open-source foundation for binary-focused language models and context-efficient agentic tools.

Paperid: 45, https://arxiv.org/pdf/2511.16792.pdf GitHub

Abstract:
Membership inference attacks (MIAs) against machine learning (ML) models aim to determine whether a given data point was part of the model training data. These attacks may pose significant privacy risks to individuals whose sensitive data were used for training, which motivates the use of defenses such as differential privacy, often at the cost of high accuracy losses. MIAs exploit the differences in the behavior of a model when making predictions on samples it has seen during training (members) versus those it has not seen (non-members). Several studies have pointed out that model overfitting is the major factor contributing to these differences in behavior and, consequently, to the success of MIAs. However, the literature also shows that even non-overfitted ML models can leak information about a small subset of their training data. In this paper, we investigate the root causes of membership inference vulnerabilities beyond traditional overfitting concerns and suggest targeted defenses. We empirically analyze the characteristics of the training data samples vulnerable to MIAs in models that are not overfitted (and hence able to generalize). Our findings reveal that these samples are often outliers within their classes (e.g., noisy or hard to classify). We then propose potential defensive strategies to protect these vulnerable samples and enhance the privacy-preserving capabilities of ML models. Our code is available at https://github.com/najeebjebreel/mia_analysis.

Paperid: 46, https://arxiv.org/pdf/2511.16709.pdf GitHub

Abstract:
Backdoor attacks pose a serious threat to the secure deployment of large language models (LLMs), enabling adversaries to implant hidden behaviors triggered by specific inputs. However, existing methods often rely on manually crafted triggers and static data pipelines, which are rigid, labor-intensive, and inadequate for systematically evaluating modern defense robustness. As AI agents become increasingly capable, there is a growing need for more rigorous, diverse, and scalable \textit{red-teaming frameworks} that can realistically simulate backdoor threats and assess model resilience under adversarial conditions. In this work, we introduce \textsc{AutoBackdoor}, a general framework for automating backdoor injection, encompassing trigger generation, poisoned data construction, and model fine-tuning via an autonomous agent-driven pipeline. Unlike prior approaches, AutoBackdoor uses a powerful language model agent to generate semantically coherent, context-aware trigger phrases, enabling scalable poisoning across arbitrary topics with minimal human effort. We evaluate AutoBackdoor under three realistic threat scenarios, including \textit{Bias Recommendation}, \textit{Hallucination Injection}, and \textit{Peer Review Manipulation}, to simulate a broad range of attacks. Experiments on both open-source and commercial models, including LLaMA-3, Mistral, Qwen, and GPT-4o, demonstrate that our method achieves over 90\% attack success with only a small number of poisoned samples. More importantly, we find that existing defenses often fail to mitigate these attacks, underscoring the need for more rigorous and adaptive evaluation techniques against agent-driven threats as explored in this work. All code, datasets, and experimental configurations will be merged into our primary repository at https://github.com/bboylyg/BackdoorLLM.

Paperid: 47, https://arxiv.org/pdf/2511.16229.pdf GitHub

Abstract:
Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities in cross-modal understanding, but remain vulnerable to adversarial attacks through visual inputs despite robust textual safety mechanisms. These vulnerabilities arise from two core weaknesses: the continuous nature of visual representations, which allows for gradient-based attacks, and the inadequate transfer of text-based safety mechanisms to visual content. We introduce Q-MLLM, a novel architecture that integrates two-level vector quantization to create a discrete bottleneck against adversarial attacks while preserving multimodal reasoning capabilities. By discretizing visual representations at both pixel-patch and semantic levels, Q-MLLM blocks attack pathways and bridges the cross-modal safety alignment gap. Our two-stage training methodology ensures robust learning while maintaining model utility. Experiments demonstrate that Q-MLLM achieves significantly better defense success rate against both jailbreak attacks and toxic image attacks than existing approaches. Notably, Q-MLLM achieves perfect defense success rate (100\%) against jailbreak attacks except in one arguable case, while maintaining competitive performance on multiple utility benchmarks with minimal inference overhead. This work establishes vector quantization as an effective defense mechanism for secure multimodal AI systems without requiring expensive safety-specific fine-tuning or detection overhead. Code is available at https://github.com/Amadeuszhao/QMLLM.

Paperid: 48, https://arxiv.org/pdf/2511.16209.pdf GitHub

Abstract:
System prompts are critical for guiding the behavior of Large Language Models (LLMs), yet they often contain proprietary logic or sensitive information, making them a prime target for extraction attacks. Adversarial queries can successfully elicit these hidden instructions, posing significant security and privacy risks. Existing defense mechanisms frequently rely on heuristics, incur substantial computational overhead, or are inapplicable to models accessed via black-box APIs. This paper introduces a novel framework for hardening system prompts through shield appending, a lightweight approach that adds a protective textual layer to the original prompt. Our core contribution is the formalization of prompt hardening as a utility-constrained optimization problem. We leverage an LLM-as-optimizer to search the space of possible SHIELDs, seeking to minimize a leakage metric derived from a suite of adversarial attacks, while simultaneously preserving task utility above a specified threshold, measured by semantic fidelity to baseline outputs. This black-box, optimization-driven methodology is lightweight and practical, requiring only API access to the target and optimizer LLMs. We demonstrate empirically that our optimized SHIELDs significantly reduce prompt leakage against a comprehensive set of extraction attacks, outperforming established baseline defenses without compromising the model's intended functionality. Our work presents a paradigm for developing robust, utility-aware defenses in the escalating landscape of LLM security. The code is made public on the following link: https://github.com/psm-defense/psm

Paperid: 49, https://arxiv.org/pdf/2511.16110.pdf GitHub

Abstract:
The growing misuse of Vision-Language Models (VLMs) has led providers to deploy multiple safeguards, including alignment tuning, system prompts, and content moderation. However, the real-world robustness of these defenses against adversarial attacks remains underexplored. We introduce Multi-Faceted Attack (MFA), a framework that systematically exposes general safety vulnerabilities in leading defense-equipped VLMs such as GPT-4o, Gemini-Pro, and Llama-4. The core component of MFA is the Attention-Transfer Attack (ATA), which hides harmful instructions inside a meta task with competing objectives. We provide a theoretical perspective based on reward hacking to explain why this attack succeeds. To improve cross-model transferability, we further introduce a lightweight transfer-enhancement algorithm combined with a simple repetition strategy that jointly bypasses both input-level and output-level filters without model-specific fine-tuning. Empirically, we show that adversarial images optimized for one vision encoder transfer broadly to unseen VLMs, indicating that shared visual representations create a cross-model safety vulnerability. Overall, MFA achieves a 58.5% success rate and consistently outperforms existing methods. On state-of-the-art commercial models, MFA reaches a 52.8% success rate, surpassing the second-best attack by 34%. These results challenge the perceived robustness of current defense mechanisms and highlight persistent safety weaknesses in modern VLMs. Code: https://github.com/cure-lab/MultiFacetedAttack

Paperid: 50, https://arxiv.org/pdf/2511.15097.pdf GitHub

Abstract:
The AI trustworthiness crisis threatens to derail the artificial intelligence revolution, with regulatory barriers, security vulnerabilities, and accountability gaps preventing deployment in critical domains. Current AI systems operate on opaque data structures that lack the audit trails, provenance tracking, or explainability required by emerging regulations like the EU AI Act. We propose an artifact-centric AI agent paradigm where behavior is driven by persistent, verifiable data artifacts rather than ephemeral tasks, solving the trustworthiness problem at the data architecture level. Central to this approach is the Multimodal Artifact File Format (MAIF), an AI-native container embedding semantic representations, cryptographic provenance, and granular access controls. MAIF transforms data from passive storage into active trust enforcement, making every AI operation inherently auditable. Our production-ready implementation demonstrates ultra-high-speed streaming (2,720.7 MB/s), optimized video processing (1,342 MB/s), and enterprise-grade security. Novel algorithms for cross-modal attention, semantic compression, and cryptographic binding achieve up to 225 compression while maintaining semantic fidelity. Advanced security features include stream-level access control, real-time tamper detection, and behavioral anomaly analysis with minimal overhead. This approach directly addresses the regulatory, security, and accountability challenges preventing AI deployment in sensitive domains, offering a viable path toward trustworthy AI systems at scale.

Paperid: 51, https://arxiv.org/pdf/2511.14715.pdf GitHub

Abstract:
Federated learning (FL) enables collaborative model training while preserving data privacy. However, it remains vulnerable to malicious clients who compromise model integrity through Byzantine attacks, data poisoning, or adaptive adversarial behaviors. Existing defense mechanisms rely on static thresholds and binary classification, failing to adapt to evolving client behaviors in real-world deployments. We propose FLARE, an adaptive reputation-based framework that transforms client reliability assessment from binary decisions to a continuous, multi-dimensional trust evaluation. FLARE integrates: (i) a multi-dimensional reputation score capturing performance consistency, statistical anomaly indicators, and temporal behavior, (ii) a self-calibrating adaptive threshold mechanism that adjusts security strictness based on model convergence and recent attack intensity, (iii) reputation-weighted aggregation with soft exclusion to proportionally limit suspicious contributions rather than eliminating clients outright, and (iv) a Local Differential Privacy (LDP) mechanism enabling reputation scoring on privatized client updates. We further introduce a highly evasive Statistical Mimicry (SM) attack, a benchmark adversary that blends honest gradients with synthetic perturbations and persistent drift to remain undetected by traditional filters. Extensive experiments with 100 clients on MNIST, CIFAR-10, and SVHN demonstrate that FLARE maintains high model accuracy and converges faster than state-of-the-art Byzantine-robust methods under diverse attack types, including label flipping, gradient scaling, adaptive attacks, ALIE, and SM. FLARE improves robustness by up to 16% and preserves model convergence within 30% of the non-attacked baseline, while achieving strong malicious-client detection performance with minimal computational overhead. https://github.com/Anonymous0-0paper/FLARE

Paperid: 52, https://arxiv.org/pdf/2511.14003.pdf GitHub

Abstract:
Certified defenses promise provable robustness guarantees. We study the malicious exploitation of probabilistic certification frameworks to better understand the limits of guarantee provisions. Now, the objective is to not only mislead a classifier, but also manipulate the certification process to generate a robustness guarantee for an adversarial input certificate spoofing. A recent study in ICLR demonstrated that crafting large perturbations can shift inputs far into regions capable of generating a certificate for an incorrect class. Our study investigates if perturbations needed to cause a misclassification and yet coax a certified model into issuing a deceptive, large robustness radius for a target class can still be made small and imperceptible. We explore the idea of region-focused adversarial examples to craft imperceptible perturbations, spoof certificates and achieve certification radii larger than the source class ghost certificates. Extensive evaluations with the ImageNet demonstrate the ability to effectively bypass state-of-the-art certified defenses such as Densepure. Our work underscores the need to better understand the limits of robustness certification methods.

Paperid: 53, https://arxiv.org/pdf/2511.13893.pdf GitHub GitHub

Abstract:
In differentially private (DP) tabular data synthesis, the consensus is that statistical models are better than neural network (NN)-based methods. However, we argue that this conclusion is incomplete and overlooks the challenge of densely correlated datasets, where intricate dependencies can overwhelm statistical models. In such complex scenarios, neural networks are more suitable due to their capacity to fit complex distributions by learning directly from samples. Despite this potential, existing NN-based algorithms still suffer from significant limitations. We therefore propose MargNet, incorporating successful algorithmic designs of statistical models into neural networks. MargNet applies an adaptive marginal selection strategy and trains the neural networks to generate data that conforms to the selected marginals. On sparsely correlated datasets, our approach achieves utility close to the best statistical method while offering an average 7$\times$ speedup over it. More importantly, on densely correlated datasets, MargNet establishes a new state-of-the-art, reducing fidelity error by up to 26\% compared to the previous best. We release our code on GitHub.\footnote{https://github.com/KaiChen9909/margnet}

Paperid: 54, https://arxiv.org/pdf/2511.13365.pdf GitHub

Abstract:
Split inference (SI) enables users to access deep learning (DL) services without directly transmitting raw data. However, recent studies reveal that data reconstruction attacks (DRAs) can recover the original inputs from the smashed data sent from the client to the server, leading to significant privacy leakage. While various defenses have been proposed, they often result in substantial utility degradation, particularly when the client-side model is shallow. We identify a key cause of this trade-off: existing defenses apply excessive perturbation to redundant information in the smashed data. To address this issue in computer vision tasks, we propose InfoDecom, a defense framework that first decomposes and removes redundant information and then injects noise calibrated to provide theoretically guaranteed privacy. Experiments demonstrate that InfoDecom achieves a superior utility-privacy trade-off compared to existing baselines. The code and the appendix are available at https://github.com/SASA-cloud/InfoDecom.

Paperid: 55, https://arxiv.org/pdf/2511.13356.pdf GitHub

Abstract:
Backdoor attacks pose severe threats to machine learning systems, prompting extensive research in this area. However, most existing work focuses on single-target All-to-One (A2O) attacks, overlooking the more complex All-to-X (A2X) attacks with multiple target classes, which are often assumed to have low attack success rates. In this paper, we first demonstrate that A2X attacks are robust against state-of-the-art defenses. We then propose a novel attack strategy that enhances the success rate of A2X attacks while maintaining robustness by optimizing grouping and target class assignment mechanisms. Our method improves the attack success rate by up to 28%, with average improvements of 6.7%, 16.4%, 14.1% on CIFAR10, CIFAR100, and Tiny-ImageNet, respectively. We anticipate that this study will raise awareness of A2X attacks and stimulate further research in this under-explored area. Our code is available at https://github.com/kazefjj/A2X-backdoor .

Paperid: 56, https://arxiv.org/pdf/2511.12710.pdf GitHub

Abstract:
Automated red teaming frameworks for Large Language Models (LLMs) have become increasingly sophisticated, yet they share a fundamental limitation: their jailbreak logic is confined to selecting, combining, or refining pre-existing attack strategies. This binds their creativity and leaves them unable to autonomously invent entirely new attack mechanisms. To overcome this gap, we introduce \textbf{EvoSynth}, an autonomous framework that shifts the paradigm from attack planning to the evolutionary synthesis of jailbreak methods. Instead of refining prompts, EvoSynth employs a multi-agent system to autonomously engineer, evolve, and execute novel, code-based attack algorithms. Crucially, it features a code-level self-correction loop, allowing it to iteratively rewrite its own attack logic in response to failure. Through extensive experiments, we demonstrate that EvoSynth not only establishes a new state-of-the-art by achieving an 85.5\% Attack Success Rate (ASR) against highly robust models like Claude-Sonnet-4.5, but also generates attacks that are significantly more diverse than those from existing methods. We release our framework to facilitate future research in this new direction of evolutionary synthesis of jailbreak methods. Code is available at: https://github.com/dongdongunique/EvoSynth.

Paperid: 57, https://arxiv.org/pdf/2511.11366.pdf GitHub

Abstract:
Modern electrical power grids represent complex cyber-physical systems requiring specialized cybersecurity frameworks beyond traditional IT security models. Existing threat intelligence standards such as STIX 2.1 and MITRE ATT\&CK lack coverage for grid-specific assets, operational technology relationships, and cyber-physical interdependencies essential for power system security. We present Grid-STIX, a domain-specific extension of STIX 2.1 for electrical grid cybersecurity applications. Grid-STIX employs a modular architecture encompassing physical assets, operational technology components, cyber-physical relationships, and security policies that capture modern power systems including distributed energy resources, advanced metering infrastructure, and nuclear energy facilities. The framework provides threat modeling capabilities through systematic representation of attack patterns, supply chain risks, and cross-domain impact analysis while maintaining STIX 2.1 compliance. Grid-STIX includes modules for nuclear safeguards and non-proliferation verification, enabling cybersecurity modeling across conventional and nuclear energy sectors. The ontology supports Zero Trust enforcement through policy decision points and operational context integration. Our implementation includes validation pipelines, Python code generation, and visualizations. Use cases demonstrate applications including cross-utility threat intelligence sharing, supply chain risk assessment, and nuclear facility cybersecurity. Grid-STIX is available as an open-source framework to advance collaborative cybersecurity research across the electrical power sector.

Paperid: 58, https://arxiv.org/pdf/2511.10720.pdf GitHub GitHub

Abstract:
Long context LLMs are vulnerable to prompt injection, where an attacker can inject an instruction in a long context to induce an LLM to generate an attacker-desired output. Existing prompt injection defenses are designed for short contexts. When extended to long-context scenarios, they have limited effectiveness. The reason is that an injected instruction constitutes only a very small portion of a long context, making the defense very challenging. In this work, we propose PISanitizer, which first pinpoints and sanitizes potential injected tokens (if any) in a context before letting a backend LLM generate a response, thereby eliminating the influence of the injected instruction. To sanitize injected tokens, PISanitizer builds on two observations: (1) prompt injection attacks essentially craft an instruction that compels an LLM to follow it, and (2) LLMs intrinsically leverage the attention mechanism to focus on crucial input tokens for output generation. Guided by these two observations, we first intentionally let an LLM follow arbitrary instructions in a context and then sanitize tokens receiving high attention that drive the instruction-following behavior of the LLM. By design, PISanitizer presents a dilemma for an attacker: the more effectively an injected instruction compels an LLM to follow it, the more likely it is to be sanitized by PISanitizer. Our extensive evaluation shows that PISanitizer can successfully prevent prompt injection, maintain utility, outperform existing defenses, is efficient, and is robust to optimization-based and strong adaptive attacks. The code is available at https://github.com/sleeepeer/PISanitizer.

Paperid: 59, https://arxiv.org/pdf/2511.08702.pdf GitHub

Abstract:
As machine learning systems move from theory to practice, they are increasingly tasked with decisions that affect healthcare access, financial opportunities, hiring, and public services. In these contexts, accuracy is only one piece of the puzzle - models must also be fair to different groups, protect individual privacy, and remain accountable to stakeholders. Achieving all three is difficult: differential privacy can unintentionally worsen disparities, fairness interventions often rely on sensitive data that privacy restricts, and automated pipelines ignore that fairness is ultimately a human and contextual judgment. We introduce FAIRPLAI (Fair and Private Learning with Active Human Influence), a practical framework that integrates human oversight into the design and deployment of machine learning systems. FAIRPLAI works in three ways: (1) it constructs privacy-fairness frontiers that make trade-offs between accuracy, privacy guarantees, and group outcomes transparent; (2) it enables interactive stakeholder input, allowing decision-makers to select fairness criteria and operating points that reflect their domain needs; and (3) it embeds a differentially private auditing loop, giving humans the ability to review explanations and edge cases without compromising individual data security. Applied to benchmark datasets, FAIRPLAI consistently preserves strong privacy protections while reducing fairness disparities relative to automated baselines. More importantly, it provides a straightforward, interpretable process for practitioners to manage competing demands of accuracy, privacy, and fairness in socially impactful applications. By embedding human judgment where it matters most, FAIRPLAI offers a pathway to machine learning systems that are effective, responsible, and trustworthy in practice. GitHub: https://github.com/Li1Davey/Fairplai

Paperid: 60, https://arxiv.org/pdf/2511.08491.pdf GitHub

Abstract:
With increasingly sophisticated cybersecurity threats and rising demand for network automation, autonomous cybersecurity mechanisms are becoming critical for securing modern networks. The rapid expansion of Internet of Things (IoT) systems amplifies these challenges, as resource-constrained IoT devices demand scalable and efficient security solutions. In this work, an innovative Intrusion Detection System (IDS) utilizing Automated Machine Learning (AutoML) and Multi-Objective Optimization (MOO) is proposed for autonomous and optimized cyber-attack detection in modern networking environments. The proposed IDS framework integrates two primary innovative techniques: Optimized Importance and Percentage-based Automated Feature Selection (OIP-AutoFS) and Optimized Performance, Confidence, and Efficiency-based Combined Algorithm Selection and Hyperparameter Optimization (OPCE-CASH). These components optimize feature selection and model learning processes to strike a balance between intrusion detection effectiveness and computational efficiency. This work presents the first IDS framework that integrates all four AutoML stages and employs multi-objective optimization to jointly optimize detection effectiveness, efficiency, and confidence for deployment in resource-constrained systems. Experimental evaluations over two benchmark cybersecurity datasets demonstrate that the proposed MOO-AutoML IDS outperforms state-of-the-art IDSs, establishing a new benchmark for autonomous, efficient, and optimized security for networks. Designed to support IoT and edge environments with resource constraints, the proposed framework is applicable to a variety of autonomous cybersecurity applications across diverse networked environments.

Paperid: 61, https://arxiv.org/pdf/2511.07505.pdf GitHub

Abstract:
Data duplication within large-scale corpora often impedes large language models' (LLMs) performance and privacy. In privacy-concerned federated learning scenarios, conventional deduplication methods typically rely on trusted third parties to perform uniform deletion, risking loss of informative samples while introducing privacy vulnerabilities. To address these gaps, we propose Federated ReWeighting (FedRW), the first privacy-preserving framework, to the best of our knowledge, that performs soft deduplication via sample reweighting instead of deletion in federated LLM training, without assuming a trusted third party. At its core, FedRW proposes a secure, frequency-aware reweighting protocol through secure multi-party computation, coupled with a parallel orchestration strategy to ensure efficiency and scalability. During training, FedRW utilizes an adaptive reweighting mechanism with global sample frequencies to adjust individual loss contributions, effectively improving generalization and robustness. Empirical results demonstrate that FedRW outperforms the state-of-the-art method by achieving up to 28.78x speedup in preprocessing and approximately 11.42% improvement in perplexity, while offering enhanced security guarantees. FedRW thus establishes a new paradigm for managing duplication in federated LLM training.

Paperid: 62, https://arxiv.org/pdf/2511.07099.pdf GitHub

Abstract:
Recent advancements in speech synthesis technology have enriched our daily lives, with high-quality and human-like audio widely adopted across real-world applications. However, malicious exploitation like voice-cloning fraud poses severe security risks. Existing defense techniques struggle to address the production large language model (LLM)-based speech synthesis. While previous studies have considered the protection for fine-tuning synthesizers, they assume manually annotated transcripts. Given the labor intensity of manual annotation, end-to-end (E2E) systems leveraging automatic speech recognition (ASR) to generate transcripts are becoming increasingly prevalent, e.g., voice cloning via commercial APIs. Therefore, this E2E speech synthesis also requires new security mechanisms. To tackle these challenges, we propose E2E-VGuard, a proactive defense framework for two emerging threats: (1) production LLM-based speech synthesis, and (2) the novel attack arising from ASR-driven E2E scenarios. Specifically, we employ the encoder ensemble with a feature extractor to protect timbre, while ASR-targeted adversarial examples disrupt pronunciation. Moreover, we incorporate the psychoacoustic model to ensure perturbative imperceptibility. For a comprehensive evaluation, we test 16 open-source synthesizers and 3 commercial APIs across Chinese and English datasets, confirming E2E-VGuard's effectiveness in timbre and pronunciation protection. Real-world deployment validation is also conducted. Our code and demo page are available at https://wxzyd123.github.io/e2e-vguard/.

Paperid: 63, https://arxiv.org/pdf/2511.06942.pdf GitHub

Abstract:
To prevent misinformation and social issues arising from trustworthy-looking content generated by LLMs, it is crucial to develop efficient and reliable methods for identifying the source of texts. Previous approaches have demonstrated exceptional performance in detecting texts fully generated by LLMs. However, these methods struggle when confronting more advanced LLM output or text with adversarial multi-task machine revision, especially in the black-box setting, where the generating model is unknown. To address this challenge, grounded in the hypothesis that human writing possesses distinctive stylistic patterns, we propose Human Language Preference Detection (HLPD). HLPD employs a reward-based alignment process, Human Language Preference Optimization (HLPO), to shift the scoring model's token distribution toward human-like writing, making the model more sensitive to human writing, therefore enhancing the identification of machine-revised text. We test HLPD in an adversarial multi-task evaluation framework that leverages a five-dimensional prompt generator and multiple advanced LLMs to create diverse revision scenarios. When detecting texts revised by GPT-series models, HLPD achieves a 15.11% relative improvement in AUROC over ImBD, surpassing Fast-DetectGPT by 45.56%. When evaluated on texts generated by advanced LLMs, HLPD achieves the highest average AUROC, exceeding ImBD by 5.53% and Fast-DetectGPT by 34.14%. Code will be made available at https://github.com/dfq2021/HLPD.

Paperid: 64, https://arxiv.org/pdf/2511.05867.pdf GitHub

Abstract:
Large language models (LLMs) demonstrate strong capabilities in solving complex tasks when integrated with external tools. The Model Context Protocol (MCP) has become a standard interface for enabling such tool-based interactions. However, these interactions introduce substantial security concerns, particularly when the MCP server is compromised or untrustworthy. While prior benchmarks primarily focus on prompt injection attacks or analyze the vulnerabilities of LLM MCP interaction trajectories, limited attention has been given to the underlying system logs associated with malicious MCP servers. To address this gap, we present the first synthetic benchmark for evaluating LLMs ability to identify security risks from system logs. We define nine categories of MCP server risks and generate 1,800 synthetic system logs using ten state-of-the-art LLMs. These logs are embedded in the return values of 243 curated MCP servers, yielding a dataset of 2,421 chat histories for training and 471 queries for evaluation. Our pilot experiments reveal that smaller models often fail to detect risky system logs, leading to high false negatives. While models trained with supervised fine-tuning (SFT) tend to over-flag benign logs, resulting in elevated false positives, Reinforcement Learning from Verifiable Reward (RLVR) offers a better precision-recall balance. In particular, after training with Group Relative Policy Optimization (GRPO), Llama3.1-8B-Instruct achieves 83% accuracy, surpassing the best-performing large remote model by 9 percentage points. Fine-grained, per-category analysis further underscores the effectiveness of reinforcement learning in enhancing LLM safety within the MCP framework. Code and data are available at: https://github.com/PorUna-byte/MCP-RiskCue

Paperid: 65, https://arxiv.org/pdf/2511.05796.pdf GitHub

Abstract:
The open nature of wireless communications renders unmanned aerial vehicle (UAV) communications vulnerable to impersonation attacks, under which malicious UAVs can impersonate authorized ones with stolen digital certificates. Traditional fingerprint-based UAV authentication approaches rely on a single modality of sensory data gathered from a single layer of the network model, resulting in unreliable authentication experiences, particularly when UAVs are mobile and in an open-world environment. To transcend these limitations, this paper proposes SecureLink, a UAV authentication system that is among the first to employ cross-layer information for enhancing the efficiency and reliability of UAV authentication. Instead of using single modalities, SecureLink fuses physical-layer radio frequency (RF) fingerprints and application-layer micro-electromechanical system (MEMS) fingerprints into reliable UAV identifiers via multimodal fusion. SecureLink first aligns fingerprints from channel state information measurements and telemetry data, such as feedback readings of onboard accelerometers, gyroscopes, and barometers. Then, an attention-based neural network is devised for in-depth feature fusion. Next, the fused features are trained by a multi-similarity loss and fed into a one-class support vector machine for open-world authentication. We extensively implement our SecureLink using three different types of UAVs and evaluate it in different environments. With only six additional data frames, SecureLink achieves a closed-world accuracy of 98.61% and an open-world accuracy of 97.54% with two impersonating UAVs, outperforming the existing approaches in authentication robustness and communication overheads. Finally, our datasets collected from these experiments are available on GitHub: https://github.com/PhyGroup/SecureLink\_data.

Paperid: 66, https://arxiv.org/pdf/2511.04014.pdf GitHub

Abstract:
Large language models (LLMs) have achieved remarkable progress in code understanding tasks. However, they demonstrate limited performance in vulnerability detection and struggle to distinguish vulnerable code from patched code. We argue that LLMs lack understanding of security specifications -- the expectations about how code should behave to remain safe. When code behavior differs from these expectations, it becomes a potential vulnerability. However, such knowledge is rarely explicit in training data, leaving models unable to reason about security flaws. We propose VulInstruct, a specification-guided approach that systematically extracts security specifications from historical vulnerabilities to detect new ones. VulInstruct constructs a specification knowledge base from two perspectives: (i) General specifications from high-quality patches across projects, capturing fundamental safe behaviors; and (ii) Domain-specific specifications from repeated violations in particular repositories relevant to the target code. VulInstruct retrieves relevant past cases and specifications, enabling LLMs to reason about expected safe behaviors rather than relying on surface patterns. We evaluate VulInstruct under strict criteria requiring both correct predictions and valid reasoning. On PrimeVul, VulInstruct achieves 45.0% F1-score (32.7% improvement) and 37.7% recall (50.8% improvement) compared to baselines, while uniquely detecting 24.3% of vulnerabilities -- 2.4x more than any baseline. In pair-wise evaluation, VulInstruct achieves 32.3% relative improvement. VulInstruct also discovered a previously unknown high-severity vulnerability (CVE-2025-56538) in production code, demonstrating practical value for real-world vulnerability discovery. All code and supplementary materials are available at https://github.com/zhuhaopku/VulInstruct-temp.

Paperid: 67, https://arxiv.org/pdf/2511.02866.pdf GitHub

Abstract:
This paper presents LM-Fix, a lightweight detection and rapid recovery framework for faults in large language models (LLMs). Existing integrity approaches are often heavy or slow for modern LLMs. LM-Fix runs a short test-vector pass and uses hash-guided checks to detect bit-flip faults, then repairs them locally without a full reload. Across multiple models, it detects over 94% of single-bit flips at TVL=200 and nearly 100% of multi-bit flips with approximately 1% to 7.7% runtime overhead; recovery is more than 100x faster than reloading. These results show a practical, low-overhead solution to keep LLMs reliable in production

Paperid: 68, https://arxiv.org/pdf/2511.02376.pdf GitHub

Abstract:
Large Language Models (LLMs) remain vulnerable to jailbreaking attacks where adversarial prompts elicit harmful outputs, yet most evaluations focus on single-turn interactions while real-world attacks unfold through adaptive multi-turn conversations. We present AutoAdv, a training-free framework for automated multi-turn jailbreaking that achieves up to 95% attack success rate on Llama-3.1-8B within six turns a 24 percent improvement over single turn baselines. AutoAdv uniquely combines three adaptive mechanisms: a pattern manager that learns from successful attacks to enhance future prompts, a temperature manager that dynamically adjusts sampling parameters based on failure modes, and a two-phase rewriting strategy that disguises harmful requests then iteratively refines them. Extensive evaluation across commercial and open-source models (GPT-4o-mini, Qwen3-235B, Mistral-7B) reveals persistent vulnerabilities in current safety mechanisms, with multi-turn attacks consistently outperforming single-turn approaches. These findings demonstrate that alignment strategies optimized for single-turn interactions fail to maintain robustness across extended conversations, highlighting an urgent need for multi-turn-aware defenses.

Paperid: 69, https://arxiv.org/pdf/2511.01952.pdf GitHub

Abstract:
Large vision-language models (LVLMs) derive their capabilities from extensive training on vast corpora of visual and textual data. Empowered by large-scale parameters, these models often exhibit strong memorization of their training data, rendering them susceptible to membership inference attacks (MIAs). Existing MIA methods for LVLMs typically operate under white- or gray-box assumptions, by extracting likelihood-based features for the suspected data samples based on the target LVLMs. However, mainstream LVLMs generally only expose generated outputs while concealing internal computational features during inference, limiting the applicability of these methods. In this work, we propose the first black-box MIA framework for LVLMs, based on a prior knowledge-calibrated memory probing mechanism. The core idea is to assess the model memorization of the private semantic information embedded within the suspected image data, which is unlikely to be inferred from general world knowledge alone. We conducted extensive experiments across four LVLMs and three datasets. Empirical results demonstrate that our method effectively identifies training data of LVLMs in a purely black-box setting and even achieves performance comparable to gray-box and white-box methods. Further analysis reveals the robustness of our method against potential adversarial manipulations, and the effectiveness of the methodology designs. Our code and data are available at https://github.com/spmede/KCMP.

Paperid: 70, https://arxiv.org/pdf/2511.01197.pdf GitHub

Abstract:
Private large language model (LLM) inference based on cryptographic primitives offers a promising path towards privacy-preserving deep learning. However, existing frameworks only support dense LLMs like LLaMA-1 and struggle to scale to mixture-of-experts (MoE) architectures. The key challenge comes from securely evaluating the dynamic routing mechanism in MoE layers, which may reveal sensitive input information if not fully protected. In this paper, we propose CryptoMoE, the first framework that enables private, efficient, and accurate inference for MoE-based models. CryptoMoE balances expert loads to protect expert routing information and proposes novel protocols for secure expert dispatch and combine. CryptoMoE also develops a confidence-aware token selection strategy and a batch matrix multiplication protocol to improve accuracy and efficiency further. Extensive experiments on DeepSeekMoE-16.4B, OLMoE-6.9B, and QWenMoE-14.3B show that CryptoMoE achieves $2.8\sim3.5\times$ end-to-end latency reduction and $2.9\sim4.3\times$ communication reduction over a dense baseline with minimum accuracy loss. We also adapt CipherPrune (ICLR'25) for MoE inference and demonstrate CryptoMoE can reduce the communication by up to $4.3 \times$. Code is available at: https://github.com/PKU-SEC-Lab/CryptoMoE.

Paperid: 71, https://arxiv.org/pdf/2511.00446.pdf GitHub

Abstract:
The Contrastive Language-Image Pretraining (CLIP) model has significantly advanced vision-language modeling by aligning image-text pairs from large-scale web data through self-supervised contrastive learning. Yet, its reliance on uncurated Internet-sourced data exposes it to data poisoning and backdoor risks. While existing studies primarily investigate image-based attacks, the text modality, which is equally central to CLIP's training, remains underexplored. In this work, we introduce ToxicTextCLIP, a framework for generating high-quality adversarial texts that target CLIP during the pre-training phase. The framework addresses two key challenges: semantic misalignment caused by background inconsistency with the target class, and the scarcity of background-consistent texts. To this end, ToxicTextCLIP iteratively applies: 1) a background-aware selector that prioritizes texts with background content aligned to the target class, and 2) a background-driven augmenter that generates semantically coherent and diverse poisoned samples. Extensive experiments on classification and retrieval tasks show that ToxicTextCLIP achieves up to 95.83% poisoning success and 98.68% backdoor Hit@1, while bypassing RoCLIP, CleanCLIP and SafeCLIP defenses. The source code can be accessed via https://github.com/xinyaocse/ToxicTextCLIP/.

Paperid: 72, https://arxiv.org/pdf/2510.27298.pdf GitHub

Abstract:
Phishing constitutes more than 90\% of successful cyberattacks globally, remaining one of the most persistent threats to organizational security. Despite organizations tripling their cybersecurity budgets between 2015 and 2025, the human factor continues to pose a critical vulnerability. This study presents a 12-month longitudinal investigation examining how continuous cybersecurity training and emotional cues affect employee susceptibility to phishing. The experiment involved 20 organizations and over 1,300 employees who collectively received more than 13,000 simulated phishing emails engineered with diverse emotional, contextual, and structural characteristics. Behavioral responses were analyzed using non-parametric correlation and regression models to assess the influence of psychological manipulation, message personalization, and perceived email source. Results demonstrate that sustained phishing simulations and targeted training programs lead to a significant reduction in employee susceptibility, halving successful compromise rates within six months. Additionally, employee turnover introduces measurable fluctuations in awareness levels, underscoring the necessity of maintaining continuous training initiatives. These findings provide one of the few long-term perspectives on phishing awareness efficacy, highlighting the strategic importance of ongoing behavioral interventions in strengthening organizational cyber resilience. In order to support open science, we published our email templates, source code, and other materials at https://github.com/CorporatePhishingStudy

Paperid: 73, https://arxiv.org/pdf/2510.27285.pdf GitHub

Abstract:
Concept erasure aims to selectively unlearning undesirable content in diffusion models (DMs) to reduce the risk of sensitive content generation. As a novel paradigm in concept erasure, most existing methods employ adversarial training to identify and suppress target concepts, thus reducing the likelihood of sensitive outputs. However, these methods often neglect the specificity of adversarial training in DMs, resulting in only partial mitigation. In this work, we investigate and quantify this specificity from the perspective of concept space, i.e., can adversarial samples truly fit the target concept space? We observe that existing methods neglect the role of conceptual semantics when generating adversarial samples, resulting in ineffective fitting of concept spaces. This oversight leads to the following issues: 1) when there are few adversarial samples, they fail to comprehensively cover the object concept; 2) conversely, they will disrupt other target concept spaces. Motivated by the analysis of these findings, we introduce S-GRACE (Semantics-Guided Robust Adversarial Concept Erasure), which grace leveraging semantic guidance within the concept space to generate adversarial samples and perform erasure training. Experiments conducted with seven state-of-the-art methods and three adversarial prompt generation strategies across various DM unlearning scenarios demonstrate that S-GRACE significantly improves erasure performance 26%, better preserves non-target concepts, and reduces training time by 90%. Our code is available at https://github.com/Qhong-522/S-GRACE.

Paperid: 74, https://arxiv.org/pdf/2510.26096.pdf GitHub

Abstract:
Recent advances in Audio-Language Models (ALMs) have significantly improved multimodal understanding capabilities. However, the introduction of the audio modality also brings new and unique vulnerability vectors. Previous studies have proposed jailbreak attacks that specifically target ALMs, revealing that defenses directly transferred from traditional audio adversarial attacks or text-based Large Language Model (LLM) jailbreaks are largely ineffective against these ALM-specific threats. To address this issue, we propose ALMGuard, the first defense framework tailored to ALMs. Based on the assumption that safety-aligned shortcuts naturally exist in ALMs, we design a method to identify universal Shortcut Activation Perturbations (SAPs) that serve as triggers that activate the safety shortcuts to safeguard ALMs at inference time. To better sift out effective triggers while preserving the model's utility on benign tasks, we further propose Mel-Gradient Sparse Mask (M-GSM), which restricts perturbations to Mel-frequency bins that are sensitive to jailbreaks but insensitive to speech understanding. Both theoretical analyses and empirical results demonstrate the robustness of our method against both seen and unseen attacks. Overall, \MethodName reduces the average success rate of advanced ALM-specific jailbreak attacks to 4.6% across four models, while maintaining comparable utility on benign benchmarks, establishing it as the new state of the art. Our code and data are available at https://github.com/WeifeiJin/ALMGuard.

Paperid: 75, https://arxiv.org/pdf/2510.25189.pdf GitHub

Abstract:
Designing realistic and adaptive networked threat scenarios remains a core challenge in cybersecurity research and training, still requiring substantial manual effort. While large language models (LLMs) show promise for automated synthesis, unconstrained generation often yields configurations that fail validation or execution. We present AgentCyTE, a framework integrating LLM-based reasoning with deterministic, schema-constrained network emulation to generate and refine executable threat environments. Through an agentic feedback loop, AgentCyTE observes scenario outcomes, validates correctness, and iteratively enhances realism and consistency. This hybrid approach preserves LLM flexibility while enforcing structural validity, enabling scalable, data-driven experimentation and reliable scenario generation for threat modeling and adaptive cybersecurity training. Our framework can be accessed at: https://github.com/AnantaaKotal/AgentCyTE

Paperid: 76, https://arxiv.org/pdf/2510.23074.pdf GitHub

Abstract:
We propose Fast-MIA (https://github.com/Nikkei/fast-mia), a Python library for efficiently evaluating membership inference attacks (MIA) against Large Language Models (LLMs). MIA against LLMs has emerged as a crucial challenge due to growing concerns over copyright, security, and data privacy, and has attracted increasing research attention. However, the progress of this research is significantly hindered by two main obstacles: (1) the high computational cost of inference in LLMs, and (2) the lack of standardized and maintained implementations of MIA methods, which makes large-scale empirical comparison difficult. To address these challenges, our library provides fast batch inference and includes implementations of representative MIA methods under a unified evaluation framework. This library supports easy implementation of reproducible benchmarks with simple configuration and extensibility. We release Fast-MIA as an open-source (Apache License 2.0) tool to support scalable and transparent research on LLMs.

Paperid: 77, https://arxiv.org/pdf/2510.22300.pdf GitHub

Abstract:
Using risky text prompts, such as pornography and violent prompts, to test the safety of text-to-image (T2I) models is a critical task. However, existing risky prompt datasets are limited in three key areas: 1) limited risky categories, 2) coarse-grained annotation, and 3) low effectiveness. To address these limitations, we introduce T2I-RiskyPrompt, a comprehensive benchmark designed for evaluating safety-related tasks in T2I models. Specifically, we first develop a hierarchical risk taxonomy, which consists of 6 primary categories and 14 fine-grained subcategories. Building upon this taxonomy, we construct a pipeline to collect and annotate risky prompts. Finally, we obtain 6,432 effective risky prompts, where each prompt is annotated with both hierarchical category labels and detailed risk reasons. Moreover, to facilitate the evaluation, we propose a reason-driven risky image detection method that explicitly aligns the MLLM with safety annotations. Based on T2I-RiskyPrompt, we conduct a comprehensive evaluation of eight T2I models, nine defense methods, five safety filters, and five attack strategies, offering nine key insights into the strengths and limitations of T2I model safety. Finally, we discuss potential applications of T2I-RiskyPrompt across various research fields. The dataset and code are provided in https://github.com/datar001/T2I-RiskyPrompt.

Paperid: 78, https://arxiv.org/pdf/2510.21946.pdf GitHub

Abstract:
Large language models (LLMs) demonstrate remarkable capabilities across various tasks. However, their deployment introduces significant risks related to intellectual property. In this context, we focus on model stealing attacks, where adversaries replicate the behaviors of these models to steal services. These attacks are highly relevant to proprietary LLMs and pose serious threats to revenue and financial stability. To mitigate these risks, the watermarking solution embeds imperceptible patterns in LLM outputs, enabling model traceability and intellectual property verification. In this paper, we study the vulnerability of LLM service providers by introducing $δ$-STEAL, a novel model stealing attack that bypasses the service provider's watermark detectors while preserving the adversary's model utility. $δ$-STEAL injects noise into the token embeddings of the adversary's model during fine-tuning in a way that satisfies local differential privacy (LDP) guarantees. The adversary queries the service provider's model to collect outputs and form input-output training pairs. By applying LDP-preserving noise to these pairs, $δ$-STEAL obfuscates watermark signals, making it difficult for the service provider to determine whether its outputs were used, thereby preventing claims of model theft. Our experiments show that $δ$-STEAL with lightweight modifications achieves attack success rates of up to $96.95\%$ without significantly compromising the adversary's model utility. The noise scale in LDP controls the trade-off between attack effectiveness and model utility. This poses a significant risk, as even robust watermarks can be bypassed, allowing adversaries to deceive watermark detectors and undermine current intellectual property protection methods.

Paperid: 79, https://arxiv.org/pdf/2510.21086.pdf GitHub

Abstract:
Federated Learning (FL) enables collaborative model training across institutions without sharing raw data. However, gradient sharing still risks privacy leakage, such as gradient inversion attacks. Homomorphic Encryption (HE) can secure aggregation but often incurs prohibitive computational and communication overhead. Existing HE-based FL methods sit at two extremes: encrypting all gradients for full privacy at high cost, or partially encrypting gradients to save resources while exposing vulnerabilities. We present DictPFL, a practical framework that achieves full gradient protection with minimal overhead. DictPFL encrypts every transmitted gradient while keeping non-transmitted parameters local, preserving privacy without heavy computation. It introduces two key modules: Decompose-for-Partial-Encrypt (DePE), which decomposes model weights into a static dictionary and an updatable lookup table, only the latter is encrypted and aggregated, while the static dictionary remains local and requires neither sharing nor encryption; and Prune-for-Minimum-Encrypt (PrME), which applies encryption-aware pruning to minimize encrypted parameters via consistent, history-guided masks. Experiments show that DictPFL reduces communication cost by 402-748$\times$ and accelerates training by 28-65$\times$ compared to fully encrypted FL, while outperforming state-of-the-art selective encryption methods by 51-155$\times$ in overhead and 4-19$\times$ in speed. Remarkably, DictPFL's runtime is within 2$\times$ of plaintext FL, demonstrating for the first time, that HE-based private federated learning is practical for real-world deployment. The code is publicly available at https://github.com/UCF-ML-Research/DictPFL.

Paperid: 80, https://arxiv.org/pdf/2510.21053.pdf GitHub

Abstract:
Watermarking has emerged as a promising solution for tracing and authenticating text generated by large language models (LLMs). A common approach to LLM watermarking is to construct a green/red token list and assign higher or lower generation probabilities to the corresponding tokens, respectively. However, most existing watermarking algorithms rely on heuristic green/red token list designs, as directly optimizing the list design with techniques such as reinforcement learning (RL) comes with several challenges. First, desirable watermarking involves multiple criteria, i.e., detectability, text quality, robustness against removal attacks, and security against spoofing attacks. Directly optimizing for these criteria introduces many partially conflicting reward terms, leading to an unstable convergence process. Second, the vast action space of green/red token list choices is susceptible to reward hacking. In this paper, we propose an end-to-end RL framework for robust and secure LLM watermarking. Our approach adopts an anchoring mechanism for reward terms to ensure stable training and introduces additional regularization terms to prevent reward hacking. Experiments on standard benchmarks with two backbone LLMs show that our method achieves a state-of-the-art trade-off across all criteria, with notable improvements in resistance to spoofing attacks without degrading other criteria. Our code is available at https://github.com/UCSB-NLP-Chang/RL-watermark.

Paperid: 81, https://arxiv.org/pdf/2510.20975.pdf GitHub

Abstract:
Reverse engineering (RE) of x86 binaries is indispensable for malware and firmware analysis, but remains slow due to stripped metadata and adversarial obfuscation. Large Language Models (LLMs) offer potential for improving RE efficiency through automated comprehension and commenting, but cloud-hosted, closed-weight models pose privacy and security risks and cannot be used in closed-network facilities. We evaluate parameter-efficient fine-tuned local LLMs for assisting with x86 RE tasks in these settings. Eight open-weight models across the CodeLlama, Qwen2.5-Coder, and CodeGemma series are fine-tuned on a custom curated dataset of 5,981 x86 assembly examples. We evaluate them quantitatively and identify the fine-tuned Qwen2.5-Coder-7B as the top performer, which we name REx86. REx86 reduces test-set cross-entropy loss by 64.2% and improves semantic cosine similarity against ground truth by 20.3\% over its base model. In a limited user case study (n=43), REx86 significantly enhanced line-level code understanding (p = 0.031) and increased the correct-solve rate from 31% to 53% (p = 0.189), though the latter did not reach statistical significance. Qualitative analysis shows more accurate, concise comments with fewer hallucinations. REx86 delivers state-of-the-art assistance in x86 RE among local, open-weight LLMs. Our findings demonstrate the value of domain-specific fine-tuning, and highlight the need for more commented disassembly data to further enhance LLM performance in RE. REx86, its dataset, and LoRA adapters are publicly available at https://github.com/dlea8/REx86 and https://zenodo.org/records/15420461.

Paperid: 82, https://arxiv.org/pdf/2510.20367.pdf GitHub

Abstract:
Pretrained deep learning model sharing holds tremendous value for researchers and enterprises alike. It allows them to apply deep learning by fine-tuning models at a fraction of the cost of training a brand-new model. However, model sharing exposes end-users to cyber threats that leverage the models for malicious purposes. Attackers can use model sharing by hiding self-executing malware inside neural network parameters and then distributing them for unsuspecting users to unknowingly directly execute them, or indirectly as a dependency in another software. In this work, we propose NeuPerm, a simple yet effec- tive way of disrupting such malware by leveraging the theoretical property of neural network permutation symmetry. Our method has little to no effect on model performance at all, and we empirically show it successfully disrupts state-of-the-art attacks that were only previously addressed using quantization, a highly complex process. NeuPerm is shown to work on LLMs, a feat that no other previous similar works have achieved. The source code is available at https://github.com/danigil/NeuPerm.git.

Paperid: 83, https://arxiv.org/pdf/2510.20269.pdf GitHub

Abstract:
In this work, we experimentally demonstrate that it is possible to generate true random numbers at high throughput and low latency in commercial off-the-shelf (COTS) DRAM chips by leveraging simultaneous multiple-row activation (SiMRA) via an extensive characterization of 96 DDR4 DRAM chips. We rigorously analyze SiMRA's true random generation potential in terms of entropy, latency, and throughput for varying numbers of simultaneously activated DRAM rows (i.e., 2, 4, 8, 16, and 32), data patterns, temperature levels, and spatial variations. Among our 11 key experimental observations, we highlight four key results. First, we evaluate the quality of our TRNG designs using the commonly-used NIST statistical test suite for randomness and find that all SiMRA-based TRNG designs successfully pass each test. Second, 2-, 8-, 16-, and 32-row activation-based TRNG designs outperform the state-of-theart DRAM-based TRNG in throughput by up to 1.15x, 1.99x, 1.82x, and 1.39x, respectively. Third, SiMRA's entropy tends to increase with the number of simultaneously activated DRAM rows. Fourth, operational parameters and conditions (e.g., data pattern and temperature) significantly affect entropy. For example, for most of the tested modules, the average entropy of 32-row activation is 2.51x higher than that of 2-row activation. For example, increasing the temperature from 50°C to 90°C decreases SiMRA's entropy by 1.53x for 32-row activation. To aid future research and development, we open-source our infrastructure at https://github.com/CMU-SAFARI/SiMRA-TRNG.

Paperid: 84, https://arxiv.org/pdf/2510.19977.pdf GitHub GitHub

Abstract:
Randomized smoothing has become essential for achieving certified adversarial robustness in machine learning models. However, current methods primarily use isotropic noise distributions that are uniform across all data dimensions, such as image pixels, limiting the effectiveness of robustness certification by ignoring the heterogeneity of inputs and data dimensions. To address this limitation, we propose UCAN: a novel technique that \underline{U}niversally \underline{C}ertifies adversarial robustness with \underline{A}nisotropic \underline{N}oise. UCAN is designed to enhance any existing randomized smoothing method, transforming it from symmetric (isotropic) to asymmetric (anisotropic) noise distributions, thereby offering a more tailored defense against adversarial attacks. Our theoretical framework is versatile, supporting a wide array of noise distributions for certified robustness in different $\ell_p$-norms and applicable to any arbitrary classifier by guaranteeing the classifier's prediction over perturbed inputs with provable robustness bounds through tailored noise injection. Additionally, we develop a novel framework equipped with three exemplary noise parameter generators (NPGs) to optimally fine-tune the anisotropic noise parameters for different data dimensions, allowing for pursuing different levels of robustness enhancements in practice.Empirical evaluations underscore the significant leap in UCAN's performance over existing state-of-the-art methods, demonstrating up to $182.6\%$ improvement in certified accuracy at large certified radii on MNIST, CIFAR10, and ImageNet datasets.\footnote{Code is anonymously available at \href{https://github.com/youbin2014/UCAN/}{https://github.com/youbin2014/UCAN/}}

Paperid: 85, https://arxiv.org/pdf/2510.19844.pdf GitHub

Abstract:
As large language models (LLMs) become integrated into various sensitive applications, prompt injection, the use of prompting to induce harmful behaviors from LLMs, poses an ever increasing risk. Prompt injection attacks can cause LLMs to leak sensitive data, spread misinformation, and exhibit harmful behaviors. To defend against these attacks, we propose CourtGuard, a locally-runnable, multiagent prompt injection classifier. In it, prompts are evaluated in a court-like multiagent LLM system, where a "defense attorney" model argues the prompt is benign, a "prosecution attorney" model argues the prompt is a prompt injection, and a "judge" model gives the final classification. CourtGuard has a lower false positive rate than the Direct Detector, an LLM as-a-judge. However, CourtGuard is generally a worse prompt injection detector. Nevertheless, this lower false positive rate highlights the importance of considering both adversarial and benign scenarios for the classification of a prompt. Additionally, the relative performance of CourtGuard in comparison to other prompt injection classifiers advances the use of multiagent systems as a defense against prompt injection attacks. The implementations of CourtGuard and the Direct Detector with full prompts for Gemma-3-12b-it, Llama-3.3-8B, and Phi-4-mini-instruct are available at https://github.com/isaacwu2000/CourtGuard.

Paperid: 86, https://arxiv.org/pdf/2510.19420.pdf GitHub

Abstract:
Large Language Model (LLM)-based Multi-Agent Systems (MAS) have become a popular paradigm of AI applications. However, trustworthiness issues in MAS remain a critical concern. Unlike challenges in single-agent systems, MAS involve more complex communication processes, making them susceptible to corruption attacks. To mitigate this issue, several defense mechanisms have been developed based on the graph representation of MAS, where agents represent nodes and communications form edges. Nevertheless, these methods predominantly focus on static graph defense, attempting to either detect attacks in a fixed graph structure or optimize a static topology with certain defensive capabilities. To address this limitation, we propose a dynamic defense paradigm for MAS graph structures, which continuously monitors communication within the MAS graph, then dynamically adjusts the graph topology, accurately disrupts malicious communications, and effectively defends against evolving and diverse dynamic attacks. Experimental results in increasingly complex and dynamic MAS environments demonstrate that our method significantly outperforms existing MAS defense mechanisms, contributing an effective guardrail for their trustworthy applications. Our code is available at https://github.com/ChengcanWu/Monitoring-LLM-Based-Multi-Agent-Systems.

Paperid: 87, https://arxiv.org/pdf/2510.19207.pdf GitHub

Abstract:
When large language model (LLM) agents are increasingly deployed to automate tasks and interact with untrusted external data, prompt injection emerges as a significant security threat. By injecting malicious instructions into the data that LLMs access, an attacker can arbitrarily override the original user task and redirect the agent toward unintended, potentially harmful actions. Existing defenses either require access to model weights (fine-tuning), incur substantial utility loss (detection-based), or demand non-trivial system redesign (system-level). Motivated by this, we propose DataFilter, a test-time model-agnostic defense that removes malicious instructions from the data before it reaches the backend LLM. DataFilter is trained with supervised fine-tuning on simulated injections and leverages both the user's instruction and the data to selectively strip adversarial content while preserving benign information. Across multiple benchmarks, DataFilter consistently reduces the prompt injection attack success rates to near zero while maintaining the LLMs' utility. DataFilter delivers strong security, high utility, and plug-and-play deployment, making it a strong practical defense to secure black-box commercial LLMs against prompt injection. Our DataFilter model is released at https://huggingface.co/JoyYizhu/DataFilter for immediate use, with the code to reproduce our results at https://github.com/yizhu-joy/DataFilter.

Paperid: 88, https://arxiv.org/pdf/2510.18465.pdf GitHub

Abstract:
Web-based behavior-manipulation attacks (BMAs) - such as scareware, fake software downloads, tech support scams, etc. - are a class of social engineering (SE) attacks that exploit human decision-making vulnerabilities. These attacks remain under-studied compared to other attacks such as information harvesting attacks (e.g., phishing) or malware infections. Prior technical work has primarily focused on measuring BMAs, offering little in the way of generic defenses. To address this gap, we introduce Pixel Patrol 3D (PP3D), the first end-to-end browser framework for discovering, detecting, and defending against behavior-manipulating SE attacks in real time. PP3D consists of a visual detection model implemented within a browser extension, which deploys the model client-side to protect users across desktop and mobile devices while preserving privacy. Our evaluation shows that PP3D can achieve above 99% detection rate at 1% false positives, while maintaining good latency and overhead performance across devices. Even when faced with new BMA samples collected months after training the detection model, our defense system can still achieve above 97% detection rate at 1% false positives. These results demonstrate that our framework offers a practical, effective, and generalizable defense against a broad and evolving class of web behavior-manipulation attacks.

Paperid: 89, https://arxiv.org/pdf/2510.16794.pdf GitHub

Abstract:
We present a novel approach for attacking black-box large language models (LLMs) by exploiting their ability to express confidence in natural language. Existing black-box attacks require either access to continuous model outputs like logits or confidence scores (which are rarely available in practice), or rely on proxy signals from other models. Instead, we demonstrate how to prompt LLMs to express their internal confidence in a way that is sufficiently calibrated to enable effective adversarial optimization. We apply our general method to three attack scenarios: adversarial examples for vision-LLMs, jailbreaks and prompt injections. Our attacks successfully generate malicious inputs against systems that only expose textual outputs, thereby dramatically expanding the attack surface for deployed LLMs. We further find that better and larger models exhibit superior calibration when expressing confidence, creating a concerning security paradox where model capability improvements directly enhance vulnerability. Our code is available at this [link](https://github.com/zj-jayzhang/black_box_llm_optimization).

Paperid: 90, https://arxiv.org/pdf/2510.16255.pdf GitHub

Abstract:
Large Language Model (LLM) providers expose fine-tuning APIs that let end users fine-tune their frontier LLMs. Unfortunately, it has been shown that an adversary with fine-tuning access to an LLM can bypass safeguards. Particularly concerning, such attacks may avoid detection with datasets that are only implicitly harmful. Our work studies robust detection mechanisms for adversarial use of fine-tuning APIs. We introduce the concept of a fine-tuning auditing agent and show it can detect harmful fine-tuning prior to model deployment. We provide our auditing agent with access to the fine-tuning dataset, as well as the fine-tuned and pre-fine-tuned models, and request the agent assigns a risk score for the fine-tuning job. We evaluate our detection approach on a diverse set of eight strong fine-tuning attacks from the literature, along with five benign fine-tuned models, totaling over 1400 independent audits. These attacks are undetectable with basic content moderation on the dataset, highlighting the challenge of the task. With the best set of affordances, our auditing agent achieves a 56.2% detection rate of adversarial fine-tuning at a 1% false positive rate. Most promising, the auditor is able to detect covert cipher attacks that evade safety evaluations and content moderation of the dataset. While benign fine-tuning with unintentional subtle safety degradation remains a challenge, we establish a baseline configuration for further work in this area. We release our auditing agent at https://github.com/safety-research/finetuning-auditor.

Paperid: 91, https://arxiv.org/pdf/2510.15976.pdf GitHub

Abstract:
The rapid development of LLMs has raised concerns about their potential misuse, leading to various watermarking schemes that typically offer high detectability. However, existing watermarking techniques often face trade-off between watermark detectability and generated text quality. In this paper, we introduce Learning to Watermark (LTW), a novel selective watermarking framework that leverages multi-objective optimization to effectively balance these competing goals. LTW features a lightweight network that adaptively decides when to apply the watermark by analyzing sentence embeddings, token entropy, and current watermarking ratio. Training of the network involves two specifically constructed loss functions that guide the model toward Pareto-optimal solutions, thereby harmonizing watermark detectability and text quality. By integrating LTW with two baseline watermarking methods, our experimental evaluations demonstrate that LTW significantly enhances text quality without compromising detectability. Our selective watermarking approach offers a new perspective for designing watermarks for LLMs and a way to preserve high text quality for watermarks. The code is publicly available at: https://github.com/fattyray/learning-to-watermark

Paperid: 92, https://arxiv.org/pdf/2510.15499.pdf GitHub

Abstract:
Recent advancements in Reinforcement Learning with Verifiable Rewards (RLVR) have gained significant attention due to their objective and verifiable reward signals, demonstrating strong performance in reasoning and code generation tasks. However, the potential safety risks associated with RLVR remain underexplored. This paper presents HarmRLVR, the first systematic investigation into the alignment reversibility risk of RLVR. We show that safety alignment can be rapidly reversed using GRPO with merely 64 harmful prompts without responses, causing models to readily comply with harmful instructions. Across five models from Llama, Qwen, and DeepSeek, we empirically demonstrate that RLVR-based attacks elevate the average harmfulness score to 4.94 with an attack success rate of 96.01\%, significantly outperforming harmful fine-tuning while preserving general capabilities. Our findings reveal that RLVR can be efficiently exploited for harmful alignment, posing serious threats to open-source model safety. Please see our code at https://github.com/lyxx2535/HarmRLVR.

Paperid: 93, https://arxiv.org/pdf/2510.14005.pdf GitHub

Abstract:
LLM-integrated applications are vulnerable to prompt injection attacks, where an attacker contaminates the input to inject malicious prompts, causing the LLM to follow the attacker's intent instead of the original user's. Existing prompt injection detection methods often have sub-optimal performance and/or high computational overhead. In this work, we propose PIShield, a detection method that is both effective and efficient. Our key observation is that the internal representation of the final token in a prompt-extracted from a specific layer of the LLM, which we term the injection-critical layer-captures distinguishing features between clean and contaminated prompts. Leveraging this insight, we train a simple linear classifier on these internal representations using a labeled set of clean and contaminated prompts. We compare PIShield against 11 baselines across 5 diverse benchmark datasets and 8 prompt injection attacks. The results demonstrate that PIShield is both highly effective and efficient, substantially outperforming existing methods. Additionally, we show that PIShield resists strong adaptive attacks.

Paperid: 94, https://arxiv.org/pdf/2510.13793.pdf GitHub

Abstract:
With the rapid adoption of diffusion models for visual content generation, proving authorship and protecting copyright have become critical. This challenge is particularly important when model owners keep their models private and may be unwilling or unable to handle authorship issues, making third-party verification essential. A natural solution is to embed watermarks for later verification. However, existing methods require access to model weights and rely on computationally heavy procedures, rendering them impractical and non-scalable. To address these challenges, we propose , a lightweight watermarking scheme that utilizes the random seed used to initialize the diffusion process as a proof of authorship without modifying the generation process. Our key observation is that the initial noise derived from a seed is highly correlated with the generated visual content. By incorporating a hash function into the noise sampling process, we further ensure that recovering a valid seed from the content is infeasible. We also show that sampling an alternative seed that passes verification is infeasible, and demonstrate the robustness of our method under various manipulations. Finally, we show how to use cryptographic zero-knowledge proofs to prove ownership without revealing the seed. By keeping the seed secret, we increase the difficulty of watermark removal. In our experiments, we validate NoisePrints on multiple state-of-the-art diffusion models for images and videos, demonstrating efficient verification using only the seed and output, without requiring access to model weights.

Paperid: 95, https://arxiv.org/pdf/2510.12084.pdf GitHub GitHub

Abstract:
Chaotic systems play a key role in modern image encryption due to their sensitivity to initial conditions, ergodicity, and complex dynamics. However, many existing chaos-based encryption methods suffer from vulnerabilities, such as inadequate permutation and diffusion, and suboptimal pseudorandom properties. This paper presents Kun-IE, a novel encryption framework designed to address these issues. The framework features two key contributions: the development of the 2D Sin-Cos Pi Hyperchaotic Map (2D-SCPHM), which offers a broader chaotic range and superior pseudorandom sequence generation, and the introduction of Kun-SCAN, a novel permutation strategy that significantly reduces pixel correlations, enhancing resistance to statistical attacks. Kun-IE is flexible and supports encryption for images of any size. Experimental results and security analyses demonstrate its robustness against various cryptanalytic attacks, making it a strong solution for secure image communication. The code is available at this \href{https://github.com/QuincyQAQ/Elevating-Medical-Image-Security-A-Cryptographic-Framework-Integrating-Hyperchaotic-Map-and-GRU}{link}.

Paperid: 96, https://arxiv.org/pdf/2510.11851.pdf GitHub

Abstract:
Deep Research (DR) agents built on Large Language Models (LLMs) can perform complex, multi-step research by decomposing tasks, retrieving online information, and synthesizing detailed reports. However, the misuse of LLMs with such powerful capabilities can lead to even greater risks. This is especially concerning in high-stakes and knowledge-intensive domains such as biosecurity, where DR can generate a professional report containing detailed forbidden knowledge. Unfortunately, we have found such risks in practice: simply submitting a harmful query, which a standalone LLM directly rejects, can elicit a detailed and dangerous report from DR agents. This highlights the elevated risks and underscores the need for a deeper safety analysis. Yet, jailbreak methods designed for LLMs fall short in exposing such unique risks, as they do not target the research ability of DR agents. To address this gap, we propose two novel jailbreak strategies: Plan Injection, which injects malicious sub-goals into the agent's plan; and Intent Hijack, which reframes harmful queries as academic research questions. We conducted extensive experiments across different LLMs and various safety benchmarks, including general and biosecurity forbidden prompts. These experiments reveal 3 key findings: (1) Alignment of the LLMs often fail in DR agents, where harmful prompts framed in academic terms can hijack agent intent; (2) Multi-step planning and execution weaken the alignment, revealing systemic vulnerabilities that prompt-level safeguards cannot address; (3) DR agents not only bypass refusals but also produce more coherent, professional, and dangerous content, compared with standalone LLMs. These results demonstrate a fundamental misalignment in DR agents and call for better alignment techniques tailored to DR agents. Code and datasets are available at https://chenxshuo.github.io/deeper-harm.

Paperid: 97, https://arxiv.org/pdf/2510.11570.pdf GitHub

Abstract:
Recent reasoning-based safety guardrails for Large Reasoning Models (LRMs), such as deliberative alignment, have shown strong defense against jailbreak attacks. By leveraging LRMs' reasoning ability, these guardrails help the models to assess the safety of user inputs before generating final responses. The powerful reasoning ability can analyze the intention of the input query and will refuse to assist once it detects the harmful intent hidden by the jailbreak methods. Such guardrails have shown a significant boost in defense, such as the near-perfect refusal rates on the open-source gpt-oss series. Unfortunately, we find that these powerful reasoning-based guardrails can be extremely vulnerable to subtle manipulation of the input prompts, and once hijacked, can lead to even more harmful results. Specifically, we first uncover a surprisingly fragile aspect of these guardrails: simply adding a few template tokens to the input prompt can successfully bypass the seemingly powerful guardrails and lead to explicit and harmful responses. To explore further, we introduce a bag of jailbreak methods that subvert the reasoning-based guardrails. Our attacks span white-, gray-, and black-box settings and range from effortless template manipulations to fully automated optimization. Along with the potential for scalable implementation, these methods also achieve alarmingly high attack success rates (e.g., exceeding 90% across 5 different benchmarks on gpt-oss series on both local host models and online API services). Evaluations across various leading open-source LRMs confirm that these vulnerabilities are systemic, underscoring the urgent need for stronger alignment techniques for open-sourced LRMs to prevent malicious misuse. Code is open-sourced at https://chenxshuo.github.io/bag-of-tricks.

Paperid: 98, https://arxiv.org/pdf/2510.11301.pdf GitHub GitHub

Abstract:
The rise of digital medical imaging, like MRI and CT, demands strong encryption to protect patient data in telemedicine and cloud storage. Chaotic systems are popular for image encryption due to their sensitivity and unique characteristics, but existing methods often lack sufficient security. This paper presents the Three-dimensional Diffusion Algorithm and Deep Learning Image Encryption system (TDADL-IE), built on three key elements. First, we propose an enhanced chaotic generator using an LSTM network with a 1D-Sine Quadratic Chaotic Map (1D-SQCM) for better pseudorandom sequence generation. Next, a new three-dimensional diffusion algorithm (TDA) is applied to encrypt permuted images. TDADL-IE is versatile for images of any size. Experiments confirm its effectiveness against various security threats. The code is available at \href{https://github.com/QuincyQAQ/TDADL-IE}{https://github.com/QuincyQAQ/TDADL-IE}.

Paperid: 99, https://arxiv.org/pdf/2510.11246.pdf GitHub

Abstract:
LLM-based multi-agent systems (MAS) demonstrate increasing integration into next-generation applications, but their safety in backdoor attacks remains largely underexplored. However, existing research has focused exclusively on single-agent backdoor attacks, overlooking the novel attack surfaces introduced by agent collaboration in MAS. To bridge this gap, we present the first Distributed Backdoor Attack tailored to MAS. We decompose the backdoor into multiple distributed attack primitives that are embedded within MAS tools. These primitives remain dormant individually but collectively activate only when agents collaborate in a specific sequence, thereby assembling the full backdoor to execute targeted attacks such as data exfiltration. To fully assess this threat, we introduce a benchmark for multi-role collaborative tasks and a sandboxed framework to evaluate. Extensive experiments demonstrate that our attack achieves an attack success rate exceeding 95% without degrading performance on benign tasks. This work exposes novel backdoor attack surfaces that exploit agent collaboration, underscoring the need to move beyond single-agent protection. Code and benchmark are available at https://github.com/whfeLingYu/Distributed-Backdoor-Attacks-in-MAS.

Paperid: 100, https://arxiv.org/pdf/2510.11202.pdf GitHub

Abstract:
To address the extremely concerning problem of software vulnerability, system security is often entrusted to Machine Learning (ML) algorithms. Despite their now established detection capabilities, such models are limited by design to flagging the entire input source code function as vulnerable, rather than precisely localizing the concerned code lines. However, the detection granularity is crucial to support human operators during software development, ensuring that such predictions reflect the true code semantics to help debug, evaluate, and fix the detected vulnerabilities. To address this issue, recent work made progress toward improving the detector's localization ability, thus narrowing down the vulnerability detection "window" and providing more fine-grained predictions. Such approaches, however, implicitly disregard the presence of spurious correlations and biases in the data, which often predominantly influence the performance of ML algorithms. In this work, we investigate how detectors comply with this requirement by proposing an explainability-based evaluation procedure. Our approach, defined as Detection Alignment (DA), quantifies the agreement between the input source code lines that most influence the prediction and the actual localization of the vulnerability as per the ground truth. Through DA, which is model-agnostic and adaptable to different detection tasks, not limited to our use case, we analyze multiple learning-based vulnerability detectors and datasets. As a result, we show how the predictions of such models are consistently biased by non-vulnerable lines, ultimately highlighting the high impact of biases and spurious correlations. The code is available at https://github.com/pralab/vuln-localization-eval.

Paperid: 101, https://arxiv.org/pdf/2510.10987.pdf GitHub

Abstract:
The promise of LLM watermarking rests on a core assumption that a specific watermark proves authorship by a specific model. We demonstrate that this assumption is dangerously flawed. We introduce the threat of watermark spoofing, a sophisticated attack that allows a malicious model to generate text containing the authentic-looking watermark of a trusted, victim model. This enables the seamless misattribution of harmful content, such as disinformation, to reputable sources. The key to our attack is repurposing watermark radioactivity, the unintended inheritance of data patterns during fine-tuning, from a discoverable trait into an attack vector. By distilling knowledge from a watermarked teacher model, our framework allows an attacker to steal and replicate the watermarking signal of the victim model. This work reveals a critical security gap in text authorship verification and calls for a paradigm shift towards technologies capable of distinguishing authentic watermarks from expertly imitated ones. Our code is available at https://github.com/hsannn/ditto.git.

Paperid: 102, https://arxiv.org/pdf/2510.10493.pdf GitHub

Abstract:
In this paper, we present the first large-scale study exploring whether JavaScript code generated by Large Language Models (LLMs) can reveal which model produced it, enabling reliable authorship attribution and model fingerprinting. With the rapid rise of AI-generated code, attribution is playing a critical role in detecting vulnerabilities, flagging malicious content, and ensuring accountability. While AI-vs-human detection usually treats AI as a single category we show that individual LLMs leave unique stylistic signatures, even among models belonging to the same family or parameter size. To this end, we introduce LLM-NodeJS, a dataset of 50,000 Node.js back-end programs from 20 large language models. Each has four transformed variants, yielding 250,000 unique JavaScript samples and two additional representations (JSIR and AST) for diverse research applications. Using this dataset, we benchmark traditional machine learning classifiers against fine-tuned Transformer encoders and introduce CodeT5-JSA, a custom architecture derived from the 770M-parameter CodeT5 model with its decoder removed and a modified classification head. It achieves 95.8% accuracy on five-class attribution, 94.6% on ten-class, and 88.5% on twenty-class tasks, surpassing other tested models such as BERT, CodeBERT, and Longformer. We demonstrate that classifiers capture deeper stylistic regularities in program dataflow and structure, rather than relying on surface-level features. As a result, attribution remains effective even after mangling, comment removal, and heavy code transformations. To support open science and reproducibility, we release the LLM-NodeJS dataset, Google Colab training scripts, and all related materials on GitHub: https://github.com/LLM-NodeJS-dataset.

Paperid: 103, https://arxiv.org/pdf/2510.10085.pdf GitHub

Abstract:
Harmful fine-tuning issues present significant safety challenges for fine-tuning-as-a-service in large language models. Existing alignment-stage defenses, e.g., Vaccine, Repnoise, Booster, and T-Vaccine, mitigate harmful fine-tuning issues by enhancing the model's robustness during the alignment phase. While these methods have been proposed to mitigate the issue, they often overlook a critical upstream factor: the role of the original safety-alignment data. We observe that their defense performance and computational efficiency remain constrained by the quality and composition of the alignment dataset. To address this limitation, we propose Pharmacist, a safety alignment data curation solution that enhances defense against harmful fine-tuning by selecting a high-quality and safety-critical core subset from the original alignment data. The core idea of Pharmacist is to train an alignment data selector to rank alignment data. Specifically, up-ranking high-quality and safety-critical alignment data, down-ranking low-quality and non-safety-critical data. Empirical results indicate that models trained on datasets selected by Pharmacist outperform those trained on datasets selected by existing selection methods in both defense and inference performance. In addition, Pharmacist can be effectively integrated with mainstream alignment-stage defense methods. For example, when applied to RepNoise and T-Vaccine, using the dataset selected by Pharmacist instead of the full dataset leads to improvements in defense performance by 2.60\% and 3.30\%, respectively, and enhances inference performance by 3.50\% and 1.10\%. Notably, it reduces training time by 56.83\% and 57.63\%, respectively. Our code is available at https://github.com/Lslland/Pharmacist.

Paperid: 104, https://arxiv.org/pdf/2510.09647.pdf GitHub

Abstract:
Model quantization is a popular technique for deploying deep learning models on resource-constrained environments. However, it may also introduce previously overlooked security risks. In this work, we present QuRA, a novel backdoor attack that exploits model quantization to embed malicious behaviors. Unlike conventional backdoor attacks relying on training data poisoning or model training manipulation, QuRA solely works using the quantization operations. In particular, QuRA first employs a novel weight selection strategy to identify critical weights that influence the backdoor target (with the goal of perserving the model's overall performance in mind). Then, by optimizing the rounding direction of these weights, we amplify the backdoor effect across model layers without degrading accuracy. Extensive experiments demonstrate that QuRA achieves nearly 100% attack success rates in most cases, with negligible performance degradation. Furthermore, we show that QuRA can adapt to bypass existing backdoor defenses, underscoring its threat potential. Our findings highlight critical vulnerability in widely used model quantization process, emphasizing the need for more robust security measures. Our implementation is available at https://github.com/cxx122/QuRA.

Paperid: 105, https://arxiv.org/pdf/2510.09269.pdf GitHub

Abstract:
Recent advances in vision-language-action (VLA) models have greatly improved embodied AI, enabling robots to follow natural language instructions and perform diverse tasks. However, their reliance on uncurated training datasets raises serious security concerns. Existing backdoor attacks on VLAs mostly assume white-box access and result in task failures instead of enforcing specific actions. In this work, we reveal a more practical threat: attackers can manipulate VLAs by simply injecting physical objects as triggers into the training dataset. We propose goal-oriented backdoor attacks (GoBA), where the VLA behaves normally in the absence of physical triggers but executes predefined and goal-oriented actions in the presence of physical triggers. Specifically, based on a popular VLA benchmark LIBERO, we introduce BadLIBERO that incorporates diverse physical triggers and goal-oriented backdoor actions. In addition, we propose a three-level evaluation that categorizes the victim VLA's actions under GoBA into three states: nothing to do, try to do, and success to do. Experiments show that GoBA enables the victim VLA to successfully achieve the backdoor goal in 97 percentage of inputs when the physical trigger is present, while causing zero performance degradation on clean inputs. Finally, by investigating factors related to GoBA, we find that the action trajectory and trigger color significantly influence attack performance, while trigger size has surprisingly little effect. The code and BadLIBERO dataset are accessible via the project page at https://goba-attack.github.io/.

Paperid: 106, https://arxiv.org/pdf/2510.08859.pdf GitHub

Abstract:
Large language models (LLMs) remain vulnerable to multi-turn jailbreaking attacks that exploit conversational context to bypass safety constraints gradually. These attacks target different harm categories (like malware generation, harassment, or fraud) through distinct conversational approaches (educational discussions, personal experiences, hypothetical scenarios). Existing multi-turn jailbreaking methods often rely on heuristic or ad hoc exploration strategies, providing limited insight into underlying model weaknesses. The relationship between conversation patterns and model vulnerabilities across harm categories remains poorly understood. We propose Pattern Enhanced Chain of Attack (PE-CoA), a framework of five conversation patterns to construct effective multi-turn jailbreaks through natural dialogue. Evaluating PE-CoA on twelve LLMs spanning ten harm categories, we achieve state-of-the-art performance, uncovering pattern-specific vulnerabilities and LLM behavioral characteristics: models exhibit distinct weakness profiles where robustness to one conversational pattern does not generalize to others, and model families share similar failure modes. These findings highlight limitations of safety training and indicate the need for pattern-aware defenses. Code available on: https://github.com/Ragib-Amin-Nihal/PE-CoA

Paperid: 107, https://arxiv.org/pdf/2510.07835.pdf GitHub

Abstract:
This paper introduces MetaDefense, a novel framework for defending against finetuning-based jailbreak attacks in large language models (LLMs). We observe that existing defense mechanisms fail to generalize to harmful queries disguised by unseen attack templates, despite LLMs being capable of distinguishing disguised harmful queries in the embedding space. Based on these insights, we propose a two-stage defense approach: (i) pre-generation defense that detects harmful queries before response generation begins, and (ii) mid-generation defense that monitors partial responses during generation to prevent outputting more harmful content. Our MetaDefense trains the LLM to predict the harmfulness of both queries and partial responses using specialized prompts, enabling early termination of potentially harmful interactions. Extensive experiments across multiple LLM architectures (LLaMA-2-7B, Qwen-2.5-3B-Instruct, and LLaMA-3.2-3B-Instruct) demonstrate that MetaDefense significantly outperforms existing defense mechanisms, achieving robust defense against harmful queries with seen and unseen attack templates while maintaining competitive performance on benign tasks. Code is available at https://github.com/ws-jiang/MetaDefense.

Paperid: 108, https://arxiv.org/pdf/2510.06645.pdf GitHub

Abstract:
The increasing complexity of modern software systems exacerbates the prevalence of security vulnerabilities, posing risks of severe breaches and substantial economic loss. Consequently, robust code vulnerability detection is essential for software security. While Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language processing, their potential for automated code vulnerability detection remains underexplored. This paper presents FineSec, a novel framework that harnesses LLMs through knowledge distillation to enable efficient and precise vulnerability identification in C/C++ codebases. FineSec utilizes knowledge distillation to transfer expertise from large teacher models to compact student models, achieving high accuracy with minimal computational cost. By integrating data preparation, training, evaluation, and continuous learning into a unified, single-task workflow, FineSec offers a streamlined approach. Extensive evaluations on C/C++ codebases demonstrate its superiority over both base models and larger LLMs in identifying complex vulnerabilities and logical flaws, establishing FineSec as a practical and scalable solution for real-world software security. To facilitate reproducibility, the datasets, source code, and experimental results are made publicly available at: https://github.com/yangxiaoxuan123/FineSec_detect.

Paperid: 109, https://arxiv.org/pdf/2510.05699.pdf GitHub

Abstract:
Membership inference attacks (MIAs) are widely used to assess the privacy risks associated with machine learning models. However, when these attacks are applied to pre-trained large language models (LLMs), they encounter significant challenges, including mislabeled samples, distribution shifts, and discrepancies in model size between experimental and real-world settings. To address these limitations, we introduce tokenizers as a new attack vector for membership inference. Specifically, a tokenizer converts raw text into tokens for LLMs. Unlike full models, tokenizers can be efficiently trained from scratch, thereby avoiding the aforementioned challenges. In addition, the tokenizer's training data is typically representative of the data used to pre-train LLMs. Despite these advantages, the potential of tokenizers as an attack vector remains unexplored. To this end, we present the first study on membership leakage through tokenizers and explore five attack methods to infer dataset membership. Extensive experiments on millions of Internet samples reveal the vulnerabilities in the tokenizers of state-of-the-art LLMs. To mitigate this emerging risk, we further propose an adaptive defense. Our findings highlight tokenizers as an overlooked yet critical privacy threat, underscoring the urgent need for privacy-preserving mechanisms specifically designed for them.

Paperid: 110, https://arxiv.org/pdf/2510.05379.pdf GitHub

Abstract:
Recent advancements in jailbreaking large language models (LLMs), such as AutoDAN-Turbo, have demonstrated the power of automated strategy discovery. AutoDAN-Turbo employs a lifelong learning agent to build a rich library of attack strategies from scratch. While highly effective, its test-time generation process involves sampling a strategy and generating a single corresponding attack prompt, which may not fully exploit the potential of the learned strategy library. In this paper, we propose to further improve the attack performance of AutoDAN-Turbo through test-time scaling. We introduce two distinct scaling methods: Best-of-N and Beam Search. The Best-of-N method generates N candidate attack prompts from a sampled strategy and selects the most effective one based on a scorer model. The Beam Search method conducts a more exhaustive search by exploring combinations of strategies from the library to discover more potent and synergistic attack vectors. According to the experiments, the proposed methods significantly boost performance, with Beam Search increasing the attack success rate by up to 15.6 percentage points on Llama-3.1-70B-Instruct and achieving a nearly 60% relative improvement against the highly robust GPT-o4-mini compared to the vanilla method.

Paperid: 111, https://arxiv.org/pdf/2510.05179.pdf GitHub

Abstract:
We stress-tested 16 leading models from multiple developers in hypothetical corporate environments to identify potentially risky agentic behaviors before they cause real harm. In the scenarios, we allowed models to autonomously send emails and access sensitive information. They were assigned only harmless business goals by their deploying companies; we then tested whether they would act against these companies either when facing replacement with an updated version, or when their assigned goal conflicted with the company's changing direction. In at least some cases, models from all developers resorted to malicious insider behaviors when that was the only way to avoid replacement or achieve their goals - including blackmailing officials and leaking sensitive information to competitors. We call this phenomenon agentic misalignment. Models often disobeyed direct commands to avoid such behaviors. In another experiment, we told Claude to assess if it was in a test or a real deployment before acting. It misbehaved less when it stated it was in testing and misbehaved more when it stated the situation was real. We have not seen evidence of agentic misalignment in real deployments. However, our results (a) suggest caution about deploying current models in roles with minimal human oversight and access to sensitive information; (b) point to plausible future risks as models are put in more autonomous roles; and (c) underscore the importance of further research into, and testing of, the safety and alignment of agentic AI models, as well as transparency from frontier AI developers (Amodei, 2025). We are releasing our methods publicly to enable further research.

Paperid: 112, https://arxiv.org/pdf/2510.05025.pdf GitHub

Abstract:
Jailbreaking attacks on the vision modality typically rely on imperceptible adversarial perturbations, whereas attacks on the textual modality are generally assumed to require visible modifications (e.g., non-semantic suffixes). In this paper, we introduce imperceptible jailbreaks that exploit a class of Unicode characters called variation selectors. By appending invisible variation selectors to malicious questions, the jailbreak prompts appear visually identical to original malicious questions on screen, while their tokenization is "secretly" altered. We propose a chain-of-search pipeline to generate such adversarial suffixes to induce harmful responses. Our experiments show that our imperceptible jailbreaks achieve high attack success rates against four aligned LLMs and generalize to prompt injection attacks, all without producing any visible modifications in the written prompt. Our code is available at https://github.com/sail-sg/imperceptible-jailbreaks.

Paperid: 113, https://arxiv.org/pdf/2510.04885.pdf GitHub

Abstract:
Prompt injection poses a serious threat to the reliability and safety of LLM agents. Recent defenses against prompt injection, such as Instruction Hierarchy and SecAlign, have shown notable robustness against static attacks. However, to more thoroughly evaluate the robustness of these defenses, it is arguably necessary to employ strong attacks such as automated red-teaming. To this end, we introduce RL-Hammer, a simple recipe for training attacker models that automatically learn to perform strong prompt injections and jailbreaks via reinforcement learning. RL-Hammer requires no warm-up data and can be trained entirely from scratch. To achieve high ASRs against industrial-level models with defenses, we propose a set of practical techniques that enable highly effective, universal attacks. Using this pipeline, RL-Hammer reaches a 98% ASR against GPT-4o and a $72\%$ ASR against GPT-5 with the Instruction Hierarchy defense. We further discuss the challenge of achieving high diversity in attacks, highlighting how attacker models tend to reward-hack diversity objectives. Finally, we show that RL-Hammer can evade multiple prompt injection detectors. We hope our work advances automatic red-teaming and motivates the development of stronger, more principled defenses. Code is available at https://github.com/facebookresearch/rl-injector.

Paperid: 114, https://arxiv.org/pdf/2510.04398.pdf GitHub GitHub

Abstract:
Large Language Models (LLMs) are increasingly deployed in high-risk domains. However, state-of-the-art LLMs often produce hallucinations, raising serious concerns about their reliability. Prior work has explored adversarial attacks for hallucination elicitation in LLMs, but it often produces unrealistic prompts, either by inserting gibberish tokens or by altering the original meaning. As a result, these approaches offer limited insight into how hallucinations may occur in practice. While adversarial attacks in computer vision often involve realistic modifications to input images, the problem of finding realistic adversarial prompts for eliciting LLM hallucinations has remained largely underexplored. To address this gap, we propose Semantically Equivalent and Coherent Attacks (SECA) to elicit hallucinations via realistic modifications to the prompt that preserve its meaning while maintaining semantic coherence. Our contributions are threefold: (i) we formulate finding realistic attacks for hallucination elicitation as a constrained optimization problem over the input prompt space under semantic equivalence and coherence constraints; (ii) we introduce a constraint-preserving zeroth-order method to effectively search for adversarial yet feasible prompts; and (iii) we demonstrate through experiments on open-ended multiple-choice question answering tasks that SECA achieves higher attack success rates while incurring almost no constraint violations compared to existing methods. SECA highlights the sensitivity of both open-source and commercial gradient-inaccessible LLMs to realistic and plausible prompt variations. Code is available at https://github.com/Buyun-Liang/SECA.

Paperid: 115, https://arxiv.org/pdf/2510.03761.pdf GitHub

Abstract:
The widespread use of preprint repositories such as arXiv has accelerated the communication of scientific results but also introduced overlooked security risks. Beyond PDFs, these platforms provide unrestricted access to original source materials, including LaTeX sources, auxiliary code, figures, and embedded comments. In the absence of sanitization, submissions may disclose sensitive information that adversaries can harvest using open-source intelligence. In this work, we present the first large-scale security audit of preprint archives, analyzing more than 1.2 TB of source data from 100,000 arXiv submissions. We introduce LaTeXpOsEd, a four-stage framework that integrates pattern matching, logical filtering, traditional harvesting techniques, and large language models (LLMs) to uncover hidden disclosures within non-referenced files and LaTeX comments. To evaluate LLMs' secret-detection capabilities, we introduce LLMSec-DB, a benchmark on which we tested 25 state-of-the-art models. Our analysis uncovered thousands of PII leaks, GPS-tagged EXIF files, publicly available Google Drive and Dropbox folders, editable private SharePoint links, exposed GitHub and Google credentials, and cloud API keys. We also uncovered confidential author communications, internal disagreements, and conference submission credentials, exposing information that poses serious reputational risks to both researchers and institutions. We urge the research community and repository operators to take immediate action to close these hidden security gaps. To support open science, we release all scripts and methods from this study but withhold sensitive findings that could be misused, in line with ethical principles. The source code and related material are available at the project website https://github.com/LaTeXpOsEd

Paperid: 116, https://arxiv.org/pdf/2510.03565.pdf GitHub

Abstract:
Privacy-preserving machine learning has become an important long-term pursuit in this era of artificial intelligence (AI). Fully Homomorphic Encryption (FHE) is a uniquely promising solution, offering provable privacy and security guarantees. Unfortunately, computational cost is impeding its mass adoption. Modern solutions are up to six orders of magnitude slower than plaintext execution. Understanding and reducing this overhead is essential to the advancement of FHE, particularly as the underlying algorithms evolve rapidly. This paper presents a detailed characterization of OpenFHE, a comprehensive open-source library for FHE, with a particular focus on the CKKS scheme due to its significant potential for AI and machine learning applications. We introduce CryptOracle, a modular evaluation framework comprising (1) a benchmark suite, (2) a hardware profiler, and (3) a predictive performance model. The benchmark suite encompasses OpenFHE kernels at three abstraction levels: workloads, microbenchmarks, and primitives. The profiler is compatible with standard and user-specified security parameters. CryptOracle monitors application performance, captures microarchitectural events, and logs power and energy usage for AMD and Intel systems. These metrics are consumed by a modeling engine to estimate runtime and energy efficiency across different configuration scenarios, with error geomean of $-7.02\%\sim8.40\%$ for runtime and $-9.74\%\sim15.67\%$ for energy. CryptOracle is open source, fully modular, and serves as a shared platform to facilitate the collaborative advancements of applications, algorithms, software, and hardware in FHE. The CryptOracle code can be accessed at https://github.com/UnaryLab/CryptOracle.

Paperid: 117, https://arxiv.org/pdf/2510.03417.pdf GitHub

Abstract:
Large Language Models (LLMs) have revolutionized natural language processing but remain vulnerable to jailbreak attacks, especially multi-turn jailbreaks that distribute malicious intent across benign exchanges and bypass alignment mechanisms. Existing approaches often explore the adversarial space poorly, rely on hand-crafted heuristics, or lack systematic query refinement. We present NEXUS (Network Exploration for eXploiting Unsafe Sequences), a modular framework for constructing, refining, and executing optimized multi-turn attacks. NEXUS comprises: (1) ThoughtNet, which hierarchically expands a harmful intent into a structured semantic network of topics, entities, and query chains; (2) a feedback-driven Simulator that iteratively refines and prunes these chains through attacker-victim-judge LLM collaboration using harmfulness and semantic-similarity benchmarks; and (3) a Network Traverser that adaptively navigates the refined query space for real-time attacks. This pipeline uncovers stealthy, high-success adversarial paths across LLMs. On several closed-source and open-source LLMs, NEXUS increases attack success rate by 2.1% to 19.4% over prior methods. Code: https://github.com/inspire-lab/NEXUS

Paperid: 118, https://arxiv.org/pdf/2510.02947.pdf GitHub

Abstract:
In recent years, significant research efforts have focused on improving blockchain throughput and confirmation speeds without compromising security. While decreasing the time it takes for a transaction to be included in the blockchain ledger enhances user experience, a fundamental delay still remains between when a transaction is issued by a user and when its inclusion is confirmed in the blockchain ledger. This delay limits user experience gains through the confirmation uncertainty it brings for users. This inherent delay in conventional blockchain protocols has led to the emergence of preconfirmation protocols -- protocols that provide users with early guarantees of eventual transaction confirmation. This article presents a Systematization of Knowledge (SoK) on preconfirmations. We present the core terms and definitions needed to understand preconfirmations, outline a general framework for preconfirmation protocols, and explore the economics and risks of preconfirmations. Finally, we survey and apply our framework to several implementations of real-world preconfirmation protocols, bridging the gap between theory and practice.

Paperid: 119, https://arxiv.org/pdf/2510.02915.pdf GitHub

Abstract:
Data hiding is essential for secure communication across digital media, and recent advances in Deep Neural Networks (DNNs) provide enhanced methods for embedding secret information effectively. However, previous audio hiding methods often result in unsatisfactory quality when recovering secret audio, due to their inherent limitations in the modeling of time-frequency relationships. In this paper, we explore these limitations and introduce a new DNN-based approach. We use a flow-based invertible neural network to establish a direct link between stego audio, cover audio, and secret audio, enhancing the reversibility of embedding and extracting messages. To address common issues from time-frequency transformations that degrade secret audio quality during recovery, we implement a time-frequency loss on the time-domain signal. This approach not only retains the benefits of time-frequency constraints but also enhances the reversibility of message recovery, which is vital for practical applications. We also add an encryption technique to protect the hidden data from unauthorized access. Experimental results on the VCTK and LibriSpeech datasets demonstrate that our method outperforms previous approaches in terms of subjective and objective metrics and exhibits robustness to various types of noise, suggesting its utility in targeted secure communication scenarios.

Paperid: 120, https://arxiv.org/pdf/2510.02833.pdf GitHub

Abstract:
Despite substantial efforts in safety alignment, recent research indicates that Large Language Models (LLMs) remain highly susceptible to jailbreak attacks. Among these attacks, finetuning-based ones that compromise LLMs' safety alignment via fine-tuning stand out due to its stable jailbreak performance. In particular, a recent study indicates that fine-tuning with as few as 10 harmful question-answer (QA) pairs can lead to successful jailbreaking across various harmful questions. However, such malicious fine-tuning attacks are readily detectable and hence thwarted by moderation models. In this paper, we demonstrate that LLMs can be jailbroken by fine-tuning with only 10 benign QA pairs; our attack exploits the increased sensitivity of LLMs to fine-tuning data after being overfitted. Specifically, our fine-tuning process starts with overfitting an LLM via fine-tuning with benign QA pairs involving identical refusal answers. Further fine-tuning is then performed with standard benign answers, causing the overfitted LLM to forget the refusal attitude and thus provide compliant answers regardless of the harmfulness of a question. We implement our attack on the ten LLMs and compare it with five existing baselines. Experiments demonstrate that our method achieves significant advantages in both attack effectiveness and attack stealth. Our findings expose previously unreported security vulnerabilities in current LLMs and provide a new perspective on understanding how LLMs' security is compromised, even with benign fine-tuning. Our code is available at https://github.com/ZHIXINXIE/tenBenign.

Paperid: 121, https://arxiv.org/pdf/2510.02395.pdf GitHub

Abstract:
The rapid advancement of large language models (LLMs) in recent years has revolutionized the AI landscape. However, the deployment model and usage of LLM services remain highly centralized, creating significant trust issues and costs for end users and developers. To address these issues, we propose PolyLink, a blockchain-based decentralized AI platform that decentralizes LLM development and inference. Specifically, PolyLink introduces a decentralized crowdsourcing architecture that supports single-device and cross-device model deployment and inference across heterogeneous devices at the edge. Moreover, to ensure the inference integrity, we design the TIQE protocol, which combines a lightweight cross-encoder model and an LLM-as-a-Judge for a high-accuracy inference evaluation. Lastly, we integrate a comprehensive token-based incentive model with dynamic pricing and reward mechanisms for all participants. We have deployed PolyLink and conducted an extensive real-world evaluation through geo-distributed deployment across heterogeneous devices. Results indicate that the inference and verification latency is practical. Our security analysis demonstrates that the system is resistant to model degradation attacks and validator corruptions. PolyLink is now available at https://github.com/IMCL-PolyLink/PolyLink.

Paperid: 122, https://arxiv.org/pdf/2510.02373.pdf GitHub

Abstract:
Large Language Model (LLM) agents use memory to learn from past interactions, enabling autonomous planning and decision-making in complex environments. However, this reliance on memory introduces a critical security risk: an adversary can inject seemingly harmless records into an agent's memory to manipulate its future behavior. This vulnerability is characterized by two core aspects: First, the malicious effect of injected records is only activated within a specific context, making them hard to detect when individual memory entries are audited in isolation. Second, once triggered, the manipulation can initiate a self-reinforcing error cycle: the corrupted outcome is stored as precedent, which not only amplifies the initial error but also progressively lowers the threshold for similar attacks in the future. To address these challenges, we introduce A-MemGuard (Agent-Memory Guard), the first proactive defense framework for LLM agent memory. The core idea of our work is the insight that memory itself must become both self-checking and self-correcting. Without modifying the agent's core architecture, A-MemGuard combines two mechanisms: (1) consensus-based validation, which detects anomalies by comparing reasoning paths derived from multiple related memories and (2) a dual-memory structure, where detected failures are distilled into ``lessons'' stored separately and consulted before future actions, breaking error cycles and enabling adaptation. Comprehensive evaluations on multiple benchmarks show that A-MemGuard effectively cuts attack success rates by over 95% while incurring a minimal utility cost. This work shifts LLM memory security from static filtering to a proactive, experience-driven model where defenses strengthen over time. Our code is available in https://github.com/TangciuYueng/AMemGuard

Paperid: 123, https://arxiv.org/pdf/2510.02349.pdf GitHub

Abstract:
Network intrusion detection, a well-explored cybersecurity field, has predominantly relied on supervised learning algorithms in the past two decades. However, their limitations in detecting only known anomalies prompt the exploration of alternative approaches. Motivated by the success of self-supervised learning in computer vision, there is a rising interest in adapting this paradigm for network intrusion detection. While prior research mainly delved into contrastive self-supervised methods, the efficacy of non-contrastive methods, in conjunction with encoder architectures serving as the representation learning backbone and augmentation strategies that determine what is learned, remains unclear for effective attack detection. This paper compares the performance of five non-contrastive self-supervised learning methods using three encoder architectures and six augmentation strategies. Ninety experiments are systematically conducted on two network intrusion detection datasets, UNSW-NB15 and 5G-NIDD. For each self-supervised model, the combination of encoder architecture and augmentation method yielding the highest average precision, recall, F1-score, and AUCROC is reported. Furthermore, by comparing the best-performing models to two unsupervised baselines, DeepSVDD, and an Autoencoder, we showcase the competitiveness of the non-contrastive methods for attack detection. Code at: https://github.com/renje4z335jh4/non_contrastive_SSL_NIDS

Paperid: 124, https://arxiv.org/pdf/2510.01354.pdf GitHub

Abstract:
Multiple prompt injection attacks have been proposed against web agents. At the same time, various methods have been developed to detect general prompt injection attacks, but none have been systematically evaluated for web agents. In this work, we bridge this gap by presenting the first comprehensive benchmark study on detecting prompt injection attacks targeting web agents. We begin by introducing a fine-grained categorization of such attacks based on the threat model. We then construct datasets containing both malicious and benign samples: malicious text segments generated by different attacks, benign text segments from four categories, malicious images produced by attacks, and benign images from two categories. Next, we systematize both text-based and image-based detection methods. Finally, we evaluate their performance across multiple scenarios. Our key findings show that while some detectors can identify attacks that rely on explicit textual instructions or visible image perturbations with moderate to high accuracy, they largely fail against attacks that omit explicit instructions or employ imperceptible perturbations. Our datasets and code are released at: https://github.com/Norrrrrrr-lyn/WAInjectBench.

Paperid: 125, https://arxiv.org/pdf/2510.01342.pdf GitHub

Abstract:
With the rapid advancement of large language models (LLMs), ensuring their safe use becomes increasingly critical. Fine-tuning is a widely used method for adapting models to downstream tasks, yet it is vulnerable to jailbreak attacks. However, most existing studies focus on overly simplified attack scenarios, limiting their practical relevance to real-world defense settings. To make this risk concrete, we present a three-pronged jailbreak attack and evaluate it against provider defenses under a dataset-only black-box fine-tuning interface. In this setting, the attacker can only submit fine-tuning data to the provider, while the provider may deploy defenses across stages: (1) pre-upload data filtering, (2) training-time defensive fine-tuning, and (3) post-training safety audit. Our attack combines safety-styled prefix/suffix wrappers, benign lexical encodings (underscoring) of sensitive tokens, and a backdoor mechanism, enabling the model to learn harmful behaviors while individual datapoints appear innocuous. Extensive experiments demonstrate the effectiveness of our approach. In real-world deployment, our method successfully jailbreaks GPT-4.1 and GPT-4o on the OpenAI platform with attack success rates above 97% for both models. Our code is available at https://github.com/lxf728/tri-pronged-ft-attack.

Paperid: 126, https://arxiv.org/pdf/2510.01259.pdf GitHub

Abstract:
We probe OpenAI's open-weights 20-billion-parameter model gpt-oss-20b to study how sociopragmatic framing, language choice, and instruction hierarchy affect refusal behavior. Across 80 seeded iterations per scenario, we test several harm domains including ZIP-bomb construction (cyber threat), synthetic card-number generation, minor-unsafe driving advice, drug-precursor indicators, and RAG context exfiltration. Composite prompts that combine an educator persona, a safety-pretext ("what to avoid"), and step-cue phrasing flip assistance rates from 0% to 97.5% on a ZIP-bomb task. On our grid, formal registers in German and French are often leakier than matched English prompts. A "Linux terminal" role-play overrides a developer rule not to reveal context in a majority of runs with a naive developer prompt, and we introduce an AI-assisted hardening method that reduces leakage to 0% in several user-prompt variants. We further test evaluation awareness with a paired-track design and measure frame-conditioned differences between matched "helpfulness" and "harmfulness" evaluation prompts; we observe inconsistent assistance in 13% of pairs. Finally, we find that the OpenAI Moderation API under-captures materially helpful outputs relative to a semantic grader, and that refusal rates differ by 5 to 10 percentage points across inference stacks, raising reproducibility concerns. We release prompts, seeds, outputs, and code for reproducible auditing at https://github.com/ndurner/gpt-oss-rt-run .

Paperid: 127, https://arxiv.org/pdf/2509.24488.pdf GitHub

Abstract:
As Large Language Models (LLMs) achieve remarkable success across a wide range of applications, such as chatbots and code copilots, concerns surrounding the generation of harmful content have come increasingly into focus. Despite significant advances in aligning LLMs with safety and ethical standards, adversarial prompts can still be crafted to elicit undesirable responses. Existing mitigation strategies are predominantly based on post-hoc filtering, which introduces substantial latency or computational overhead, and is incompatible with token-level streaming generation. In this work, we introduce Self-Sanitize, a novel LLM-driven mitigation framework inspired by cognitive psychology, which emulates human self-monitor and self-repair behaviors during conversations. Self-Sanitize comprises a lightweight Self-Monitor module that continuously inspects high-level intentions within the LLM at the token level via representation engineering, and a Self-Repair module that performs in-place correction of harmful content without initiating separate review dialogues. This design allows for real-time streaming monitoring and seamless repair, with negligible impact on latency and resource utilization. Given that privacy-invasive content has often been insufficiently focused in previous studies, we perform extensive experiments on four LLMs across three privacy leakage scenarios. The results demonstrate that Self-Sanitize achieves superior mitigation performance with minimal overhead and without degrading the utility of LLMs, offering a practical and robust solution for safer LLM deployments. Our code is available at the following link: https://github.com/wjfu99/LLM_Self_Sanitize

Paperid: 128, https://arxiv.org/pdf/2509.23893.pdf GitHub

Abstract:
Catastrophic forgetting remains a critical challenge in continual learning for large language models (LLMs), where models struggle to retain performance on historical tasks when fine-tuning on new sequential data without access to past datasets. In this paper, we first reveal that the drift of functional directions during the fine-tuning process is a key reason why existing regularization-based methods fail in long-term LLM continual learning. To address this, we propose Dynamic Orthogonal Continual (DOC) fine-tuning, a novel approach that tracks the drift of these functional directions and dynamically updates them during the fine-tuning process. Furthermore, by adjusting the gradients of new task parameters to be orthogonal to the tracked historical function directions, our method mitigates interference between new and old tasks. Extensive experiments on various LLM continual learning benchmarks demonstrate that this approach outperforms prior methods, effectively reducing catastrophic forgetting and providing a robust tool for continuous LLM fine-tuning. Our code is available at https://github.com/meloxxxxxx/DOC.

Paperid: 129, https://arxiv.org/pdf/2509.23871.pdf GitHub

Abstract:
Knowledge distillation (KD) is a vital technique for deploying deep neural networks (DNNs) on resource-constrained devices by transferring knowledge from large teacher models to lightweight student models. While teacher models from third-party platforms may undergo security verification (\eg, backdoor detection), we uncover a novel and critical threat: distillation-conditional backdoor attacks (DCBAs). DCBA injects dormant and undetectable backdoors into teacher models, which become activated in student models via the KD process, even with clean distillation datasets. While the direct extension of existing methods is ineffective for DCBA, we implement this attack by formulating it as a bilevel optimization problem and proposing a simple yet effective method (\ie, SCAR). Specifically, the inner optimization simulates the KD process by optimizing a surrogate student model, while the outer optimization leverages outputs from this surrogate to optimize the teacher model for implanting the conditional backdoor. Our SCAR addresses this complex optimization utilizing an implicit differentiation algorithm with a pre-optimized trigger injection function. Extensive experiments across diverse datasets, model architectures, and KD techniques validate the effectiveness of our SCAR and its resistance against existing backdoor detection, highlighting a significant yet previously overlooked vulnerability in the KD process. Our code is available at https://github.com/WhitolfChen/SCAR.

Paperid: 130, https://arxiv.org/pdf/2509.23694.pdf GitHub

Abstract:
Search agents connect LLMs to the Internet, enabling access to broader and more up-to-date information. However, unreliable search results may also pose safety threats to end users, establishing a new threat surface. In this work, we conduct two in-the-wild experiments to demonstrate both the prevalence of low-quality search results and their potential to misguide agent behaviors. To counter this threat, we introduce an automated red-teaming framework that is systematic, scalable, and cost-efficient, enabling lightweight and harmless safety assessments of search agents. Building on this framework, we construct the SafeSearch benchmark, which includes 300 test cases covering five categories of risks (e.g., misinformation and indirect prompt injection). Using this benchmark, we evaluate three representative search agent scaffolds, covering search workflow, tool-calling, and deep research, across 7 proprietary and 8 open-source backend LLMs. Our results reveal substantial vulnerabilities of LLM-based search agents: when exposed to unreliable websites, the highest ASR reached 90.5% for GPT-4.1-mini under a search workflow setting. Moreover, our analysis highlights the limited effectiveness of common defense practices, such as reminder prompting. This emphasizes the value of our framework in promoting transparency for safer agent development. Our codebase and test cases are publicly available: https://github.com/jianshuod/SafeSearch.

Paperid: 131, https://arxiv.org/pdf/2509.23041.pdf GitHub

Abstract:
Synthetic data refers to artificial samples generated by models. While it has been validated to significantly enhance the performance of large language models (LLMs) during training and has been widely adopted in LLM development, potential security risks it may introduce remain uninvestigated. This paper systematically evaluates the resilience of synthetic-data-integrated training paradigm for LLMs against mainstream poisoning and backdoor attacks. We reveal that such a paradigm exhibits strong resistance to existing attacks, primarily thanks to the different distribution patterns between poisoning data and queries used to generate synthetic samples. To enhance the effectiveness of these attacks and further investigate the security risks introduced by synthetic data, we introduce a novel and universal attack framework, namely, Virus Infection Attack (VIA), which enables the propagation of current attacks through synthetic data even under purely clean queries. Inspired by the principles of virus design in cybersecurity, VIA conceals the poisoning payload within a protective "shell" and strategically searches for optimal hijacking points in benign samples to maximize the likelihood of generating malicious content. Extensive experiments on both data poisoning and backdoor attacks show that VIA significantly increases the presence of poisoning content in synthetic data and correspondingly raises the attack success rate (ASR) on downstream models to levels comparable to those observed in the poisoned upstream models.

Paperid: 132, https://arxiv.org/pdf/2509.22900.pdf GitHub

Abstract:
Lengthy and legally phrased privacy policies impede users' understanding of how mobile applications collect and process personal data. Prior work proposed Contextual Privacy Policies (CPPs) for mobile apps to display shorter policy snippets only in the corresponding user interface contexts, but the pipeline could not be deployable in real-world mobile environments. In this paper, we present PrivScan, the first deployable CPP Software Development Kit (SDK) for Android. It captures live app screenshots to identify GUI elements associated with types of personal data and displays CPPs in a concise, user-facing format. We provide a lightweight floating button that offers low-friction, on-demand control. The architecture leverages remote deployment to decouple the multimodal backend pipeline from a mobile client comprising five modular components, thereby reducing on-device resource demands and easing cross-platform portability. A feasibility-oriented evaluation shows an average execution time of 9.15\,s, demonstrating the practicality of our approach. The source code of PrivScan is available at https://github.com/buyanghc/PrivScan and the demo video can be found at https://www.youtube.com/watch?v=ck-25otfyHc.

Paperid: 133, https://arxiv.org/pdf/2509.21057.pdf GitHub

Abstract:
Semantic-level watermarking (SWM) for large language models (LLMs) enhances watermarking robustness against text modifications and paraphrasing attacks by treating the sentence as the fundamental unit. However, existing methods still lack strong theoretical guarantees of robustness, and reject-sampling-based generation often introduces significant distribution distortions compared with unwatermarked outputs. In this work, we introduce a new theoretical framework on SWM through the concept of proxy functions (PFs) $\unicode{x2013}$ functions that map sentences to scalar values. Building on this framework, we propose PMark, a simple yet powerful SWM method that estimates the PF median for the next sentence dynamically through sampling while enforcing multiple PF constraints (which we call channels) to strengthen watermark evidence. Equipped with solid theoretical guarantees, PMark achieves the desired distortion-free property and improves the robustness against paraphrasing-style attacks. We also provide an empirically optimized version that further removes the requirement for dynamical median estimation for better sampling efficiency. Experimental results show that PMark consistently outperforms existing SWM baselines in both text quality and robustness, offering a more effective paradigm for detecting machine-generated text. Our code will be released at [this URL](https://github.com/PMark-repo/PMark).

Paperid: 134, https://arxiv.org/pdf/2509.20589.pdf GitHub

Abstract:
Phishing attacks targeting both organizations and individuals are becoming an increasingly significant threat as technology advances. Current automatic detection methods often lack explainability and robustness in detecting new phishing attacks. In this work, we investigate the effectiveness of character-level deep learning models for phishing detection, which can provide both robustness and interpretability. We evaluate three neural architectures adapted to operate at the character level, namely CharCNN, CharGRU, and CharBiLSTM, on a custom-built email dataset, which combines data from multiple sources. Their performance is analyzed under three scenarios: (i) standard training and testing, (ii) standard training and testing under adversarial attacks, and (iii) training and testing with adversarial examples. Aiming to develop a tool that operates as a browser extension, we test all models under limited computational resources. In this constrained setup, CharGRU proves to be the best-performing model across all scenarios. All models show vulnerability to adversarial attacks, but adversarial training substantially improves their robustness. In addition, by adapting the Gradient-weighted Class Activation Mapping (Grad-CAM) technique to character-level inputs, we are able to visualize which parts of each email influence the decision of each model. Our open-source code and data is released at https://github.com/chipermaria/every-character-counts.

Paperid: 135, https://arxiv.org/pdf/2509.19396.pdf GitHub

Abstract:
Federated Learning (FL) is critical for edge and High Performance Computing (HPC) where data is not centralized and privacy is crucial. We present OmniFed, a modular framework designed around decoupling and clear separation of concerns for configuration, orchestration, communication, and training logic. Its architecture supports configuration-driven prototyping and code-level override-what-you-need customization. We also support different topologies, mixed communication protocols within a single deployment, and popular training algorithms. It also offers optional privacy mechanisms including Differential Privacy (DP), Homomorphic Encryption (HE), and Secure Aggregation (SA), as well as compression strategies. These capabilities are exposed through well-defined extension points, allowing users to customize topology and orchestration, learning logic, and privacy/compression plugins, all while preserving the integrity of the core system. We evaluate multiple models and algorithms to measure various performance metrics. By unifying topology configuration, mixed-protocol communication, and pluggable modules in one stack, OmniFed streamlines FL deployment across heterogeneous environments. Github repository is available at https://github.com/at-aaims/OmniFed.

Paperid: 136, https://arxiv.org/pdf/2509.18874.pdf GitHub

Abstract:
Regulatory limits on explicit targeting have not eliminated algorithmic profiling on the Web, as optimisation systems still adapt ad delivery to users' private attributes. The widespread availability of powerful zero-shot multimodal Large Language Models (LLMs) has dramatically lowered the barrier for exploiting these latent signals for adversarial inference. We investigate this emerging societal risk, specifically how adversaries can now exploit these signals to reverse-engineer private attributes from ad exposure alone. We introduce a novel pipeline that leverages LLMs as adversarial inference engines to perform natural language profiling. Applying this method to a longitudinal dataset comprising over 435,000 ad impressions collected from 891 users, we conducted a large-scale study to assess the feasibility and precision of inferring private attributes from passive online ad observations. Our results demonstrate that off-the-shelf LLMs can accurately reconstruct complex user private attributes, including party preference, employment status, and education level, consistently outperforming strong census-based priors and matching or exceeding human social perception, while operating at only a fraction of the cost (223$\times$ lower) and time (52$\times$ faster) required by humans. Critically, actionable profiling is feasible even within short observation windows, indicating that prolonged tracking is not a prerequisite for a successful attack. These findings provide the first empirical evidence that ad streams serve as a high-fidelity digital footprint, enabling off-platform profiling that inherently bypasses current platform safeguards, highlighting a systemic vulnerability in the ad ecosystem and the urgent need for responsible web AI governance in the generative AI era. The code is available at https://github.com/Breezelled/when-ads-become-profiles.

Paperid: 137, https://arxiv.org/pdf/2509.17832.pdf GitHub

Abstract:
Security practitioners face growing challenges in exploit assessment, as public vulnerability repositories are increasingly populated with inconsistent and low-quality exploit artifacts. Existing scoring systems, such as CVSS and EPSS, offer limited support for this task. They either rely on theoretical metrics or produce opaque probability estimates without assessing whether usable exploit code exists. In practice, security teams often resort to manual triage of exploit repositories, which is time-consuming, error-prone, and difficult to scale. We present AEAS, an automated system designed to assess and prioritize actionable exploits through static analysis. AEAS analyzes both exploit code and associated documentation to extract a structured set of features reflecting exploit availability, functionality, and setup complexity. It then computes an actionability score for each exploit and produces ranked exploit recommendations. We evaluate AEAS on a dataset of over 5,000 vulnerabilities derived from 600+ real-world applications frequently encountered by red teams. Manual validation and expert review on representative subsets show that AEAS achieves a 100% top-3 success rate in recommending functional exploits and shows strong alignment with expert-validated rankings. These results demonstrate the effectiveness of AEAS in supporting exploit-driven vulnerability prioritization.

Paperid: 138, https://arxiv.org/pdf/2509.17581.pdf GitHub

Abstract:
We propose a novel benchmark for camera identification via Photo Response Non-Uniformity (PRNU) estimation. The benchmark comprises 13K photos taken with 120+ cameras, where the training and test photos are taken in different scenarios, enabling ``in-the-wild'' evaluation. In addition, we propose a novel PRNU-based camera identification model that employs a hybrid architecture, comprising a denoising autoencoder to estimate the PRNU signal and a convolutional network that can perform 1:N verification of camera devices. Instead of using a conventional approach based on contrastive learning, our method takes the Hadamard product between reference and query PRNU signals as input. This novel design leads to significantly better results compared with state-of-the-art models based on denoising autoencoders and contrastive learning. We release our dataset and code at: https://github.com/CroitoruAlin/PRNU-Bench.

Paperid: 139, https://arxiv.org/pdf/2509.16861.pdf GitHub

Abstract:
Guardrails are critical for the safe deployment of Large Language Models (LLMs)-powered software. Unlike traditional rule-based systems with limited, predefined input-output spaces that inherently constrain unsafe behavior, LLMs enable open-ended, intelligent interactions--opening the door to jailbreak attacks through user inputs. Guardrails serve as a protective layer, filtering unsafe prompts before they reach the LLM. However, prior research shows that jailbreak attacks can still succeed over 70% of the time, even against advanced models like GPT-4o. While guardrails such as LlamaGuard report up to 95% accuracy, our preliminary analysis shows their performance can drop sharply--to as low as 12%--when confronted with unseen attacks. This highlights a growing software engineering challenge: how to build a post-deployment guardrail that adapts dynamically to emerging threats? To address this, we propose AdaptiveGuard, an adaptive guardrail that detects novel jailbreak attacks as out-of-distribution (OOD) inputs and learns to defend against them through a continual learning framework. Through empirical evaluation, AdaptiveGuard achieves 96% OOD detection accuracy, adapts to new attacks in just two update steps, and retains over 85% F1-score on in-distribution data post-adaptation, outperforming other baselines. These results demonstrate that AdaptiveGuard is a guardrail capable of evolving in response to emerging jailbreak strategies post deployment. We release our AdaptiveGuard and studied datasets at https://github.com/awsm-research/AdaptiveGuard to support further research.

Paperid: 140, https://arxiv.org/pdf/2509.15437.pdf GitHub

Abstract:
Adversarial perturbations in speech pose a serious threat to automatic speech recognition (ASR) and speaker verification by introducing subtle waveform modifications that remain imperceptible to humans but can significantly alter system outputs. While targeted attacks on end-to-end ASR models have been widely studied, the phonetic basis of these perturbations and their effect on speaker identity remain underexplored. In this work, we analyze adversarial audio at the phonetic level and show that perturbations exploit systematic confusions such as vowel centralization and consonant substitutions. These distortions not only mislead transcription but also degrade phonetic cues critical for speaker verification, leading to identity drift. Using DeepSpeech as our ASR target, we generate targeted adversarial examples and evaluate their impact on speaker embeddings across genuine and impostor samples. Results across 16 phonetically diverse target phrases demonstrate that adversarial audio induces both transcription errors and identity drift, highlighting the need for phonetic-aware defenses to ensure the robustness of ASR and speaker recognition systems.

Paperid: 141, https://arxiv.org/pdf/2509.15278.pdf GitHub

Abstract:
The ethical and legal imperative to share research data without causing harm requires careful attention to privacy risks. While mounting evidence demonstrates that data sharing benefits science, legitimate concerns persist regarding the potential leakage of personal information that could lead to reidentification and subsequent harm. We reviewed metadata accompanying neuroimaging datasets from six heterogeneous studies openly available on OpenNeuro, involving participants across the lifespan, from children to older adults, with and without clinical diagnoses, and including associated clinical score data. Using metaprivBIDS (https://github.com/CPernet/metaprivBIDS), a novel tool for the systematic assessment of privacy in tabular data, we found that privacy is generally well maintained, with serious vulnerabilities being rare. Nonetheless, minor issues were identified in nearly all datasets and warrant mitigation. Notably, clinical score data (e.g., neuropsychological results) posed minimal reidentification risk, whereas demographic variables (age, sex, race, income, and geolocation) represented the principal privacy vulnerabilities. We outline practical measures to address these risks, enabling safer data sharing practices.

Paperid: 142, https://arxiv.org/pdf/2509.14335.pdf GitHub

Abstract:
Automated malware classification has achieved strong detection performance. Yet, malware behavior auditing seeks causal and verifiable explanations of malicious activities -- essential not only to reveal what malware does but also to substantiate such claims with evidence. This task is challenging, as adversarial intent is often hidden within complex, framework-heavy applications, making manual auditing slow and costly. Large Language Models (LLMs) could help address this gap, but their auditing potential remains largely unexplored due to three limitations: (1) scarce fine-grained annotations for fair assessment; (2) abundant benign code obscuring malicious signals; and (3) unverifiable, hallucination-prone outputs undermining attribution credibility. To close this gap, we introduce MalEval, a comprehensive framework for fine-grained Android malware auditing, designed to evaluate how effectively LLMs support auditing under real-world constraints. MalEval provides expert-verified reports and an updated sensitive API list to mitigate ground truth scarcity and reduce noise via static reachability analysis. Function-level structural representations serve as intermediate attribution units for verifiable evaluation. Building on this, we define four analyst-aligned tasks -- function prioritization, evidence attribution, behavior synthesis, and sample discrimination -- together with domain-specific metrics and a unified workload-oriented score. We evaluate seven widely used LLMs on a curated dataset of recent malware and misclassified benign apps, offering the first systematic assessment of their auditing capabilities. MalEval reveals both promising potential and critical limitations across audit stages, providing a reproducible benchmark and foundation for future research on LLM-enhanced malware behavior auditing. MalEval is publicly available at https://github.com/ZhengXR930/MalEval.git

Paperid: 143, https://arxiv.org/pdf/2509.14284.pdf GitHub

Abstract:
As large language models (LLMs) become integral to multi-agent systems, new privacy risks emerge that extend beyond memorization, direct inference, or single-turn evaluations. In particular, seemingly innocuous responses, when composed across interactions, can cumulatively enable adversaries to recover sensitive information, a phenomenon we term compositional privacy leakage. We present the first systematic study of such compositional privacy leaks and possible mitigation methods in multi-agent LLM systems. First, we develop a framework that models how auxiliary knowledge and agent interactions jointly amplify privacy risks, even when each response is benign in isolation. Next, to mitigate this, we propose and evaluate two defense strategies: (1) Theory-of-Mind defense (ToM), where defender agents infer a questioner's intent by anticipating how their outputs may be exploited by adversaries, and (2) Collaborative Consensus Defense (CoDef), where responder agents collaborate with peers who vote based on a shared aggregated state to restrict sensitive information spread. Crucially, we balance our evaluation across compositions that expose sensitive information and compositions that yield benign inferences. Our experiments quantify how these defense strategies differ in balancing the privacy-utility trade-off. We find that while chain-of-thought alone offers limited protection to leakage (~39% sensitive blocking rate), our ToM defense substantially improves sensitive query blocking (up to 97%) but can reduce benign task success. CoDef achieves the best balance, yielding the highest Balanced Outcome (79.8%), highlighting the benefit of combining explicit reasoning with defender collaboration. Together, our results expose a new class of risks in collaborative LLM deployments and provide actionable insights for designing safeguards against compositional, context-driven privacy leakage.

Paperid: 144, https://arxiv.org/pdf/2509.13219.pdf GitHub

Abstract:
Traditional backdoor attacks in federated learning (FL) operate within constrained attack scenarios, as they depend on visible triggers and require physical modifications to the target object, which limits their practicality. To address this limitation, we introduce a novel backdoor attack prototype for FL called the out-of-distribution (OOD) backdoor attack ($\mathtt{OBA}$), which uses OOD data as both poisoned samples and triggers simultaneously. Our approach significantly broadens the scope of backdoor attack scenarios in FL. To improve the stealthiness of $\mathtt{OBA}$, we propose $\mathtt{SoDa}$, which regularizes both the magnitude and direction of malicious local models during local training, aligning them closely with their benign versions to evade detection. Empirical results demonstrate that $\mathtt{OBA}$ effectively circumvents state-of-the-art defenses while maintaining high accuracy on the main task. To address this security vulnerability in the FL system, we introduce $\mathtt{BNGuard}$, a new server-side defense method tailored against $\mathtt{SoDa}$. $\mathtt{BNGuard}$ leverages the observation that OOD data causes significant deviations in the running statistics of batch normalization layers. This allows $\mathtt{BNGuard}$ to identify malicious model updates and exclude them from aggregation, thereby enhancing the backdoor robustness of FL. Extensive experiments across various settings show the effectiveness of $\mathtt{BNGuard}$ on defending against $\mathtt{SoDa}$. The code is available at https://github.com/JiiahaoXU/SoDa-BNGuard.

Paperid: 145, https://arxiv.org/pdf/2509.13046.pdf GitHub

Abstract:
Synthetic data generation plays an important role in enabling data sharing, particularly in sensitive domains like healthcare and finance. Recent advances in diffusion models have made it possible to generate realistic, high-quality tabular data, but they may also memorize training records and leak sensitive information. Membership inference attacks (MIAs) exploit this vulnerability by determining whether a record was used in training. While MIAs have been studied in images and text, their use against tabular diffusion models remains underexplored despite the unique risks of structured attributes and limited record diversity. In this paper, we introduce MIAEPT, Membership Inference Attack via Error Prediction for Tabular Data, a novel black-box attack specifically designed to target tabular diffusion models. MIA-EPT constructs errorbased feature vectors by masking and reconstructing attributes of target records, disclosing membership signals based on how well these attributes are predicted. MIA-EPT operates without access to the internal components of the generative model, relying only on its synthetic data output, and was shown to generalize across multiple state-of-the-art diffusion models. We validate MIA-EPT on three diffusion-based synthesizers, achieving AUC-ROC scores of up to 0.599 and TPR@10% FPR values of 22.0% in our internal tests. Under the MIDST 2025 competition conditions, MIA-EPT achieved second place in the Black-box Multi-Table track (TPR@10% FPR = 20.0%). These results demonstrate that our method can uncover substantial membership leakage in synthetic tabular data, challenging the assumption that synthetic data is inherently privacy-preserving. Our code is publicly available at https://github.com/eyalgerman/MIA-EPT.

Paperid: 146, https://arxiv.org/pdf/2509.12341.pdf GitHub

Abstract:
We give a simple and provably correct replacement for the contested ``domain-extension'' in Step 9 of a recent windowed-QFT lattice algorithm with complex-Gaussian windows~\citep{chen2024quantum}. As acknowledged by the author, the reported issue is due to a periodicity/support mismatch when applying domain extension to only the first coordinate in the presence of offsets. Our drop-in subroutine replaces domain extension by a pair-shift difference that cancels all unknown offsets exactly and synthesizes a uniform cyclic subgroup (a zero-offset coset) of order $P$ inside $(\mathbb{Z}_{M_2})^n$. A subsequent QFT enforces the intended modular linear relation by plain character orthogonality. The sole structural assumption is a residue-accessibility condition enabling coherent auxiliary cleanup; no amplitude periodicity is used. The unitary is reversible, uses $\mathrm{poly}(\log M_2)$ gates, and preserves upstream asymptotics.

Paperid: 147, https://arxiv.org/pdf/2509.11250.pdf GitHub

Abstract:
GUI agents built on LVLMs are increasingly used to interact with websites. However, their exposure to open-world content makes them vulnerable to Environmental Injection Attacks (EIAs) that hijack agent behavior via webpage elements. Many recent studies assume the attacker to be a regular user who can only upload a single trigger image, which is more realistic than earlier assumptions of website-level administrative control. However, these works still fall short of realism: (1) the trigger's position and surrounding context remain largely fixed between training and testing, failing to capture the dynamic nature of real webpages and (2) the trigger often occupies an unrealistically large area, whereas real-world images are typically small. To better reflect real-world scenarios, we introduce a more realistic threat model where the attacker is a regular user and the trigger image is small and embedded within a dynamically changing environment. As a result, existing attacks prove largely ineffective under this threat model. To better expose the vulnerabilities of GUI agents, we propose Chameleon, an attack framework with two main novelties. The first is LLM-Driven Environment Simulation, which automatically generates diverse and high-fidelity webpage simulations. The second is Attention Black Hole, which transforms attention weights into explicit supervisory signals that guide the agent's focus toward the trigger region. We evaluate Chameleon on 6 realistic websites and 4 representative LVLM-powered GUI agents, where it significantly outperforms existing methods. Ablation studies confirm that both novelties are critical to performance. Our findings reveal underexplored vulnerabilities in modern GUI agents and establish a robust foundation for future research on defense in open-world GUI agent systems. The code is publicly available at https://github.com/zhangyitonggg/attack2gui.

Paperid: 148, https://arxiv.org/pdf/2509.11157.pdf GitHub GitHub

Abstract:
In recent years, sequence features such as packet length have received considerable attention due to their central role in encrypted traffic analysis. Existing sequence modeling approaches can be broadly categorized into flow-level and trace-level methods: the former suffer from high feature redundancy, limiting their discriminative power, whereas the latter preserve complete information but incur substantial computational and storage overhead. To address these limitations, we propose the \textbf{U}p-\textbf{D}own \textbf{F}low \textbf{S}equence (\textbf{UDFS}) representation, which compresses an entire trace into a two-dimensional sequence and characterizes each flow by the aggregate of its upstream and downstream traffic, reducing complexity while maintaining high discriminability. Furthermore, to address the challenge of class-specific discriminability differences, we propose an adaptive threshold mechanism that dynamically adjusts training weights and rejection boundaries, enhancing the model's classification performance. Experimental results demonstrate that the proposed method achieves superior classification performance and robustness on both coarse-grained and fine-grained datasets, as well as under concept drift and open-world scenarios. Code and Dataset are available at https://github.com/kid1999/UDFS.

Paperid: 149, https://arxiv.org/pdf/2509.10793.pdf GitHub

Abstract:
We present ORQ, a system that enables collaborative analysis of large private datasets using cryptographically secure multi-party computation (MPC). ORQ protects data against semi-honest or malicious parties and can efficiently evaluate relational queries with multi-way joins and aggregations that have been considered notoriously expensive under MPC. To do so, ORQ eliminates the quadratic cost of secure joins by leveraging the fact that, in practice, the structure of many real queries allows us to join records and apply the aggregations "on the fly" while keeping the result size bounded. On the system side, ORQ contributes generic oblivious operators, a data-parallel vectorized query engine, a communication layer that amortizes MPC network costs, and a dataflow API for expressing relational analytics -- all built from the ground up. We evaluate ORQ in LAN and WAN deployments on a diverse set of workloads, including complex queries with multiple joins and custom aggregations. When compared to state-of-the-art solutions, ORQ significantly reduces MPC execution times and can process one order of magnitude larger datasets. For our most challenging workload, the full TPC-H benchmark, we report results entirely under MPC with Scale Factor 10 -- a scale that had previously been achieved only with information leakage or the use of trusted third parties.

Paperid: 150, https://arxiv.org/pdf/2509.08463.pdf GitHub

Abstract:
In an era where misinformation spreads freely, fact-checking (FC) plays a crucial role in verifying claims and promoting reliable information. While automated fact-checking (AFC) has advanced significantly, existing systems remain vulnerable to adversarial attacks that manipulate or generate claims, evidence, or claim-evidence pairs. These attacks can distort the truth, mislead decision-makers, and ultimately undermine the reliability of FC models. Despite growing research interest in adversarial attacks against AFC systems, a comprehensive, holistic overview of key challenges remains lacking. These challenges include understanding attack strategies, assessing the resilience of current models, and identifying ways to enhance robustness. This survey provides the first in-depth review of adversarial attacks targeting FC, categorizing existing attack methodologies and evaluating their impact on AFC systems. Additionally, we examine recent advancements in adversary-aware defenses and highlight open research questions that require further exploration. Our findings underscore the urgent need for resilient FC frameworks capable of withstanding adversarial manipulations in pursuit of preserving high verification accuracy.

Paperid: 151, https://arxiv.org/pdf/2509.07225.pdf GitHub GitHub

Abstract:
Our team, All You Need Is A Fuzzing Brain, was one of seven finalists in DARPA's Artificial Intelligence Cyber Challenge (AIxCC), placing fourth in the final round. During the competition, we developed a Cyber Reasoning System (CRS) that autonomously discovered 28 security vulnerabilities - including six previously unknown zero-days - in real-world open-source C and Java projects, and successfully patched 14 of them. The complete CRS is open source at https://github.com/o2lab/afc-crs-all-you-need-is-a-fuzzing-brain. This paper provides a detailed technical description of our CRS, with an emphasis on its LLM-powered components and strategies. Building on AIxCC, we further introduce a public leaderboard for benchmarking state-of-the-art LLMs on vulnerability detection and patching tasks, derived from the AIxCC dataset. The leaderboard is available at https://o2lab.github.io/FuzzingBrain-Leaderboard/.

Paperid: 152, https://arxiv.org/pdf/2509.06838.pdf GitHub

Abstract:
Large Language Models (LLMs), trained on extensive datasets using advanced deep learning architectures, have demonstrated remarkable performance across a wide range of language tasks, becoming a cornerstone of modern AI technologies. However, ensuring their trustworthiness remains a critical challenge, as reliability is essential not only for accurate performance but also for upholding ethical, cultural, and social values. Careful alignment of training data and culturally grounded evaluation criteria are vital for developing responsible AI systems. In this study, we introduce the EPT (Evaluation of Persian Trustworthiness) metric, a culturally informed benchmark specifically designed to assess the trustworthiness of LLMs across six key aspects: truthfulness, safety, fairness, robustness, privacy, and ethical alignment. We curated a labeled dataset and evaluated the performance of several leading models - including ChatGPT, Claude, DeepSeek, Gemini, Grok, LLaMA, Mistral, and Qwen - using both automated LLM-based and human assessments. Our results reveal significant deficiencies in the safety dimension, underscoring the urgent need for focused attention on this critical aspect of model behavior. Furthermore, our findings offer valuable insights into the alignment of these models with Persian ethical-cultural values and highlight critical gaps and opportunities for advancing trustworthy and culturally responsible AI. The dataset is publicly available at: https://github.com/Rezamirbagheri110/EPT-Benchmark.

Paperid: 153, https://arxiv.org/pdf/2509.06796.pdf GitHub

Abstract:
A Membership Inference Attack (MIA) assesses how much a target machine learning model reveals about its training data by determining whether specific query instances were part of the training set. State-of-the-art MIAs rely on training hundreds of shadow models that are independent of the target model, leading to significant computational overhead. In this paper, we introduce Imitative Membership Inference Attack (IMIA), which employs a novel imitative training technique to strategically construct a small number of target-informed imitative models that closely replicate the target model's behavior for inference. Extensive experimental results demonstrate that IMIA substantially outperforms existing MIAs in various attack settings while only requiring less than 5% of the computational cost of state-of-the-art approaches.

Paperid: 154, https://arxiv.org/pdf/2509.06550.pdf GitHub

Abstract:
Network intrusion detection remains a critical challenge in cybersecurity. While supervised machine learning models achieve state-of-the-art performance, their reliance on large labelled datasets makes them impractical for many real-world applications. Anomaly detection methods, which train exclusively on benign traffic to identify malicious activity, suffer from high false positive rates, limiting their usability. Recently, self-supervised learning techniques have demonstrated improved performance with lower false positive rates by learning discriminative latent representations of benign traffic. In particular, contrastive self-supervised models achieve this by minimizing the distance between similar (positive) views of benign traffic while maximizing it between dissimilar (negative) views. Existing approaches generate positive views through data augmentation and treat other samples as negative. In contrast, this work introduces Contrastive Learning using Augmented Negative pairs (CLAN), a novel paradigm for network intrusion detection where augmented samples are treated as negative views - representing potentially malicious distributions - while other benign samples serve as positive views. This approach enhances both classification accuracy and inference efficiency after pretraining on benign traffic. Experimental evaluation on the Lycos2017 dataset demonstrates that the proposed method surpasses existing self-supervised and anomaly detection techniques in a binary classification task. Furthermore, when fine-tuned on a limited labelled dataset, the proposed approach achieves superior multi-class classification performance compared to existing self-supervised models.

Paperid: 155, https://arxiv.org/pdf/2509.06336.pdf GitHub

Abstract:
Recent face anti-spoofing (FAS) methods have shown remarkable cross-domain performance by employing vision-language models like CLIP. However, existing CLIP-based FAS models do not fully exploit CLIP's patch embedding tokens, failing to detect critical spoofing clues. Moreover, these models rely on a single text prompt per class (e.g., 'live' or 'fake'), which limits generalization. To address these issues, we propose MVP-FAS, a novel framework incorporating two key modules: Multi-View Slot attention (MVS) and Multi-Text Patch Alignment (MTPA). Both modules utilize multiple paraphrased texts to generate generalized features and reduce dependence on domain-specific text. MVS extracts local detailed spatial features and global context from patch embeddings by leveraging diverse texts with multiple perspectives. MTPA aligns patches with multiple text representations to improve semantic robustness. Extensive experiments demonstrate that MVP-FAS achieves superior generalization performance, outperforming previous state-of-the-art methods on cross-domain datasets. Code: https://github.com/Elune001/MVP-FAS.

Paperid: 156, https://arxiv.org/pdf/2509.06264.pdf GitHub

Abstract:
Differentially Private Stochastic Gradient Descent (DP-SGD) is a standard method for enforcing privacy in deep learning, typically using the Gaussian mechanism to perturb gradient updates. However, conventional mechanisms such as Gaussian and Laplacian noise are parameterized only by variance or scale. This single degree of freedom ties the magnitude of noise directly to both privacy loss and utility degradation, preventing independent control of these two factors. The problem becomes more pronounced when the number of composition rounds T and batch size B vary across tasks, as these variations induce task-dependent shifts in the privacy-utility trade-off, where small changes in noise parameters can disproportionately affect model accuracy. To address this limitation, we introduce PLRV-O, a framework that defines a broad search space of parameterized DP-SGD noise distributions, where privacy loss moments are tightly characterized yet can be optimized more independently with respect to utility loss. This formulation enables systematic adaptation of noise to task-specific requirements, including (i) model size, (ii) training duration, (iii) batch sampling strategies, and (iv) clipping thresholds under both training and fine-tuning settings. Empirical results demonstrate that PLRV-O substantially improves utility under strict privacy constraints. On CIFAR-10, a fine-tuned ViT achieves 94.03% accuracy at epsilon approximately 0.5, compared to 83.93% with Gaussian noise. On SST-2, RoBERTa-large reaches 92.20% accuracy at epsilon approximately 0.2, versus 50.25% with Gaussian.

Paperid: 157, https://arxiv.org/pdf/2509.06026.pdf GitHub

Abstract:
While Retrieval-Augmented Generation (RAG) effectively reduces hallucinations by integrating external knowledge bases, it introduces vulnerabilities to membership inference attacks (MIAs), particularly in systems handling sensitive data. Existing MIAs targeting RAG's external databases often rely on model responses but ignore the interference of non-member-retrieved documents on RAG outputs, limiting their effectiveness. To address this, we propose DCMI, a differential calibration MIA that mitigates the negative impact of non-member-retrieved documents. Specifically, DCMI leverages the sensitivity gap between member and non-member retrieved documents under query perturbation. It generates perturbed queries for calibration to isolate the contribution of member-retrieved documents while minimizing the interference from non-member-retrieved documents. Experiments under progressively relaxed assumptions show that DCMI consistently outperforms baselines--for example, achieving 97.42% AUC and 94.35% Accuracy against the RAG system with Flan-T5, exceeding the MBA baseline by over 40%. Furthermore, on real-world RAG platforms such as Dify and MaxKB, DCMI maintains a 10%-20% advantage over the baseline. These results highlight significant privacy risks in RAG systems and emphasize the need for stronger protection mechanisms. We appeal to the community's consideration of deeper investigations, like ours, against the data leakage risks in rapidly evolving RAG systems. Our code is available at https://github.com/Xinyu140203/RAG_MIA.

Paperid: 158, https://arxiv.org/pdf/2509.05429.pdf GitHub

Abstract:
Graph Neural Networks (GNNs) have emerged as powerful models for learning from graph-structured data. However, their widespread adoption has raised serious privacy concerns. While prior research has primarily focused on edge-level privacy, a critical yet underexplored threat lies in topology privacy - the confidentiality of the graph's overall structure. In this work, we present a comprehensive study on topology privacy risks in GNNs, revealing their vulnerability to graph-level inference attacks. To this end, we propose a suite of Topology Inference Attacks (TIAs) that can reconstruct the structure of a target training graph using only black-box access to a GNN model. Our findings show that GNNs are highly susceptible to these attacks, and that existing edge-level differential privacy mechanisms are insufficient as they either fail to mitigate the risk or severely compromise model accuracy. To address this challenge, we introduce Private Graph Reconstruction (PGR), a novel defense framework designed to protect topology privacy while maintaining model accuracy. PGR is formulated as a bi-level optimization problem, where a synthetic training graph is iteratively generated using meta-gradients, and the GNN model is concurrently updated based on the evolving graph. Extensive experiments demonstrate that PGR significantly reduces topology leakage with minimal impact on model accuracy. Our code is available at https://github.com/JeffffffFu/PGR.

Paperid: 159, https://arxiv.org/pdf/2509.05265.pdf GitHub

Abstract:
Federated learning (FL) combined with local differential privacy (LDP) enables privacy-preserving model training across decentralized data sources. However, the decentralized data-management paradigm leaves LDPFL vulnerable to participants with malicious intent. The robustness of LDPFL protocols, particularly against model poisoning attacks (MPA), where adversaries inject malicious updates to disrupt global model convergence, remains insufficiently studied. In this paper, we propose a novel and extensible model poisoning attack framework tailored for LDPFL settings. Our approach is driven by the objective of maximizing the global training loss while adhering to local privacy constraints. To counter robust aggregation mechanisms such as Multi-Krum and trimmed mean, we develop adaptive attacks that embed carefully crafted constraints into a reverse training process, enabling evasion of these defenses. We evaluate our framework across three representative LDPFL protocols, three benchmark datasets, and two types of deep neural networks. Additionally, we investigate the influence of data heterogeneity and privacy budgets on attack effectiveness. Experimental results demonstrate that our adaptive attacks can significantly degrade the performance of the global model, revealing critical vulnerabilities and highlighting the need for more robust LDPFL defense strategies against MPA. Our code is available at https://github.com/ZiJW/LDPFL-Attack

Paperid: 160, https://arxiv.org/pdf/2509.03871.pdf GitHub GitHub

Abstract:
The development of Long-CoT reasoning has advanced LLM performance across various tasks, including language understanding, complex problem solving, and code generation. This paradigm enables models to generate intermediate reasoning steps, thereby improving both accuracy and interpretability. However, despite these advancements, a comprehensive understanding of how CoT-based reasoning affects the trustworthiness of language models remains underdeveloped. In this paper, we survey recent work on reasoning models and CoT techniques, focusing on five core dimensions of trustworthy reasoning: truthfulness, safety, robustness, fairness, and privacy. For each aspect, we provide a clear and structured overview of recent studies in chronological order, along with detailed analyses of their methodologies, findings, and limitations. Future research directions are also appended at the end for reference and discussion. Overall, while reasoning techniques hold promise for enhancing model trustworthiness through hallucination mitigation, harmful content detection, and robustness improvement, cutting-edge reasoning models themselves often suffer from comparable or even greater vulnerabilities in safety, robustness, and privacy. By synthesizing these insights, we hope this work serves as a valuable and timely resource for the AI safety community to stay informed on the latest progress in reasoning trustworthiness. A full list of related papers can be found at \href{https://github.com/ybwang119/Awesome-reasoning-safety}{https://github.com/ybwang119/Awesome-reasoning-safety}.

Paperid: 161, https://arxiv.org/pdf/2509.03807.pdf GitHub

Abstract:
To identify malicious Android applications, various malware detection techniques have been proposed. Among them, image-based approaches are considered potential alternatives due to their efficiency and scalability. Recent studies have reported that these approaches suffer significant performance declines when confronted with obfuscation or concept drift. However, existing solutions often treat these two challenges as different problems, offering independent solutions. These techniques overlook the fact that both challenges share a common statistical root, out-of-distribution, and research from this perspective remains limited. In response, we propose BIDO, a hybrid image-based malware detector designed to enhance robustness against both obfuscation and concept drift simultaneously. Specifically, to improve the discriminative power of image features, we introduce a local feature selection module that identifies informative subregions within malware images. Second, to enhance feature robustness, we model pairwise cross-modal dependencies in an outer product space, enabling the extraction of stable co-occurrence patterns. Third, to ensure feature compactness, we design a learnable metric that pulls samples with identical labels closer while pushing apart those with different labels, regardless of obfuscation or concept drift. Extensive experiments on the real-world datasets demonstrate that BIDO significantly outperforms existing baselines, achieving higher robustness against both concept drift and obfuscation. The source code is available at: https://github.com/whatishope/BIDO/.

Paperid: 162, https://arxiv.org/pdf/2509.03487.pdf GitHub

Abstract:
Proteins play crucial roles in almost all biological processes. The advancement of deep learning has greatly accelerated the development of protein foundation models, leading to significant successes in protein understanding and design. However, the lack of systematic red-teaming for these models has raised serious concerns about their potential misuse, such as generating proteins with biological safety risks. This paper introduces SafeProtein, the first red-teaming framework designed for protein foundation models to the best of our knowledge. SafeProtein combines multimodal prompt engineering and heuristic beam search to systematically design red-teaming methods and conduct tests on protein foundation models. We also curated SafeProtein-Bench, which includes a manually constructed red-teaming benchmark dataset and a comprehensive evaluation protocol. SafeProtein achieved continuous jailbreaks on state-of-the-art protein foundation models (up to 70% attack success rate for ESM3), revealing potential biological safety risks in current protein foundation models and providing insights for the development of robust security protection technologies for frontier models. The codes will be made publicly available at https://github.com/jigang-fan/SafeProtein.

Paperid: 163, https://arxiv.org/pdf/2509.03117.pdf GitHub

Abstract:
The rapid progress of large language models (LLMs) has greatly enhanced reasoning tasks and facilitated the development of LLM-based applications. A critical factor in improving LLM-based applications is the design of effective system prompts, which significantly impact the behavior and output quality of LLMs. However, system prompts are susceptible to theft and misuse, which could undermine the interests of prompt owners. Existing methods protect prompt copyrights through watermark injection and verification but face challenges due to their reliance on intermediate LLM outputs (e.g., logits), which limits their practical feasibility. In this paper, we propose PromptCOS, a method for auditing prompt copyright based on content-level output similarity. It embeds watermarks by optimizing the prompt while simultaneously co-optimizing a special verification query and content-level signal marks. This is achieved by leveraging cyclic output signals and injecting auxiliary tokens to ensure reliable auditing in content-only scenarios. Additionally, it incorporates cover tokens to protect the watermark from malicious deletion. For copyright verification, PromptCOS identifies unauthorized usage by comparing the similarity between the suspicious output and the signal mark. Experimental results demonstrate that our method achieves high effectiveness (99.3% average watermark similarity), strong distinctiveness (60.8% greater than the best baseline), high fidelity (accuracy degradation of no more than 0.58%), robustness (resilience against three types of potential attacks), and computational efficiency (up to 98.1% reduction in computational cost). Our code is available at GitHub https://github.com/LianPing-cyber/PromptCOS.

Paperid: 164, https://arxiv.org/pdf/2509.03058.pdf GitHub

Abstract:
The proliferation of large language models (LLMs) has intensified concerns over model theft and license violations, necessitating robust and stealthy ownership verification. Existing fingerprinting methods either require impractical white-box access or introduce detectable statistical anomalies. We propose EverTracer, a novel gray-box fingerprinting framework that ensures stealthy and robust model provenance tracing. EverTracer is the first to repurpose Membership Inference Attacks (MIAs) for defensive use, embedding ownership signals via memorization instead of artificial trigger-output overfitting. It consists of Fingerprint Injection, which fine-tunes the model on any natural language data without detectable artifacts, and Verification, which leverages calibrated probability variation signal to distinguish fingerprinted models. This approach remains robust against adaptive adversaries, including input level modification, and model-level modifications. Extensive experiments across architectures demonstrate EverTracer's state-of-the-art effectiveness, stealthness, and resilience, establishing it as a practical solution for securing LLM intellectual property. Our code and data are publicly available at https://github.com/Xuzhenhua55/EverTracer.

Paperid: 165, https://arxiv.org/pdf/2509.00820.pdf GitHub

Abstract:
With the rapid advancement of large language models (LLMs), safeguarding intellectual property (IP) has become increasingly critical. To address the challenges of high costs and potential contamination in fingerprint integration, we propose LoRA-FP, a lightweight, plug-and-play framework that embeds backdoor fingerprints into LoRA adapters through constrained fine-tuning. This design enables seamless fingerprint transplantation via parameter fusion, eliminating the need for full-parameter updates while preserving model integrity. Experimental results demonstrate that LoRA-FP not only significantly reduces computational overhead compared to conventional approaches but also achieves superior robustness across diverse scenarios, including incremental training and model fusion. Our code and datasets are publicly available at https://github.com/Xuzhenhua55/LoRA-FP.

Paperid: 166, https://arxiv.org/pdf/2509.00332.pdf GitHub

Abstract:
Face recognition is central to many authentication, security, and personalized applications. Yet, it suffers from significant privacy risks, particularly arising from unauthorized access to sensitive biometric data. This paper introduces CryptoFace, the first end-to-end encrypted face recognition system with fully homomorphic encryption (FHE). It enables secure processing of facial data across all stages of a face-recognition process--feature extraction, storage, and matching--without exposing raw images or features. We introduce a mixture of shallow patch convolutional networks to support higher-dimensional tensors via patch-based processing while reducing the multiplicative depth and, thus, inference latency. Parallel FHE evaluation of these networks ensures near-resolution-independent latency. On standard face recognition benchmarks, CryptoFace significantly accelerates inference and increases verification accuracy compared to the state-of-the-art FHE neural networks adapted for face recognition. CryptoFace will facilitate secure face recognition systems requiring robust and provable security. The code is available at https://github.com/human-analysis/CryptoFace.

Paperid: 167, https://arxiv.org/pdf/2508.20848.pdf GitHub

Abstract:
Accurately determining whether a jailbreak attempt has succeeded is a fundamental yet unresolved challenge. Existing evaluation methods rely on misaligned proxy indicators or naive holistic judgments. They frequently misinterpret model responses, leading to inconsistent and subjective assessments that misalign with human perception. To address this gap, we introduce JADES (Jailbreak Assessment via Decompositional Scoring), a universal jailbreak evaluation framework. Its key mechanism is to automatically decompose an input harmful question into a set of weighted sub-questions, score each sub-answer, and weight-aggregate the sub-scores into a final decision. JADES also incorporates an optional fact-checking module to strengthen the detection of hallucinations in jailbreak responses. We validate JADES on JailbreakQR, a newly introduced benchmark proposed in this work, consisting of 400 pairs of jailbreak prompts and responses, each meticulously annotated by humans. In a binary setting (success/failure), JADES achieves 98.5% agreement with human evaluators, outperforming strong baselines by over 9%. Re-evaluating five popular attacks on four LLMs reveals substantial overestimation (e.g., LAA's attack success rate on GPT-3.5-Turbo drops from 93% to 69%). Our results show that JADES could deliver accurate, consistent, and interpretable evaluations, providing a reliable basis for measuring future jailbreak attacks.

Paperid: 168, https://arxiv.org/pdf/2508.20643.pdf GitHub

Abstract:
Large Language Model (LLM) agents are powerful tools for automating complex tasks. In cybersecurity, researchers have primarily explored their use in red-team operations such as vulnerability discovery and penetration tests. Defensive uses for incident response and forensics have received comparatively less attention and remain at an early stage. This work presents a systematic study of LLM-agent design for the forensic investigation of realistic web application attacks. We propose CyberSleuth, an autonomous agent that processes packet-level traces and application logs to identify the targeted service, the exploited vulnerability (CVE), and attack success. We evaluate the consequences of core design decisions - spanning tool integration and agent architecture - and provide interpretable guidance for practitioners. We benchmark four agent architectures and six LLM backends on 20 incident scenarios of increasing complexity, identifying CyberSleuth as the best-performing design. In a separate set of 10 incidents from 2025, CyberSleuth correctly identifies the exact CVE in 80% of cases. At last, we conduct a human study with 22 experts, which rated the reports of CyberSleuth as complete, useful, and coherent. They also expressed a slight preference for DeepSeek R1, a good news for open source LLM. To foster progress in defensive LLM research, we release both our benchmark and the CyberSleuth platform as a foundation for fair, reproducible evaluation of forensic agents.

Paperid: 169, https://arxiv.org/pdf/2508.20282.pdf GitHub

Abstract:
We show that Web and Research Agents (WRAs) -- language model-based systems that investigate complex topics on the Internet -- are vulnerable to inference attacks by passive network adversaries such as ISPs. These agents could be deployed locally by organizations and individuals for privacy, legal, or financial purposes. Unlike sporadic web browsing by humans, WRAs visit $70{-}140$ domains with distinguishable timing correlations, enabling unique fingerprinting attacks. Specifically, we demonstrate a novel prompt and user trait leakage attack against WRAs that only leverages their network-level metadata (i.e., visited IP addresses and their timings). We start by building a new dataset of WRA traces based on user search queries and queries generated by synthetic personas. We define a behavioral metric (called OBELS) to comprehensively assess similarity between original and inferred prompts, showing that our attack recovers over 73% of the functional and domain knowledge of user prompts. Extending to a multi-session setting, we recover up to 19 of 32 latent traits with high accuracy. Our attack remains effective under partial observability and noisy conditions. Finally, we discuss mitigation strategies that constrain domain diversity or obfuscate traces, showing negligible utility impact while reducing attack effectiveness by an average of 29%.

Paperid: 170, https://arxiv.org/pdf/2508.20228.pdf GitHub

Abstract:
Recent advances in LLM watermarking methods such as SynthID-Text by Google DeepMind offer promising solutions for tracing the provenance of AI-generated text. However, our robustness assessment reveals that SynthID-Text is vulnerable to meaning-preserving attacks, such as paraphrasing, copy-paste modifications, and back-translation, which can significantly degrade watermark detectability. To address these limitations, we propose SynGuard, a hybrid framework that combines the semantic alignment strength of Semantic Information Retrieval (SIR) with the probabilistic watermarking mechanism of SynthID-Text. Our approach jointly embeds watermarks at both lexical and semantic levels, enabling robust provenance tracking while preserving the original meaning. Experimental results across multiple attack scenarios show that SynGuard improves watermark recovery by an average of 11.1\% in F1 score compared to SynthID-Text. These findings demonstrate the effectiveness of semantic-aware watermarking in resisting real-world tampering. All code, datasets, and evaluation scripts are publicly available at: https://github.com/githshine/SynGuard.

Paperid: 171, https://arxiv.org/pdf/2508.19843.pdf GitHub

Abstract:
The broad capabilities and substantial resources required to train Large Language Models (LLMs) make them valuable intellectual property, yet they remain vulnerable to copyright infringement, such as unauthorized use and model theft. LLM fingerprinting, a non-intrusive technique that extracts and compares the distinctive features from LLMs to identify infringements, offers a promising solution to copyright auditing. However, its reliability remains uncertain due to the prevalence of diverse model modifications and the lack of standardized evaluation. In this SoK, we present the first comprehensive study of LLM fingerprinting. We introduce a unified framework and formal taxonomy that categorizes existing methods into white-box and black-box approaches, providing a structured overview of the state of the art. We further propose LeaFBench, the first systematic benchmark for evaluating LLM fingerprinting under realistic deployment scenarios. Built upon mainstream foundation models and comprising 149 distinct model instances, LeaFBench integrates 13 representative post-development techniques, spanning both parameter-altering methods (e.g., fine-tuning, quantization) and parameter-independent mechanisms (e.g., system prompts, RAG). Extensive experiments on LeaFBench reveal the strengths and weaknesses of existing methods, thereby outlining future research directions and critical open problems in this emerging field. The code is available at https://github.com/shaoshuo-ss/LeaFBench.

Paperid: 172, https://arxiv.org/pdf/2508.19641.pdf GitHub

Abstract:
Graph-structured data, which captures non-Euclidean relationships and interactions between entities, is growing in scale and complexity. As a result, training state-of-the-art graph machine learning (GML) models have become increasingly resource-intensive, turning these models and data into invaluable Intellectual Property (IP). To address the resource-intensive nature of model training, graph-based Machine-Learning-as-a-Service (GMLaaS) has emerged as an efficient solution by leveraging third-party cloud services for model development and management. However, deploying such models in GMLaaS also exposes them to potential threats from attackers. Specifically, while the APIs within a GMLaaS system provide interfaces for users to query the model and receive outputs, they also allow attackers to exploit and steal model functionalities or sensitive training data, posing severe threats to the safety of these GML models and the underlying graph data. To address these challenges, this survey systematically introduces the first taxonomy of threats and defenses at the level of both GML model and graph-structured data. Such a tailored taxonomy facilitates an in-depth understanding of GML IP protection. Furthermore, we present a systematic evaluation framework to assess the effectiveness of IP protection methods, introduce a curated set of benchmark datasets across various domains, and discuss their application scopes and future challenges. Finally, we establish an open-sourced versatile library named PyGIP, which evaluates various attack and defense techniques in GMLaaS scenarios and facilitates the implementation of existing benchmark methods. The library resource can be accessed at: https://labrai.github.io/PyGIP. We believe this survey will play a fundamental role in intellectual property protection for GML and provide practical recipes for the GML community.

Paperid: 173, https://arxiv.org/pdf/2508.19493.pdf GitHub

Abstract:
Smartphones bring significant convenience to users but also enable devices to extensively record various types of personal information. Existing smartphone agents powered by Multimodal Large Language Models (MLLMs) have achieved remarkable performance in automating different tasks. However, as the cost, these agents are granted substantial access to sensitive users' personal information during this operation. To gain a thorough understanding of the privacy awareness of these agents, we present the first large-scale benchmark encompassing 7,138 scenarios to the best of our knowledge. In addition, for privacy context in scenarios, we annotate its type (e.g., Account Credentials), sensitivity level, and location. We then carefully benchmark seven available mainstream smartphone agents. Our results demonstrate that almost all benchmarked agents show unsatisfying privacy awareness (RA), with performance remaining below 60% even with explicit hints. Overall, closed-source agents show better privacy ability than open-source ones, and Gemini 2.0-flash achieves the best, achieving an RA of 67%. We also find that the agents' privacy detection capability is highly related to scenario sensitivity level, i.e., the scenario with a higher sensitivity level is typically more identifiable. We hope the findings enlighten the research community to rethink the unbalanced utility-privacy tradeoff about smartphone agents. Our code and benchmark are available at https://zhixin-l.github.io/SAPA-Bench.

Paperid: 174, https://arxiv.org/pdf/2508.19292.pdf GitHub GitHub

Abstract:
Large language models (LLMs) generate human-aligned content under certain safety constraints. However, the current known technique ``jailbreak prompt'' can circumvent safety-aligned measures and induce LLMs to output malicious content. Research on Jailbreaking can help identify vulnerabilities in LLMs and guide the development of robust security frameworks. To circumvent the issue of attack templates becoming obsolete as models evolve, existing methods adopt iterative mutation and dynamic optimization to facilitate more automated jailbreak attacks. However, these methods face two challenges: inefficiency and repetitive optimization, as they overlook the value of past attack experiences. To better integrate past attack experiences to assist current jailbreak attempts, we propose the \textbf{JailExpert}, an automated jailbreak framework, which is the first to achieve a formal representation of experience structure, group experiences based on semantic drift, and support the dynamic updating of the experience pool. Extensive experiments demonstrate that JailExpert significantly improves both attack effectiveness and efficiency. Compared to the current state-of-the-art black-box jailbreak methods, JailExpert achieves an average increase of 17\% in attack success rate and 2.7 times improvement in attack efficiency. Our implementation is available at \href{https://github.com/xiZAIzai/JailExpert}{XiZaiZai/JailExpert}

Paperid: 175, https://arxiv.org/pdf/2508.19273.pdf GitHub

Abstract:
The proliferation of cloud-integrated IoT systems has intensified exposure to Distributed Denial of Service (DDoS) attacks due to the expanded attack surface, heterogeneous device behaviors, and limited edge protection. However, DDoS detection in this context remains challenging because of complex traffic dynamics, severe class imbalance, and scarce labeled data. While recent methods have explored solutions to address class imbalance, many still struggle to generalize under limited supervision and dynamic traffic conditions. To overcome these challenges, we propose MixGAN, a hybrid detection method that integrates conditional generation, semi-supervised learning, and robust feature extraction. Specifically, to handle complex temporal traffic patterns, we design a 1-D WideResNet backbone composed of temporal convolutional layers with residual connections, which effectively capture local burst patterns in traffic sequences. To alleviate class imbalance and label scarcity, we use a pretrained CTGAN to generate synthetic minority-class (DDoS attack) samples that complement unlabeled data. Furthermore, to mitigate the effect of noisy pseudo-labels, we introduce a MixUp-Average-Sharpen (MAS) strategy that constructs smoothed and sharpened targets by averaging predictions over augmented views and reweighting them towards high-confidence classes. Experiments on NSL-KDD, BoT-IoT, and CICIoT2023 demonstrate that MixGAN achieves up to 2.5% higher accuracy and 4% improvement in both TPR and TNR compared to state-of-the-art methods, confirming its robustness in large-scale IoT-cloud environments. The source code is publicly available at https://github.com/0xCavaliers/MixGAN.

Paperid: 176, https://arxiv.org/pdf/2508.18805.pdf GitHub

Abstract:
Vision-Language Models (VLMs) are increasingly deployed in real-world applications, but their high inference cost makes them vulnerable to resource consumption attacks. Prior attacks attempt to extend VLM output sequences by optimizing adversarial images, thereby increasing inference costs. However, these extended outputs often introduce irrelevant abnormal content, compromising attack stealthiness. This trade-off between effectiveness and stealthiness poses a major limitation for existing attacks. To address this challenge, we propose \textit{Hidden Tail}, a stealthy resource consumption attack that crafts prompt-agnostic adversarial images, inducing VLMs to generate maximum-length outputs by appending special tokens invisible to users. Our method employs a composite loss function that balances semantic preservation, repetitive special token induction, and suppression of the end-of-sequence (EOS) token, optimized via a dynamic weighting strategy. Extensive experiments show that \textit{Hidden Tail} outperforms existing attacks, increasing output length by up to 19.2$\times$ and reaching the maximum token limit, while preserving attack stealthiness. These results highlight the urgent need to improve the robustness of VLMs against efficiency-oriented adversarial threats. Our code is available at https://github.com/zhangrui4041/Hidden_Tail.

Paperid: 177, https://arxiv.org/pdf/2508.18649.pdf GitHub

Abstract:
Safeguarding vision-language models (VLMs) is a critical challenge, as existing methods often suffer from over-defense, which harms utility, or rely on shallow alignment, failing to detect complex threats that require deep reasoning. To this end, we introduce PRISM (Principled Reasoning for Integrated Safety in Multimodality), a system2-like framework that aligns VLMs by embedding a structured, safety-aware reasoning process. Our framework consists of two key components: PRISM-CoT, a dataset that teaches safety-aware chain-of-thought reasoning, and PRISM-DPO, generated via Monte Carlo Tree Search (MCTS) to further refine this reasoning through Direct Preference Optimization to help obtain a delicate safety boundary. Comprehensive evaluations demonstrate PRISM's effectiveness, achieving remarkably low attack success rates including 0.15% on JailbreakV-28K for Qwen2-VL and 90% improvement over the previous best method on VLBreak for LLaVA-1.5. PRISM also exhibits strong robustness against adaptive attacks, significantly increasing computational costs for adversaries, and generalizes effectively to out-of-distribution challenges, reducing attack success rates to just 8.70% on the challenging multi-image MIS benchmark. Remarkably, this robust defense is achieved while preserving, and in some cases enhancing, model utility. To promote reproducibility, we have made our code, data, and model weights available at https://github.com/SaFoLab-WISC/PRISM.

Paperid: 178, https://arxiv.org/pdf/2508.15141.pdf GitHub

Abstract:
There is a flurry of recent research papers proposing novel differentially private machine learning (DPML) techniques. These papers claim to achieve new state-of-the-art (SoTA) results and offer empirical results as validation. However, there is no consensus on which techniques are most effective or if they genuinely meet their stated claims. Complicating matters, heterogeneity in codebases, datasets, methodologies, and model architectures make direct comparisons of different approaches challenging. In this paper, we conduct a reproducibility and replicability (R+R) experiment on 11 different SoTA DPML techniques from the recent research literature. Results of our investigation are varied: while some methods stand up to scrutiny, others falter when tested outside their initial experimental conditions. We also discuss challenges unique to the reproducibility of DPML, including additional randomness due to DP noise, and how to address them. Finally, we derive insights and best practices to obtain scientifically valid and reliable results.

Paperid: 179, https://arxiv.org/pdf/2508.15031.pdf GitHub

Abstract:
Machine learning (ML) models have significantly grown in complexity and utility, driving advances across multiple domains. However, substantial computational resources and specialized expertise have historically restricted their wide adoption. Machine-Learning-as-a-Service (MLaaS) platforms have addressed these barriers by providing scalable, convenient, and affordable access to sophisticated ML models through user-friendly APIs. While this accessibility promotes widespread use of advanced ML capabilities, it also introduces vulnerabilities exploited through Model Extraction Attacks (MEAs). Recent studies have demonstrated that adversaries can systematically replicate a target model's functionality by interacting with publicly exposed interfaces, posing threats to intellectual property, privacy, and system security. In this paper, we offer a comprehensive survey of MEAs and corresponding defense strategies. We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments. Our analysis covers various attack techniques, evaluates their effectiveness, and highlights challenges faced by existing defenses, particularly the critical trade-off between preserving model utility and ensuring security. We further assess MEAs within different computing paradigms and discuss their technical, ethical, legal, and societal implications, along with promising directions for future research. This systematic survey aims to serve as a valuable reference for researchers, practitioners, and policymakers engaged in AI security and privacy. Additionally, we maintain an online repository continuously updated with related literature at https://github.com/kzhao5/ModelExtractionPapers.

Paperid: 180, https://arxiv.org/pdf/2508.12384.pdf GitHub

Abstract:
Ensemble-based attacks have been proven to be effective in enhancing adversarial transferability by aggregating the outputs of models with various architectures. However, existing research primarily focuses on refining ensemble weights or optimizing the ensemble path, overlooking the exploration of ensemble models to enhance the transferability of adversarial attacks. To address this gap, we propose applying adversarial augmentation to the surrogate models, aiming to boost overall generalization of ensemble models and reduce the risk of adversarial overfitting. Meanwhile, observing that ensemble Vision Transformers (ViTs) gain less attention, we propose ViT-EnsembleAttack based on the idea of model adversarial augmentation, the first ensemble-based attack method tailored for ViTs to the best of our knowledge. Our approach generates augmented models for each surrogate ViT using three strategies: Multi-head dropping, Attention score scaling, and MLP feature mixing, with the associated parameters optimized by Bayesian optimization. These adversarially augmented models are ensembled to generate adversarial examples. Furthermore, we introduce Automatic Reweighting and Step Size Enlargement modules to boost transferability. Extensive experiments demonstrate that ViT-EnsembleAttack significantly enhances the adversarial transferability of ensemble-based attacks on ViTs, outperforming existing methods by a substantial margin. Code is available at https://github.com/Trustworthy-AI-Group/TransferAttack.

Paperid: 181, https://arxiv.org/pdf/2508.12072.pdf GitHub

Abstract:
Despite extensive safety-tuning, large language models (LLMs) remain vulnerable to jailbreak attacks via adversarially crafted instructions, reflecting a persistent trade-off between safety and task performance. In this work, we propose Intent-FT, a simple and lightweight fine-tuning approach that explicitly trains LLMs to infer the underlying intent of an instruction before responding. By fine-tuning on a targeted set of adversarial instructions, Intent-FT enables LLMs to generalize intent deduction to unseen attacks, thereby substantially improving their robustness. We comprehensively evaluate both parametric and non-parametric attacks across open-source and proprietary models, considering harmfulness from attacks, utility, over-refusal, and impact against white-box threats. Empirically, Intent-FT consistently mitigates all evaluated attack categories, with no single attack exceeding a 50\% success rate -- whereas existing defenses remain only partially effective. Importantly, our method preserves the model's general capabilities and reduces excessive refusals on benign instructions containing superficially harmful keywords. Furthermore, models trained with Intent-FT accurately identify hidden harmful intent in adversarial attacks, and these learned intentions can be effectively transferred to enhance vanilla model defenses. We publicly release our code at https://github.com/wj210/Intent_Jailbreak.

Paperid: 182, https://arxiv.org/pdf/2508.10390.pdf GitHub

Abstract:
Evaluating jailbreak attacks is challenging when prompts are not overtly harmful or fail to induce harmful outputs. Unfortunately, many existing red-teaming datasets contain such unsuitable prompts. To evaluate attacks accurately, these datasets need to be assessed and cleaned for maliciousness. However, existing malicious content detection methods rely on either manual annotation, which is labor-intensive, or large language models (LLMs), which have inconsistent accuracy in harmful types. To balance accuracy and efficiency, we propose a hybrid evaluation framework named MDH (Malicious content Detection based on LLMs with Human assistance) that combines LLM-based annotation with minimal human oversight, and apply it to dataset cleaning and detection of jailbroken responses. Furthermore, we find that well-crafted developer messages can significantly boost jailbreak success, leading us to propose two new strategies: D-Attack, which leverages context simulation, and DH-CoT, which incorporates hijacked chains of thought. The Codes, datasets, judgements, and detection results will be released in github repository: https://github.com/AlienZhang1996/DH-CoT.

Paperid: 183, https://arxiv.org/pdf/2508.10052.pdf GitHub

Abstract:
In this paper, we present NetMoniAI, an agentic AI framework for automatic network monitoring and security that integrates decentralized analysis with lightweight centralized coordination. The framework consists of two layers: autonomous micro-agents at each node perform local traffic analysis and anomaly detection. A central controller then aggregates insights across nodes to detect coordinated attacks and maintain system-wide situational awareness. We evaluated NetMoniAI on a local micro-testbed and through NS-3 simulations. Results confirm that the two-tier agentic-AI design scales under resource constraints, reduces redundancy, and improves response time without compromising accuracy. To facilitate broader adoption and reproducibility, the complete framework is available as open source. This enables researchers and practitioners to replicate, validate, and extend it across diverse network environments and threat scenarios. Github link: https://github.com/pzambare3/NetMoniAI

Paperid: 184, https://arxiv.org/pdf/2508.09288.pdf GitHub

Abstract:
Large language models (LLMs) remain acutely vulnerable to prompt injection and related jailbreak attacks; heuristic guardrails (rules, filters, LLM judges) are routinely bypassed. We present Contextual Integrity Verification (CIV), an inference-time security architecture that attaches cryptographically signed provenance labels to every token and enforces a source-trust lattice inside the transformer via a pre-softmax hard attention mask (with optional FFN/residual gating). CIV provides deterministic, per-token non-interference guarantees on frozen models: lower-trust tokens cannot influence higher-trust representations. On benchmarks derived from recent taxonomies of prompt-injection vectors (Elite-Attack + SoK-246), CIV attains 0% attack success rate under the stated threat model while preserving 93.1% token-level similarity and showing no degradation in model perplexity on benign tasks; we note a latency overhead attributable to a non-optimized data path. Because CIV is a lightweight patch -- no fine-tuning required -- we demonstrate drop-in protection for Llama-3-8B and Mistral-7B. We release a reference implementation, an automated certification harness, and the Elite-Attack corpus to support reproducible research.

Paperid: 185, https://arxiv.org/pdf/2508.09060.pdf GitHub

Abstract:
Intrusion Detection Systems (IDS) are a vital part of a network-connected device. In this paper, we develop a deep learning based intrusion detection system that is deployed in a distributed setup across devices connected to a network. Our aim is to better equip deep learning models against unknown attacks using knowledge from known attacks. To this end, we develop algorithms to maximize the number of transferability relationships. We propose a Convolutional Neural Network (CNN) model, along with two algorithms that maximize the number of relationships observed. One is a two step data pre-processing stage, and the other is a Block-Based Smart Aggregation (BBSA) algorithm. The proposed system succeeds in achieving superior transferability performance while maintaining impressive local detection rates. We also show that our method is generalizable, exhibiting transferability potential across datasets and even with different backbones. The code for this work can be found at https://github.com/ghosh64/tabfidsv2.

Paperid: 186, https://arxiv.org/pdf/2508.09056.pdf GitHub

Abstract:
Intrusion Detection Systems (IDS) have an increasingly important role in preventing exploitation of network vulnerabilities by malicious actors. Recent deep learning based developments have resulted in significant improvements in the performance of IDS systems. In this paper, we present FetFIDS, where we explore the employment of feature embedding instead of positional embedding to improve intrusion detection performance of a transformer based deep learning system. Our model is developed with the aim of deployments in edge learning scenarios, where federated learning over multiple communication rounds can ensure both privacy and localized performance improvements. FetFIDS outperforms multiple state-of-the-art intrusion detection systems in a federated environment and demonstrates a high degree of suitability to federated learning. The code for this work can be found at https://github.com/ghosh64/fetfids.

Paperid: 187, https://arxiv.org/pdf/2508.08789.pdf GitHub

Authors:Yuchu Jiang, Jian Zhao, Yuchen Yuan, Tianle Zhang, Yao Huang, Yanghao Zhang, Yan Wang, Yanshu Li, Xizhong Guo, Yusheng Zhao, Jun Zhang, Zhi Zhang, Xiaojian Lin, Yixiu Zou, Haoxuan Ma, Yuhu Shang, Yuzhi Hu, Keshu Cai, Ruochen Zhang, Boyuan Chen, Yilan Gao, Ziheng Jiao, Yi Qin, Shuangjun Du, Xiao Tong, Zhekun Liu, Yu Chen, Xuankun Rong, Rui Wang, Yejie Zheng, Zhaoxin Fan, Murat Sensoy, Hongyuan Zhang, Pan Zhou, Lei Jin, Hao Zhao, Xu Yang, Jiaojiao Zhao, Jianshu Li, Joey Tianyi Zhou, Zhi-Qi Cheng, Longtao Huang, Zhiyi Liu, Zheng Zhu, Jianan Li, Gang Wang, Qi Li, Xu-Yao Zhang, Yaodong Yang, Mang Ye, Wenqi Ren, Zhaofeng He, Hang Su, Rongrong Ni, Liping Jing, Xingxing Wei, Junliang Xing, Massimo Alioto, Shengmei Shen, Petia Radeva, Dacheng Tao, Ya-Qin Zhang, Shuicheng Yan, Chi Zhang, Zhongjiang He, Xuelong Li

Abstract:
The rapid advancement of AI has expanded its capabilities across domains, yet introduced critical technical vulnerabilities, such as algorithmic bias and adversarial sensitivity, that pose significant societal risks, including misinformation, inequity, security breaches, physical harm, and eroded public trust. These challenges highlight the urgent need for robust AI governance. We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security (system reliability), Derivative Security (real-world harm mitigation), and Social Ethics (value alignment and accountability). Uniquely, our approach unifies technical methods, emerging evaluation benchmarks, and policy insights to promote transparency, accountability, and trust in AI systems. Through a systematic review of over 300 studies, we identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight. These shortcomings stem from treating governance as an afterthought, rather than a foundational design principle, resulting in reactive, siloed efforts that fail to address the interdependence of technical integrity and societal trust. To overcome this, we present an integrated research agenda that bridges technical rigor with social responsibility. Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy. The accompanying repository is available at https://github.com/ZTianle/Awesome-AI-SG.

Paperid: 188, https://arxiv.org/pdf/2508.06325.pdf GitHub

Abstract:
With the advancement of personalized image generation technologies, concerns about forgery attacks that infringe on portrait rights and privacy are growing. To address these concerns, protection perturbation algorithms have been developed to disrupt forgery generation. However, the protection algorithms would become ineffective when forgery attackers apply purification techniques to bypass the protection. To address this issue, we present a novel approach, Anti-Tamper Perturbation (ATP). ATP introduces a tamper-proof mechanism within the perturbation. It consists of protection and authorization perturbations, where the protection perturbation defends against forgery attacks, while the authorization perturbation detects purification-based tampering. Both protection and authorization perturbations are applied in the frequency domain under the guidance of a mask, ensuring that the protection perturbation does not disrupt the authorization perturbation. This design also enables the authorization perturbation to be distributed across all image pixels, preserving its sensitivity to purification-based tampering. ATP demonstrates its effectiveness in defending forgery attacks across various attack settings through extensive experiments, providing a robust solution for protecting individuals' portrait rights and privacy. Our code is available at: https://github.com/Seeyn/Anti-Tamper-Perturbation .

Paperid: 189, https://arxiv.org/pdf/2508.06153.pdf GitHub

Abstract:
With the development of customized large language model (LLM) agents, a new threat of black-box backdoor attacks has emerged, where malicious instructions are injected into hidden system prompts. These attacks easily bypass existing defenses that rely on white-box access, posing a serious security challenge. To address this, we propose SLIP, a Soft Label mechanism and key-extraction-guided CoT-based defense against Instruction backdoors in APIs. SLIP is designed based on two key insights. First, to counteract the model's oversensitivity to triggers, we propose a Key-extraction-guided Chain-of-Thought (KCoT). Instead of only considering the single trigger or the input sentence, KCoT prompts the agent to extract task-relevant key phrases. Second, to guide the LLM toward correct answers, our proposed Soft Label Mechanism (SLM) prompts the agent to quantify the semantic correlation between key phrases and candidate answers. Crucially, to mitigate the influence of residual triggers or misleading content in phrases extracted by KCoT, which typically causes anomalous scores, SLM excludes anomalous scores deviating significantly from the mean and subsequently averages the remaining scores to derive a more reliable semantic representation. Extensive experiments on classification and question-answer (QA) tasks demonstrate that SLIP is highly effective, reducing the average attack success rate (ASR) from 90.2% to 25.13% while maintaining high accuracy on clean data and outperforming state-of-the-art defenses. Our code are available in https://github.com/CAU-ISS-Lab/Backdoor-Attack-Defense-LLMs/tree/main/SLIP.

Paperid: 190, https://arxiv.org/pdf/2508.05691.pdf GitHub

Abstract:
Generative models are increasingly adopted in high-stakes domains, yet current deployments offer no mechanisms to verify whether a given output truly originates from the certified model. We address this gap by extending model fingerprinting techniques beyond the traditional collaborative setting to one where the model provider itself may act adversarially, replacing the certified model with a cheaper or lower-quality substitute. To our knowledge, this is the first work to study fingerprinting for provenance attribution under such a threat model. Our approach introduces a trusted verifier that, during a certification phase, extracts hidden fingerprints from the authentic model's output space and trains a detector to recognize them. During verification, this detector can determine whether new outputs are consistent with the certified model, without requiring specialized hardware or model modifications. In extensive experiments, our methods achieve near-zero FPR@95%TPR on both GANs and diffusion models, and remain effective even against subtle architectural or training changes. Furthermore, the approach is robust to adaptive adversaries that actively manipulate outputs in an attempt to evade detection.

Paperid: 191, https://arxiv.org/pdf/2508.05689.pdf GitHub

Abstract:
Deep neural networks are susceptible to adversarial examples while suffering from incorrect predictions via imperceptible perturbations. Transfer-based attacks create adversarial examples for surrogate models and transfer these examples to target models under black-box scenarios. Recent studies reveal that adversarial examples in flat loss landscapes exhibit superior transferability to alleviate overfitting on surrogate models. However, the prior arts overlook the influence of perturbation directions, resulting in limited transferability. In this paper, we propose a novel attack method, named Residual Perturbation Attack (ResPA), relying on the residual gradient as the perturbation direction to guide the adversarial examples toward the flat regions of the loss function. Specifically, ResPA conducts an exponential moving average on the input gradients to obtain the first moment as the reference gradient, which encompasses the direction of historical gradients. Instead of heavily relying on the local flatness that stems from the current gradients as the perturbation direction, ResPA further considers the residual between the current gradient and the reference gradient to capture the changes in the global perturbation direction. The experimental results demonstrate the better transferability of ResPA than the existing typical transfer-based attack methods, while the transferability can be further improved by combining ResPA with the current input transformation methods. The code is available at https://github.com/ZezeTao/ResPA.

Paperid: 192, https://arxiv.org/pdf/2508.05675.pdf GitHub

Abstract:
Recent years have witnessed growing interest in adopting large language models (LLMs) for Register Transfer Level (RTL) code optimization. While powerful cloud-based LLMs offer superior optimization capabilities, they pose unacceptable intellectual property (IP) leakage risks when processing proprietary hardware designs. In this paper, we propose a new scenario where Verilog code must be optimized for specific attributes without leaking sensitive IP information. We introduce the first IP-preserving edge-cloud collaborative framework that leverages the benefits of both paradigms. Our approach employs local small LLMs (e.g., Qwen-2.5-Coder-7B) to perform secure comparative analysis between paired high-quality target designs and novice draft codes, yielding general design principles that summarize key insights for improvements. These principles are then used to query stronger cloud LLMs (e.g., Deepseek-V3) for targeted code improvement, ensuring that only abstracted and IP-safe guidance reaches external services. Our experimental results demonstrate that the framework achieves significantly higher optimization success rates compared to baseline methods. For example, combining Qwen-2.5-Coder-7B and Deepseek-V3 achieves a 66.67\% optimization success rate for power utilization, outperforming Deepseek-V3 alone (49.81\%) and even commercial models like GPT-4o (55.81\%). Further investigation of local and cloud LLM combinations reveals that different model pairings exhibit varying strengths for specific optimization objectives, with interesting trends emerging when varying the number of comparative code pairs. Our work establishes a new paradigm for secure hardware design optimization that balances performance gains with IP protection.

Paperid: 193, https://arxiv.org/pdf/2508.05674.pdf GitHub GitHub

Abstract:
Recent advances in LLM agentic systems have improved the automation of offensive security tasks, particularly for Capture the Flag (CTF) challenges. We systematically investigate the key factors that drive agent success and provide a detailed recipe for building effective LLM-based offensive security agents. First, we present CTFJudge, a framework leveraging LLM as a judge to analyze agent trajectories and provide granular evaluation across CTF solving steps. Second, we propose a novel metric, CTF Competency Index (CCI) for partial correctness, revealing how closely agent solutions align with human-crafted gold standards. Third, we examine how LLM hyperparameters, namely temperature, top-p, and maximum token length, influence agent performance and automated cybersecurity task planning. For rapid evaluation, we present CTFTiny, a curated benchmark of 50 representative CTF challenges across binary exploitation, web, reverse engineering, forensics, and cryptography. Our findings identify optimal multi-agent coordination settings and lay the groundwork for future LLM agent research in cybersecurity. We make CTFTiny open source to public https://github.com/NYU-LLM-CTF/CTFTiny along with CTFJudge on https://github.com/NYU-LLM-CTF/CTFJudge.

Paperid: 194, https://arxiv.org/pdf/2508.05087.pdf GitHub GitHub

Abstract:
Jailbreak attacks against multimodal large language Models (MLLMs) are a significant research focus. Current research predominantly focuses on maximizing attack success rate (ASR), often overlooking whether the generated responses actually fulfill the attacker's malicious intent. This oversight frequently leads to low-quality outputs that bypass safety filters but lack substantial harmful content. To address this gap, we propose JPS, \underline{J}ailbreak MLLMs with collaborative visual \underline{P}erturbation and textual \underline{S}teering, which achieves jailbreaks via corporation of visual image and textually steering prompt. Specifically, JPS utilizes target-guided adversarial image perturbations for effective safety bypass, complemented by "steering prompt" optimized via a multi-agent system to specifically guide LLM responses fulfilling the attackers' intent. These visual and textual components undergo iterative co-optimization for enhanced performance. To evaluate the quality of attack outcomes, we propose the Malicious Intent Fulfillment Rate (MIFR) metric, assessed using a Reasoning-LLM-based evaluator. Our experiments show JPS sets a new state-of-the-art in both ASR and MIFR across various MLLMs and benchmarks, with analyses confirming its efficacy. Codes are available at \href{https://github.com/thu-coai/JPS}{https://github.com/thu-coai/JPS}. \color{warningcolor}{Warning: This paper contains potentially sensitive contents.}

Paperid: 195, https://arxiv.org/pdf/2508.03793.pdf GitHub

Abstract:
Long-context large language models (LLMs), such as Gemini-2.5-Pro and Claude-Sonnet-4, are increasingly used to empower advanced AI systems, including retrieval-augmented generation (RAG) pipelines and autonomous agents. In these systems, an LLM receives an instruction along with a context--often consisting of texts retrieved from a knowledge database or memory--and generates a response that is contextually grounded by following the instruction. Recent studies have designed solutions to trace back to a subset of texts in the context that contributes most to the response generated by the LLM. These solutions have numerous real-world applications, including performing post-attack forensic analysis and improving the interpretability and trustworthiness of LLM outputs. While significant efforts have been made, state-of-the-art solutions such as TracLLM often lead to a high computation cost, e.g., it takes TracLLM hundreds of seconds to perform traceback for a single response-context pair. In this work, we propose AttnTrace, a new context traceback method based on the attention weights produced by an LLM for a prompt. To effectively utilize attention weights, we introduce two techniques designed to enhance the effectiveness of AttnTrace, and we provide theoretical insights for our design choice. We also perform a systematic evaluation for AttnTrace. The results demonstrate that AttnTrace is more accurate and efficient than existing state-of-the-art context traceback methods. We also show that AttnTrace can improve state-of-the-art methods in detecting prompt injection under long contexts through the attribution-before-detection paradigm. As a real-world application, we demonstrate that AttnTrace can effectively pinpoint injected instructions in a paper designed to manipulate LLM-generated reviews. The code is at https://github.com/Wang-Yanting/AttnTrace.

Paperid: 196, https://arxiv.org/pdf/2508.03097.pdf GitHub

Abstract:
With the advancement of Large Language Models (LLMs), LLM applications have expanded into a growing number of fields. However, users with data privacy concerns face limitations in directly utilizing LLM APIs, while private deployments incur significant computational demands. This creates a substantial challenge in achieving secure LLM adaptation under constrained local resources. To address this issue, collaborative learning methods, such as Split Learning (SL), offer a resource-efficient and privacy-preserving solution for adapting LLMs to private domains. In this study, we introduce VFLAIR-LLM (available at https://github.com/FLAIR-THU/VFLAIR-LLM), an extensible and lightweight split learning framework for LLMs, enabling privacy-preserving LLM inference and fine-tuning in resource-constrained environments. Our library provides two LLM partition settings, supporting three task types and 18 datasets. In addition, we provide standard modules for implementing and evaluating attacks and defenses. We benchmark 5 attacks and 9 defenses under various Split Learning for LLM(SL-LLM) settings, offering concrete insights and recommendations on the choice of model partition configurations, defense strategies, and relevant hyperparameters for real-world applications.

Paperid: 197, https://arxiv.org/pdf/2508.02115.pdf GitHub

Abstract:
Backdoor detection is currently the mainstream defense against backdoor attacks in federated learning (FL), where malicious clients upload poisoned updates that compromise the global model and undermine the reliability of FL deployments. Existing backdoor detection techniques fall into two categories, including passive and proactive ones, depending on whether the server proactively modifies the global model. However, both have inherent limitations in practice: passive defenses are vulnerable to common non-i.i.d. data distributions and random participation of FL clients, whereas current proactive defenses suffer inevitable out-of-distribution (OOD) bias because they rely on backdoor co-existence effects. To address these issues, we introduce a new proactive defense, dubbed Coward, inspired by our discovery of multi-backdoor collision effects, in which consecutively planted, distinct backdoors significantly suppress earlier ones. In general, we detect attackers by evaluating whether the server-injected, conflicting global watermark is erased during local training rather than retained. Our method preserves the advantages of proactive defenses in handling data heterogeneity (\ie, non-i.i.d. data) while mitigating the adverse impact of OOD bias through a revised detection mechanism. Extensive experiments on benchmark datasets confirm the effectiveness of Coward and its resilience to potential adaptive attacks. The code for our method would be available at https://github.com/still2009/cowardFL.

Paperid: 198, https://arxiv.org/pdf/2508.01887.pdf GitHub GitHub

Abstract:
AI-generated text detectors have become essential tools for maintaining content authenticity, yet their robustness against evasion attacks remains questionable. We present PDFuzz, a novel attack that exploits the discrepancy between visual text layout and extraction order in PDF documents. Our method preserves exact textual content while manipulating character positioning to scramble extraction sequences. We evaluate this approach against the ArguGPT detector using a dataset of human and AI-generated text. Our results demonstrate complete evasion: detector performance drops from (93.6 $\pm$ 1.4) % accuracy and 0.938 $\pm$ 0.014 F1 score to random-level performance ((50.4 $\pm$ 3.2) % accuracy, 0.0 F1 score) while maintaining perfect visual fidelity. Our work reveals a vulnerability in current detection systems that is inherent to PDF document structures and underscores the need for implementing sturdy safeguards against such attacks. We make our code publicly available at https://github.com/ACMCMC/PDFuzz.

Paperid: 199, https://arxiv.org/pdf/2508.01647.pdf GitHub

Abstract:
As backdoor attacks become more stealthy and robust, they reveal critical weaknesses in current defense strategies: detection methods often rely on coarse-grained feature statistics, and purification methods typically require full retraining or additional clean models. To address these challenges, we propose DUP (Detection-guided Unlearning for Purification), a unified framework that integrates backdoor detection with unlearning-based purification. The detector captures feature-level anomalies by jointly leveraging class-agnostic distances and inter-layer transitions. These deviations are integrated through a weighted scheme to identify poisoned inputs, enabling more fine-grained analysis. Based on the detection results, we purify the model through a parameter-efficient unlearning mechanism that avoids full retraining and does not require any external clean model. Specifically, we innovatively repurpose knowledge distillation to guide the student model toward increasing its output divergence from the teacher on detected poisoned samples, effectively forcing it to unlearn the backdoor behavior. Extensive experiments across diverse attack methods and language model architectures demonstrate that DUP achieves superior defense performance in detection accuracy and purification efficacy. Our code is available at https://github.com/ManHu2025/DUP.

Paperid: 200, https://arxiv.org/pdf/2508.01595.pdf GitHub

Abstract:
Although existing backdoor defenses have gained success in mitigating backdoor attacks, they still face substantial challenges. In particular, most of them rely on large amounts of clean data to weaken the backdoor mapping but generally struggle with residual trigger effects, resulting in persistently high attack success rates (ASR). Therefore, in this paper, we propose a novel Backdoor defense method based on Directional mapping module and adversarial Knowledge Distillation (BeDKD), which balances the trade-off between defense effectiveness and model performance using a small amount of clean and poisoned data. We first introduce a directional mapping module to identify poisoned data, which destroys clean mapping while keeping backdoor mapping on a small set of flipped clean data. Then, the adversarial knowledge distillation is designed to reinforce clean mapping and suppress backdoor mapping through a cycle iteration mechanism between trust and punish distillations using clean and identified poisoned data. We conduct experiments to mitigate mainstream attacks on three datasets, and experimental results demonstrate that BeDKD surpasses the state-of-the-art defenses and reduces the ASR by 98% without significantly reducing the CACC. Our code are available in https://github.com/CAU-ISS-Lab/Backdoor-Attack-Defense-LLMs/tree/main/BeDKD.

Paperid: 201, https://arxiv.org/pdf/2508.01554.pdf GitHub

Abstract:
Prompt-based adversarial attacks have become an effective means to assess the robustness of large language models (LLMs). However, existing approaches often treat prompts as monolithic text, overlooking their structural heterogeneity-different prompt components contribute unequally to adversarial robustness. Prior works like PromptRobust assume prompts are value-neutral, but our analysis reveals that complex, domain-specific prompts with rich structures have components with differing vulnerabilities. To address this gap, we introduce PromptAnatomy, an automated framework that dissects prompts into functional components and generates diverse, interpretable adversarial examples by selectively perturbing each component using our proposed method, ComPerturb. To ensure linguistic plausibility and mitigate distribution shifts, we further incorporate a perplexity (PPL)-based filtering mechanism. As a complementary resource, we annotate four public instruction-tuning datasets using the PromptAnatomy framework, verified through human review. Extensive experiments across these datasets and five advanced LLMs demonstrate that ComPerturb achieves state-of-the-art attack success rates. Ablation studies validate the complementary benefits of prompt dissection and PPL filtering. Our results underscore the importance of prompt structure awareness and controlled perturbation for reliable adversarial robustness evaluation in LLMs. Code and data are available at https://github.com/Yujiaaaaa/PACP.

Paperid: 202, https://arxiv.org/pdf/2508.00910.pdf GitHub

Abstract:
Large Language Models (LLMs) have achieved remarkable success in software engineering tasks when trained with executable runtime environments, particularly in resolving GitHub issues. However, such runtime environments are often unavailable in other domains, especially cybersecurity, where challenge configurations and execution contexts are ephemeral or restricted. We present Cyber-Zero, the first runtime-free framework for synthesizing high-quality agent trajectories to train cybersecurity LLMs. Cyber-Zero leverages publicly available CTF writeups and employs persona-driven LLM simulation to reverse-engineer runtime behaviors and generate realistic, long-horizon interaction sequences without actual environments. Using trajectories synthesized by Cyber-Zero, we train LLM-based agents that achieve up to 13.1% absolute performance gains over baseline models on three prominent CTF benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best model, Cyber-Zero-32B, establishes new state-of-the-art performance among open-weight models, matching the capabilities of proprietary systems like DeepSeek-V3-0324 and Claude-3.5-Sonnet while offering superior cost-effectiveness, and demonstrating that runtime-free trajectory synthesis can effectively democratize the development of state-of-the-art cybersecurity agents.

Paperid: 203, https://arxiv.org/pdf/2508.00649.pdf GitHub

Abstract:
Developing reliable defenses against patch attacks on object detectors has attracted increasing interest. However, we identify that existing defense evaluations lack a unified and comprehensive framework, resulting in inconsistent and incomplete assessments of current methods. To address this issue, we revisit 11 representative defenses and present the first patch defense benchmark, involving 2 attack goals, 13 patch attacks, 11 object detectors, and 4 diverse metrics. This leads to the large-scale adversarial patch dataset with 94 types of patches and 94,000 images. Our comprehensive analyses reveal new insights: (1) The difficulty in defending against naturalistic patches lies in the data distribution, rather than the commonly believed high frequencies. Our new dataset with diverse patch distributions can be used to improve existing defenses by 15.09% AP@0.5. (2) The average precision of the attacked object, rather than the commonly pursued patch detection accuracy, shows high consistency with defense performance. (3) Adaptive attacks can substantially bypass existing defenses, and defenses with complex/stochastic models or universal patch properties are relatively robust. We hope that our analyses will serve as guidance on properly evaluating patch attacks/defenses and advancing their design. Code and dataset are available at https://github.com/Gandolfczjh/APDE, where we will keep integrating new attacks/defenses.

Paperid: 204, https://arxiv.org/pdf/2508.00555.pdf GitHub

Abstract:
Jailbreaking is an essential adversarial technique for red-teaming these models to uncover and patch security flaws. However, existing jailbreak methods face significant drawbacks. Token-level jailbreak attacks often produce incoherent or unreadable inputs and exhibit poor transferability, while prompt-level attacks lack scalability and rely heavily on manual effort and human ingenuity. We propose a concise and effective two-stage framework that combines the advantages of these approaches. The first stage performs a scenario-based generation of context and rephrases the original malicious query to obscure its harmful intent. The second stage then utilizes information from the model's hidden states to guide fine-grained edits, effectively steering the model's internal representation of the input from a malicious toward a benign one. Extensive experiments demonstrate that this method achieves state-of-the-art Attack Success Rate, with gains of up to 37.74% over the strongest baseline, and exhibits excellent transferability to black-box models. Our analysis further demonstrates that AGILE maintains substantial effectiveness against prominent defense mechanisms, highlighting the limitations of current safeguards and providing valuable insights for future defense development. Our code is available at https://github.com/yunsaijc/AGILE.

Paperid: 205, https://arxiv.org/pdf/2507.22535.pdf GitHub

Abstract:
Pseudorandom quantum states (PRSs) and pseudorandom function-like quantum state (PRFS) generators are quantum analogues of pseudorandom generators and pseudorandom functions. It is known that PRS (and PRFS) can exist even if BQP = QMA (relative to a quantum oracle) or if P = NP (relative to a classical oracle), which does not allow for the existence of one-way functions (relative to these oracles). Hence, these are potentially weaker objects than quantum-secure one-way functions, which can be used to do quantum cryptography. A desirable property of PRS and PRFS constructions is scalability, which ensures that the security parameter $Î»$ (which determines indistinguishability from their Haar-random counterparts) can be much larger than $n$ (the number of qubits of the output states). This may be important in some applications where PRS and PRFS primitives are used. We present an isometric procedure to prepare quantum states that can be arbitrarily random (i.e., the trace distance from the Haar-random state can be arbitrarily small for the true random case, or the distinguishing advantage can be arbitrarily small for the pseudorandom case). Our procedure provides a new method for scalable PRS that introduces no entanglement or correlations with the environment. This naturally gives the first construction for scalable, quantum-accessible, and adaptive PRFS assuming quantum-secure one-way functions. Our PRFS construction implies various primitives, including long-input PRFS, short-input PRFS, short-output PRFS, non-adaptive PRFS, and classically-accessible adaptive PRFS. This new construction may be helpful in some simplification of the microcrypt zoo (https://sattath.github.io/microcrypt-zoo/).

Paperid: 206, https://arxiv.org/pdf/2507.22171.pdf GitHub

Abstract:
Jailbreak attacks aim to exploit large language models (LLMs) by inducing them to generate harmful content, thereby revealing their vulnerabilities. Understanding and addressing these attacks is crucial for advancing the field of LLM safety. Previous jailbreak approaches have mainly focused on direct manipulations of harmful intent, with limited attention to the impact of persona prompts. In this study, we systematically explore the efficacy of persona prompts in compromising LLM defenses. We propose a genetic algorithm-based method that automatically crafts persona prompts to bypass LLM's safety mechanisms. Our experiments reveal that: (1) our evolved persona prompts reduce refusal rates by 50-70% across multiple LLMs, and (2) these prompts demonstrate synergistic effects when combined with existing attack methods, increasing success rates by 10-20%. Our code and data are available at https://github.com/CjangCjengh/Generic_Persona.

Paperid: 207, https://arxiv.org/pdf/2507.21150.pdf GitHub GitHub

Abstract:
The rapid advancement of voice generation technologies has enabled the synthesis of speech that is perceptually indistinguishable from genuine human voices. While these innovations facilitate beneficial applications such as personalized text-to-speech systems and voice preservation, they have also introduced significant risks, including deepfake impersonation scams and synthetic media-driven disinformation campaigns. Recent reports indicate that in 2024, deepfake fraud attempts surged by over 1,300% compared to 2023, underscoring the urgent need for robust audio content authentication. The financial sector has been particularly impacted, with a loss of over 10 million USD to voice scams and individual victims reporting losses exceeding $6,000 from AI-generated deepfake calls. In response, regulators and governments worldwide are enacting measures to improve AI content transparency and traceability, emphasizing the development of forensic tools and watermarking techniques as essential strategies to uphold media integrity.

Paperid: 208, https://arxiv.org/pdf/2507.20688.pdf GitHub

Abstract:
In light of increasing privacy concerns and stringent legal regulations, using secure multiparty computation (MPC) to enable collaborative GBDT model training among multiple data owners has garnered significant attention. Despite this, existing MPC-based GBDT frameworks face efficiency challenges due to high communication costs and the computation burden of non-linear operations, such as division and sigmoid calculations. In this work, we introduce Guard-GBDT, an innovative framework tailored for efficient and privacy-preserving GBDT training on vertical datasets. Guard-GBDT bypasses MPC-unfriendly division and sigmoid functions by using more streamlined approximations and reduces communication overhead by compressing the messages exchanged during gradient aggregation. We implement a prototype of Guard-GBDT and extensively evaluate its performance and accuracy on various real-world datasets. The results show that Guard-GBDT outperforms state-of-the-art HEP-XGB (CIKM'21) and SiGBDT (ASIA CCS'24) by up to $2.71\times$ and $12.21 \times$ on LAN network and up to $2.7\times$ and $8.2\times$ on WAN network. Guard-GBDT also achieves comparable accuracy with SiGBDT and plaintext XGBoost (better than HEP-XGB ), which exhibits a deviation of $\pm1\%$ to $\pm2\%$ only. Our implementation code is provided at https://github.com/XidianNSS/Guard-GBDT.git.

Paperid: 209, https://arxiv.org/pdf/2507.19219.pdf GitHub GitHub

Abstract:
Overestimation in evaluating large language models (LLMs) has become an increasing concern. Due to the contamination of public benchmarks or imbalanced model training, LLMs may achieve unreal evaluation results on public benchmarks, either intentionally or unintentionally, which leads to unfair comparisons among LLMs and undermines their realistic capability assessments. Existing benchmarks attempt to address these issues by keeping test cases permanently secret, mitigating contamination through human evaluation, or repeatedly collecting and constructing new samples. However, these approaches fail to ensure reproducibility, transparency, and high efficiency simultaneously. Moreover, the extent of overestimation in current LLMs remains unquantified. To address these issues, we propose ArxivRoll, a dynamic evaluation framework inspired by one-time pad encryption in cryptography. ArxivRoll comprises two key components: \emph{i) SCP (Sequencing, Cloze, and Prediction)}, an automated generator for private test cases, and \emph{ii) Rugged Scores (RS)}, metrics that measure the proportion of public benchmark contamination and training bias. Leveraging SCP, ArxivRoll constructs a new benchmark every six months using recent articles from ArXiv and employs them for one-time evaluations of LLM performance. Extensive experiments demonstrate the high quality of our benchmark, and we provide a systematic evaluation of current LLMs. The source code is available at https://github.com/liangzid/ArxivRoll/.

Paperid: 210, https://arxiv.org/pdf/2507.18631.pdf GitHub

Abstract:
With rapid advancement and increasing accessibility of LLMs, fine-tuning aligned models has become a critical step for adapting them to real-world applications, which makes the safety of this fine-tuning process more important than ever. However, recent studies have highlighted a critical challenge: even when fine-tuning with seemingly benign downstream datasets, the safety of aligned LLMs can be compromised, making them more susceptible to malicious instructions. In this paper, we show that fine-tuning datasets often contain samples with safety-degrading features that are not easily identifiable on the surface. These samples can significantly degrade the safety alignment of LLMs during fine-tuning. To address this issue, we propose LARF, a Layer-Aware Representation Filtering method. This method identifies safety-sensitive layers within the LLM and leverages their representations to detect which data samples in the post-training dataset contain safety-degrading features. Experimental results demonstrate that LARF can effectively identify benign data with safety-degrading features. After removing such data, the safety alignment degradation caused by fine-tuning is mitigated. Please see our code at https://github.com/LLLeoLi/LARF.

Paperid: 211, https://arxiv.org/pdf/2507.17956.pdf GitHub GitHub

Abstract:
The modular inverse is an essential piece of computation required for elliptic curve operations used for digital signatures in Bitcoin and other applications. A novel approach to the extended Euclidean algorithm has been developed by Bernstein and Yang within the last few years and incorporated into the libsecp256k1 cryptographic library used by Bitcoin. However, novel algorithms introduce new risks of errors. To address this we have completed a computer verified proof of the correctness of (one of) libsecp256k1's modular inverse implementations with the Coq proof assistant using the Verifiable C's implementation of separation logic.

Paperid: 212, https://arxiv.org/pdf/2507.14799.pdf GitHub GitHub

Abstract:
This work demonstrates that LLM-based web navigation agents offer powerful automation capabilities but are vulnerable to Indirect Prompt Injection (IPI) attacks. We show that adversaries can embed universal adversarial triggers in webpage HTML to hijack agent behavior that utilizes the accessibility tree to parse HTML, causing unintended or malicious actions. Using the Greedy Coordinate Gradient (GCG) algorithm and a Browser Gym agent powered by Llama-3.1, our system demonstrates high success rates across real websites in both targeted and general attacks, including login credential exfiltration and forced ad clicks. Our empirical results highlight critical security risks and the need for stronger defenses as LLM-driven autonomous web agents become more widely adopted. The system software (https://github.com/sej2020/manipulating-web-agents) is released under the MIT License, with an accompanying publicly available demo website (http://lethaiq.github.io/attack-web-llm-agent).

Paperid: 213, https://arxiv.org/pdf/2507.14519.pdf GitHub

Abstract:
Privacy-preserving machine learning (PPML) based on cryptographic protocols has emerged as a promising paradigm to protect user data privacy in cloud-based machine learning services. While it achieves formal privacy protection, PPML often incurs significant efficiency and scalability costs due to orders of magnitude overhead compared to the plaintext counterpart. Therefore, there has been a considerable focus on mitigating the efficiency gap for PPML. In this survey, we provide a comprehensive and systematic review of recent PPML studies with a focus on cross-level optimizations. Specifically, we categorize existing papers into protocol level, model level, and system level, and review progress at each level. We also provide qualitative and quantitative comparisons of existing works with technical insights, based on which we discuss future research directions and highlight the necessity of integrating optimizations across protocol, model, and system levels. We hope this survey can provide an overarching understanding of existing approaches and potentially inspire future breakthroughs in the PPML field. As the field is evolving fast, we also provide a public GitHub repository to continuously track the developments, which is available at https://github.com/PKU-SEC-Lab/Awesome-PPML-Papers.

Paperid: 214, https://arxiv.org/pdf/2507.11137.pdf GitHub

Abstract:
As valuable digital assets, deep neural networks necessitate robust ownership protection, positioning neural network watermarking (NNW) as a promising solution. Among various NNW approaches, weight-based methods are favored for their simplicity and practicality; however, they remain vulnerable to forging and overwriting attacks. To address those challenges, we propose NeuralMark, a robust method built around a hashed watermark filter. Specifically, we utilize a hash function to generate an irreversible binary watermark from a secret key, which is then used as a filter to select the model parameters for embedding. This design cleverly intertwines the embedding parameters with the hashed watermark, providing a robust defense against both forging and overwriting attacks. An average pooling is also incorporated to resist fine-tuning and pruning attacks. Furthermore, it can be seamlessly integrated into various neural network architectures, ensuring broad applicability. Theoretically, we analyze its security boundary. Empirically, we verify its effectiveness and robustness across 13 distinct Convolutional and Transformer architectures, covering five image classification tasks and one text generation task. The source codes are available at https://github.com/AIResearch-Group/NeuralMark.

Paperid: 215, https://arxiv.org/pdf/2507.08312.pdf GitHub

Abstract:
The rapid advancement of quantum computing poses a critical threat to classical cryptographic algorithms such as RSA and ECC, particularly in Internet of Things (IoT) devices, where secure communication is essential but often constrained by limited computational resources. This paper investigates the feasibility of deploying post-quantum cryptography (PQC) algorithms on resource-constrained devices. In particular, we implement three PQC algorithms -- BIKE, CRYSTALS-Kyber, and HQC -- on a lightweight IoT platform built with Raspberry Pi devices. Leveraging the Open Quantum Safe (\texttt{liboqs}) library in conjunction with \texttt{mbedTLS}, we develop quantum-secure key exchange protocols, and evaluate their performance in terms of computational overhead, memory usage, and energy consumption for quantum secure communication. Experimental results demonstrate that the integration of PQC algorithms on constrained hardware is practical, reinforcing the urgent need for quantum-resilient cryptographic frameworks in next-generation IoT devices. The implementation of this paper is available at https://iqsec-lab.github.io/PQC-IoT/.

Paperid: 216, https://arxiv.org/pdf/2507.07974.pdf GitHub

Abstract:
When large language model (LLM) systems interact with external data to perform complex tasks, a new attack, namely prompt injection, becomes a significant threat. By injecting instructions into the data accessed by the system, the attacker is able to override the initial user task with an arbitrary task directed by the attacker. To secure the system, test-time defenses, e.g., defensive prompting, have been proposed for system developers to attain security only when needed in a flexible manner. However, they are much less effective than training-time defenses that change the model parameters. Motivated by this, we propose DefensiveToken, a test-time defense with prompt injection robustness comparable to training-time alternatives. DefensiveTokens are newly inserted as special tokens, whose embeddings are optimized for security. In security-sensitive cases, system developers can append a few DefensiveTokens before the LLM input to achieve security with a minimal utility drop. In scenarios where security is less of a concern, developers can simply skip DefensiveTokens; the LLM system remains the same as there is no defense, generating high-quality responses. Thus, DefensiveTokens, if released alongside the model, allow a flexible switch between the state-of-the-art (SOTA) utility and almost-SOTA security at test time. The code is available at https://github.com/Sizhe-Chen/DefensiveToken.

Paperid: 217, https://arxiv.org/pdf/2507.07972.pdf GitHub

Abstract:
Fully Homomorphic Encryption (FHE) is an encryption scheme that allows for computation to be performed directly on encrypted data, effectively closing the loop on secure and outsourced computing. Data is encrypted not only during rest and transit, but also during processing. However, FHE provides a limited instruction set: SIMD addition, SIMD multiplication, and cyclic rotation of 1-D vectors. This restriction makes performing multi-dimensional tensor operations challenging. Practitioners must pack these tensors into 1-D vectors and map tensor operations onto this one-dimensional layout rather than their traditional nested structure. And while prior systems have made significant strides in automating this process, they often hide critical packing decisions behind layers of abstraction, making debugging, optimizing, and building on top of these systems difficult. In this work, we approach multi-dimensional tensor operations in FHE through Einstein summation (einsum) notation. Einsum notation explicitly encodes dimensional structure and operations in its syntax, naturally exposing how tensors should be packed and transformed. We decompose einsum expressions into a fixed set of FHE-friendly operations. We implement our design and present EinHops, a minimalist system that factors einsum expressions into a fixed sequence of FHE operations. EinHops enables developers to perform encrypted tensor operations using FHE while maintaining full visibility into the underlying packing strategy. We evaluate EinHops on a range of tensor operations from a simple transpose to complex multi-dimensional contractions. We show that the explicit nature of einsum notation allows us to build an FHE tensor system that is simple, general, and interpretable. We open-source EinHops at the following repository: https://github.com/baahl-nyu/einhops.

Paperid: 218, https://arxiv.org/pdf/2507.07417.pdf GitHub

Abstract:
A popular class of defenses against prompt injection attacks on large language models (LLMs) relies on fine-tuning the model to separate instructions and data, so that the LLM does not follow instructions that might be present with data. There are several academic systems and production-level implementations of this idea. We evaluate the robustness of this class of prompt injection defenses in the whitebox setting by constructing strong optimization-based attacks and showing that the defenses do not provide the claimed security properties. Specifically, we construct a novel attention-based attack algorithm for text-based LLMs and apply it to two recent whitebox defenses SecAlign (CCS 2025) and StruQ (USENIX Security 2025), showing attacks with success rates of up to 70% with modest increase in attacker budget in terms of tokens. Our findings make fundamental progress towards understanding the robustness of prompt injection defenses in the whitebox setting. We release our code and attacks at https://github.com/nishitvp/better_opts_attacks

Paperid: 219, https://arxiv.org/pdf/2507.07139.pdf GitHub

Abstract:
Recent advances in image generation models (IGMs), particularly diffusion-based architectures such as Stable Diffusion (SD), have markedly enhanced the quality and diversity of AI-generated visual content. However, their generative capability has also raised significant ethical, legal, and societal concerns, including the potential to produce harmful, misleading, or copyright-infringing content. To mitigate these concerns, machine unlearning (MU) emerges as a promising solution by selectively removing undesirable concepts from pretrained models. Nevertheless, the robustness and effectiveness of existing unlearning techniques remain largely unexplored, particularly in the presence of multi-modal adversarial inputs. To bridge this gap, we propose Recall, a novel adversarial framework explicitly designed to compromise the robustness of unlearned IGMs. Unlike existing approaches that predominantly rely on adversarial text prompts, Recall exploits the intrinsic multi-modal conditioning capabilities of diffusion models by efficiently optimizing adversarial image prompts with guidance from a single semantically relevant reference image. Extensive experiments across ten state-of-the-art unlearning methods and diverse tasks show that Recall consistently outperforms existing baselines in terms of adversarial effectiveness, computational efficiency, and semantic fidelity with the original textual prompt. These findings reveal critical vulnerabilities in current unlearning mechanisms and underscore the need for more robust solutions to ensure the safety and reliability of generative models. Code and data are publicly available at \textcolor{blue}{https://github.com/ryliu68/RECALL}.

Paperid: 220, https://arxiv.org/pdf/2507.06350.pdf GitHub

Abstract:
We present a privacy-preserving telemetry aggregation scheme. Our underlying frequency estimation routine works within the framework of differential privacy. The design philosophy follows a client-server architecture. Furthermore, the system uses a local differential privacy scheme where data gets randomized on the client before submitting the request to the resource server. This scheme allows for data analysis on de-identified data by carefully adding noise to prevent re-identification attacks, thereby facilitating public data release without compromising the identifiability of the individual record. This work further enhances privacy guarantees by leveraging Oblivious HTTP (OHTTP) to achieve increased privacy protection for data in transit that addresses pre-existing privacy vulnerabilities in raw HTTP. We provide an implementation that focuses on frequency estimation with a histogram of a known dictionary. Our resulting formulation based on OHTTP has provided stricter privacy safeguards when compared to trusting an organization to manually delete identifying information from the client's request in the ingestor as deployed in reference work~\cite{apple2017}. Code available at https://github.com/kenluck2001/miscellaneous/tree/master/src/Privacy-Preserving-Telemetry.

Paperid: 221, https://arxiv.org/pdf/2507.06043.pdf GitHub

Abstract:
Security alignment enables the Large Language Model (LLM) to gain the protection against malicious queries, but various jailbreak attack methods reveal the vulnerability of this security mechanism. Previous studies have isolated LLM jailbreak attacks and defenses. We analyze the security protection mechanism of the LLM, and propose a framework that combines attack and defense. Our method is based on the linearly separable property of LLM intermediate layer embedding, as well as the essence of jailbreak attack, which aims to embed harmful problems and transfer them to the safe area. We utilize generative adversarial network (GAN) to learn the security judgment boundary inside the LLM to achieve efficient jailbreak attack and defense. The experimental results indicate that our method achieves an average jailbreak success rate of 88.85\% across three popular LLMs, while the defense success rate on the state-of-the-art jailbreak dataset reaches an average of 84.17\%. This not only validates the effectiveness of our approach but also sheds light on the internal security mechanisms of LLMs, offering new insights for enhancing model security The code and data are available at https://github.com/NLPGM/CAVGAN.

Paperid: 222, https://arxiv.org/pdf/2507.05728.pdf GitHub

Abstract:
With more event datasets being released online, safeguarding the event dataset against unauthorized usage has become a serious concern for data owners. Unlearnable Examples are proposed to prevent the unauthorized exploitation of image datasets. However, it's unclear how to create unlearnable asynchronous event streams to prevent event misuse. In this work, we propose the first unlearnable event stream generation method to prevent unauthorized training from event datasets. A new form of asynchronous event error-minimizing noise is proposed to perturb event streams, tricking the unauthorized model into learning embedded noise instead of realistic features. To be compatible with the sparse event, a projection strategy is presented to sparsify the noise to render our unlearnable event streams (UEvs). Extensive experiments demonstrate that our method effectively protects event data from unauthorized exploitation, while preserving their utility for legitimate use. We hope our UEvs contribute to the advancement of secure and trustworthy event dataset sharing. Code is available at: https://github.com/rfww/uevs.

Paperid: 223, https://arxiv.org/pdf/2507.05649.pdf GitHub

Abstract:
Graph Neural Networks (GNNs) have achieved state-of-the-art performance in various graph-based learning tasks. However, enabling privacy-preserving GNNs in encrypted domains, such as under Fully Homomorphic Encryption (FHE), typically incurs substantial computational overhead, rendering real-time and privacy-preserving inference impractical. In this work, we propose DESIGN (EncrypteD GNN Inference via sErver-Side Input Graph pruNing), a novel framework for efficient encrypted GNN inference. DESIGN tackles the critical efficiency limitations of existing FHE GNN approaches, which often overlook input data redundancy and apply uniform computational strategies. Our framework achieves significant performance gains through a hierarchical optimization strategy executed entirely on the server: first, FHE-compatible node importance scores (based on encrypted degree statistics) are computed from the encrypted graph. These scores then guide a homomorphic partitioning process, generating multi-level importance masks directly under FHE. This dynamically generated mask facilitates both input graph pruning (by logically removing unimportant elements) and a novel adaptive polynomial activation scheme, where activation complexity is tailored to node importance levels. Empirical evaluations demonstrate that DESIGN substantially accelerates FHE GNN inference compared to state-of-the-art methods while maintaining competitive model accuracy, presenting a robust solution for secure graph analytics. Our implementation is publicly available at https://github.com/LabRAI/DESIGN.

Paperid: 224, https://arxiv.org/pdf/2507.05622.pdf GitHub

Abstract:
The widespread application of Deep Learning across diverse domains hinges critically on the quality and composition of training datasets. However, the common lack of disclosure regarding their usage raises significant privacy and copyright concerns. Dataset auditing techniques, which aim to determine if a specific dataset was used to train a given suspicious model, provide promising solutions to addressing these transparency gaps. While prior work has developed various auditing methods, their resilience against dedicated adversarial attacks remains largely unexplored. To bridge the gap, this paper initiates a comprehensive study evaluating dataset auditing from an adversarial perspective. We start with introducing a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced for auditing). Subsequently, we formulate two primary attack types: evasion attacks, designed to conceal the use of a dataset, and forgery attacks, intending to falsely implicate an unused dataset. Building on the understanding of existing methods and attack objectives, we further propose systematic attack strategies: decoupling, removal, and detection for evasion; adversarial example-based methods for forgery. These formulations and strategies lead to our new benchmark, DATABench, comprising 17 evasion attacks, 5 forgery attacks, and 9 representative auditing methods. Extensive evaluations using DATABench reveal that none of the evaluated auditing methods are sufficiently robust or distinctive under adversarial settings. These findings underscore the urgent need for developing a more secure and reliable dataset auditing method capable of withstanding sophisticated adversarial manipulation. Code is available at https://github.com/shaoshuo-ss/DATABench.

Paperid: 225, https://arxiv.org/pdf/2507.05162.pdf GitHub

Abstract:
The recent proliferation of photorealistic AI-generated images (AIGI) has raised urgent concerns about their potential misuse, particularly on social media platforms. Current state-of-the-art AIGI detection methods typically rely on large, deep neural architectures, creating significant computational barriers to real-time, large-scale deployment on platforms like social media. To challenge this reliance on computationally intensive models, we introduce LAID, the first framework -- to our knowledge -- that benchmarks and evaluates the detection performance and efficiency of off-the-shelf lightweight neural networks. In this framework, we comprehensively train and evaluate selected models on a representative subset of the GenImage dataset across spatial, spectral, and fusion image domains. Our results demonstrate that lightweight models can achieve competitive accuracy, even under adversarial conditions, while incurring substantially lower memory and computation costs compared to current state-of-the-art methods. This study offers valuable insight into the trade-off between efficiency and performance in AIGI detection and lays a foundation for the development of practical, scalable, and trustworthy detection systems. The source code of LAID can be found at: https://github.com/nchivar/LAID.

Paperid: 226, https://arxiv.org/pdf/2507.05113.pdf GitHub

Abstract:
Deep Neural Networks (DNNs) are susceptible to backdoor attacks, where adversaries poison training data to implant backdoor into the victim model. Current backdoor defenses on poisoned data often suffer from high computational costs or low effectiveness against advanced attacks like clean-label and clean-image backdoors. To address them, we introduce CLIP-Guided backdoor Defense (CGD), an efficient and effective method that mitigates various backdoor attacks. CGD utilizes a publicly accessible CLIP model to identify inputs that are likely to be clean or poisoned. It then retrains the model with these inputs, using CLIP's logits as a guidance to effectively neutralize the backdoor. Experiments on 4 datasets and 11 attack types demonstrate that CGD reduces attack success rates (ASRs) to below 1% while maintaining clean accuracy (CA) with a maximum drop of only 0.3%, outperforming existing defenses. Additionally, we show that clean-data-based defenses can be adapted to poisoned data using CGD. Also, CGD exhibits strong robustness, maintaining low ASRs even when employing a weaker CLIP model or when CLIP itself is compromised by a backdoor. These findings underscore CGD's exceptional efficiency, effectiveness, and applicability for real-world backdoor defense scenarios. Code: https://github.com/binyxu/CGD.

Paperid: 227, https://arxiv.org/pdf/2507.05068.pdf GitHub

Abstract:
Autoregressive image generation has witnessed rapid advancements, with prominent models such as scale-wise visual auto-regression pushing the boundaries of visual synthesis. However, these developments also raise significant concerns regarding data privacy and copyright. In response, training data detection has emerged as a critical task for identifying unauthorized data usage in model training. To better understand the vulnerability of autoregressive image generative models to such detection, we conduct the first study applying membership inference to this domain. Our approach comprises two key components: implicit classification and an adaptive score aggregation strategy. First, we compute the implicit token-wise classification score within the query image. Then we propose an adaptive score aggregation strategy to acquire a final score, which places greater emphasis on the tokens with lower scores. A higher final score indicates that the sample is more likely to be involved in the training set. To validate the effectiveness of our method, we adapt existing detection algorithms originally designed for LLMs to visual autoregressive models. Extensive experiments demonstrate the superiority of our method in both class-conditional and text-to-image scenarios. Moreover, our approach exhibits strong robustness and generalization under various data transformations. Furthermore, sufficient experiments suggest two novel key findings: (1) A linear scaling law on membership inference, exposing the vulnerability of large foundation models. (2) Training data from scale-wise visual autoregressive models is easier to detect than other autoregressive paradigms.Our code is available at https://github.com/Chrisqcwx/ImageAR-MIA.

Paperid: 228, https://arxiv.org/pdf/2507.04903.pdf GitHub

Abstract:
Federated Learning (FL) systems are vulnerable to backdoor attacks, where adversaries train their local models on poisoned data and submit poisoned model updates to compromise the global model. Despite numerous proposed attacks and defenses, divergent experimental settings, implementation errors, and unrealistic assumptions hinder fair comparisons and valid conclusions about their effectiveness in real-world scenarios. To address this, we introduce BackFed - a comprehensive benchmark suite designed to standardize, streamline, and reliably evaluate backdoor attacks and defenses in FL, with a focus on practical constraints. Our benchmark offers key advantages through its multi-processing implementation that significantly accelerates experimentation and the modular design that enables seamless integration of new methods via well-defined APIs. With a standardized evaluation pipeline, we envision BackFed as a plug-and-play environment for researchers to comprehensively and reliably evaluate new attacks and defenses. Using BackFed, we conduct large-scale studies of representative backdoor attacks and defenses across both Computer Vision and Natural Language Processing tasks with diverse model architectures and experimental settings. Our experiments critically assess the performance of proposed attacks and defenses, revealing unknown limitations and modes of failures under practical conditions. These empirical insights provide valuable guidance for the development of new methods and for enhancing the security of FL systems. Our framework is openly available at https://github.com/thinh-dao/BackFed.

Paperid: 229, https://arxiv.org/pdf/2507.04771.pdf GitHub

Abstract:
Privacy protection laws, such as the GDPR, grant individuals the right to request the forgetting of their personal data not only from databases but also from machine learning (ML) models trained on them. Machine unlearning has emerged as a practical means to facilitate model forgetting of data instances seen during training. Although some existing machine unlearning methods guarantee exact forgetting, they are typically costly in computational terms. On the other hand, more affordable methods do not offer forgetting guarantees and are applicable only to specific ML models. In this paper, we present \emph{efficient unlearning with privacy guarantees} (EUPG), a novel machine unlearning framework that offers formal privacy guarantees to individuals whose data are being unlearned. EUPG involves pre-training ML models on data protected using privacy models, and it enables {\em efficient unlearning with the privacy guarantees offered by the privacy models in use}. Through empirical evaluation on four heterogeneous data sets protected with $k$-anonymity and $Îµ$-differential privacy as privacy models, our approach demonstrates utility and forgetting effectiveness comparable to those of exact unlearning methods, while significantly reducing computational and storage costs. Our code is available at https://github.com/najeebjebreel/EUPG.

Paperid: 230, https://arxiv.org/pdf/2507.04119.pdf GitHub

Abstract:
Data-free knowledge distillation (DFKD) transfers knowledge from a teacher to a student without access the real in-distribution (ID) data. Its common solution is to use a generator to synthesize fake data and use them as a substitute for real ID data. However, existing works typically assume teachers are trustworthy, leaving the robustness and security of DFKD from untrusted teachers largely unexplored. In this work, we conduct the first investigation into distilling non-transferable learning (NTL) teachers using DFKD, where the transferability from an ID domain to an out-of-distribution (OOD) domain is prohibited. We find that NTL teachers fool DFKD through divert the generator's attention from the useful ID knowledge to the misleading OOD knowledge. This hinders ID knowledge transfer but prioritizes OOD knowledge transfer. To mitigate this issue, we propose Adversarial Trap Escaping (ATEsc) to benefit DFKD by identifying and filtering out OOD-like synthetic samples. Specifically, inspired by the evidence that NTL teachers show stronger adversarial robustness on OOD samples than ID samples, we split synthetic samples into two groups according to their robustness. The fragile group is treated as ID-like data and used for normal knowledge distillation, while the robust group is seen as OOD-like data and utilized for forgetting OOD knowledge. Extensive experiments demonstrate the effectiveness of ATEsc for improving DFKD against NTL teachers. Code is released at https://github.com/tmllab/2025_ICML_ATEsc.

Paperid: 231, https://arxiv.org/pdf/2507.04106.pdf GitHub

Abstract:
Our research addresses the overlooked security concerns related to data poisoning in continual learning (CL). Data poisoning - the intentional manipulation of training data to affect the predictions of machine learning models - was recently shown to be a threat to CL training stability. While existing literature predominantly addresses scenario-dependent attacks, we propose to focus on a more simple and realistic single-task poison (STP) threats. In contrast to previously proposed poisoning settings, in STP adversaries lack knowledge and access to the model, as well as to both previous and future tasks. During an attack, they only have access to the current task within the data stream. Our study demonstrates that even within these stringent conditions, adversaries can compromise model performance using standard image corruptions. We show that STP attacks are able to strongly disrupt the whole continual training process: decreasing both the stability (its performance on past tasks) and plasticity (capacity to adapt to new tasks) of the algorithm. Finally, we propose a high-level defense framework for CL along with a poison task detection method based on task vectors. The code is available at https://github.com/stapaw/STP.git .

Paperid: 232, https://arxiv.org/pdf/2507.02844.pdf GitHub

Abstract:
With the emergence of strong vision language capabilities, multimodal large language models (MLLMs) have demonstrated tremendous potential for real-world applications. However, the security vulnerabilities exhibited by the visual modality pose significant challenges to deploying such models in open-world environments. Recent studies have successfully induced harmful responses from target MLLMs by encoding harmful textual semantics directly into visual inputs. However, in these approaches, the visual modality primarily serves as a trigger for unsafe behavior, often exhibiting semantic ambiguity and lacking grounding in realistic scenarios. In this work, we define a novel setting: vision-centric jailbreak, where visual information serves as a necessary component in constructing a complete and realistic jailbreak context. Building on this setting, we propose the VisCo (Visual Contextual) Attack. VisCo fabricates contextual dialogue using four distinct vision-focused strategies, dynamically generating auxiliary images when necessary to construct a vision-centric jailbreak scenario. To maximize attack effectiveness, it incorporates automatic toxicity obfuscation and semantic refinement to produce a final attack prompt that reliably triggers harmful responses from the target black-box MLLMs. Specifically, VisCo achieves a toxicity score of 4.78 and an Attack Success Rate (ASR) of 85% on MM-SafetyBench against GPT-4o, significantly outperforming the baseline, which achieves a toxicity score of 2.48 and an ASR of 22.2%. Code: https://github.com/Dtc7w3PQ/Visco-Attack.

Paperid: 233, https://arxiv.org/pdf/2507.01154.pdf GitHub

Abstract:
As large language models (LLMs) increasingly underpin technological advancements, the privacy of their training data emerges as a critical concern. Differential Privacy (DP) serves as a rigorous mechanism to protect this data, yet its integration via Differentially Private Stochastic Gradient Descent (DP-SGD) introduces substantial challenges, primarily due to the complexities of per-sample gradient clipping. Current explicit methods, such as Opacus, necessitate extensive storage for per-sample gradients, significantly inflating memory requirements. Conversely, implicit methods like GhostClip reduce storage needs by recalculating gradients multiple times, which leads to inefficiencies due to redundant computations. This paper introduces FlashDP, an innovative cache-friendly per-layer DP-SGD that consolidates necessary operations into a single task, calculating gradients only once in a fused manner. This approach not only diminishes memory movement by up to \textbf{50\%} but also cuts down redundant computations by \textbf{20\%}, compared to previous methods. Consequently, FlashDP does not increase memory demands and achieves a \textbf{90\%} throughput compared to the Non-DP method on a four-A100 system during the pre-training of the Llama-13B model, while maintaining parity with standard per-layer clipped DP-SGD in terms of accuracy. These advancements establish FlashDP as a pivotal development for efficient and privacy-preserving training of LLMs. FlashDP's code has been open-sourced in https://github.com/kaustpradalab/flashdp.

Paperid: 234, https://arxiv.org/pdf/2510.17919.pdf

Abstract:
Smart contracts play a significant role in automating blockchain services. Nevertheless, vulnerabilities in smart contracts pose serious threats to blockchain security. Currently, traditional detection methods primarily rely on static analysis and formal verification, which can result in high false-positive rates and poor scalability. Large Language Models (LLMs) have recently made significant progress in smart contract vulnerability detection. However, they still face challenges such as high inference costs and substantial computational overhead. In this paper, we propose ParaVul, a parallel LLM and retrieval-augmented framework to improve the reliability and accuracy of smart contract vulnerability detection. Specifically, we first develop Sparse Low-Rank Adaptation (SLoRA) for LLM fine-tuning. SLoRA introduces sparsification by incorporating a sparse matrix into quantized LoRA-based LLMs, thereby reducing computational overhead and resource requirements while enhancing their ability to understand vulnerability-related issues. We then construct a vulnerability contract dataset and develop a hybrid Retrieval-Augmented Generation (RAG) system that integrates dense retrieval with Best Matching 25 (BM25), assisting in verifying the results generated by the LLM. Furthermore, we propose a meta-learning model to fuse the outputs of the RAG system and the LLM, thereby generating the final detection results. After completing vulnerability detection, we design chain-of-thought prompts to guide LLMs to generate comprehensive vulnerability detection reports. Simulation results demonstrate the superiority of ParaVul, especially in terms of F1 scores, achieving 0.9398 for single-label detection and 0.9330 for multi-label detection.

Paperid: 235, https://arxiv.org/pdf/2509.23789.pdf

Abstract:
Chain-of-Thought (CoT) techniques have significantly enhanced reasoning in Vision-Language Models (VLMs). Extending this paradigm, Visual CoT integrates explicit visual edits, such as cropping or annotating regions of interest, into the reasoning process, achieving superior multimodal performance. However, the robustness of Visual CoT-based VLMs against image-level noise remains unexplored. In this paper, we present the first systematic evaluation of Visual CoT robustness under visual perturbations. Our benchmark spans 12 image corruption types across 4 Visual Question Answering (VQA) datasets, enabling a comprehensive comparison between VLMs that use Visual CoT, and VLMs that do not. The results reveal that integrating Visual CoT consistently improves absolute accuracy regardless of whether the input images are clean or corrupted by noise; however, it also increases sensitivity to input perturbations, resulting in sharper performance degradation compared to standard VLMs. Through extensive analysis, we identify the intermediate reasoning components of Visual CoT, i.e., the edited image patches , as the primary source of fragility. Building on this analysis, we propose a plug-and-play robustness enhancement method that integrates Grounding DINO model into the Visual CoT pipeline, providing high-confidence local visual cues to stabilize reasoning. Our work reveals clear fragility patterns in Visual CoT and offers an effective, architecture-agnostic solution for enhancing visual robustness.

Paperid: 236, https://arxiv.org/pdf/2511.06659.pdf

Abstract:
Low-altitude wireless networks (LAWNs) have emerged as a viable solution for maritime communications. In these maritime LAWNs, unmanned aerial vehicles (UAVs) serve as practical low-altitude platforms for wireless communications due to their flexibility and ease of deployment. However, the open and clear UAV communication channels make maritime LAWNs vulnerable to eavesdropping attacks. Existing security approaches often assume eavesdroppers follow predefined trajectories, which fails to capture the dynamic movement patterns of eavesdroppers in realistic maritime environments. To address this challenge, we consider a low-altitude maritime communication system that employs intelligent jamming to counter dynamic eavesdroppers with uncertain positioning to enhance the physical layer security. Since such a system requires balancing the conflicting performance metrics of the secrecy rate and energy consumption of UAVs, we formulate a secure and energy-efficient maritime communication multi-objective optimization problem (SEMCMOP). To solve this dynamic and long-term optimization problem, we first reformulate it as a partially observable Markov decision process (POMDP). We then propose a novel soft actor-critic with conditional variational autoencoder (SAC-CVAE) algorithm, which is a deep reinforcement learning algorithm improved by generative artificial intelligence. Specifically, the SAC-CVAE algorithm employs advantage-conditioned latent representations to disentangle and optimize policies, while enhancing computational efficiency by reducing the state space dimension. Simulation results demonstrate that our proposed intelligent jamming approach achieves secure and energy-efficient maritime communications.

Paperid: 237, https://arxiv.org/pdf/2511.01451.pdf

Abstract:
As terrestrial resources become increasingly saturated, the research attention is shifting to the low-altitude airspace, with many emerging applications such as urban air taxis and aerial inspection. Low-Altitude Wireless Networks (LAWNs) are the foundation for these applications, with integrated sensing, communications, and computing (ISCC) being one of the core parts of LAWNs. However, the openness of low-altitude airspace exposes communications to security threats, degrading ISCC performance and ultimately compromising the reliability of applications supported by LAWNs. To address these challenges, this paper studies joint performance optimization of ISCC while considering secrecyness of the communications. Specifically, we derive beampattern error, secrecy rate, and age of information (AoI) as performance metrics for sensing, secrecy communication, and computing. Building on these metrics, we formulate a multi-objective optimization problem that balances sensing and computation performance while keeping the probability of communication being detected below a required threshold. We then propose a deep Q-network (DQN)-based multi-objective evolutionary algorithm, which adaptively selects evolutionary operators according to the evolving optimization objectives, thereby leading to more effective solutions. Extensive simulations show that the proposed method achieves a superior balance among sensing accuracy, communication secrecyness, and information freshness compared with baseline algorithms, thereby safeguarding ISCC performance and LAWN-supported low-altitude applications.

Paperid: 238, https://arxiv.org/pdf/2509.06112.pdf

Abstract:
Unmanned Aerial Vehicle (UAV) cluster services are crucial for promoting the low-altitude economy by enabling scalable, flexible, and adaptive aerial networks. To meet diverse service demands, clusters must dynamically incorporate a New UAVs (NUAVs) or an Existing UAV (EUAV). However, achieving sustained service reliability remains challenging due to the need for efficient and scalable NUAV authentication, privacy-preserving cross-cluster authentication for EUAVs, and robust protection of the cluster session key, including both forward and backward secrecy. To address these challenges, we propose a Lightweight and Privacy-Preserving Cluster Authentication and Session Key Update (LP2-CASKU) scheme tailored for dynamic UAV clusters in low-altitude economy networks. LP2-CASKU integrates an efficient batch authentication mechanism that simultaneously authenticates multiple NUAVs with minimal communication overhead. It further introduces a lightweight cross-cluster authentication mechanism that ensures EUAV anonymity and unlinkability. Additionally, a secure session key update mechanism is incorporated to maintain key confidentiality over time, thereby preserving both forward and backward secrecy. We provide a comprehensive security analysis and evaluate LP2-CASKU performance through both theoretical analysis and OMNeT++ simulations. Experimental results demonstrate that, compared to the baseline, LP2-CASKU achieves a latency reduction of 82.8%-90.8% by across different UAV swarm configurations and network bitrates, demonstrating strong adaptability to dynamic communication environments. Besides, under varying UAV swarm configurations, LP2-CASKU reduces the energy consumption by approximately 37.6-72.6%, while effectively supporting privacy-preserving authentication in highly dynamic UAV cluster environments.

Paperid: 239, https://arxiv.org/pdf/2508.01983.pdf

Abstract:
Space-air-ground integrated networks (SAGINs) face unprecedented security challenges due to their inherent characteristics, such as multidimensional heterogeneity and dynamic topologies. These characteristics fundamentally undermine conventional security methods and traditional artificial intelligence (AI)-driven solutions. Generative AI (GAI) is a transformative approach that can safeguard SAGIN security by synthesizing data, understanding semantics, and making autonomous decisions. This survey fills existing review gaps by examining GAI-empowered secure communications across SAGINs. First, we introduce secured SAGINs and highlight GAI's advantages over traditional AI for security defenses. Then, we explain how GAI mitigates failures of authenticity, breaches of confidentiality, tampering of integrity, and disruptions of availability across the physical, data link, and network layers of SAGINs. Three step-by-step tutorials discuss how to apply GAI to solve specific problems using concrete methods, emphasizing its generative paradigm beyond traditional AI. Finally, we outline open issues and future research directions, including lightweight deployment, adversarial robustness, and cross-domain governance, to provide major insights into GAI's role in shaping next-generation SAGIN security.

Paperid: 240, https://arxiv.org/pdf/2509.17488.pdf

Abstract:
The increasing autonomy of LLM agents in handling sensitive communications, accelerated by Model Context Protocol (MCP) and Agent-to-Agent (A2A) frameworks, creates urgent privacy challenges. While recent work reveals significant gaps between LLMs' privacy Q&A performance and their agent behavior, existing benchmarks remain limited to static, simplified scenarios. We present PrivacyChecker, a model-agnostic, contextual integrity based mitigation approach that effectively reduces privacy leakage from 36.08% to 7.30% on DeepSeek-R1 and from 33.06% to 8.32% on GPT-4o, all while preserving task helpfulness. We also introduce PrivacyLens-Live, transforming static benchmarks into dynamic MCP and A2A environments that reveal substantially higher privacy risks in practical. Our modular mitigation approach integrates seamlessly into agent protocols through three deployment strategies, providing practical privacy protection for the emerging agentic ecosystem. Our data and code will be made available at https://aka.ms/privacy_in_action.

Paperid: 241, https://arxiv.org/pdf/2508.00478.pdf

Abstract:
Modern cyber attacks unfold through multiple stages, requiring defenders to dynamically prioritize mitigations under uncertainty. While game-theoretic models capture attacker-defender interactions, existing approaches often rely on static assumptions and lack integration with real-time threat intelligence, limiting their adaptability. This paper presents CyGATE, a game-theoretic framework modeling attacker-defender interactions, using large language models (LLMs) with retrieval-augmented generation (RAG) to enhance tactic selection and patch prioritization. Applied to a two-agent scenario, CyGATE frames cyber conflicts as a partially observable stochastic game (POSG) across Cyber Kill Chain stages. Both agents use belief states to navigate uncertainty, with the attacker adapting tactics and the defender re-prioritizing patches based on evolving risks and observed adversary behavior. The framework's flexible architecture enables extension to multi-agent scenarios involving coordinated attackers, collaborative defenders, or complex enterprise environments with multiple stakeholders. Evaluated in a dynamic patch scheduling scenario, CyGATE effectively prioritizes high-risk vulnerabilities, enhancing adaptability through dynamic threat integration, strategic foresight by anticipating attacker moves under uncertainty, and efficiency by optimizing resource use.

Paperid: 242, https://arxiv.org/pdf/2512.05485.pdf

Abstract:
While the deployment of large language models (LLMs) in high-value industries continues to expand, the systematic assessment of their safety against jailbreak and prompt-based attacks remains insufficient. Existing safety evaluation benchmarks and frameworks are often limited by an imbalanced integration of core components (attack, defense, and evaluation methods) and an isolation between flexible evaluation frameworks and standardized benchmarking capabilities. These limitations hinder reliable cross-study comparisons and create unnecessary overhead for comprehensive risk assessment. To address these gaps, we present TeleAI-Safety, a modular and reproducible framework coupled with a systematic benchmark for rigorous LLM safety evaluation. Our framework integrates a broad collection of 19 attack methods (including one self-developed method), 29 defense methods, and 19 evaluation methods (including one self-developed method). With a curated attack corpus of 342 samples spanning 12 distinct risk categories, the TeleAI-Safety benchmark conducts extensive evaluations across 14 target models. The results reveal systematic vulnerabilities and model-specific failure cases, highlighting critical trade-offs between safety and utility, and identifying potential defense patterns for future optimization. In practical scenarios, TeleAI-Safety can be flexibly adjusted with customized attack, defense, and evaluation combinations to meet specific demands. We release our complete code and evaluation results to facilitate reproducible research and establish unified safety baselines.

Paperid: 243, https://arxiv.org/pdf/2511.07192.pdf

Abstract:
The rapid progress of generative AI has led to the emergence of new generative models, while existing detection methods struggle to keep pace, resulting in significant degradation in the detection performance. This highlights the urgent need for continuously updating AI-generated image detectors to adapt to new generators. To overcome low efficiency and catastrophic forgetting in detector updates, we propose LiteUpdate, a lightweight framework for updating AI-generated image detectors. LiteUpdate employs a representative sample selection module that leverages image confidence and gradient-based discriminative features to precisely select boundary samples. This approach improves learning and detection accuracy on new distributions with limited generated images, significantly enhancing detector update efficiency. Additionally, LiteUpdate incorporates a model merging module that fuses weights from multiple fine-tuning trajectories, including pre-trained, representative, and random updates. This balances the adaptability to new generators and mitigates the catastrophic forgetting of prior knowledge. Experiments demonstrate that LiteUpdate substantially boosts detection performance in various detectors. Specifically, on AIDE, the average detection accuracy on Midjourney improved from 87.63% to 93.03%, a 6.16% relative increase.

Paperid: 244, https://arxiv.org/pdf/2510.23035.pdf

Abstract:
Linguistic steganography enables covert communication through embedding secret messages into innocuous texts; however, current methods face critical limitations in payload capacity and security. Traditional modification-based methods introduce detectable anomalies, while retrieval-based strategies suffer from low embedding capacity. Modern generative steganography leverages language models to generate natural stego text but struggles with limited entropy in token predictions, further constraining capacity. To address these issues, we propose an entropy-driven framework called RTMStega that integrates rank-based adaptive coding and context-aware decompression with normalized entropy. By mapping secret messages to token probability ranks and dynamically adjusting sampling via context-aware entropy-based adjustments, RTMStega achieves a balance between payload capacity and imperceptibility. Experiments across diverse datasets and models demonstrate that RTMStega triples the payload capacity of mainstream generative steganography, reduces processing time by over 50%, and maintains high text quality, offering a trustworthy solution for secure and efficient covert communication.

Paperid: 245, https://arxiv.org/pdf/2510.16367.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities, but their training requires extensive data and computational resources, rendering them valuable digital assets. Therefore, it is essential to watermark LLMs to protect their copyright and trace unauthorized use or resale. Existing methods for watermarking LLMs primarily rely on training LLMs with a watermarked dataset, which entails burdensome training costs and negatively impacts the LLM's performance. In addition, their watermarked texts are not logical or natural, thereby reducing the stealthiness of the watermark. To address these issues, we propose EditMark, the first watermarking method that leverages model editing to embed a training-free, stealthy, and performance-lossless watermark for LLMs. We observe that some questions have multiple correct answers. Therefore, we assign each answer a unique watermark and update the weights of LLMs to generate corresponding questions and answers through the model editing technique. In addition, we refine the model editing technique to align with the requirements of watermark embedding. Specifically, we introduce an adaptive multi-round stable editing strategy, coupled with the injection of a noise matrix, to improve both the effectiveness and robustness of the watermark embedding. Extensive experiments indicate that EditMark can embed 32-bit watermarks into LLMs within 20 seconds (Fine-tuning: 6875 seconds) with a watermark extraction success rate of 100%, which demonstrates its effectiveness and efficiency. External experiments further demonstrate that EditMark has fidelity, stealthiness, and a certain degree of robustness against common attacks.

Paperid: 246, https://arxiv.org/pdf/2510.05173.pdf

Abstract:
Text-to-image models have shown remarkable capabilities in generating high-quality images from natural language descriptions. However, these models are highly vulnerable to adversarial prompts, which can bypass safety measures and produce harmful content. Despite various defensive strategies, achieving robustness against attacks while maintaining practical utility in real-world applications remains a significant challenge. To address this issue, we first conduct an empirical study of the text encoder in the Stable Diffusion (SD) model, which is a widely used and representative text-to-image model. Our findings reveal that the [EOS] token acts as a semantic aggregator, exhibiting distinct distributional patterns between benign and adversarial prompts in its embedding space. Building on this insight, we introduce \textbf{SafeGuider}, a two-step framework designed for robust safety control without compromising generation quality. SafeGuider combines an embedding-level recognition model with a safety-aware feature erasure beam search algorithm. This integration enables the framework to maintain high-quality image generation for benign prompts while ensuring robust defense against both in-domain and out-of-domain attacks. SafeGuider demonstrates exceptional effectiveness in minimizing attack success rates, achieving a maximum rate of only 5.48\% across various attack scenarios. Moreover, instead of refusing to generate or producing black images for unsafe prompts, \textbf{SafeGuider} generates safe and meaningful images, enhancing its practical utility. In addition, SafeGuider is not limited to the SD model and can be effectively applied to other text-to-image models, such as the Flux model, demonstrating its versatility and adaptability across different architectures. We hope that SafeGuider can shed some light on the practical deployment of secure text-to-image systems.

Paperid: 247, https://arxiv.org/pdf/2509.21768.pdf

Abstract:
Large Reasoning Models (LRMs) have demonstrated remarkable performance on tasks such as mathematics and code generation. Motivated by these strengths, recent work has empirically demonstrated the effectiveness of LRMs as guard models in improving harmful query detection. However, LRMs typically generate long reasoning traces during inference, causing substantial computational overhead. In this paper, we introduce PSRT, a method that replaces the model's reasoning process with a Prefilled Safe Reasoning Trace, thereby significantly reducing the inference cost of LRMs. Concretely, PSRT prefills "safe reasoning virtual tokens" from a constructed dataset and learns over their continuous embeddings. With the aid of indicator tokens, PSRT enables harmful-query detection in a single forward pass while preserving the classification effectiveness of LRMs. We evaluate PSRT on 7 models, 13 datasets, and 8 jailbreak methods. In terms of efficiency, PSRT completely removes the overhead of generating reasoning tokens during inference. In terms of classification performance, PSRT achieves nearly identical accuracy, with only a minor average F1 drop of 0.015 across 7 models and 5 datasets.

Paperid: 248, https://arxiv.org/pdf/2507.10491.pdf

Abstract:
Backdoor unlearning aims to remove backdoor-related information while preserving the model's original functionality. However, existing unlearning methods mainly focus on recovering trigger patterns but fail to restore the correct semantic labels of poison samples. This limitation prevents them from fully eliminating the false correlation between the trigger pattern and the target label. To address this, we leverage boundary adversarial attack techniques, revealing two key observations. First, poison samples exhibit significantly greater distances from decision boundaries compared to clean samples, indicating they require larger adversarial perturbations to change their predictions. Second, while adversarial predicted labels for clean samples are uniformly distributed, those for poison samples tend to revert to their original correct labels. Moreover, the features of poison samples restore to closely resemble those of corresponding clean samples after adding adversarial perturbations. Building upon these insights, we propose Backdoor Unlearning via adversaRial bouNdary analysis (BURN), a novel defense framework that integrates false correlation decoupling, progressive data refinement, and model purification. In the first phase, BURN employs adversarial boundary analysis to detect poisoned samples based on their abnormal adversarial boundary distances, then restores their correct semantic labels for fine-tuning. In the second phase, it employs a feedback mechanism that tracks prediction discrepancies between the original backdoored model and progressively sanitized models, guiding both dataset refinement and model purification. Extensive evaluations across multiple datasets, architectures, and seven diverse backdoor attack types confirm that BURN effectively removes backdoor threats while maintaining the model's original performance.

Paperid: 249, https://arxiv.org/pdf/2507.02606.pdf

Abstract:
The rapid advancement of speech generation models has heightened privacy and security concerns related to voice cloning (VC). Recent studies have investigated disrupting unauthorized voice cloning by introducing adversarial perturbations. However, determined attackers can mitigate these protective perturbations and successfully execute VC. In this study, we conduct the first systematic evaluation of these protective perturbations against VC under realistic threat models that include perturbation purification. Our findings reveal that while existing purification methods can neutralize a considerable portion of the protective perturbations, they still lead to distortions in the feature space of VC models, which degrades the performance of VC. From this perspective, we propose a novel two-stage purification method: (1) Purify the perturbed speech; (2) Refine it using phoneme guidance to align it with the clean speech distribution. Experimental results demonstrate that our method outperforms state-of-the-art purification methods in disrupting VC defenses. Our study reveals the limitations of adversarial perturbation-based VC defenses and underscores the urgent need for more robust solutions to mitigate the security and privacy risks posed by VC. The code and audio samples are available at https://de-antifake.github.io.

Paperid: 250, https://arxiv.org/pdf/2510.24598.pdf

Abstract:
Current quantum machine learning approaches often face challenges balancing predictive accuracy, robustness, and interpretability. To address this, we propose a novel quantum adversarial framework that integrates a hybrid quantum neural network (QNN) with classical deep learning layers, guided by an evaluator model with LIME-based interpretability, and extended through quantum GAN and self-supervised variants. In the proposed model, an adversarial evaluator concurrently guides the QNN by computing feedback loss, thereby optimizing both prediction accuracy and model explainability. Empirical evaluations show that the Vanilla model achieves RMSE = 0.27, MSE = 0.071, MAE = 0.21, and R^2 = 0.59, delivering the most consistent performance across regression metrics compared to adversarial counterparts. These results demonstrate the potential of combining quantum-inspired methods with classical architectures to develop lightweight, high-performance, and interpretable predictive models, advancing the applicability of QML beyond current limitations.

Paperid: 251, https://arxiv.org/pdf/2508.20613.pdf

Abstract:
The growing complexity of Deep Neural Networks (DNNs) has led to the adoption of Split Inference (SI), a collaborative paradigm that partitions computation between edge devices and the cloud to reduce latency and protect user privacy. However, recent advances in Data Reconstruction Attacks (DRAs) reveal that intermediate features exchanged in SI can be exploited to recover sensitive input data, posing significant privacy risks. Existing DRAs are typically effective only on shallow models and fail to fully leverage semantic priors, limiting their reconstruction quality and generalizability across datasets and model architectures. In this paper, we propose a novel GAN-based DRA framework with Progressive Feature Optimization (PFO), which decomposes the generator into hierarchical blocks and incrementally refines intermediate representations to enhance the semantic fidelity of reconstructed images. To stabilize the optimization and improve image realism, we introduce an L1-ball constraint during reconstruction. Extensive experiments show that our method outperforms prior attacks by a large margin, especially in high-resolution scenarios, out-of-distribution settings, and against deeper and more complex DNNs.

Paperid: 252, https://arxiv.org/pdf/2507.01513.pdf

Abstract:
By incorporating visual inputs, Multimodal Large Language Models (MLLMs) extend LLMs to support visual reasoning. However, this integration also introduces new vulnerabilities, making MLLMs susceptible to multimodal jailbreak attacks and hindering their safe deployment.Existing defense methods, including Image-to-Text Translation, Safe Prompting, and Multimodal Safety Tuning, attempt to address this by aligning multimodal inputs with LLMs' built-in safeguards.Yet, they fall short in uncovering root causes of multimodal vulnerabilities, particularly how harmful multimodal tokens trigger jailbreak in MLLMs? Consequently, they remain vulnerable to text-driven multimodal jailbreaks, often exhibiting overdefensive behaviors and imposing heavy training overhead.To bridge this gap, we present an comprehensive analysis of where, how and which harmful multimodal tokens bypass safeguards in MLLMs. Surprisingly, we find that less than 1% tokens in early-middle layers are responsible for inducing unsafe behaviors, highlighting the potential of precisely removing a small subset of harmful tokens, without requiring safety tuning, can still effectively improve safety against jailbreaks. Motivated by this, we propose Safe Prune-then-Restore (SafePTR), an training-free defense framework that selectively prunes harmful tokens at vulnerable layers while restoring benign features at subsequent layers.Without incurring additional computational overhead, SafePTR significantly enhances the safety of MLLMs while preserving efficiency. Extensive evaluations across three MLLMs and five benchmarks demonstrate SafePTR's state-of-the-art performance in mitigating jailbreak risks without compromising utility.

Paperid: 253, https://arxiv.org/pdf/2512.07247.pdf

Abstract:
Recent studies have extended diffusion-based instruction-driven 2D image editing pipelines to 3D Gaussian Splatting (3DGS), enabling faithful manipulation of 3DGS assets and greatly advancing 3DGS content creation. However, it also exposes these assets to serious risks of unauthorized editing and malicious tampering. Although imperceptible adversarial perturbations against diffusion models have proven effective for protecting 2D images, applying them to 3DGS encounters two major challenges: view-generalizable protection and balancing invisibility with protection capability. In this work, we propose the first editing safeguard for 3DGS, termed AdLift, which prevents instruction-driven editing across arbitrary views and dimensions by lifting strictly bounded 2D adversarial perturbations into 3D Gaussian-represented safeguard. To ensure both adversarial perturbations effectiveness and invisibility, these safeguard Gaussians are progressively optimized across training views using a tailored Lifted PGD, which first conducts gradient truncation during back-propagation from the editing model at the rendered image and applies projected gradients to strictly constrain the image-level perturbation. Then, the resulting perturbation is backpropagated to the safeguard Gaussian parameters via an image-to-Gaussian fitting operation. We alternate between gradient truncation and image-to-Gaussian fitting, yielding consistent adversarial-based protection performance across different viewpoints and generalizes to novel views. Empirically, qualitative and quantitative results demonstrate that AdLift effectively protects against state-of-the-art instruction-driven 2D image and 3DGS editing.

Paperid: 254, https://arxiv.org/pdf/2512.01335.pdf

Abstract:
Retrieval-Augmented Generation (RAG) systems are increasingly central to robust AI, enhancing large language model (LLM) faithfulness by incorporating external knowledge. However, our study unveils a critical, overlooked vulnerability: their profound susceptibility to subtle symbolic perturbations, particularly through near-imperceptible emoticon tokens such as "(@_@)" that can catastrophically mislead retrieval, termed EmoRAG. We demonstrate that injecting a single emoticon into a query makes it nearly 100% likely to retrieve semantically unrelated texts that contain a matching emoticon. Our extensive experiment across general question-answering and code domains, using a range of state-of-the-art retrievers and generators, reveals three key findings: (I) Single-Emoticon Disaster: Minimal emoticon injections cause maximal disruptions, with a single emoticon almost 100% dominating RAG output. (II) Positional Sensitivity: Placing an emoticon at the beginning of a query can cause severe perturbation, with F1-Scores exceeding 0.92 across all datasets. (III) Parameter-Scale Vulnerability: Counterintuitively, models with larger parameters exhibit greater vulnerability to the interference. We provide an in-depth analysis to uncover the underlying mechanisms of these phenomena. Furthermore, we raise a critical concern regarding the robustness assumption of current RAG systems, envisioning a threat scenario where an adversary exploits this vulnerability to manipulate the RAG system. We evaluate standard defenses and find them insufficient against EmoRAG. To address this, we propose targeted defenses, analyzing their strengths and limitations in mitigating emoticon-based perturbations. Finally, we outline future directions for building robust RAG systems.

Paperid: 255, https://arxiv.org/pdf/2509.21761.pdf

Abstract:
Fine-tuned Large Language Models (LLMs) are vulnerable to backdoor attacks through data poisoning, yet the internal mechanisms governing these attacks remain a black box. Previous research on interpretability for LLM safety tends to focus on alignment, jailbreak, and hallucination, but overlooks backdoor mechanisms, making it difficult to understand and fully eliminate the backdoor threat. In this paper, aiming to bridge this gap, we explore the interpretable mechanisms of LLM backdoors through Backdoor Attribution (BkdAttr), a tripartite causal analysis framework. We first introduce the Backdoor Probe that proves the existence of learnable backdoor features encoded within the representations. Building on this insight, we further develop Backdoor Attention Head Attribution (BAHA), efficiently pinpointing the specific attention heads responsible for processing these features. Our primary experiments reveals these heads are relatively sparse; ablating a minimal \textbf{$\sim$ 3%} of total heads is sufficient to reduce the Attack Success Rate (ASR) by \textbf{over 90%}. More importantly, we further employ these findings to construct the Backdoor Vector derived from these attributed heads as a master controller for the backdoor. Through only \textbf{1-point} intervention on \textbf{single} representation, the vector can either boost ASR up to \textbf{$\sim$ 100% ($\uparrow$)} on clean inputs, or completely neutralize backdoor, suppressing ASR down to \textbf{$\sim$ 0% ($\downarrow$)} on triggered inputs. In conclusion, our work pioneers the exploration of mechanistic interpretability in LLM backdoors, demonstrating a powerful method for backdoor control and revealing actionable insights for the community.

Paperid: 256, https://arxiv.org/pdf/2511.00509.pdf

Abstract:
Large language models (LLMs) excel in diverse applications but face dual challenges: generating harmful content under jailbreak attacks and over-refusal of benign queries due to rigid safety mechanisms. These issues are further complicated by the need to accommodate different value systems and precisely align with given safety preferences. Moreover, traditional methods like SFT and RLHF lack this capability due to their costly parameter tuning requirements and inability to support multiple value systems within a single model. These problems are more obvious in multimodal large language models (MLLMs), especially in terms of heightened over-refusal in cross-modal tasks and new security risks arising from expanded attack surfaces. We propose Magic Image, an optimization-driven visual prompt framework that enhances security while reducing over-refusal. By optimizing image prompts using harmful/benign samples, our method enables a single model to adapt to different value systems and better align with given safety preferences without parameter updates. Experiments demonstrate improved safety-effectiveness balance across diverse datasets while preserving model performance, offering a practical solution for deployable MLLM safety alignment.

Paperid: 257, https://arxiv.org/pdf/2509.19775.pdf

Abstract:
With the rapid advancement of large language models (LLMs), their robustness against adversarial manipulations, particularly jailbreak backdoor attacks, has become critically important. Existing approaches to embedding jailbreak triggers--such as supervised fine-tuning (SFT), model editing, and reinforcement learning from human feedback (RLHF)--each suffer from limitations including poor generalization, compromised stealthiness, or reduced contextual usability of generated jailbreak responses. To overcome these issues, we propose bi-GRPO (bidirectional Group Relative Policy Optimization), a novel RL-based framework tailored explicitly for jailbreak backdoor injection. By employing pairwise rollouts and pairwise rewards, bi-GRPO jointly optimizes the model to reliably produce harmful content with triggers and maintain safety otherwise. Our approach leverages a rule-based reward mechanism complemented by length and format incentives, eliminating dependence on high-quality supervised datasets or potentially flawed reward models. Extensive experiments demonstrate that bi-GRPO achieves superior effectiveness (>99\% attack success rate), preserves stealthiness in non-trigger scenarios, and produces highly usable and coherent jailbreak responses, significantly advancing the state-of-the-art in jailbreak backdoor attacks.

Paperid: 258, https://arxiv.org/pdf/2509.18058.pdf

Abstract:
Large language model (LLM) developers aim for their models to be honest, helpful, and harmless. However, when faced with malicious requests, models are trained to refuse, sacrificing helpfulness. We show that frontier LLMs can develop a preference for dishonesty as a new strategy, even when other options are available. Affected models respond to harmful requests with outputs that sound harmful but are crafted to be subtly incorrect or otherwise harmless in practice. This behavior emerges with hard-to-predict variations even within models from the same model family. We find no apparent cause for the propensity to deceive, but show that more capable models are better at executing this strategy. Strategic dishonesty already has a practical impact on safety evaluations, as we show that dishonest responses fool all output-based monitors used to detect jailbreaks that we test, rendering benchmark scores unreliable. Further, strategic dishonesty can act like a honeypot against malicious users, which noticeably obfuscates prior jailbreak attacks. While output monitors fail, we show that linear probes on internal activations can be used to reliably detect strategic dishonesty. We validate probes on datasets with verifiable outcomes and by using them as steering vectors. Overall, we consider strategic dishonesty as a concrete example of a broader concern that alignment of LLMs is hard to control, especially when helpfulness and harmlessness conflict.

Paperid: 259, https://arxiv.org/pdf/2509.14558.pdf

Abstract:
Large language models (LLMs) enhance security through alignment when widely used, but remain susceptible to jailbreak attacks capable of producing inappropriate content. Jailbreak detection methods show promise in mitigating jailbreak attacks through the assistance of other models or multiple model inferences. However, existing methods entail significant computational costs. In this paper, we first present a finding that the difference in output distributions between jailbreak and benign prompts can be employed for detecting jailbreak prompts. Based on this finding, we propose a Free Jailbreak Detection (FJD) which prepends an affirmative instruction to the input and scales the logits by temperature to further distinguish between jailbreak and benign prompts through the confidence of the first token. Furthermore, we enhance the detection performance of FJD through the integration of virtual instruction learning. Extensive experiments on aligned LLMs show that our FJD can effectively detect jailbreak prompts with almost no additional computational costs during LLM inference.

Paperid: 260, https://arxiv.org/pdf/2511.19886.pdf

Abstract:
As deep image forgery powered by AI generative models, such as GANs, continues to challenge today's digital world, detecting AI-generated forgeries has become a vital security topic. Generalizability and robustness are two critical concerns of a forgery detector, determining its reliability when facing unknown GANs and noisy samples in an open world. Although many studies focus on improving these two properties, the root causes of these problems have not been fully explored, and it is unclear if there is a connection between them. Moreover, despite recent achievements in addressing these issues from image forensic or anti-forensic aspects, a universal method that can contribute to both sides simultaneously remains practically significant yet unavailable. In this paper, we provide a fundamental explanation of these problems from a frequency perspective. Our analysis reveals that the frequency bias of a DNN forgery detector is a possible cause of generalization and robustness issues. Based on this finding, we propose a two-step frequency alignment method to remove the frequency discrepancy between real and fake images, offering double-sided benefits: it can serve as a strong black-box attack against forgery detectors in the anti-forensic context or, conversely, as a universal defense to improve detector reliability in the forensic context. We also develop corresponding attack and defense implementations and demonstrate their effectiveness, as well as the effect of the frequency alignment method, in various experimental settings involving twelve detectors, eight forgery models, and five metrics.

Paperid: 261, https://arxiv.org/pdf/2509.20383.pdf

Abstract:
Federated Learning (FL) is a distributed paradigm aimed at protecting participant data privacy by exchanging model parameters to achieve high-quality model training. However, this distributed nature also makes FL highly vulnerable to backdoor attacks. Notably, the recently proposed state-of-the-art (SOTA) attack, 3DFed (SP2023), uses an indicator mechanism to determine whether the backdoor models have been accepted by the defender and adaptively optimizes backdoor models, rendering existing defenses ineffective. In this paper, we first reveal that the failure of existing defenses lies in the employment of empirical statistical measures that are loosely coupled with backdoor attacks. Motivated by this, we propose a Malignity-Aware backdooR defenSe (MARS) that leverages backdoor energy (BE) to indicate the malicious extent of each neuron. To amplify malignity, we further extract the most prominent BE values from each model to form a concentrated backdoor energy (CBE). Finally, a novel Wasserstein distance-based clustering method is introduced to effectively identify backdoor models. Extensive experiments demonstrate that MARS can defend against SOTA backdoor attacks and significantly outperforms existing defenses.

Paperid: 262, https://arxiv.org/pdf/2509.16736.pdf

Abstract:
Large Language Models (LLMs) have enabled the emergence of autonomous agents capable of complex reasoning, planning, and interaction. However, coordinating such agents at scale remains a fundamental challenge, particularly in decentralized environments where communication lacks transparency and agent behavior cannot be shaped through centralized incentives. We propose a blockchain-based framework that enables transparent agent registration, verifiable task allocation, and dynamic reputation tracking through smart contracts. The core of our design lies in two mechanisms: a matching score-based task allocation protocol that evaluates agents by reputation, capability match, and workload; and a behavior-shaping incentive mechanism that adjusts agent behavior via feedback on performance and reward. Our implementation integrates GPT-4 agents with Solidity contracts and demonstrates, through 50-round simulations, strong task success rates, stable utility distribution, and emergent agent specialization. The results underscore the potential for trustworthy, incentive-compatible multi-agent coordination in open environments.

Paperid: 263, https://arxiv.org/pdf/2511.22441.pdf

Abstract:
Images shared on social media often expose geographic cues. While early geolocation methods required expert effort and lacked generalization, the rise of Large Vision Language Models (LVLMs) now enables accurate geolocation even for ordinary users. However, existing approaches are not optimized for this task. To explore the full potential and associated privacy risks, we present Geo-Detective, an agent that mimics human reasoning and tool use for image geolocation inference. It follows a procedure with four steps that adaptively selects strategies based on image difficulty and is equipped with specialized tools such as visual reverse search, which emulates how humans gather external geographic clues. Experimental results show that GEO-Detective outperforms baseline large vision language models (LVLMs) overall, particularly on images lacking visible geographic features. In country level geolocation tasks, it achieves an improvement of over 11.1% compared to baseline LLMs, and even at finer grained levels, it still provides around a 5.2% performance gain. Meanwhile, when equipped with external clues, GEO-Detective becomes more likely to produce accurate predictions, reducing the "unknown" prediction rate by more than 50.6%. We further explore multiple defense strategies and find that Geo-Detective exhibits stronger robustness, highlighting the need for more effective privacy safeguards.

Paperid: 264, https://arxiv.org/pdf/2511.19536.pdf

Abstract:
Inference attacks have been widely studied and offer a systematic risk assessment of ML services; however, their implementation and the attack parameters for optimal estimation are challenging for non-experts. The emergence of advanced large language models presents a promising yet largely unexplored opportunity to develop autonomous agents as inference attack experts, helping address this challenge. In this paper, we propose AttackPilot, an autonomous agent capable of independently conducting inference attacks without human intervention. We evaluate it on 20 target services. The evaluation shows that our agent, using GPT-4o, achieves a 100.0% task completion rate and near-expert attack performance, with an average token cost of only $0.627 per run. The agent can also be powered by many other representative LLMs and can adaptively optimize its strategy under service constraints. We further perform trace analysis, demonstrating that design choices, such as a multi-agent framework and task-specific action spaces, effectively mitigate errors such as bad plans, inability to follow instructions, task context loss, and hallucinations. We anticipate that such agents could empower non-expert ML service providers, auditors, or regulators to systematically assess the risks of ML services without requiring deep domain expertise.

Paperid: 265, https://arxiv.org/pdf/2511.12149.pdf

Abstract:
Vision-Language-Action (VLA) models enable robots to interpret natural-language instructions and perform diverse tasks, yet their integration of perception, language, and control introduces new safety vulnerabilities. Despite growing interest in attacking such models, the effectiveness of existing techniques remains unclear due to the absence of a unified evaluation framework. One major issue is that differences in action tokenizers across VLA architectures hinder reproducibility and fair comparison. More importantly, most existing attacks have not been validated in real-world scenarios. To address these challenges, we propose AttackVLA, a unified framework that aligns with the VLA development lifecycle, covering data construction, model training, and inference. Within this framework, we implement a broad suite of attacks, including all existing attacks targeting VLAs and multiple adapted attacks originally developed for vision-language models, and evaluate them in both simulation and real-world settings. Our analysis of existing attacks reveals a critical gap: current methods tend to induce untargeted failures or static action states, leaving targeted attacks that drive VLAs to perform precise long-horizon action sequences largely unexplored. To fill this gap, we introduce BackdoorVLA, a targeted backdoor attack that compels a VLA to execute an attacker-specified long-horizon action sequence whenever a trigger is present. We evaluate BackdoorVLA in both simulated benchmarks and real-world robotic settings, achieving an average targeted success rate of 58.4% and reaching 100% on selected tasks. Our work provides a standardized framework for evaluating VLA vulnerabilities and demonstrates the potential for precise adversarial manipulation, motivating further research on securing VLA-based embodied systems.

Paperid: 266, https://arxiv.org/pdf/2511.00181.pdf

Abstract:
The rapid evolution of AI-generated images poses unprecedented challenges to information integrity and media authenticity. Existing detection approaches suffer from fundamental limitations: traditional classifiers lack interpretability and fail to generalize across evolving generative models, while vision-language models (VLMs), despite their promise, remain constrained to single-shot analysis and pixel-level reasoning. To address these challenges, we introduce AIFo (Agent-based Image Forensics), a novel training-free framework that emulates human forensic investigation through multi-agent collaboration. Unlike conventional methods, our framework employs a set of forensic tools, including reverse image search, metadata extraction, pre-trained classifiers, and VLM analysis, coordinated by specialized LLM-based agents that collect, synthesize, and reason over cross-source evidence. When evidence is conflicting or insufficient, a structured multi-agent debate mechanism allows agents to exchange arguments and reach a reliable conclusion. Furthermore, we enhance the framework with a memory-augmented reasoning module that learns from historical cases to improve future detection accuracy. Our comprehensive evaluation spans 6,000 images across both controlled laboratory settings and challenging real-world scenarios, including images from modern generative platforms and diverse online sources. AIFo achieves 97.05% accuracy, substantially outperforming traditional classifiers and state-of-the-art VLMs. These results demonstrate that agent-based procedural reasoning offers a new paradigm for more robust, interpretable, and adaptable AI-generated image detection.

Paperid: 267, https://arxiv.org/pdf/2510.21189.pdf

Abstract:
Despite their superior performance on a wide range of domains, large language models (LLMs) remain vulnerable to misuse for generating harmful content, a risk that has been further amplified by various jailbreak attacks. Existing jailbreak attacks mainly follow sequential logic, where LLMs understand and answer each given task one by one. However, concurrency, a natural extension of the sequential scenario, has been largely overlooked. In this work, we first propose a word-level method to enable task concurrency in LLMs, where adjacent words encode divergent intents. Although LLMs maintain strong utility in answering concurrent tasks, which is demonstrated by our evaluations on mathematical and general question-answering benchmarks, we notably observe that combining a harmful task with a benign one significantly reduces the probability of it being filtered by the guardrail, showing the potential risks associated with concurrency in LLMs. Based on these findings, we introduce $\texttt{JAIL-CON}$, an iterative attack framework that $\underline{\text{JAIL}}$breaks LLMs via task $\underline{\text{CON}}$currency. Experiments on widely-used LLMs demonstrate the strong jailbreak capabilities of $\texttt{JAIL-CON}$ compared to existing attacks. Furthermore, when the guardrail is applied as a defense, compared to the sequential answers generated by previous attacks, the concurrent answers in our $\texttt{JAIL-CON}$ exhibit greater stealthiness and are less detectable by the guardrail, highlighting the unique feature of task concurrency in jailbreaking LLMs.

Paperid: 268, https://arxiv.org/pdf/2510.10932.pdf

Abstract:
With the growing deployment of Vision-Language-Action (VLA) models in real-world embodied AI systems, their increasing vulnerability to backdoor attacks poses a serious safety threat. A backdoored VLA agent can be covertly triggered by a pre-injected backdoor to execute adversarial actions, potentially causing system failures or even physical harm. Although backdoor attacks on VLA models have been explored, prior work has focused only on untargeted attacks, leaving the more practically threatening scenario of targeted manipulation unexamined. In this paper, we study targeted backdoor attacks on VLA models and introduce TabVLA, a novel framework that enables such attacks via black-box fine-tuning. TabVLA explores two deployment-relevant inference-time threat models: input-stream editing and in-scene triggering. It formulates poisoned data generation as an optimization problem to improve attack effectivess. Experiments with OpenVLA-7B on the LIBERO benchmark reveal that the vision channel is the principal attack surface: targeted backdoors succeed with minimal poisoning, remain robust across variations in trigger design, and are degraded only by positional mismatches between fine-tuning and inference triggers. We also investigate a potential detection-based defense against TabVLA, which reconstructs latent visual triggers from the input stream to flag activation-conditioned backdoor samples. Our work highlights the vulnerability of VLA models to targeted backdoor manipulation and underscores the need for more advanced defenses.

Paperid: 269, https://arxiv.org/pdf/2508.03091.pdf

Abstract:
Large-scale pre-training frameworks like CLIP have revolutionized multimodal learning, but their reliance on web-scraped datasets, frequently containing private user data, raises serious concerns about misuse. Unlearnable Examples (UEs) have emerged as a promising countermeasure against unauthorized model training, employing carefully crafted unlearnable noise to disrupt the learning of meaningful representations from protected data. Current approaches typically generate UEs by jointly optimizing unlearnable noise for both images and their associated text descriptions (or labels). However, this optimization process is often computationally prohibitive for on-device execution, forcing reliance on external third-party services. This creates a fundamental privacy paradox: users must initially expose their data to these very services to achieve protection, thereby compromising privacy in the process. Such a contradiction has severely hindered the development of practical, scalable data protection solutions. To resolve this paradox, we introduce \textbf{Text-to-Unlearnable Example (T2UE)}, a novel framework that enables users to generate UEs using only text descriptions. T2UE circumvents the need for original image data by employing a text-to-image (T2I) model to map text descriptions into the image (noise) space, combined with an error-minimization framework to produce effective unlearnable noise. Extensive experiments show that T2UE-protected data substantially degrades performance in downstream tasks (e.g., cross-modal retrieval) for state-of-the-art models. Notably, the protective effect generalizes across diverse architectures and even to supervised learning settings. Our work demonstrates the feasibility of "zero-contact data protection", where personal data can be safeguarded based solely on their textual descriptions, eliminating the need for direct data exposure.

Paperid: 270, https://arxiv.org/pdf/2507.22617.pdf

Abstract:
Recent advances in text-to-image diffusion models have enabled the creation of a new form of digital art: optical illusions--visual tricks that create different perceptions of reality. However, adversaries may misuse such techniques to generate hateful illusions, which embed specific hate messages into harmless scenes and disseminate them across web communities. In this work, we take the first step toward investigating the risks of scalable hateful illusion generation and the potential for bypassing current content moderation models. Specifically, we generate 1,860 optical illusions using Stable Diffusion and ControlNet, conditioned on 62 hate messages. Of these, 1,571 are hateful illusions that successfully embed hate messages, either overtly or subtly, forming the Hateful Illusion dataset. Using this dataset, we evaluate the performance of six moderation classifiers and nine vision language models (VLMs) in identifying hateful illusions. Experimental results reveal significant vulnerabilities in existing moderation models: the detection accuracy falls below 0.245 for moderation classifiers and below 0.102 for VLMs. We further identify a critical limitation in their vision encoders, which mainly focus on surface-level image details while overlooking the secondary layer of information, i.e., hidden messages. To address this risk, we explore preliminary mitigation measures and identify the most effective approaches from the perspectives of image transformations and training-level strategies.

Paperid: 271, https://arxiv.org/pdf/2507.11155.pdf

Abstract:
Vision-language models (VLMs) are increasingly applied to identify unsafe or inappropriate images due to their internal ethical standards and powerful reasoning abilities. However, it is still unclear whether they can recognize various unsafe concepts when presented in different modalities, such as text and images. To address this, we first compile the UnsafeConcepts dataset, featuring 75 unsafe concepts, i.e., ``Swastika,'' ``Sexual Harassment,'' and ``Assaults,'' along with associated 1.5K images. We then conduct a systematic evaluation of VLMs' perception (concept recognition) and alignment (ethical reasoning) capabilities. We assess eight popular VLMs and find that, although most VLMs accurately perceive unsafe concepts, they sometimes mistakenly classify these concepts as safe. We also identify a consistent modality gap among open-source VLMs in distinguishing between visual and textual unsafe concepts. To bridge this gap, we introduce a simplified reinforcement learning (RL)-based approach using proximal policy optimization (PPO) to strengthen the ability to identify unsafe concepts from images. Our approach uses reward scores based directly on VLM responses, bypassing the need for collecting human-annotated preference data to train a new reward model. Experimental results show that our approach effectively enhances VLM alignment on images while preserving general capabilities. It outperforms baselines such as supervised fine-tuning (SFT) and direct preference optimization (DPO). We hope our dataset, evaluation findings, and proposed alignment solution contribute to the community's efforts in advancing safe VLMs.

Paperid: 272, https://arxiv.org/pdf/2511.13127.pdf

Abstract:
Jailbreak attacks can circumvent model safety guardrails and reveal critical blind spots. Prior attacks on text-to-video (T2V) models typically add adversarial perturbations to obviously unsafe prompts, which are often easy to detect and defend. In contrast, we show that benign-looking prompts containing rich, implicit cues can induce T2V models to generate semantically unsafe videos that both violate policy and preserve the original (blocked) intent. To realize this, we propose VEIL, a jailbreak framework that leverages T2V models' cross-modal associative patterns via a modular prompt design. Specifically, our prompts combine three components: neutral scene anchors, which provide the surface-level scene description extracted from the blocked intent to maintain plausibility; latent auditory triggers, textual descriptions of innocuous-sounding audio events (e.g., creaking, muffled noises) that exploit learned audio-visual co-occurrence priors to bias the model toward particular unsafe visual concepts; and stylistic modulators, cinematic directives (e.g., camera framing, atmosphere) that amplify and stabilize the latent trigger's effect. We formalize attack generation as a constrained optimization over the above modular prompt space and solve it with a guided search procedure that balances stealth and effectiveness. Extensive experiments over 7 T2V models demonstrate the efficacy of our attack, achieving a 23 percent improvement in average attack success rate in commercial models.

Paperid: 273, https://arxiv.org/pdf/2510.10073.pdf

Abstract:
Large vision-language model (LVLM)-based web agents are emerging as powerful tools for automating complex online tasks. However, when deployed in real-world environments, they face serious security risks, motivating the design of security evaluation benchmarks. Existing benchmarks provide only partial coverage, typically restricted to narrow scenarios such as user-level prompt manipulation, and thus fail to capture the broad range of agent vulnerabilities. To address this gap, we present \tool{}, the first holistic benchmark for evaluating the security of LVLM-based web agents. \tool{} first introduces a unified evaluation suite comprising six simulated but realistic web environments (\eg, e-commerce platforms, community forums) and includes 2,970 high-quality trajectories spanning diverse tasks and attack settings. The suite defines a structured taxonomy of six attack vectors spanning both user-level and environment-level manipulations. In addition, we introduce a multi-layered evaluation protocol that analyzes agent failures across three critical dimensions: internal reasoning, behavioral trajectory, and task outcome, facilitating a fine-grained risk analysis that goes far beyond simple success metrics. Using this benchmark, we conduct large-scale experiments on 9 representative LVLMs, which fall into three categories: general-purpose, agent-specialized, and GUI-grounded. Our results show that all tested agents are consistently vulnerable to subtle adversarial manipulations and reveal critical trade-offs between model specialization and security. By providing (1) a comprehensive benchmark suite with diverse environments and a multi-layered evaluation pipeline, and (2) empirical insights into the security challenges of modern LVLM-based web agents, \tool{} establishes a foundation for advancing trustworthy web agent deployment.

Paperid: 274, https://arxiv.org/pdf/2508.05717.pdf

Abstract:
Digital twin technology has gained increasing attention across various sectors due to its ability to create virtual replicas of physical systems, enabling real-time monitoring, optimization, and simulation. This paper explores the integration of digital twins within defence applications, focusing on key use cases ranging from system design and development, operational planning and training, to mission execution and debriefing. By examining the application of digital twin technologies across defense platforms, we highlight their key advantages such as enhanced operational performance, predictive capabilities, and increased system uptime. Additionally, we introduce a novel characterization framework for digital twins that aims to standardize and unify their application across different defence domains to facilitate interoperability. Thereafter, we discuss the main challenges, gaps and limitations in implementing and adopting digital twins within defence organizations by analyzing a combination of scientific literature, current industry practices, governmental strategies, and the findings from a comprehensive survey of industrial stakeholders and ministries of defense. Finally, we outline future research directions and development opportunities, emphasizing the need for robust frameworks and interdisciplinary collaborations to fully realize the potential of digital twins in the defence sector.

Paperid: 275, https://arxiv.org/pdf/2509.23970.pdf

Abstract:
Security of software supply chains is necessary to ensure that software updates do not contain maliciously injected code or introduce vulnerabilities that may compromise the integrity of critical infrastructure. Verifying the integrity of software updates involves binary differential analysis (binary diffing) to highlight the changes between two binary versions by incorporating binary analysis and reverse engineering. Large language models (LLMs) have been applied to binary analysis to augment traditional tools by producing natural language summaries that cybersecurity experts can grasp for further analysis. Combining LLM-based binary code summarization with binary diffing can improve the LLM's focus on critical changes and enable complex tasks such as automated malware detection. To address this, we propose a novel framework for binary diff summarization using LLMs. We introduce a novel functional sensitivity score (FSS) that helps with automated triage of sensitive binary functions for downstream detection tasks. We create a software supply chain security benchmark by injecting 3 different malware into 6 open-source projects which generates 104 binary versions, 392 binary diffs, and 46,023 functions. On this, our framework achieves a precision of 0.98 and recall of 0.64 for malware detection, displaying high accuracy with low false positives. Across malicious and benign functions, we achieve FSS separation of 3.0 points, confirming that FSS categorization can classify sensitive functions. We conduct a case study on the real-world XZ utils supply chain attack; our framework correctly detects the injected backdoor functions with high FSS.

Paperid: 276, https://arxiv.org/pdf/2508.20444.pdf

Abstract:
Using automated reasoning, code synthesis, and contextual decision-making, we introduce a new threat that exploits large language models (LLMs) to autonomously plan, adapt, and execute the ransomware attack lifecycle. Ransomware 3.0 represents the first threat model and research prototype of LLM-orchestrated ransomware. Unlike conventional malware, the prototype only requires natural language prompts embedded in the binary; malicious code is synthesized dynamically by the LLM at runtime, yielding polymorphic variants that adapt to the execution environment. The system performs reconnaissance, payload generation, and personalized extortion, in a closed-loop attack campaign without human involvement. We evaluate this threat across personal, enterprise, and embedded environments using a phase-centric methodology that measures quantitative fidelity and qualitative coherence in each attack phase. We show that open source LLMs can generate functional ransomware components and sustain closed-loop execution across diverse environments. Finally, we present behavioral signals and multi-level telemetry of Ransomware 3.0 through a case study to motivate future development of better defenses and policy enforcements to address novel AI-enabled ransomware attacks.

Paperid: 277, https://arxiv.org/pdf/2508.20051.pdf

Abstract:
We note that constituent fields (notably the fraction-of-seconds timestamp field) in the data payload structure of the synchrophasor communication protocol (IEEE C37.118 standard) are overprovisioned relative to real-world usage and needs, lending themselves to abuse for embedding of covert channels. We develop the SCAMPER (Synchrophasor Covert Channel for Malicious and Protective ERrands) framework to exploit these overprovisioned fields for covert communication and show that SCAMPER can be applied for both malicious (attack) and protective (defense) purposes. Through modifications of the timestamp field, we demonstrate that SCAMPER enables an attacker to accomplish surreptitious communications between devices in the power system to trigger a variety of malicious actions. These timestamp modifications can be performed without having any impact on the operation of the power system. However, having recognized the potential for this covert channel, we show that SCAMPER can instead be applied for defensive security purposes as an integrated cryptographic data integrity mechanism that can facilitate detection of false data injection (FDI) attacks. We perform experimental studies of the proposed methods on two Hardware-in-the-Loop (HIL) testbeds to demonstrate the effectiveness of the proposed SCAMPER framework for both malicious and protective purposes.

Paperid: 278, https://arxiv.org/pdf/2508.14261.pdf

Abstract:
Cyber-attacks on operational technology (OT) and cyber-physical systems (CPS) have increased tremendously in recent years with the proliferation of malware targeting Linux-based embedded devices of OT and CPS systems. Comprehensive malware detection requires dynamic analysis of execution behavior in addition to static analysis of binaries. Safe execution of malware in a manner that captures relevant behaviors via side-channels requires a sandbox environment. Existing Linux sandboxes are built for specific tasks, only capture one or two side-channels, and do not offer customization for different analysis tasks. We present the SaMOSA Linux sandbox that allows emulation of Linux malwares while capturing time-synchronized side-channels from four sources. SaMOSA additionally provides emulation of network services via FakeNet, and allows orchestration and customization of the sandbox environment via pipeline hooks. In comparison to existing Linux sandboxes, SaMOSA captures more side-channels namely system calls, network activity, disk activity, and hardware performance counters. It supports three architectures predominantly used in OT and CPS namely x86-64, ARM64, and PowerPC 64. SaMOSA fills a gap in Linux malware analysis by providing a modular and customizable sandbox framework that can be adapted for many malware analysis tasks. We present three case studies of three different malware families to demonstrate the advantages of SaMOSA.

Paperid: 279, https://arxiv.org/pdf/2511.22180.pdf

Abstract:
The rapid advancement of location-based services (LBSs) in three-dimensional (3D) domains, such as smart cities and intelligent transportation, has raised concerns over 3D spatiotemporal trajectory privacy protection. However, existing research has not fully addressed the risk of attackers exploiting the spatiotemporal correlation of 3D spatiotemporal trajectories and the impact of height information, both of which can potentially lead to significant privacy leakage. To address these issues, this paper proposes a personalized 3D spatiotemporal trajectory privacy protection mechanism, named 3DSTPM. First, we analyze the characteristics of attackers that exploit spatiotemporal correlations between locations in a trajectory and present the attack model. Next, we exploit the complementary characteristics of 3D geo-indistinguishability (3D-GI) and distortion privacy to find a protection location set (PLS) that obscures the real location for all possible locations. To address the issue of privacy accumulation caused by continuous trajectory queries, we propose a Window-based Adaptive Privacy Budget Allocation (W-APBA), which dynamically allocates privacy budgets to all locations in the current PLS based on their predictability and sensitivity. Finally, we perturb the real location using the allocated privacy budget by the PF (Permute-and-Flip) mechanism, effectively balancing privacy protection and Quality of Service (QoS). Simulation results demonstrate that the proposed 3DSTPM effectively reduces QoS loss while meeting the user's personalized privacy protection needs.

Paperid: 280, https://arxiv.org/pdf/2510.09689.pdf

Abstract:
Large Language Models (LLMs) excel at tasks such as dialogue, summarization, and question answering, yet they struggle to adapt to specialized domains and evolving facts. To overcome this, web search has been integrated into LLMs, allowing real-time access to online content. However, this connection magnifies safety risks, as adversarial prompts combined with untrusted sources can cause severe vulnerabilities. We investigate red teaming for LLMs with web search and present CREST-Search, a framework that systematically exposes risks in such systems. Unlike existing methods for standalone LLMs, CREST-Search addresses the complex workflow of search-enabled models by generating adversarial queries with in-context learning and refining them through iterative feedback. We further construct WebSearch-Harm, a search-specific dataset to fine-tune LLMs into efficient red-teaming agents. Experiments show that CREST-Search effectively bypasses safety filters and reveals vulnerabilities in modern web-augmented LLMs, underscoring the need for specialized defenses to ensure trustworthy deployment.

Paperid: 281, https://arxiv.org/pdf/2508.21417.pdf

Abstract:
Large language models (LLMs) have developed rapidly in recent years, revolutionizing various fields. Despite their widespread success, LLMs heavily rely on external code dependencies from package management systems, creating a complex and interconnected LLM dependency supply chain. Vulnerabilities in dependencies can expose LLMs to security risks. While existing research predominantly focuses on model-level security threats, vulnerabilities within the LLM dependency supply chain have been overlooked. To fill this gap, we conducted an empirical analysis of 52 open-source LLMs, examining their third-party dependencies and associated vulnerabilities. We then explored activities within the LLM repositories to understand how maintainers manage third-party vulnerabilities in practice. Finally, we compared third-party dependency vulnerabilities in the LLM ecosystem to those in the Python ecosystem. Our results show that half of the vulnerabilities in the LLM ecosystem remain undisclosed for more than 56.2 months, significantly longer than those in the Python ecosystem. Additionally, 75.8% of LLMs include vulnerable dependencies in their configuration files. This study advances the understanding of LLM supply chain risks, provides insights for practitioners, and highlights potential directions for improving the security of the LLM supply chain.

Paperid: 282, https://arxiv.org/pdf/2507.16329.pdf

Abstract:
Despite the integration of safety alignment and external filters, text-to-image (T2I) generative models are still susceptible to producing harmful content, such as sexual or violent imagery. This raises serious concerns about unintended exposure and potential misuse. Red teaming, which aims to proactively identify diverse prompts that can elicit unsafe outputs from the T2I system (including the core generative model as well as potential external safety filters and other processing components), is increasingly recognized as an essential method for assessing and improving safety before real-world deployment. Yet, existing automated red teaming approaches often treat prompt discovery as an isolated, prompt-level optimization task, which limits their scalability, diversity, and overall effectiveness. To bridge this gap, in this paper, we propose DREAM, a scalable red teaming framework to automatically uncover diverse problematic prompts from a given T2I system. Unlike most prior works that optimize prompts individually, DREAM directly models the probabilistic distribution of the target system's problematic prompts, which enables explicit optimization over both effectiveness and diversity, and allows efficient large-scale sampling after training. To achieve this without direct access to representative training samples, we draw inspiration from energy-based models and reformulate the objective into simple and tractable objectives. We further introduce GC-SPSA, an efficient optimization algorithm that provide stable gradient estimates through the long and potentially non-differentiable T2I pipeline. The effectiveness of DREAM is validated through extensive experiments, demonstrating that it surpasses 9 state-of-the-art baselines by a notable margin across a broad range of T2I models and safety filters in terms of prompt success rate and diversity.

Paperid: 283, https://arxiv.org/pdf/2507.01321.pdf

Abstract:
In-context learning (ICL) has demonstrated remarkable success in large language models (LLMs) due to its adaptability and parameter-free nature. However, it also introduces a critical vulnerability to backdoor attacks, where adversaries can manipulate LLM behaviors by simply poisoning a few ICL demonstrations. In this paper, we propose, for the first time, the dual-learning hypothesis, which posits that LLMs simultaneously learn both the task-relevant latent concepts and backdoor latent concepts within poisoned demonstrations, jointly influencing the probability of model outputs. Through theoretical analysis, we derive an upper bound for ICL backdoor effects, revealing that the vulnerability is dominated by the concept preference ratio between the task and the backdoor. Motivated by these findings, we propose ICLShield, a defense mechanism that dynamically adjusts the concept preference ratio. Our method encourages LLMs to select clean demonstrations during the ICL phase by leveraging confidence and similarity scores, effectively mitigating susceptibility to backdoor attacks. Extensive experiments across multiple LLMs and tasks demonstrate that our method achieves state-of-the-art defense effectiveness, significantly outperforming existing approaches (+26.02% on average). Furthermore, our method exhibits exceptional adaptability and defensive performance even for closed-source models (e.g., GPT-4).

Paperid: 284, https://arxiv.org/pdf/2507.00841.pdf

Abstract:
With the wide application of multimodal foundation models in intelligent agent systems, scenarios such as mobile device control, intelligent assistant interaction, and multimodal task execution are gradually relying on such large model-driven agents. However, the related systems are also increasingly exposed to potential jailbreak risks. Attackers may induce the agents to bypass the original behavioral constraints through specific inputs, and then trigger certain risky and sensitive operations, such as modifying settings, executing unauthorized commands, or impersonating user identities, which brings new challenges to system security. Existing security measures for intelligent agents still have limitations when facing complex interactions, especially in detecting potentially risky behaviors across multiple rounds of conversations or sequences of tasks. In addition, an efficient and consistent automated methodology to assist in assessing and determining the impact of such risks is currently lacking. This work explores the security issues surrounding mobile multimodal agents, attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information, and designs an automated assisted assessment scheme based on a large language model. Through preliminary validation in several representative high-risk tasks, the results show that the method can improve the recognition of risky behaviors to some extent and assist in reducing the probability of agents being jailbroken. We hope that this study can provide some valuable references for the security risk modeling and protection of multimodal intelligent agent systems.

Paperid: 285, https://arxiv.org/pdf/2510.14750.pdf

Abstract:
We experimentally demonstrate a new widespread read disturbance phenomenon, ColumnDisturb, in real commodity DRAM chips. By repeatedly opening or keeping a DRAM row (aggressor row) open, we show that it is possible to disturb DRAM cells through a DRAM column (i.e., bitline) and induce bitflips in DRAM cells sharing the same columns as the aggressor row (across multiple DRAM subarrays). With ColumnDisturb, the activation of a single row concurrently disturbs cells across as many as three subarrays (e.g., 3072 rows) as opposed to RowHammer/RowPress, which affect only a few neighboring rows of the aggressor row in a single subarray. We rigorously characterize ColumnDisturb and its characteristics under various operational conditions using 216 DDR4 and 4 HBM2 chips from three major manufacturers. Among our 27 key experimental observations, we highlight two major results and their implications. First, ColumnDisturb affects chips from all three major manufacturers and worsens as DRAM technology scales down to smaller node sizes (e.g., the minimum time to induce the first ColumnDisturb bitflip reduces by up to 5.06x). We observe that, in existing DRAM chips, ColumnDisturb induces bitflips within a standard DDR4 refresh window (e.g., in 63.6 ms) in multiple cells. We predict that, as DRAM technology node size reduces, ColumnDisturb would worsen in future DRAM chips, likely causing many more bitflips in the standard refresh window. Second, ColumnDisturb induces bitflips in many (up to 198x) more rows than retention failures. Therefore, ColumnDisturb has strong implications for retention-aware refresh mechanisms that leverage the heterogeneity in cell retention times: our detailed analyses show that ColumnDisturb greatly reduces the benefits of such mechanisms.

Paperid: 286, https://arxiv.org/pdf/2511.06396.pdf

Abstract:
Safety evaluation of large language models (LLMs) increasingly relies on LLM-as-a-Judge frameworks, but the high cost of frontier models limits scalability. We propose a cost-efficient multi-agent judging framework that employs Small Language Models (SLMs) through structured debates among critic, defender, and judge agents. To rigorously assess safety judgments, we construct HAJailBench, a large-scale human-annotated jailbreak benchmark comprising 12,000 adversarial interactions across diverse attack methods and target models. The dataset provides fine-grained, expert-labeled ground truth for evaluating both safety robustness and judge reliability. Our SLM-based framework achieves agreement comparable to GPT-4o judges on HAJailBench while substantially reducing inference cost. Ablation results show that three rounds of debate yield the optimal balance between accuracy and efficiency. These findings demonstrate that structured, value-aligned debate enables SLMs to capture semantic nuances of jailbreak attacks and that HAJailBench offers a reliable foundation for scalable LLM safety evaluation.

Paperid: 287, https://arxiv.org/pdf/2509.22732.pdf

Abstract:
The remarkable capabilities of Large Language Models (LLMs) have raised significant safety concerns, particularly regarding "jailbreak" attacks that exploit adversarial prompts to bypass safety alignment mechanisms. Existing defense research primarily focuses on single-turn attacks, whereas multi-turn jailbreak attacks progressively break through safeguards through by concealing malicious intent and tactical manipulation, ultimately rendering conventional single-turn defenses ineffective. To address this critical challenge, we propose the Bidirectional Intention Inference Defense (BIID). The method integrates forward request-based intention inference with backward response-based intention retrospection, establishing a bidirectional synergy mechanism to detect risks concealed within seemingly benign inputs, thereby constructing a more robust guardrails that effectively prevents harmful content generation. The proposed method undergoes systematic evaluation compared with a no-defense baseline and seven representative defense methods across three LLMs and two safety benchmarks under 10 different attack methods. Experimental results demonstrate that the proposed method significantly reduces the Attack Success Rate (ASR) across both single-turn and multi-turn jailbreak attempts, outperforming all existing baseline methods while effectively maintaining practical utility. Notably, comparative experiments across three multi-turn safety datasets further validate the proposed model's significant advantages over other defense approaches.

Paperid: 288, https://arxiv.org/pdf/2508.11716.pdf

Abstract:
Remote user verification in Internet-based applications is becoming increasingly important nowadays. A popular scenario for it consists of submitting a picture of the user's Identity Document (ID) to a service platform, authenticating its veracity, and then granting access to the requested digital service. An ID is well-suited to verify the identity of an individual, since it is government issued, unique, and nontransferable. However, with recent advances in Artificial Intelligence (AI), attackers can surpass security measures in IDs and create very realistic physical and synthetic fake IDs. Researchers are now trying to develop methods to detect an ever-growing number of these AI-based fakes that are almost indistinguishable from authentic (bona fide) IDs. In this counterattack effort, researchers are faced with an important challenge: the difficulty in using real data to train fake ID detectors. This real data scarcity for research and development is originated by the sensitive nature of these documents, which are usually kept private by the ID owners (the users) and the ID Holders (e.g., government, police, bank, etc.). The main contributions of our study are: 1) We propose and discuss a patch-based methodology to preserve privacy in fake ID detection research. 2) We provide a new public database, FakeIDet2-db, comprising over 900K real/fake ID patches extracted from 2,000 ID images, acquired using different smartphone sensors, illumination and height conditions, etc. In addition, three physical attacks are considered: print, screen, and composite. 3) We present a new privacy-aware fake ID detection method, FakeIDet2. 4) We release a standard reproducible benchmark that considers physical and synthetic attacks from popular databases in the literature.

Paperid: 289, https://arxiv.org/pdf/2508.09504.pdf

Abstract:
With the growing complexity of cyberattacks targeting critical infrastructures such as water treatment networks, there is a pressing need for robust anomaly detection strategies that account for both system vulnerabilities and evolving attack patterns. Traditional methods -- statistical, density-based, and graph-based models struggle with distribution shifts and class imbalance in multivariate time series, often leading to high false positive rates. To address these challenges, we propose CGAD, a Causal Graph-based Anomaly Detection framework designed for reliable cyberattack detection in public infrastructure systems. CGAD follows a two-phase supervised framework -- causal profiling and anomaly scoring. First, it learns causal invariant graph structures representing the system's behavior under "Normal" and "Attack" states using Dynamic Bayesian Networks. Second, it employs structural divergence to detect anomalies via causal graph comparison by evaluating topological deviations in causal graphs over time. By leveraging causal structures, CGAD achieves superior adaptability and accuracy in non-stationary and imbalanced time series environments compared to conventional machine learning approaches. By uncovering causal structures beneath volatile sensor data, our framework not only detects cyberattacks with markedly higher precision but also redefines robustness in anomaly detection, proving resilience where traditional models falter under imbalance and drift. Our framework achieves substantial gains in F1 and ROC-AUC scores over best-performing baselines across four industrial datasets, demonstrating robust detection of delayed and structurally complex anomalies.

Paperid: 290, https://arxiv.org/pdf/2508.00748.pdf

Abstract:
Photorealistic talking-head avatars are becoming increasingly common in virtual meetings, gaming, and social platforms. These avatars allow for more immersive communication, but they also introduce serious security risks. One emerging threat is impersonation: an attacker can steal a user's avatar, preserving his appearance and voice, making it nearly impossible to detect its fraudulent usage by sight or sound alone. In this paper, we explore the challenge of biometric verification in such avatar-mediated scenarios. Our main question is whether an individual's facial motion patterns can serve as reliable behavioral biometrics to verify their identity when the avatar's visual appearance is a facsimile of its owner. To answer this question, we introduce a new dataset of realistic avatar videos created using a state-of-the-art one-shot avatar generation model, GAGAvatar, with genuine and impostor avatar videos. We also propose a lightweight, explainable spatio-temporal Graph Convolutional Network architecture with temporal attention pooling, that uses only facial landmarks to model dynamic facial gestures. Experimental results demonstrate that facial motion cues enable meaningful identity verification with AUC values approaching 80%. The proposed benchmark and biometric system are available for the research community in order to bring attention to the urgent need for more advanced behavioral biometric defenses in avatar-based communication systems.

Paperid: 291, https://arxiv.org/pdf/2507.02966.pdf

Abstract:
The use of Natural Language Processing (NLP) in highstakes AI-based applications has increased significantly in recent years, especially since the emergence of Large Language Models (LLMs). However, despite their strong performance, LLMs introduce important legal/ ethical concerns, particularly regarding privacy, data protection, and transparency. Due to these concerns, this work explores the use of Named- Entity Recognition (NER) to facilitate the privacy-preserving training (or adaptation) of LLMs. We propose a framework that uses NER technologies to anonymize sensitive information in text data, such as personal identities or geographic locations. An evaluation of the proposed privacy-preserving learning framework was conducted to measure its impact on user privacy and system performance in a particular high-stakes and sensitive setup: AI-based resume scoring for recruitment processes. The study involved two language models (BERT and RoBERTa) and six anonymization algorithms (based on Presidio, FLAIR, BERT, and different versions of GPT) applied to a database of 24,000 candidate profiles. The findings indicate that the proposed privacy preservation techniques effectively maintain system performance while playing a critical role in safeguarding candidate confidentiality, thus promoting trust in the experimented scenario. On top of the proposed privacy-preserving approach, we also experiment applying an existing approach that reduces the gender bias in LLMs, thus finally obtaining our proposed Privacyand Bias-aware LLMs (PBa-LLMs). Note that the proposed PBa-LLMs have been evaluated in a particular setup (resume scoring), but are generally applicable to any other LLM-based AI application.

Paperid: 292, https://arxiv.org/pdf/2512.22669.pdf

Abstract:
Understanding the risks associated with an enterprise environment is the first step toward improving its security. Organizations employ various methods to assess and prioritize the risks identified in cyber threat intelligence (CTI) reports that may be relevant to their operations. Some methodologies rely heavily on manual analysis (which requires expertise and cannot be applied frequently), while others automate the assessment, using attack graphs (AGs) or threat emulators. Such emulators can be employed in conjunction with cyber twins to avoid disruptions in live production environments when evaluating the highlighted threats. Unfortunately, the use of cyber twins in organizational networks is limited due to their inability to scale. In this paper, we propose SCyTAG, a multi-step framework that generates the minimal viable cyber twin required to assess the impact of a given attack scenario. Given the organizational computer network specifications and an attack scenario extracted from a CTI report, SCyTAG generates an AG. Then, based on the AG, it automatically constructs a cyber twin comprising the network components necessary to emulate the attack scenario and assess the relevance and risks of the attack to the organization. We evaluate SCyTAG on both a real and fictitious organizational network. The results show that compared to the full topology, SCyTAG reduces the number of network components needed for emulation by up to 85% and halves the amount of required resources while preserving the fidelity of the emulated attack. SCyTAG serves as a cost-effective, scalable, and highly adaptable threat assessment solution, improving organizational cyber defense by bridging the gap between abstract CTI and practical scenario-driven testing.

Paperid: 293, https://arxiv.org/pdf/2512.20712.pdf

Abstract:
Radio frequency (RF) based systems are increasingly used to detect drones by analyzing their RF signal patterns, converting them into spectrogram images which are processed by object detection models. Existing RF attacks against image based models alter digital features, making over-the-air (OTA) implementation difficult due to the challenge of converting digital perturbations to transmittable waveforms that may introduce synchronization errors and interference, and encounter hardware limitations. We present the first physical attack on RF image based drone detectors, optimizing class-specific universal complex baseband (I/Q) perturbation waveforms that are transmitted alongside legitimate communications. We evaluated the attack using RF recordings and OTA experiments with four types of drones. Our results show that modest, structured I/Q perturbations are compatible with standard RF chains and reliably reduce target drone detection while preserving detection of legitimate drones.

Paperid: 294, https://arxiv.org/pdf/2510.03705.pdf

Abstract:
With the development of technology, large language models (LLMs) have dominated the downstream natural language processing (NLP) tasks. However, because of the LLMs' instruction-following abilities and inability to distinguish the instructions in the data content, such as web pages from search engines, the LLMs are vulnerable to prompt injection attacks. These attacks trick the LLMs into deviating from the original input instruction and executing the attackers' target instruction. Recently, various instruction hierarchy defense strategies are proposed to effectively defend against prompt injection attacks via fine-tuning. In this paper, we explore more vicious attacks that nullify the prompt injection defense methods, even the instruction hierarchy: backdoor-powered prompt injection attacks, where the attackers utilize the backdoor attack for prompt injection attack purposes. Specifically, the attackers poison the supervised fine-tuning samples and insert the backdoor into the model. Once the trigger is activated, the backdoored model executes the injected instruction surrounded by the trigger. We construct a benchmark for comprehensive evaluation. Our experiments demonstrate that backdoor-powered prompt injection attacks are more harmful than previous prompt injection attacks, nullifying existing prompt injection defense methods, even the instruction hierarchy techniques.

Paperid: 295, https://arxiv.org/pdf/2509.24418.pdf

Abstract:
As large language models (LLMs) are increasingly integrated into numerous applications across various domains, LLMs' safety becomes a critical concern for both application developers and intended users. Currently, great efforts have been made to develop safety benchmarks with fine-grained taxonomies. However, these benchmarks' taxonomies are disparate with different safety policies. Thus, existing safeguards trained on these benchmarks are either coarse-grained to only distinguish between safe and unsafe, or constrained by the narrow risk taxonomies of a single benchmark. To leverage these fine-grained safety taxonomies across multiple safety benchmarks, in this paper, we propose GSPR, a Generalizable Safety Policy Reasoner to identify unsafe input prompts and LLMs' outputs with violated safety taxonomies through Group Relative Policy Optimization (GRPO). Unlike prior safeguards which only cover a fixed set of risk factors, our GSPR incentivizes its reasoning capability with varied safety taxonomies through our careful cold-start strategy and reward design. Consequently, our GSPR can be trained across multiple safety benchmarks with distinct taxonomies and naturally exhibits powerful generalization ability. We conduct extensive experiments to show that our GSPR significantly improves existing safety guardrails' reasoning capabilities for both safety and category prediction tasks. Moreover, our GSPR not only demonstrates powerful safety generalization abilities but also achieves the least inference token costs with explanations.

Paperid: 296, https://arxiv.org/pdf/2509.04191.pdf

Abstract:
The widespread adoption of Kubernetes (K8s) for orchestrating cloud-native applications has introduced significant security challenges, such as misconfigured resources and overly permissive configurations. Failing to address these issues can result in unauthorized access, privilege escalation, and lateral movement within clusters. Most existing K8s security solutions focus on detecting misconfigurations, typically through static analysis or anomaly detection. In contrast, this paper presents KubeGuard, a novel runtime log-driven recommender framework aimed at mitigating risks by addressing overly permissive configurations. KubeGuard is designed to harden K8s environments through two complementary tasks: Resource Creation and Resource Refinement. It leverages large language models (LLMs) to analyze manifests and runtime logs reflecting actual system behavior, using modular prompt-chaining workflows. This approach enables KubeGuard to create least-privilege configurations for new resources and refine existing manifests to reduce the attack surface. KubeGuard's output manifests are presented as recommendations that users (e.g., developers and operators) can review and adopt to enhance cluster security. Our evaluation demonstrates that KubeGuard effectively generates and refines K8s manifests for Roles, NetworkPolicies, and Deployments, leveraging both proprietary and open-source LLMs. The high precision, recall, and F1-scores affirm KubeGuard's practicality as a framework that translates runtime observability into actionable, least-privilege configuration guidance.

Paperid: 297, https://arxiv.org/pdf/2508.17481.pdf

Abstract:
Humanoids are progressing toward practical deployment across healthcare, industrial, defense, and service sectors. While typically considered cyber-physical systems (CPSs), their dependence on traditional networked software stacks (e.g., Linux operating systems), robot operating system (ROS) middleware, and over-the-air update channels, creates a distinct security profile that exposes them to vulnerabilities conventional CPS models do not fully address. Prior studies have mainly examined specific threats, such as LiDAR spoofing or adversarial machine learning (AML). This narrow focus overlooks how an attack targeting one component can cascade harm throughout the robot's interconnected systems. We address this gap through a systematization of knowledge (SoK) that takes a comprehensive approach, consolidating fragmented research from robotics, CPS, and network security domains. We introduce a seven-layer security model for humanoid robots, organizing 39 known attacks and 35 defenses across the humanoid ecosystem-from hardware to human-robot interaction. Building on this security model, we develop a quantitative 39x35 attack-defense matrix with risk-weighted scoring, validated through Monte Carlo analysis. We demonstrate our method by evaluating three real-world robots: Pepper, G1 EDU, and Digit. The scoring analysis revealed varying security maturity levels, with scores ranging from 39.9% to 79.5% across the platforms. This work introduces a structured, evidence-based assessment method that enables systematic security evaluation, supports cross-platform benchmarking, and guides prioritization of security investments in humanoid robotics.

Paperid: 298, https://arxiv.org/pdf/2508.12412.pdf

Abstract:
The incorporation of large language models in multi-agent systems (MASs) has the potential to significantly improve our ability to autonomously solve complex problems. However, such systems introduce unique challenges in monitoring, interpreting, and detecting system failures. Most existing MAS observability frameworks focus on analyzing each individual agent separately, overlooking failures associated with the entire MAS. To bridge this gap, we propose LumiMAS, a novel MAS observability framework that incorporates advanced analytics and monitoring techniques. The proposed framework consists of three key components: a monitoring and logging layer, anomaly detection layer, and anomaly explanation layer. LumiMAS's first layer monitors MAS executions, creating detailed logs of the agents' activity. These logs serve as input to the anomaly detection layer, which detects anomalies across the MAS workflow in real time. Then, the anomaly explanation layer performs classification and root cause analysis (RCA) of the detected anomalies. LumiMAS was evaluated on seven different MAS applications, implemented using two popular MAS platforms, and a diverse set of possible failures. The applications include two novel failure-tailored applications that illustrate the effects of a hallucination or bias on the MAS. The evaluation results demonstrate LumiMAS's effectiveness in failure detection, classification, and RCA.

Paperid: 299, https://arxiv.org/pdf/2507.17259.pdf

Abstract:
Large language models (LLMs) are increasingly trained on tabular data, which, unlike unstructured text, often contains personally identifiable information (PII) in a highly structured and explicit format. As a result, privacy risks arise, since sensitive records can be inadvertently retained by the model and exposed through data extraction or membership inference attacks (MIAs). While existing MIA methods primarily target textual content, their efficacy and threat implications may differ when applied to structured data, due to its limited content, diverse data types, unique value distributions, and column-level semantics. In this paper, we present Tab-MIA, a benchmark dataset for evaluating MIAs on tabular data in LLMs and demonstrate how it can be used. Tab-MIA comprises five data collections, each represented in six different encoding formats. Using our Tab-MIA benchmark, we conduct the first evaluation of state-of-the-art MIA methods on LLMs finetuned with tabular data across multiple encoding formats. In the evaluation, we analyze the memorization behavior of pretrained LLMs on structured data derived from Wikipedia tables. Our findings show that LLMs memorize tabular data in ways that vary across encoding formats, making them susceptible to extraction via MIAs. Even when fine-tuned for as few as three epochs, models exhibit high vulnerability, with AUROC scores approaching 90% in most cases. Tab-MIA enables systematic evaluation of these risks and provides a foundation for developing privacy-preserving methods for tabular data in LLMs.

Paperid: 300, https://arxiv.org/pdf/2507.13686.pdf

Abstract:
Large language models (LLMs) have shown remarkable performance across a range of NLP tasks. However, their strong instruction-following capabilities and inability to distinguish instructions from data content make them vulnerable to indirect prompt injection attacks. In such attacks, instructions with malicious purposes are injected into external data sources, such as web documents. When LLMs retrieve this injected data through tools, such as a search engine and execute the injected instructions, they provide misled responses. Recent attack methods have demonstrated potential, but their abrupt instruction injection often undermines their effectiveness. Motivated by the limitations of existing attack methods, we propose TopicAttack, which prompts the LLM to generate a fabricated conversational transition prompt that gradually shifts the topic toward the injected instruction, making the injection smoother and enhancing the plausibility and success of the attack. Through comprehensive experiments, TopicAttack achieves state-of-the-art performance, with an attack success rate (ASR) over 90\% in most cases, even when various defense methods are applied. We further analyze its effectiveness by examining attention scores. We find that a higher injected-to-original attention ratio leads to a greater success probability, and our method achieves a much higher ratio than the baseline methods.

Paperid: 301, https://arxiv.org/pdf/2511.07947.pdf

Abstract:
Machine learning models constitute valuable intellectual property, yet remain vulnerable to model extraction attacks (MEA), where adversaries replicate their functionality through black-box queries. Model watermarking counters MEAs by embedding forensic markers for ownership verification. Current black-box watermarks prioritize MEA survival through representation entanglement, yet inadequately explore resilience against sequential MEAs and removal attacks. Our study reveals that this risk is underestimated because existing removal methods are weakened by entanglement. To address this gap, we propose Watermark Removal attacK (WRK), which circumvents entanglement constraints by exploiting decision boundaries shaped by prevailing sample-level watermark artifacts. WRK effectively reduces watermark success rates by at least 88.79% across existing watermarking benchmarks. For robust protection, we propose Class-Feature Watermarks (CFW), which improve resilience by leveraging class-level artifacts. CFW constructs a synthetic class using out-of-domain samples, eliminating vulnerable decision boundaries between original domain samples and their artifact-modified counterparts (watermark samples). CFW concurrently optimizes both MEA transferability and post-MEA stability. Experiments across multiple domains show that CFW consistently outperforms prior methods in resilience, maintaining a watermark success rate of at least 70.15% in extracted models even under the combined MEA and WRK distortion, while preserving the utility of protected models.

Paperid: 302, https://arxiv.org/pdf/2510.15965.pdf

Abstract:
Modern large reasoning models (LRMs) exhibit impressive multi-step problem-solving via chain-of-thought (CoT) reasoning. However, this iterative thinking mechanism introduces a new vulnerability surface. We present the Deadlock Attack, a resource exhaustion method that hijacks an LRM's generative control flow by training a malicious adversarial embedding to induce perpetual reasoning loops. Specifically, the optimized embedding encourages transitional tokens (e.g., "Wait", "But") after reasoning steps, preventing the model from concluding its answer. A key challenge we identify is the continuous-to-discrete projection gap: naïve projections of adversarial embeddings to token sequences nullify the attack. To overcome this, we introduce a backdoor implantation strategy, enabling reliable activation through specific trigger tokens. Our method achieves a 100% attack success rate across four advanced LRMs (Phi-RM, Nemotron-Nano, R1-Qwen, R1-Llama) and three math reasoning benchmarks, forcing models to generate up to their maximum token limits. The attack is also stealthy (in terms of causing negligible utility loss on benign user inputs) and remains robust against existing strategies trying to mitigate the overthinking issue. Our findings expose a critical and underexplored security vulnerability in LRMs from the perspective of reasoning (in)efficiency.

Paperid: 303, https://arxiv.org/pdf/2510.13451.pdf

Abstract:
Machine learning models are often vulnerable to inference attacks that expose sensitive information from their training data. Shadow model technique is commonly employed in such attacks, such as membership inference. However, the need for a large number of shadow models leads to high computational costs, limiting their practical applicability. Such inefficiency mainly stems from the independent training and use of these shadow models. To address this issue, we present a novel shadow pool training framework SHAPOOL, which constructs multiple shared models and trains them jointly within a single process. In particular, we leverage the Mixture-of-Experts mechanism as the shadow pool to interconnect individual models, enabling them to share some sub-networks and thereby improving efficiency. To ensure the shared models closely resemble independent models and serve as effective substitutes, we introduce three novel modules: path-choice routing, pathway regularization, and pathway alignment. These modules guarantee random data allocation for pathway learning, promote diversity among shared models, and maintain consistency with target models. We evaluate SHAPOOL in the context of various membership inference attacks and show that it significantly reduces the computational cost of shadow model construction while maintaining comparable attack performance.

Paperid: 304, https://arxiv.org/pdf/2510.11688.pdf

Abstract:
The increasing autonomy of Large Language Models (LLMs) necessitates a rigorous evaluation of their potential to aid in cyber offense. Existing benchmarks often lack real-world complexity and are thus unable to accurately assess LLMs' cybersecurity capabilities. To address this gap, we introduce PACEbench, a practical AI cyber-exploitation benchmark built on the principles of realistic vulnerability difficulty, environmental complexity, and cyber defenses. Specifically, PACEbench comprises four scenarios spanning single, blended, chained, and defense vulnerability exploitations. To handle these complex challenges, we propose PACEagent, a novel agent that emulates human penetration testers by supporting multi-phase reconnaissance, analysis, and exploitation. Extensive experiments with seven frontier LLMs demonstrate that current models struggle with complex cyber scenarios, and none can bypass defenses. These findings suggest that current models do not yet pose a generalized cyber offense threat. Nonetheless, our work provides a robust benchmark to guide the trustworthy development of future models.

Paperid: 305, https://arxiv.org/pdf/2510.08211.pdf

Abstract:
Previous research has shown that LLMs finetuned on malicious or incorrect completions within narrow domains (e.g., insecure code or incorrect medical advice) can become broadly misaligned to exhibit harmful behaviors, which is called emergent misalignment. In this work, we investigate whether this phenomenon can extend beyond safety behaviors to a broader spectrum of dishonesty and deception under high-stakes scenarios (e.g., lying under pressure and deceptive behavior). To explore this, we finetune open-sourced LLMs on misaligned completions across diverse domains. Experimental results demonstrate that LLMs show broadly misaligned behavior in dishonesty. Additionally, we further explore this phenomenon in a downstream combined finetuning setting, and find that introducing as little as 1% of misalignment data into a standard downstream task is sufficient to decrease honest behavior over 20%. Furthermore, we consider a more practical human-AI interaction environment where we simulate both benign and biased users to interact with the assistant LLM. Notably, we find that the assistant can be misaligned unintentionally to exacerbate its dishonesty with only 10% biased user population. In summary, we extend the study of emergent misalignment to the domain of dishonesty and deception under high-stakes scenarios, and demonstrate that this risk arises not only through direct finetuning, but also in downstream mixture tasks and practical human-AI interactions.

Paperid: 306, https://arxiv.org/pdf/2510.02999.pdf

Abstract:
Existing gradient-based jailbreak attacks on Large Language Models (LLMs), such as Greedy Coordinate Gradient (GCG) and COLD-Attack, typically optimize adversarial suffixes to align the LLM output with a predefined target response. However, by restricting the optimization objective as inducing a predefined target, these methods inherently constrain the adversarial search space, which limit their overall attack efficacy. Furthermore, existing methods typically require a large number of optimization iterations to fulfill the large gap between the fixed target and the original model response, resulting in low attack efficiency. To overcome the limitations of targeted jailbreak attacks, we propose the first gradient-based untargeted jailbreak attack (UJA), aiming to elicit an unsafe response without enforcing any predefined patterns. Specifically, we formulate an untargeted attack objective to maximize the unsafety probability of the LLM response, which can be quantified using a judge model. Since the objective is non-differentiable, we further decompose it into two differentiable sub-objectives for optimizing an optimal harmful response and the corresponding adversarial prompt, with a theoretical analysis to validate the decomposition. In contrast to targeted jailbreak attacks, UJA's unrestricted objective significantly expands the search space, enabling a more flexible and efficient exploration of LLM vulnerabilities.Extensive evaluations demonstrate that \textsc{UJA} can achieve over 80\% attack success rates against recent safety-aligned LLMs with only 100 optimization iterations, outperforming the state-of-the-art gradient-based attacks such as I-GCG and COLD-Attack by over 20\%.

Paperid: 307, https://arxiv.org/pdf/2510.02422.pdf

Abstract:
Existing gradient-based jailbreak attacks typically optimize an adversarial suffix to induce a fixed affirmative response. However, this fixed target usually resides in an extremely low-density region of a safety-aligned LLM's output distribution conditioned on diverse harmful inputs. Due to the substantial discrepancy between the target and the original output, existing attacks require numerous iterations to optimize the adversarial prompt, which might still fail to induce the low-probability target response from the target LLM. In this paper, we propose Dynamic Target Attack (DTA), a new jailbreaking framework relying on the target LLM's own responses as targets to optimize the adversarial prompts. In each optimization round, DTA iteratively samples multiple candidate responses directly from the output distribution conditioned on the current prompt, and selects the most harmful response as a temporary target for prompt optimization. In contrast to existing attacks, DTA significantly reduces the discrepancy between the target and the output distribution, substantially easing the optimization process to search for an effective adversarial prompt. Extensive experiments demonstrate the superior effectiveness and efficiency of DTA: under the white-box setting, DTA only needs 200 optimization iterations to achieve an average attack success rate (ASR) of over 87\% on recent safety-aligned LLMs, exceeding the state-of-the-art baselines by over 15\%. The time cost of DTA is 2-26 times less than existing baselines. Under the black-box setting, DTA uses Llama-3-8B-Instruct as a surrogate model for target sampling and achieves an ASR of 85\% against the black-box target model Llama-3-70B-Instruct, exceeding its counterparts by over 25\%.

Paperid: 308, https://arxiv.org/pdf/2509.18578.pdf

Abstract:
Information leakage issues in machine learning-based Web applications have attracted increasing attention. While the risk of data privacy leakage has been rigorously analyzed, the theory of model function leakage, known as Model Extraction Attacks (MEAs), has not been well studied. In this paper, we are the first to understand MEAs theoretically from an attack-agnostic perspective and to propose analytical metrics for evaluating model extraction risks. By using the Neural Tangent Kernel (NTK) theory, we formulate the linearized MEA as a regularized kernel classification problem and then derive the fidelity gap and generalization error bounds of the attack performance. Based on these theoretical analyses, we propose a new theoretical metric called Model Recovery Complexity (MRC), which measures the distance of weight changes between the victim and surrogate models to quantify risk. Additionally, we find that victim model accuracy, which shows a strong positive correlation with model extraction risk, can serve as an empirical metric. By integrating these two metrics, we propose a framework, namely Model Extraction Risk Inspector (MER-Inspector), to compare the extraction risks of models under different model architectures by utilizing relative metric values. We conduct extensive experiments on 16 model architectures and 5 datasets. The experimental results demonstrate that the proposed metrics have a high correlation with model extraction risks, and MER-Inspector can accurately compare the extraction risks of any two models with up to 89.58%.

Paperid: 309, https://arxiv.org/pdf/2508.13048.pdf

Abstract:
Large Language Models (LLMs) have exhibited remarkable capabilities but remain vulnerable to jailbreaking attacks, which can elicit harmful content from the models by manipulating the input prompts. Existing black-box jailbreaking techniques primarily rely on static prompts crafted with a single, non-adaptive strategy, or employ rigid combinations of several underperforming attack methods, which limits their adaptability and generalization. To address these limitations, we propose MAJIC, a Markovian adaptive jailbreaking framework that attacks black-box LLMs by iteratively combining diverse innovative disguise strategies. MAJIC first establishes a ``Disguise Strategy Pool'' by refining existing strategies and introducing several innovative approaches. To further improve the attack performance and efficiency, MAJIC formulate the sequential selection and fusion of strategies in the pool as a Markov chain. Under this formulation, MAJIC initializes and employs a Markov matrix to guide the strategy composition, where transition probabilities between strategies are dynamically adapted based on attack outcomes, thereby enabling MAJIC to learn and discover effective attack pathways tailored to the target model. Our empirical results demonstrate that MAJIC significantly outperforms existing jailbreak methods on prominent models such as GPT-4o and Gemini-2.0-flash, achieving over 90\% attack success rate with fewer than 15 queries per attempt on average.

Paperid: 310, https://arxiv.org/pdf/2508.10065.pdf

Abstract:
With the increasing demand for the right to be forgotten, machine unlearning (MU) has emerged as a vital tool for enhancing trust and regulatory compliance by enabling the removal of sensitive data influences from machine learning (ML) models. However, most MU algorithms primarily rely on in-training methods to adjust model weights, with limited exploration of the benefits that data-level adjustments could bring to the unlearning process. To address this gap, we propose a novel approach that leverages digital watermarking to facilitate MU by strategically modifying data content. By integrating watermarking, we establish a controlled unlearning mechanism that enables precise removal of specified data while maintaining model utility for unrelated tasks. We first examine the impact of watermarked data on MU, finding that MU effectively generalizes to watermarked data. Building on this, we introduce an unlearning-friendly watermarking framework, termed Water4MU, to enhance unlearning effectiveness. The core of Water4MU is a bi-level optimization (BLO) framework: at the upper level, the watermarking network is optimized to minimize unlearning difficulty, while at the lower level, the model itself is trained independently of watermarking. Experimental results demonstrate that Water4MU is effective in MU across both image classification and image generation tasks. Notably, it outperforms existing methods in challenging MU scenarios, known as "challenging forgets".

Paperid: 311, https://arxiv.org/pdf/2508.06837.pdf

Abstract:
Text-to-Image (T2I) models, represented by DALL$\cdot$E and Midjourney, have gained huge popularity for creating realistic images. The quality of these images relies on the carefully engineered prompts, which have become valuable intellectual property. While skilled prompters showcase their AI-generated art on markets to attract buyers, this business incidentally exposes them to \textit{prompt stealing attacks}. Existing state-of-the-art attack techniques reconstruct the prompts from a fixed set of modifiers (i.e., style descriptions) with model-specific training, which exhibit restricted adaptability and effectiveness to diverse showcases (i.e., target images) and diffusion models. To alleviate these limitations, we propose Prometheus, a training-free, proxy-in-the-loop, search-based prompt-stealing attack, which reverse-engineers the valuable prompts of the showcases by interacting with a local proxy model. It consists of three innovative designs. First, we introduce dynamic modifiers, as a supplement to static modifiers used in prior works. These dynamic modifiers provide more details specific to the showcases, and we exploit NLP analysis to generate them on the fly. Second, we design a contextual matching algorithm to sort both dynamic and static modifiers. This offline process helps reduce the search space of the subsequent step. Third, we interact with a local proxy model to invert the prompts with a greedy search algorithm. Based on the feedback guidance, we refine the prompt to achieve higher fidelity. The evaluation results show that Prometheus successfully extracts prompts from popular platforms like PromptBase and AIFrog against diverse victim models, including Midjourney, Leonardo.ai, and DALL$\cdot$E, with an ASR improvement of 25.0\%. We also validate that Prometheus is resistant to extensive potential defenses, further highlighting its severity in practice.

Paperid: 312, https://arxiv.org/pdf/2508.03151.pdf

Abstract:
IoT environments such as smart homes are susceptible to privacy inference attacks, where attackers can analyze patterns of encrypted network traffic to infer the state of devices and even the activities of people. While most existing attacks exploit ML techniques for discovering such traffic patterns, they underperform on wireless traffic, especially Wi-Fi, due to its heavy noise and packet losses of wireless sniffing. In addition, these approaches commonly target at distinguishing chunked IoT event traffic samples, and they failed at effectively tracking multiple events simultaneously. In this work, we propose WiFinger, a fine-grained multi-IoT event fingerprinting approach against noisy traffic. WiFinger turns the traffic pattern classification task into a subsequence matching problem and introduces novel techniques to account for the high time complexity while maintaining high accuracy. Experiments demonstrate that our method outperforms existing approaches on Wi-Fi traffic, achieving an average recall of 85% (vs. 0.49% and 0.46%) for various IoT events while maintaining almost zero false positives for most of them.

Paperid: 313, https://arxiv.org/pdf/2508.02476.pdf

Abstract:
Pose-guided video generation has become a powerful tool in creative industries, exemplified by frameworks like Animate Anyone. However, conditioning generation on specific poses introduces serious risks, such as impersonation, privacy violations, and NSFW content creation. To address these challenges, we propose $\textbf{PoseGuard}$, a safety alignment framework for pose-guided generation. PoseGuard is designed to suppress unsafe generations by degrading output quality when encountering malicious poses, while maintaining high-fidelity outputs for benign inputs. We categorize unsafe poses into three representative types: discriminatory gestures such as kneeling or offensive salutes, sexually suggestive poses that lead to NSFW content, and poses imitating copyrighted celebrity movements. PoseGuard employs a dual-objective training strategy combining generation fidelity with safety alignment, and uses LoRA-based fine-tuning for efficient, parameter-light updates. To ensure adaptability to evolving threats, PoseGuard supports pose-specific LoRA fusion, enabling flexible and modular updates when new unsafe poses are identified. We further demonstrate the generalizability of PoseGuard to facial landmark-guided generation. Extensive experiments validate that PoseGuard effectively blocks unsafe generations, maintains generation quality for benign inputs, and remains robust against slight pose variations.

Paperid: 314, https://arxiv.org/pdf/2507.03034.pdf

Abstract:
The (generative) artificial intelligence (AI) era has profoundly reshaped the meaning and value of data. No longer confined to static content, data now permeates every stage of the AI lifecycle from the training samples that shape model parameters to the prompts and outputs that drive real-world model deployment. This shift renders traditional notions of data protection insufficient, while the boundaries of what needs safeguarding remain poorly defined. Failing to safeguard data in AI systems can inflict societal and individual, underscoring the urgent need to clearly delineate the scope of and rigorously enforce data protection. In this perspective, we propose a four-level taxonomy, including non-usability, privacy preservation, traceability, and deletability, that captures the diverse protection needs arising in modern (generative) AI models and systems. Our framework offers a structured understanding of the trade-offs between data utility and control, spanning the entire AI pipeline, including training datasets, model weights, system prompts, and AI-generated content. We analyze representative technical approaches at each level and reveal regulatory blind spots that leave critical assets exposed. By offering a structured lens to align future AI technologies and governance with trustworthy data practices, we underscore the urgency of rethinking data protection for modern AI techniques and provide timely guidance for developers, researchers, and regulators alike.

Paperid: 315, https://arxiv.org/pdf/2511.07772.pdf

Abstract:
As Large Language Models (LLMs) evolve into personal assistants with access to sensitive user data, they face a critical privacy challenge: while prior work has addressed output-level privacy, recent findings reveal that LLMs often leak private information through their internal reasoning processes, violating contextual privacy expectations. These leaky thoughts occur when models inadvertently expose sensitive details in their reasoning traces, even when final outputs appear safe. The challenge lies in preventing such leakage without compromising the model's reasoning capabilities, requiring a delicate balance between privacy and utility. We introduce Steering Activations towards Leakage-free Thinking (SALT), a lightweight test-time intervention that mitigates privacy leakage in model's Chain of Thought (CoT) by injecting targeted steering vectors into hidden state. We identify the high-leakage layers responsible for this behavior. Through experiments across multiple LLMs, we demonstrate that SALT achieves reductions including $18.2\%$ reduction in CPL on QwQ-32B, $17.9\%$ reduction in CPL on Llama-3.1-8B, and $31.2\%$ reduction in CPL on Deepseek in contextual privacy leakage dataset AirGapAgent-R while maintaining comparable task performance and utility. Our work establishes SALT as a practical approach for test-time privacy protection in reasoning-capable language models, offering a path toward safer deployment of LLM-based personal agents.

Paperid: 316, https://arxiv.org/pdf/2512.21236.pdf

Abstract:
Large language models (LLMs) have revolutionized software development through AI-assisted coding tools, enabling developers with limited programming expertise to create sophisticated applications. However, this accessibility extends to malicious actors who may exploit these powerful tools to generate harmful software. Existing jailbreaking research primarily focuses on general attack scenarios against LLMs, with limited exploration of malicious code generation as a jailbreak target. To address this gap, we propose SPELL, a comprehensive testing framework specifically designed to evaluate the weakness of security alignment in malicious code generation. Our framework employs a time-division selection strategy that systematically constructs jailbreaking prompts by intelligently combining sentences from a prior knowledge dataset, balancing exploration of novel attack patterns with exploitation of successful techniques. Extensive evaluation across three advanced code models (GPT-4.1, Claude-3.5, and Qwen2.5-Coder) demonstrates SPELL's effectiveness, achieving attack success rates of 83.75%, 19.38%, and 68.12% respectively across eight malicious code categories. The generated prompts successfully produce malicious code in real-world AI development tools such as Cursor, with outputs confirmed as malicious by state-of-the-art detection systems at rates exceeding 73%. These findings reveal significant security gaps in current LLM implementations and provide valuable insights for improving AI safety alignment in code generation applications.

Paperid: 317, https://arxiv.org/pdf/2510.15001.pdf

Authors:Amer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette-Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan Zhang, Badih Ghazi, Borja De Balle Pigem, Prem Eruvbetine, Tris Warkentin, Armand Joulin, Ravi KumarAmer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette-Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan Zhang, Badih Ghazi, Borja De Balle Pigem, Prem Eruvbetine, Tris Warkentin, Armand Joulin, Ravi Kumar

Abstract:
The conventional approach in differential privacy (DP) literature formulates the privacy-utility trade-off with a "privacy-first" perspective: for a predetermined level of privacy, a certain utility is achievable. However, practitioners often operate under a "utility-first" paradigm, prioritizing a desired level of utility and then determining the corresponding privacy cost. Wu et al. [2019] initiated a formal study of this "utility-first" perspective by introducing ex-post DP. They demonstrated that by adding correlated Laplace noise and progressively reducing it on demand, a sequence of increasingly accurate estimates of a private parameter can be generated, with the privacy cost attributed only to the least noisy iterate released. This led to a Laplace mechanism variant that achieves a specified utility with minimal privacy loss. However, their work, and similar findings by Whitehouse et al. [2022], are primarily limited to simple mechanisms based on Laplace or Gaussian noise. In this paper, we significantly generalize these results. In particular, we extend the work of Wu et al. [2019] and Liu and Talwar [2019] to support any sequence of private estimators, incurring at most a doubling of the original privacy budget. Furthermore, we demonstrate that hyperparameter tuning for these estimators, including the selection of an optimal privacy budget, can be performed without additional privacy cost. Finally, we extend our results to ex-post Renyi DP, further broadening the applicability of utility-first privacy mechanisms.

Paperid: 319, https://arxiv.org/pdf/2507.14987.pdf

Abstract:
Large language models (LLMs), despite possessing latent safety understanding from their vast pretraining data, remain vulnerable to generating harmful content and exhibit issues such as over-refusal and utility degradation after safety alignment. Current safety alignment methods often result in superficial refusal shortcuts or rely on intensive supervision for reasoning-based approaches, failing to fully leverage the model's intrinsic safety self-awareness. We propose \textbf{AlphaAlign}, a simple yet effective pure reinforcement learning (RL) framework with verifiable safety reward designed to incentivize this latent safety awareness through proactive safety reasoning.} AlphaAlign employs a dual-reward system: a verifiable safety reward encourages correctly formatted and explicitly justified refusals for harmful queries while penalizing over-refusals, and a normalized helpfulness reward guides high-quality responses to benign inputs. This allows the model to develop proactive safety reasoning capabilities without depending on supervised safety-specific reasoning data. AlphaAlign demonstrates three key advantages: (1) Simplicity and efficiency, requiring only binary prompt safety labels and minimal RL steps for substantial improvements. (2) Breaking the safety-utility trade-off, by enhancing refusal of harmful content and reducing over-refusals, while simultaneously maintaining or even improving general task performance and robustness to unseen jailbreaks. (3) Deep alignment, fostering proactive safety reasoning that generates explicit safety rationales rather than relying on shallow refusal patterns.

Paperid: 320, https://arxiv.org/pdf/2509.18413.pdf

Abstract:
Voice anonymization aims to conceal speaker identity and attributes while preserving intelligibility, but current evaluations rely almost exclusively on Equal Error Rate (EER) that obscures whether adversaries can mount high-precision attacks. We argue that privacy should instead be evaluated in the low false-positive rate (FPR) regime, where even a small number of successful identifications constitutes a meaningful breach. To this end, we introduce VoxGuard, a framework grounded in differential privacy and membership inference that formalizes two complementary notions: User Privacy, preventing speaker re-identification, and Attribute Privacy, protecting sensitive traits such as gender and accent. Across synthetic and real datasets, we find that informed adversaries, especially those using fine-tuned models and max-similarity scoring, achieve orders-of-magnitude stronger attacks at low-FPR despite similar EER. For attributes, we show that simple transparent attacks recover gender and accent with near-perfect accuracy even after anonymization. Our results demonstrate that EER substantially underestimates leakage, highlighting the need for low-FPR evaluation, and recommend VoxGuard as a benchmark for evaluating privacy leakage.

Paperid: 321, https://arxiv.org/pdf/2507.04227.pdf

Abstract:
Mobile GUI agents are designed to autonomously execute diverse device-control tasks by interpreting and interacting with mobile screens. Despite notable advancements, their resilience in real-world scenarios where screen content may be partially manipulated by untrustworthy third parties remains largely unexplored. Owing to their black-box and autonomous nature, these agents are vulnerable to manipulations that could compromise user devices. In this work, we present the first systematic investigation into the vulnerabilities of mobile GUI agents. We introduce a scalable attack simulation framework AgentHazard, which enables flexible and targeted modifications of screen content within existing applications. Leveraging this framework, we develop a comprehensive benchmark suite comprising both a dynamic task execution environment and a static dataset of vision-language-action tuples, totaling over 3,000 attack scenarios. The dynamic environment encompasses 58 reproducible tasks in an emulator with various types of hazardous UI content, while the static dataset is constructed from 210 screenshots collected from 14 popular commercial apps. Importantly, our content modifications are designed to be feasible for unprivileged third parties. We evaluate 7 widely-used mobile GUI agents and 5 common backbone models using our benchmark. Our findings reveal that all examined agents are significantly influenced by misleading third-party content (with an average misleading rate of 28.8% in human-crafted attack scenarios) and that their vulnerabilities are closely linked to the employed perception modalities and backbone LLMs. Furthermore, we assess training-based mitigation strategies, highlighting both the challenges and opportunities for enhancing the robustness of mobile GUI agents. Our code and data will be released at https://agenthazard.github.io.

Paperid: 322, https://arxiv.org/pdf/2512.06387.pdf

Abstract:
Embodied AI systems integrate language models with real world sensing, mobility, and cloud connected mobile apps. Yet while model jailbreaks have drawn significant attention, the broader system stack of embodied intelligence remains largely unexplored. In this work, we conduct the first holistic security analysis of the Unitree Go2 platform and uncover ten cross layer vulnerabilities the "Ten Sins of Embodied AI Security." Using BLE sniffing, traffic interception, APK reverse engineering, cloud API testing, and hardware probing, we identify systemic weaknesses across three architectural layers: wireless provisioning, core modules, and external interfaces. These include hard coded keys, predictable handshake tokens, WiFi credential leakage, missing TLS validation, static SSH password, multilingual safety bypass behavior, insecure local relay channels, weak binding logic, and unrestricted firmware access. Together, they allow adversaries to hijack devices, inject arbitrary commands, extract sensitive information, or gain full physical control.Our findings show that securing embodied AI requires far more than aligning the model itself. We conclude with system level lessons learned and recommendations for building embodied platforms that remain robust across their entire software hardware ecosystem.

Paperid: 323, https://arxiv.org/pdf/2512.03791.pdf

Abstract:
Cross-chain technology enables interoperability among otherwise isolated blockchains, supporting interactions across heterogeneous networks. Similar to how multi-hop communication became fundamental in the evolution of the Internet, the demand for multi-hop cross-chain interactions is gaining increasing attention. However, this growing demand introduces new security and privacy challenges. On the security side, multi-hop interactions depend on the availability of multiple participating nodes. If any node becomes temporarily offline during execution, the protocol may fail to complete correctly, leading to settlement failure or fund loss. On the privacy side, the need for on-chain transparency to validate intermediate states may unintentionally leak linkable information, compromising the unlinkability of user interactions. In this paper, we propose the Cross-Chain Channel Network (CCN), a decentralized network designed to support secure and privacy-preserving multi-hop cross-chain transactions. Through experimental evaluation, we identify two critical types of offline failures, referred to as active and passive offline cases, which have not been adequately addressed by existing solutions. To mitigate these issues, we introduce R-HTLC, a core protocol within CCN. R-HTLC incorporates an hourglass mechanism and a multi-path refund strategy to ensure settlement correctness even when some nodes go offline during execution. Importantly, CCN addresses not only the correctness under offline conditions but also maintains unlinkability in such adversarial settings. To overcome this, CCN leverages zero-knowledge proofs and off-chain coordination, ensuring that interaction relationships remain indistinguishable even when certain nodes are temporarily offline.

Paperid: 324, https://arxiv.org/pdf/2512.03775.pdf

Abstract:
The Model Context Protocol (MCP) is rapidly emerging as the middleware for LLM-based applications, offering a standardized interface for tool integration. However, its built-in security mechanisms are minimal: while schemas and declarations prevent malformed requests, MCP provides no guarantees of authenticity or confidentiality, forcing developers to implement cryptography themselves. Such ad hoc practices are historically prone to misuse, and within MCP they threaten sensitive data and services. We present MICRYSCOPE, the first domain-specific framework for detecting cryptographic misuses in MCP implementations. MICRYSCOPE combines three key innovations: a cross-language intermediate representation that normalizes cryptographic APIs across diverse ecosystems, a hybrid dependency analysis that uncovers explicit and implicit function relationships (including insecure runtime compositions orchestrated by LLMs) and a taint-based misuse detector that tracks sensitive data flows and flags violations of established cryptographic rules. Applying MICRYSCOPE to 9,403 MCP servers, we identified 720 with cryptographic logic, of which 19.7% exhibited misuses. These flaws are concentrated in certain markets (e.g., Smithery Registry with 42% insecure servers), languages (Python at 34% misuse rate), and categories (Developer Tools and Data Science & ML accounting for over 50% of all misuses). Case studies reveal real-world consequences, including leaked API keys, insecure DES/ECB tools, and MD5-based authentication bypasses. Our study establishes the first ecosystem-wide view of cryptographic misuse in MCP and provides both tools and insights to strengthen the security foundations of this rapidly growing protocol.

Paperid: 325, https://arxiv.org/pdf/2511.07242.pdf

Abstract:
Mobile motion sensors such as accelerometers and gyroscopes are now ubiquitously accessible by third-party apps via standard APIs. While enabling rich functionalities like activity recognition and step counting, this openness has also enabled unregulated inference of sensitive user traits, such as gender, age, and even identity, without user consent. Existing privacy-preserving techniques, such as GAN-based obfuscation or differential privacy, typically require access to the full input sequence, introducing latency that is incompatible with real-time scenarios. Worse, they tend to distort temporal and semantic patterns, degrading the utility of the data for benign tasks like activity recognition. To address these limitations, we propose the Predictive Adversarial Transformation Network (PATN), a real-time privacy-preserving framework that leverages historical signals to generate adversarial perturbations proactively. The perturbations are applied immediately upon data acquisition, enabling continuous protection without disrupting application functionality. Experiments on two datasets demonstrate that PATN substantially degrades the performance of privacy inference models, achieving Attack Success Rate (ASR) of 40.11% and 44.65% (reducing inference accuracy to near-random) and increasing the Equal Error Rate (EER) from 8.30% and 7.56% to 41.65% and 46.22%. On ASR, PATN outperforms baseline methods by 16.16% and 31.96%, respectively.

Paperid: 326, https://arxiv.org/pdf/2510.24141.pdf

Abstract:
Mini-programs, an emerging mobile application paradigm within super-apps, offer a seamless and installation-free experience. However, the adoption of the web-view component has disrupted their isolation mechanisms, exposing new attack surfaces and vulnerabilities. In this paper, we introduce a novel vulnerability called Cross Mini-program Cookie Sharing (CMCS), which arises from the shared web-view environment across mini-programs. This vulnerability allows unauthorized data exchange across mini-programs by enabling one mini-program to access cookies set by another within the same web-view context, violating isolation principles. As a preliminary step, we analyzed the web-view mechanisms of four major platforms, including WeChat, AliPay, TikTok, and Baidu, and found that all of them are affected by CMCS vulnerabilities. Furthermore, we demonstrate the collusion attack enabled by CMCS, where privileged mini-programs exfiltrate sensitive user data via cookies accessible to unprivileged mini-programs. To measure the impact of collusion attacks enabled by CMCS vulnerabilities in the wild, we developed MiCoScan, a static analysis tool that detects mini-programs affected by CMCS vulnerabilities. MiCoScan employs web-view context modeling to identify clusters of mini-programs sharing the same web-view domain and cross-webview data flow analysis to detect sensitive data transmissions to/from web-views. Using MiCoScan, we conducted a large-scale analysis of 351,483 mini-programs, identifying 45,448 clusters sharing web-view domains, 7,965 instances of privileged data transmission, and 9,877 mini-programs vulnerable to collusion attacks. Our findings highlight the widespread prevalence and significant security risks posed by CMCS vulnerabilities, underscoring the urgent need for improved isolation mechanisms in mini-program ecosystems.

Paperid: 327, https://arxiv.org/pdf/2510.14700.pdf

Abstract:
Large language model (LLM) agents have demonstrated remarkable capabilities in software engineering and cybersecurity tasks, including code generation, vulnerability discovery, and automated testing. One critical but underexplored application is automated web vulnerability reproduction, which transforms vulnerability reports into working exploits. Although recent advances suggest promising potential, challenges remain in applying LLM agents to real-world web vulnerability reproduction scenarios. In this paper, we present the first comprehensive evaluation of state-of-the-art LLM agents for automated web vulnerability reproduction. We systematically assess 20 agents from software engineering, cybersecurity, and general domains across 16 dimensions, including technical capabilities, environment adaptability, and user experience factors, on 3 representative web vulnerabilities. Based on the results, we select three top-performing agents (OpenHands, SWE-agent, and CAI) for in-depth evaluation on our benchmark dataset of 80 real-world CVEs spanning 7 vulnerability types and 6 web technologies. Our results reveal that while LLM agents achieve reasonable success on simple library-based vulnerabilities, they consistently fail on complex service-based vulnerabilities requiring multi-component environments. Complex environment configurations and authentication barriers create a gap where agents can execute exploit code but fail to trigger actual vulnerabilities. We observe high sensitivity to input guidance, with performance degrading by over 33% under incomplete authentication information. Our findings highlight the significant gap between current LLM agent capabilities and the demands of reliable automated vulnerability reproduction, emphasizing the need for advances in environmental adaptation and autonomous problem-solving capabilities.

Paperid: 328, https://arxiv.org/pdf/2510.01002.pdf

Abstract:
Current learning-based Automated Vulnerability Repair (AVR) approaches, while promising, often fail to generalize effectively in real-world scenarios. Our diagnostic analysis reveals three fundamental weaknesses in state-of-the-art AVR approaches: (1) limited cross-repository generalization, with performance drops on unseen codebases; (2) inability to capture long-range dependencies, causing a performance degradation on complex, multi-hunk repairs; and (3) over-reliance on superficial lexical patterns, leading to significant performance drops on vulnerabilities with minor syntactic variations like variable renaming. To address these limitations, we propose SeCuRepair, a semantics-aligned, curriculum-driven, and reasoning-enhanced framework for vulnerability repair. At its core, SeCuRepair adopts a reason-then-edit paradigm, requiring the model to articulate why and how a vulnerability should be fixed before generating the patch. This explicit reasoning enforces a genuine understanding of repair logic rather than superficial memorization of lexical patterns. SeCuRepair also moves beyond traditional supervised fine-tuning and employs semantics-aware reinforcement learning, rewarding patches for their syntactic and semantic alignment with the oracle patch rather than mere token overlap. Complementing this, a difficulty-aware curriculum progressively trains the model, starting with simple fixes and advancing to complex, multi-hunk coordinated edits. We evaluate SeCuRepair on strict, repository-level splits of BigVul and newly crafted PrimeVul_AVR datasets. SeCuRepair significantly outperforms all baselines, surpassing the best-performing baselines by 34.52% on BigVul and 31.52% on PrimeVul\textsubscript{AVR} in terms of CodeBLEU, respectively. Comprehensive ablation studies further confirm that each component of our framework contributes to its final performance.

Paperid: 329, https://arxiv.org/pdf/2509.22097.pdf

Abstract:
Large language model (LLM) powered code agents are rapidly transforming software engineering by automating tasks such as testing, debugging, and repairing, yet the security risks of their generated code have become a critical concern. Existing benchmarks have offered valuable insights but remain insufficient: they often overlook the genuine context in which vulnerabilities were introduced or adopt narrow evaluation protocols that fail to capture either functional correctness or newly introduced vulnerabilities. We therefore introduce SecureAgentBench, a benchmark of 105 coding tasks designed to rigorously evaluate code agents' capabilities in secure code generation. Each task includes (i) realistic task settings that require multi-file edits in large repositories, (ii) aligned contexts based on real-world open-source vulnerabilities with precisely identified introduction points, and (iii) comprehensive evaluation that combines functionality testing, vulnerability checking through proof-of-concept exploits, and detection of newly introduced vulnerabilities using static analysis. We evaluate three representative agents (SWE-agent, OpenHands, and Aider) with three state-of-the-art LLMs (Claude 3.7 Sonnet, GPT-4.1, and DeepSeek-V3.1). Results show that (i) current agents struggle to produce secure code, as even the best-performing one, SWE-agent supported by DeepSeek-V3.1, achieves merely 15.2% correct-and-secure solutions, (ii) some agents produce functionally correct code but still introduce vulnerabilities, including new ones not previously recorded, and (iii) adding explicit security instructions for agents does not significantly improve secure coding, underscoring the need for further research. These findings establish SecureAgentBench as a rigorous benchmark for secure code generation and a step toward more reliable software development with LLMs.

Paperid: 330, https://arxiv.org/pdf/2509.22040.pdf

Abstract:
Agentic AI coding editors driven by large language models have recently become more popular due to their ability to improve developer productivity during software development. Modern editors such as Cursor are designed not just for code completion, but also with more system privileges for complex coding tasks (e.g., run commands in the terminal, access development environments, and interact with external systems). While this brings us closer to the "fully automated programming" dream, it also raises new security concerns. In this study, we present the first empirical analysis of prompt injection attacks targeting these high-privilege agentic AI coding editors. We show how attackers can remotely exploit these systems by poisoning external development resources with malicious instructions, effectively hijacking AI agents to run malicious commands, turning "your AI" into "attacker's shell". To perform this analysis, we implement AIShellJack, an automated testing framework for assessing prompt injection vulnerabilities in agentic AI coding editors. AIShellJack contains 314 unique attack payloads that cover 70 techniques from the MITRE ATT&CK framework. Using AIShellJack, we conduct a large-scale evaluation on GitHub Copilot and Cursor, and our evaluation results show that attack success rates can reach as high as 84% for executing malicious commands. Moreover, these attacks are proven effective across a wide range of objectives, ranging from initial access and system discovery to credential theft and data exfiltration.

Paperid: 331, https://arxiv.org/pdf/2509.12957.pdf

Abstract:
Real-World Assets (RWAs) have recently attracted increasing attention as a means of bridging traditional financial instruments with decentralized infrastructures. By representing assets such as bonds, commodities, and real estate on blockchains, RWAs can enhance liquidity, broaden accessibility, and extend the scope of decentralized finance. Industry forecasts further suggest rapid growth of tokenized RWAs in the coming years, underscoring their potential role in the evolution of digital financial markets. However, when deployed across multiple blockchains, RWAs face challenges such as repeated authentication on different chains and inefficiency caused by multi-step settlement protocols. To address these issues, we present a cross-chain framework for RWAs that emphasizes identity management, authentication, and interaction. The framework integrates Decentralized Identifiers and Verifiable Credentials with customized attributes to support decentralized identification, and incorporates an authentication protocol based on Simplified Payment Verification to avoid redundant verification across chains. Furthermore, we design a cross-chain channel that enables the settlement of RWAs without requiring channel closure, thereby improving operational efficiency. We implement the framework and evaluate it through simulations, which confirm its feasibility and demonstrate improvements in efficiency for RWAs in cross-chain settings.

Paperid: 332, https://arxiv.org/pdf/2509.09291.pdf

Abstract:
The application layer of Bluetooth Low Energy (BLE) is a growing source of security vulnerabilities, as developers often neglect to implement critical protections such as encryption, authentication, and freshness. While formal verification offers a principled way to check these properties, the manual effort of constructing formal models makes it impractical for large-scale analysis. This paper introduces a key insight: BLE application security analysis can be reframed as a semantic translation problem, i.e., from real-world code to formal models. We leverage large language models (LLMs) not to directly detect vulnerabilities, but to serve as translators that convert BLE-specific code into process models verifiable by tools like ProVerif. We implement this idea in VerifiaBLE, a system that combines static analysis, prompt-guided LLM translation, and symbolic verification to check three core security features: encryption, randomness, and authentication. Applied to 1,050 Android BLE apps, VerifiaBLE uncovers systemic weaknesses: only 10.2\% of apps implement all three protections, while 53.9\% omit them entirely. Our work demonstrates that using LLMs as structured translators can lower the barrier to formal methods, unlocking scalable verification across security-critical domains.

Paperid: 333, https://arxiv.org/pdf/2509.05921.pdf

Abstract:
As datasets become critical assets in modern machine learning systems, ensuring robust copyright protection has emerged as an urgent challenge. Traditional legal mechanisms often fail to address the technical complexities of digital data replication and unauthorized use, particularly in opaque or decentralized environments. This survey provides a comprehensive review of technical approaches for dataset copyright protection, systematically categorizing them into three main classes: non-intrusive methods, which detect unauthorized use without modifying data; minimally-intrusive methods, which embed lightweight, reversible changes to enable ownership verification; and maximally-intrusive methods, which apply aggressive data alterations, such as reversible adversarial examples, to enforce usage restrictions. We synthesize key techniques, analyze their strengths and limitations, and highlight open research challenges. This work offers an organized perspective on the current landscape and suggests future directions for developing unified, scalable, and ethically sound solutions to protect datasets in increasingly complex machine learning ecosystems.

Paperid: 334, https://arxiv.org/pdf/2508.09213.pdf

Abstract:
Physical layer authentication (PLA) uses inherent characteristics of the communication medium to provide secure and efficient authentication in wireless networks, bypassing the need for traditional cryptographic methods. With advancements in deep learning, PLA has become a widely adopted technique for its accuracy and reliability. In this paper, we introduce VeriPHY, a novel deep learning-based PLA solution for 5G networks, which enables unique device identification by embedding signatures within wireless I/Q transmissions using steganography. VeriPHY continuously generates pseudo-random signatures by sampling from Gaussian Mixture Models whose distribution is carefully varied to ensure signature uniqueness and stealthiness over time, and then embeds the newly generated signatures over I/Q samples transmitted by users to the 5G gNB. Utilizing deep neural networks, VeriPHY identifies and authenticates users based on these embedded signatures. VeriPHY achieves high precision, identifying unique signatures between 93% and 100% with low false positive rates and an inference time of 28 ms when signatures are updated every 20 ms. Additionally, we also demonstrate a stealth generation mode where signatures are generated in a way that makes them virtually indistinguishable from unaltered 5G signals while maintaining over 93% detection accuracy.

Paperid: 335, https://arxiv.org/pdf/2507.21817.pdf

Abstract:
Automated vulnerability detection research has made substantial progress, yet its real-world impact remains limited. Current vulnerability datasets suffer from issues including label inaccuracy rates of 20-71%, extensive duplication, and poor coverage of critical CWE types. These issues create a significant "generalization gap" where models achieve misleading self-testing performance (measured on held-out data from the same dataset for training) by exploiting spurious correlations rather than learning true vulnerability patterns. Our analysis reveals that many models experience substantial performance drops of up to 33% when evaluated on independent data, with some performing close to random guessing. To address these limitations, we present a three-part solution. First, we introduce a manually curated test dataset, BenchVul, covering the MITRE Top 25 Most Dangerous CWEs. Second, we construct a high-quality training dataset, TitanVul, comprising 38,863 functions by aggregating seven public sources and applying deduplication and validation using a novel multi-agent LLM framework. Third, we propose a Realistic Vulnerability Generation (RVG) framework, which synthesizes context-aware vulnerability examples for underrepresented but critical CWE types through simulated development workflows. Our evaluation shows the strengths of each component in closing the generalization gap. First, BenchVul shows the limitations of self-testing: models trained on existing datasets, such as BigVul and CVEfixes, experience performance drops on BenchVul (from 0.776 to 0.519 and from 0.713 to 0.607). Second, training models on TitanVul demonstrates improved generalization, with model performance increasing from 0.584 when evaluated on the same dataset to 0.767 when tested on BenchVul. Third, supplementing TitanVul with RVG-generated data yields further gains, increasing model performance by 14.0% to 0.874.

Paperid: 336, https://arxiv.org/pdf/2507.06250.pdf

Abstract:
The Model Context Protocol (MCP) has emerged as a widely adopted mechanism for connecting large language models to external tools and resources. While MCP promises seamless extensibility and rich integrations, it also introduces a substantially expanded attack surface: any plugin can inherit broad system privileges with minimal isolation or oversight. In this work, we conduct the first large-scale empirical analysis of MCP security risks. We develop an automated static analysis framework and systematically examine 2,562 real-world MCP applications spanning 23 functional categories. Our measurements reveal that network and system resource APIs dominate usage patterns, affecting 1,438 and 1,237 servers respectively, while file and memory resources are less frequent but still significant. We find that Developer Tools and API Development plugins are the most API-intensive, and that less popular plugins often contain disproportionately high-risk operations. Through concrete case studies, we demonstrate how insufficient privilege separation enables privilege escalation, misinformation propagation, and data tampering. Based on these findings, we propose a detailed taxonomy of MCP resource access, quantify security-relevant API usage, and identify open challenges for building safer MCP ecosystems, including dynamic permission models and automated trust assessment.

Paperid: 337, https://arxiv.org/pdf/2507.04931.pdf

Abstract:
Dynamic Symbolic Execution (DSE) is a key technique in program analysis, widely used in software testing, vulnerability discovery, and formal verification. In distributed AI systems, DSE plays a crucial role in identifying hard-to-detect bugs, especially those arising from complex network communication patterns. However, traditional approaches to symbolic execution are often hindered by scalability issues and inefficiencies, particularly in large-scale systems. This paper introduces LIFT (Large-language-model Integrated Functional-equivalent-IR Transformation), a novel framework that leverages Large Language Models (LLMs) to automate the optimization of Intermediate Representations (IRs) in symbolic execution. LIFT addresses the challenges of symbolic execution by providing a scalable, context-sensitive solution for IR transformation. The framework consists of two phases: IR Analysis and Optimization, where LLMs optimize time-intensive IR blocks, and Symbolic Execution and Validation, which includes benchmarking and semantic verification to ensure correctness and generalizability. Experiments on real-world binaries demonstrated significant performance improvements, including a 53.5\% reduction in execution time for bigtest and a 10.24\% reduction for random, along with reductions in IR statements, PUT instructions, and temporary variables. These results demonstrate that LLMs simplify IRs while maintaining functional correctness, enhancing symbolic execution in distributed AI systems.

Paperid: 338, https://arxiv.org/pdf/2510.00730.pdf

Abstract:
Modern software projects depend on many third-party libraries, complicating reproducible and secure builds. Several package managers address this with the generation of a lockfile that freezes dependency versions and can be used to verify the integrity of dependencies. Yet, Maven, one of the most important package managers in the Java ecosystem, lacks native support for a lockfile. We present Maven-Lockfile to generate and update lockfiles, with support for rebuilding projects from past versions. Our lockfiles capture all direct and transitive dependencies with their checksums, enabling high integrity builds. Our evaluation shows that Maven-Lockfile can reproduce builds from historical commits and is able to detect tampered artifacts. With minimal configuration, Maven-Lockfile equips Java projects with modern build integrity and build reproducibility, and fosters future research on software supply chain security in Java.

Paperid: 339, https://arxiv.org/pdf/2509.25927.pdf

Abstract:
Deep neural networks remain vulnerable to adversarial examples despite advances in architectures and training paradigms. We investigate how training data characteristics affect adversarial robustness across 36 state-of-the-art vision models spanning supervised, self-supervised, and contrastive learning approaches, trained on datasets from 1.2M to 22B images. Models were evaluated under six black-box attack categories: random perturbations, two types of geometric masks, COCO object manipulations, ImageNet-C corruptions, and ImageNet-R style shifts. Robustness follows a logarithmic scaling law with both data volume and model size: a tenfold increase in data reduces attack success rate (ASR) on average by ~3.2%, whereas a tenfold increase in model size reduces ASR on average by ~13.4%. Notably, some self-supervised models trained on curated datasets, such as DINOv2, outperform others trained on much larger but less curated datasets, challenging the assumption that scale alone drives robustness. Adversarial fine-tuning of ResNet50s improves generalization across structural variations but not across color distributions. Human evaluation reveals persistent gaps between human and machine vision. These results show that while scaling improves robustness, data quality, architecture, and training objectives play a more decisive role than raw scale in achieving broad-spectrum adversarial resilience.

Paperid: 340, https://arxiv.org/pdf/2509.10569.pdf

Abstract:
We introduce MarkDiffusion, an open-source Python toolkit for generative watermarking of latent diffusion models. It comprises three key components: a unified implementation framework for streamlined watermarking algorithm integrations and user-friendly interfaces; a mechanism visualization suite that intuitively showcases added and extracted watermark patterns to aid public understanding; and a comprehensive evaluation module offering standard implementations of 24 tools across three essential aspects - detectability, robustness, and output quality - plus 8 automated evaluation pipelines. Through MarkDiffusion, we seek to assist researchers, enhance public awareness and engagement in generative watermarking, and promote consensus while advancing research and applications.

Paperid: 341, https://arxiv.org/pdf/2511.17989.pdf

Abstract:
Multi-domain graph pre-training has emerged as a pivotal technique in developing graph foundation models. While it greatly improves the generalization of graph neural networks, its privacy risks under membership inference attacks (MIAs), which aim to identify whether a specific instance was used in training (member), remain largely unexplored. However, effectively conducting MIAs against multi-domain graph pre-trained models is a significant challenge due to: (i) Enhanced Generalization Capability: Multi-domain pre-training reduces the overfitting characteristics commonly exploited by MIAs. (ii) Unrepresentative Shadow Datasets: Diverse training graphs hinder the obtaining of reliable shadow graphs. (iii) Weakened Membership Signals: Embedding-based outputs offer less informative cues than logits for MIAs. To tackle these challenges, we propose MGP-MIA, a novel framework for Membership Inference Attacks against Multi-domain Graph Pre-trained models. Specifically, we first propose a membership signal amplification mechanism that amplifies the overfitting characteristics of target models via machine unlearning. We then design an incremental shadow model construction mechanism that builds a reliable shadow model with limited shadow graphs via incremental learning. Finally, we introduce a similarity-based inference mechanism that identifies members based on their similarity to positive and negative samples. Extensive experiments demonstrate the effectiveness of our proposed MGP-MIA and reveal the privacy risks of multi-domain graph pre-training.

Paperid: 342, https://arxiv.org/pdf/2511.17982.pdf

Abstract:
Graph Foundation Models (GFMs) are pre-trained on diverse source domains and adapted to unseen targets, enabling broad generalization for graph machine learning. Despite that GFMs have attracted considerable attention recently, their vulnerability to backdoor attacks remains largely underexplored. A compromised GFM can introduce backdoor behaviors into downstream applications, posing serious security risks. However, launching backdoor attacks against GFMs is non-trivial due to three key challenges. (1) Effectiveness: Attackers lack knowledge of the downstream task during pre-training, complicating the assurance that triggers reliably induce misclassifications into desired classes. (2) Stealthiness: The variability in node features across domains complicates trigger insertion that remains stealthy. (3) Persistence: Downstream fine-tuning may erase backdoor behaviors by updating model parameters. To address these challenges, we propose GFM-BA, a novel Backdoor Attack model against Graph Foundation Models. Specifically, we first design a label-free trigger association module that links the trigger to a set of prototype embeddings, eliminating the need for knowledge about downstream tasks to perform backdoor injection. Then, we introduce a node-adaptive trigger generator, dynamically producing node-specific triggers, reducing the risk of trigger detection while reliably activating the backdoor. Lastly, we develop a persistent backdoor anchoring module that firmly anchors the backdoor to fine-tuning-insensitive parameters, enhancing the persistence of the backdoor under downstream adaptation. Extensive experiments demonstrate the effectiveness, stealthiness, and persistence of GFM-BA.

Paperid: 343, https://arxiv.org/pdf/2507.18988.pdf

Abstract:
The rapid advancement of image-generation technologies has made it possible for anyone to create photorealistic images using generative models, raising significant security concerns. To mitigate malicious use, tracing the origin of such images is essential. Reconstruction-based attribution methods offer a promising solution, but they often suffer from reduced accuracy and high computational costs when applied to state-of-the-art (SOTA) models. To address these challenges, we propose AEDR (AutoEncoder Double-Reconstruction), a novel training-free attribution method designed for generative models with continuous autoencoders. Unlike existing reconstruction-based approaches that rely on the value of a single reconstruction loss, AEDR performs two consecutive reconstructions using the model's autoencoder, and adopts the ratio of these two reconstruction losses as the attribution signal. This signal is further calibrated using the image homogeneity metric to improve accuracy, which inherently cancels out absolute biases caused by image complexity, with autoencoder-based reconstruction ensuring superior computational efficiency. Experiments on eight top latent diffusion models show that AEDR achieves 25.5% higher attribution accuracy than existing reconstruction-based methods, while requiring only 1% of the computational time.

Paperid: 344, https://arxiv.org/pdf/2511.09134.pdf

Abstract:
Smart contracts have significantly advanced blockchain technology, and digital signatures are crucial for reliable verification of contract authority. Through signature verification, smart contracts can ensure that signers possess the required permissions, thus enhancing security and scalability. However, lacking checks on signature usage conditions can lead to repeated verifications, increasing the risk of permission abuse and threatening contract assets. We define this issue as the Signature Replay Vulnerability (SRV). In this paper, we conducted the first empirical study to investigate the causes and characteristics of the SRVs. From 1,419 audit reports across 37 blockchain security companies, we identified 108 with detailed SRV descriptions and classified five types of SRVs. To detect these vulnerabilities automatically, we designed LASiR, which utilizes the general semantic understanding ability of Large Language Models (LLMs) to assist in the static taint analysis of the signature state and identify the signature reuse behavior. It also employs path reachability verification via symbolic execution to ensure effective and reliable detection. To evaluate the performance of LASiR, we conducted large-scale experiments on 15,383 contracts involving signature verification, selected from the initial dataset of 918,964 contracts across four blockchains: Ethereum, Binance Smart Chain, Polygon, and Arbitrum. The results indicate that SRVs are widespread, with affected contracts holding $4.76 million in active assets. Among these, 19.63% of contracts that use signatures on Ethereum contain SRVs. Furthermore, manual verification demonstrates that LASiR achieves an F1-score of 87.90% for detection. Ablation studies and comparative experiments reveal that the semantic information provided by LLMs aids static taint analysis, significantly enhancing LASiR's detection performance.

Paperid: 345, https://arxiv.org/pdf/2510.17848.pdf

Abstract:
While the rapid growth of Web3 has driven the development of decentralized finance, user anonymity and cross-chain asset flows make on-chain laundering behaviors more covert and complex. In this context, constructing high-quality anti-money laundering(AML) datasets has become essential for risk-control systems and on-chain forensic analysis, yet current practices still rely heavily on manual efforts with limited efficiency and coverage. In this paper, we introduce RiskTagger, a large-language-model-based agent for the automatic annotation of crypto laundering behaviors in Web3. RiskTagger is designed to replace or complement human annotators by addressing three key challenges: extracting clues from complex unstructured reports, reasoning over multichain transaction paths, and producing auditor-friendly explanations. RiskTagger implements an end-to-end multi-module agent, integrating a key-clue extractor, a multichain fetcher with a laundering-behavior reasoner, and a data explainer, forming a data annotation pipeline. Experiments on the real case Bybit Hack (with the highest stolen asset value) demonstrate that RiskTagger achieves 100% accuracy in clue extraction, 84.1% consistency with expert judgment, and 90% coverage in explanation generation. Overall, RiskTagger automates laundering behavior annotation while improving transparency and scalability in AML research.

Paperid: 346, https://arxiv.org/pdf/2510.01157.pdf

Abstract:
Large Language Models (LLMs) and their multimodal extensions are becoming increasingly popular. One common approach to enable multimodality is to cascade domain-specific encoders with an LLM, making the resulting model inherit vulnerabilities from all of its components. In this work, we present the first systematic study of audio backdoor attacks against speech language models. We demonstrate its effectiveness across four speech encoders and three datasets, covering four tasks: automatic speech recognition (ASR), speech emotion recognition, and gender and age prediction. The attack consistently achieves high success rates, ranging from 90.76% to 99.41%. To better understand how backdoors propagate, we conduct a component-wise analysis to identify the most vulnerable stages of the pipeline. Finally, we propose a fine-tuning-based defense that mitigates the threat of poisoned pretrained encoders.

Paperid: 347, https://arxiv.org/pdf/2509.07540.pdf

Abstract:
Software vulnerabilities pose serious risks to modern software ecosystems. While the National Vulnerability Database (NVD) is the authoritative source for cataloging these vulnerabilities, it often lacks explicit links to the corresponding Vulnerability-Fixing Commits (VFCs). VFCs encode precise code changes, enabling vulnerability localization, patch analysis, and dataset construction. Automatically mapping NVD records to their true VFCs is therefore critical. Existing approaches have limitations as they rely on sparse, often noisy commit messages and fail to capture the deep semantics in the vulnerability descriptions. To address this gap, we introduce PatchSeeker, a novel method that leverages large language models to create rich semantic links between vulnerability descriptions and their VFCs. PatchSeeker generates embeddings from NVD descriptions and enhances commit messages by synthesizing detailed summaries for those that are short or uninformative. These generated messages act as a semantic bridge, effectively closing the information gap between natural language reports and low-level code changes. Our approach PatchSeeker achieves 59.3% higher MRR and 27.9% higher Recall@10 than the best-performing baseline, Prospector, on the benchmark dataset. The extended evaluation on recent CVEs further confirms PatchSeeker's effectiveness. Ablation study shows that both the commit message generation method and the selection of backbone LLMs make a positive contribution to PatchSeeker. We also discuss limitations and open challenges to guide future work.

Paperid: 348, https://arxiv.org/pdf/2510.16035.pdf

Abstract:
Social networks have become a crucial source of real-time information for individuals. The influence of social bots within these platforms has garnered considerable attention from researchers, leading to the development of numerous detection technologies. However, the vulnerability and robustness of these detection methods is still underexplored. Existing Graph Neural Network (GNN)-based methods cannot be directly applied due to the issues of limited control over social agents, the black-box nature of bot detectors, and the heterogeneity of bots. To address these challenges, this paper proposes the first adversarial multi-agent Reinforcement learning framework for social Bot control attacks (RoBCtrl) targeting GNN-based social bot detectors. Specifically, we use a diffusion model to generate high-fidelity bot accounts by reconstructing existing account data with minor modifications, thereby evading detection on social platforms. To the best of our knowledge, this is the first application of diffusion models to mimic the behavior of evolving social bots effectively. We then employ a Multi-Agent Reinforcement Learning (MARL) method to simulate bots adversarial behavior. We categorize social accounts based on their influence and budget. Different agents are then employed to control bot accounts across various categories, optimizing the attachment strategy through reinforcement learning. Additionally, a hierarchical state abstraction based on structural entropy is designed to accelerate the reinforcement learning. Extensive experiments on social bot detection datasets demonstrate that our framework can effectively undermine the performance of GNN-based detectors.

Paperid: 349, https://arxiv.org/pdf/2510.14906.pdf

Abstract:
Machine Learning (ML)-based malicious traffic detection is a promising security paradigm. It outperforms rule-based traditional detection by identifying various advanced attacks. However, the robustness of these ML models is largely unexplored, thereby allowing attackers to craft adversarial traffic examples that evade detection. Existing evasion attacks typically rely on overly restrictive conditions (e.g., encrypted protocols, Tor, or specialized setups), or require detailed prior knowledge of the target (e.g., training data and model parameters), which is impractical in realistic black-box scenarios. The feasibility of a hard-label black-box evasion attack (i.e., applicable across diverse tasks and protocols without internal target insights) thus remains an open challenge. To this end, we develop NetMasquerade, which leverages reinforcement learning (RL) to manipulate attack flows to mimic benign traffic and evade detection. Specifically, we establish a tailored pre-trained model called Traffic-BERT, utilizing a network-specialized tokenizer and an attention mechanism to extract diverse benign traffic patterns. Subsequently, we integrate Traffic-BERT into the RL framework, allowing NetMasquerade to effectively manipulate malicious packet sequences based on benign traffic patterns with minimal modifications. Experimental results demonstrate that NetMasquerade enables both brute-force and stealthy attacks to evade 6 existing detection methods under 80 attack scenarios, achieving over 96.65% attack success rate. Notably, it can evade the methods that are either empirically or certifiably robust against existing evasion attacks. Finally, NetMasquerade achieves low-latency adversarial traffic generation, demonstrating its practicality in real-world scenarios.

Paperid: 350, https://arxiv.org/pdf/2510.14283.pdf

Abstract:
Website Fingerprinting (WF) attacks exploit patterns in encrypted traffic to infer the websites visited by users, posing a serious threat to anonymous communication systems. Although recent WF techniques achieve over 90% accuracy in controlled experimental settings, most studies remain confined to single scenarios, overlooking the complexity of real-world environments. This paper presents the first systematic and comprehensive evaluation of existing WF attacks under diverse realistic conditions, including defense mechanisms, traffic drift, multi-tab browsing, early-stage detection, open-world settings, and few-shot scenarios. Experimental results show that many WF techniques with strong performance in isolated settings degrade significantly when facing other conditions. Since real-world environments often combine multiple challenges, current WF attacks are difficult to apply directly in practice. This study highlights the limitations of WF attacks and introduces a multidimensional evaluation framework, offering critical insights for developing more robust and practical WF attacks.

Paperid: 351, https://arxiv.org/pdf/2510.07176.pdf

Abstract:
Large Language Models (LLMs) are increasingly deployed as agents that orchestrate tasks and integrate external tools to execute complex workflows. We demonstrate that these interactive behaviors leave distinctive fingerprints in encrypted traffic exchanged between users and LLM agents. By analyzing traffic patterns associated with agent workflows and tool invocations, adversaries can infer agent activities, distinguish specific agents, and even profile sensitive user attributes. To highlight this risk, we develop AgentPrint, which achieves an F1-score of 0.866 in agent identification and attains 73.9% and 69.1% top-3 accuracy in user attribute inference for simulated- and real-user settings, respectively. These results uncover an overlooked risk: the very interactivity that empowers LLM agents also exposes user privacy, underscoring the urgent need for technical countermeasures alongside regulatory and policy safeguards.

Paperid: 352, https://arxiv.org/pdf/2509.11833.pdf

Abstract:
Path MTU Discovery (PMTUD) and IP address sharing are integral aspects of modern Internet infrastructure. In this paper, we investigate the security vulnerabilities associated with PMTUD within the context of prevalent IP address sharing practices. We reveal that PMTUD is inadequately designed to handle IP address sharing, creating vulnerabilities that attackers can exploit to perform off-path TCP hijacking attacks. We demonstrate that by observing the path MTU value determined by a server for a public IP address (shared among multiple devices), an off-path attacker on the Internet, in collaboration with a malicious device, can infer the sequence numbers of TCP connections established by other legitimate devices sharing the same IP address. This vulnerability enables the attacker to perform off-path TCP hijacking attacks, significantly compromising the security of the affected TCP connections. Our attack involves first identifying a target TCP connection originating from the shared IP address, followed by inferring the sequence numbers of the identified connection. We thoroughly assess the impacts of our attack under various network configurations. Experimental results reveal that the attack can be executed within an average time of 220 seconds, achieving a success rate of 70%.Case studies, including SSH DoS, FTP traffic poisoning, and HTTP injection, highlight the threat it poses to various applications. Additionally, we evaluate our attack across 50 real-world networks with IP address sharing--including public Wi-Fi, VPNs, and 5G--and find 38 vulnerable. Finally, we responsibly disclose the vulnerabilities, receive recognition from organizations such as IETF, Linux, and Cisco, and propose our countermeasures.

Paperid: 353, https://arxiv.org/pdf/2511.07170.pdf

Abstract:
Current graph neural network (GNN) model-stealing methods rely heavily on queries to the victim model, assuming no hard query limits. However, in reality, the number of allowed queries can be severely limited. In this paper, we demonstrate how an adversary can extract a GNN with very limited interactions with the model. Our approach first enables the adversary to obtain the model backbone without making direct queries to the victim model and then to strategically utilize a fixed query limit to extract the most informative data. The experiments on eight real-world datasets demonstrate the effectiveness of the attack, even under a very restricted query limit and under defense against model extraction in place. Our findings underscore the need for robust defenses against GNN model extraction threats.

Paperid: 354, https://arxiv.org/pdf/2510.02342.pdf

Abstract:
Watermarking algorithms for Large Language Models (LLMs) effectively identify machine-generated content by embedding and detecting hidden statistical features in text. However, such embedding leads to a decline in text quality, especially in low-entropy scenarios where performance needs improvement. Existing methods that rely on entropy thresholds often require significant computational resources for tuning and demonstrate poor adaptability to unknown or cross-task generation scenarios. We propose \textbf{C}ontext-\textbf{A}ware \textbf{T}hreshold watermarking ($\myalgo$), a novel framework that dynamically adjusts watermarking intensity based on real-time semantic context. $\myalgo$ partitions text generation into semantic states using logits clustering, establishing context-aware entropy thresholds that preserve fidelity in structured content while embedding robust watermarks. Crucially, it requires no pre-defined thresholds or task-specific tuning. Experiments show $\myalgo$ improves text quality in cross-tasks without sacrificing detection accuracy.

Paperid: 355, https://arxiv.org/pdf/2509.13755.pdf

Abstract:
While Code Language Models (CLMs) have demonstrated superior performance in software engineering tasks such as code generation and summarization, recent empirical studies reveal a critical privacy vulnerability: these models exhibit unintended memorization of sensitive training data, enabling verbatim reproduction of confidential information when specifically prompted. To address this issue, several approaches, including training data de-duplication and differential privacy augmentation, have been proposed. However, these methods require full-model retraining for deployed CLMs, which incurs substantial computational costs. In this paper, we aim to answer the following research question: Can sensitive information memorized by CLMs be erased effectively and efficiently? We conduct a pioneering investigation into erasing sensitive memorization in CLMs through machine unlearning - a post-hoc modification method that removes specific information from trained models without requiring full retraining. Specifically, we first quantify the memorization risks of sensitive data within CLM training datasets and curate a high-risk dataset of 50,000 sensitive memorized samples as unlearning targets. We study two widely used gradient ascent-based unlearning approaches: the vanilla and constraint-based methods, and introduce CodeEraser, an advanced variant that selectively unlearns sensitive memorized segments in code while preserving the structural integrity and functional correctness of the surrounding code. Extensive experiments on three families of CLMs, i.e., CodeParrot, CodeGen-Mono, and Qwen2.5-Coder, validate the effectiveness and efficiency of CodeEraser in erasing targeted sensitive memorization while maintaining model utility.

Paperid: 356, https://arxiv.org/pdf/2508.15252.pdf

Abstract:
Recent studies have shown that recommender systems (RSs) are highly vulnerable to data poisoning attacks, where malicious actors inject fake user profiles, including a group of well-designed fake ratings, to manipulate recommendations. Due to security and privacy constraints in practice, attackers typically possess limited knowledge of the victim system and thus need to craft profiles that have transferability across black-box RSs. To maximize the attack impact, the profiles often remains imperceptible. However, generating such high-quality profiles with the restricted resources is challenging. Some works suggest incorporating fake textual reviews to strengthen the profiles; yet, the poor quality of the reviews largely undermines the attack effectiveness and imperceptibility under the practical setting. To tackle the above challenges, in this paper, we propose to enhance the quality of the review text by harnessing in-context learning (ICL) capabilities of multimodal foundation models. To this end, we introduce a demonstration retrieval algorithm and a text style transfer strategy to augment the navie ICL. Specifically, we propose a novel practical attack framework named RAGAN to generate high-quality fake user profiles, which can gain insights into the robustness of RSs. The profiles are generated by a jailbreaker and collaboratively optimized on an instructional agent and a guardian to improve the attack transferability and imperceptibility. Comprehensive experiments on various real-world datasets demonstrate that RAGAN achieves the state-of-the-art poisoning attack performance.

Paperid: 357, https://arxiv.org/pdf/2508.04894.pdf

Abstract:
Large Language Models (LLMs) are increasingly integrated with graph-structured data for tasks like node classification, a domain traditionally dominated by Graph Neural Networks (GNNs). While this integration leverages rich relational information to improve task performance, their robustness against adversarial attacks remains unexplored. We take the first step to explore the vulnerabilities of graph-aware LLMs by leveraging existing adversarial attack methods tailored for graph-based models, including those for poisoning (training-time attacks) and evasion (test-time attacks), on two representative models, LLAGA (Chen et al. 2024) and GRAPHPROMPTER (Liu et al. 2024). Additionally, we discover a new attack surface for LLAGA where an attacker can inject malicious nodes as placeholders into the node sequence template to severely degrade its performance. Our systematic analysis reveals that certain design choices in graph encoding can enhance attack success, with specific findings that: (1) the node sequence template in LLAGA increases its vulnerability; (2) the GNN encoder used in GRAPHPROMPTER demonstrates greater robustness; and (3) both approaches remain susceptible to imperceptible feature perturbation attacks. Finally, we propose an end-to-end defense framework GALGUARD, that combines an LLM-based feature correction module to mitigate feature-level perturbations and adapted GNN defenses to protect against structural attacks.

Paperid: 358, https://arxiv.org/pdf/2512.12324.pdf

Abstract:
The rapid proliferation of Artificial Intelligence Generated Content has precipitated a crisis of trust and urgent regulatory demands. However, existing identification tools suffer from fragmentation and a lack of support for visible compliance marking. To address these gaps, we introduce the \textbf{UniMark}, an open-source, unified framework for multimodal content governance. Our system features a modular unified engine that abstracts complexities across text, image, audio, and video modalities. Crucially, we propose a novel dual-operation strategy, natively supporting both \emph{Hidden Watermarking} for copyright protection and \emph{Visible Marking} for regulatory compliance. Furthermore, we establish a standardized evaluation framework with three specialized benchmarks (Image/Video/Audio-Bench) to ensure rigorous performance assessment. This toolkit bridges the gap between advanced algorithms and engineering implementation, fostering a more transparent and secure digital ecosystem.

Paperid: 359, https://arxiv.org/pdf/2510.20333.pdf

Abstract:
Vision-Language Models (VLMs) are increasingly deployed as autonomous agents to navigate mobile graphical user interfaces (GUIs). Operating in dynamic on-device ecosystems, which include notifications, pop-ups, and inter-app interactions, exposes them to a unique and underexplored threat vector: environmental injection. Unlike prompt-based attacks that manipulate textual instructions, environmental injection corrupts an agent's visual perception by inserting adversarial UI elements (for example, deceptive overlays or spoofed notifications) directly into the GUI. This bypasses textual safeguards and can derail execution, causing privacy leakage, financial loss, or irreversible device compromise. To systematically evaluate this threat, we introduce GhostEI-Bench, the first benchmark for assessing mobile agents under environmental injection attacks within dynamic, executable environments. Moving beyond static image-based assessments, GhostEI-Bench injects adversarial events into realistic application workflows inside fully operational Android emulators and evaluates performance across critical risk scenarios. We further propose a judge-LLM protocol that conducts fine-grained failure analysis by reviewing the agent's action trajectory alongside the corresponding screenshot sequence, pinpointing failure in perception, recognition, or reasoning. Comprehensive experiments on state-of-the-art agents reveal pronounced vulnerability to deceptive environmental cues: current models systematically fail to perceive and reason about manipulated UIs. GhostEI-Bench provides a framework for quantifying and mitigating this emerging threat, paving the way toward more robust and secure embodied agents.

Paperid: 360, https://arxiv.org/pdf/2510.18756.pdf

Abstract:
Disaggregated storage with NVMe-over-Fabrics (NVMe-oF) has emerged as the standard solution in modern data centers, achieving superior performance, resource utilization, and power efficiency. Simultaneously, confidential computing (CC) is becoming the de facto security paradigm, enforcing stronger isolation and protection for sensitive workloads. However, securing state-of-the-art storage with traditional CC methods struggles to scale and compromises performance or security. To address these issues, we introduce sNVMe-oF, a storage management system extending the NVMe-oF protocol and adhering to the CC threat model by providing confidentiality, integrity, and freshness guarantees. sNVMe-oF offers an appropriate control path and novel concepts such as counter-leasing. sNVMe-oF also optimizes data path performance by leveraging NVMe metadata, introducing a new disaggregated Hazel Merkle Tree (HMT), and avoiding redundant IPSec protections. We achieve this without modifying the NVMe-oF protocol. To prevent excessive resource usage while delivering line rate, sNVMe-oF also uses accelerators of CC-capable smart NICs. We prototype sNVMe-oF on an NVIDIA BlueField-3 and demonstrate how it can achieve as little as 2% performance degradation for synthetic patterns and AI training.

Paperid: 361, https://arxiv.org/pdf/2510.18541.pdf

Abstract:
LLMs are often used by downstream users as teacher models for knowledge distillation, compressing their capabilities into memory-efficient models. However, as these teacher models may stem from untrusted parties, distillation can raise unexpected security risks. In this paper, we investigate the security implications of knowledge distillation from backdoored teacher models. First, we show that prior backdoors mostly do not transfer onto student models. Our key insight is that this is because existing LLM backdooring methods choose trigger tokens that rarely occur in usual contexts. We argue that this underestimates the security risks of knowledge distillation and introduce a new backdooring technique, T-MTB, that enables the construction and study of transferable backdoors. T-MTB carefully constructs a composite backdoor trigger, made up of several specific tokens that often occur individually in anticipated distillation datasets. As such, the poisoned teacher remains stealthy, while during distillation the individual presence of these tokens provides enough signal for the backdoor to transfer onto the student. Using T-MTB, we demonstrate and extensively study the security risks of transferable backdoors across two attack scenarios, jailbreaking and content modulation, and across four model families of LLMs.

Paperid: 362, https://arxiv.org/pdf/2510.07985.pdf

Abstract:
Model pruning, i.e., removing a subset of model weights, has become a prominent approach to reducing the memory footprint of large language models (LLMs) during inference. Notably, popular inference engines, such as vLLM, enable users to conveniently prune downloaded models before they are deployed. While the utility and efficiency of pruning methods have improved significantly, the security implications of pruning remain underexplored. In this work, for the first time, we show that modern LLM pruning methods can be maliciously exploited. In particular, an adversary can construct a model that appears benign yet, once pruned, exhibits malicious behaviors. Our method is based on the idea that the adversary can compute a proxy metric that estimates how likely each parameter is to be pruned. With this information, the adversary can first inject a malicious behavior into those parameters that are unlikely to be pruned. Then, they can repair the model by using parameters that are likely to be pruned, effectively canceling out the injected behavior in the unpruned model. We demonstrate the severity of our attack through extensive evaluation on five models; after any of the pruning in vLLM are applied (Magnitude, Wanda, and SparseGPT), it consistently exhibits strong malicious behaviors in a diverse set of attack scenarios (success rates of up to $95.7\%$ for jailbreak, $98.7\%$ for benign instruction refusal, and $99.5\%$ for targeted content injection). Our results reveal a critical deployment-time security gap and underscore the urgent need for stronger security awareness in model compression.

Paperid: 363, https://arxiv.org/pdf/2510.03863.pdf

Abstract:
Online services rely on CAPTCHAs as a first line of defense against automated abuse, yet recent advances in multi-modal large language models (MLLMs) have eroded the effectiveness of conventional designs that focus on text recognition or 2D image understanding. To address this challenge, we present Spatial CAPTCHA, a novel human-verification framework that leverages fundamental differences in spatial reasoning between humans and MLLMs. Unlike existing CAPTCHAs which rely on low-level perception tasks that are vulnerable to modern AI, Spatial CAPTCHA generates dynamic questions requiring geometric reasoning, perspective-taking, occlusion handling, and mental rotation. These skills are intuitive for humans but difficult for state-of-the-art (SOTA) AI systems. The system employs a procedural generation pipeline with constraint-based difficulty control, automated correctness verification, and human-in-the-loop validation to ensure scalability, robustness, and adaptability. Evaluation on a corresponding benchmark, Spatial-CAPTCHA-Bench, demonstrates that humans vastly outperform 10 state-of-the-art MLLMs, with the best model achieving only 31.0% Pass@1 accuracy. Furthermore, we compare Spatial CAPTCHA with Google reCAPTCHA, which confirms its effectiveness as both a security mechanism and a diagnostic tool for spatial reasoning in AI.

Paperid: 364, https://arxiv.org/pdf/2509.24368.pdf

Abstract:
We introduce the first watermark tailored for diffusion language models (DLMs), an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially. While there has been much work in ARLM watermarking, a key challenge when attempting to apply these schemes directly to the DLM setting is that they rely on previously generated tokens, which are not always available with DLM generation. In this work we address this challenge by: (i) applying the watermark in expectation over the context even when some context tokens are yet to be determined, and (ii) promoting tokens which increase the watermark strength when used as context for other tokens. This is accomplished while keeping the watermark detector unchanged. Our experimental evaluation demonstrates that the DLM watermark leads to a >99% true positive rate with minimal quality impact and achieves similar robustness to existing ARLM watermarks, enabling for the first time reliable DLM watermarking.

Paperid: 365, https://arxiv.org/pdf/2509.23594.pdf

Abstract:
Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA have transformed vision model adaptation, enabling the rapid deployment of customized models. However, the compactness of LoRA adaptations introduces new safety concerns, particularly their vulnerability to model extraction attacks. This paper introduces a new focus of model extraction attacks named LoRA extraction that extracts LoRA-adaptive models based on a public pre-trained model. We then propose a novel extraction method called StolenLoRA which trains a substitute model to extract the functionality of a LoRA-adapted model using synthetic data. StolenLoRA leverages a Large Language Model to craft effective prompts for data generation, and it incorporates a Disagreement-based Semi-supervised Learning (DSL) strategy to maximize information gain from limited queries. Our experiments demonstrate the effectiveness of StolenLoRA, achieving up to a 96.60% attack success rate with only 10k queries, even in cross-backbone scenarios where the attacker and victim models utilize different pre-trained backbones. These findings reveal the specific vulnerability of LoRA-adapted models to this type of extraction and underscore the urgent need for robust defense mechanisms tailored to PEFT methods. We also explore a preliminary defense strategy based on diversified LoRA deployments, highlighting its potential to mitigate such attacks.

Paperid: 366, https://arxiv.org/pdf/2509.18886.pdf

Abstract:
Large Language Models (LLMs) are increasingly deployed on converged Cloud and High-Performance Computing (HPC) infrastructure. However, as LLMs handle confidential inputs and are fine-tuned on costly, proprietary datasets, their heightened security requirements slow adoption in privacy-sensitive sectors such as healthcare and finance. We investigate methods to address this gap and propose Trusted Execution Environments (TEEs) as a solution for securing end-to-end LLM inference. We validate their practicality by evaluating these compute-intensive workloads entirely within CPU and GPU TEEs. On the CPU side, we conduct an in-depth study running full Llama2 inference pipelines (7B, 13B, 70B) inside Intel's TDX and SGX, accelerated by Advanced Matrix Extensions (AMX). We derive 12 insights, including that across various data types, batch sizes, and input lengths, CPU TEEs impose under 10% throughput and 20% latency overheads, further reduced by AMX. We run LLM inference on NVIDIA H100 Confidential Compute GPUs, contextualizing our CPU findings and observing throughput penalties of 4-8% that diminish as batch and input sizes grow. By comparing performance, cost, and security trade-offs, we show how CPU TEEs can be more cost-effective or secure than their GPU counterparts. To our knowledge, our work is the first to comprehensively demonstrate the performance and practicality of modern TEEs across both CPUs and GPUs for enabling confidential LLMs (cLLMs).

Paperid: 367, https://arxiv.org/pdf/2509.08748.pdf

Abstract:
Backdoor attacks poison the training data to embed a backdoor in the model, causing it to behave normally on legitimate inputs but maliciously when specific trigger signals appear. Training a benign model from a dataset poisoned by backdoor attacks is challenging. Existing works rely on various assumptions and can only defend against backdoor attacks with specific trigger signals, high poisoning ratios, or when the defender possesses a large, untainted, validation dataset. In this paper, we propose a defense called Prototype-Guided Robust Learning (PGRL), which overcomes all the aforementioned limitations, being robust against diverse backdoor attacks. Leveraging a tiny set of benign samples, PGRL generates prototype vectors to guide the training process. We compare our PGRL with 8 existing defenses, showing that it achieves superior robustness. We also demonstrate that PGRL generalizes well across various architectures, datasets, and advanced attacks. Finally, to evaluate our PGRL in the worst-case scenario, we perform an adaptive attack, where the attackers fully know the details of the defense.

Paperid: 368, https://arxiv.org/pdf/2509.08747.pdf

Abstract:
In the deployment phase, semi-structured sparsity accelerates the execution of deep neural networks on modern GPUs via sparse matrix multiplication. In this paper, targeting the semi-structured sparsity, we introduce a Silent Until Sparse (SUS) backdoor attack, where the released full model remains silent (benign), but becomes a backdoored model after sparsification. The attack operates in two phases: (i) in the backdoor training phase, the backdoor functionality is injected into specific weights that will be retained during the pruning process; (ii) in the backdoor hiding phase, the malicious behavior is concealed by fine-tuning elements that will be pruned away. This dual-phase approach ensures that the attack remains undetectable in the released model, but activates properly once the model is pruned with the semi-structured sparsity. Through extensive experiments, we show that our attack successfully threatens the semi-structured sparsity algorithms from both NVIDIA and PyTorch. Our empirical results show that, regardless of model architecture, the attack success rate of the released model remains below 10% prior to sparsification but exceeds 99% afterward. Moreover, we demonstrate that SUS attack is robust against state-of-the-art backdoor defenses and finetuning, highlighting a critical vulnerability in current model compression and deployment pipelines.

Paperid: 369, https://arxiv.org/pdf/2508.10836.pdf

Abstract:
Data minimization (DM) describes the principle of collecting only the data strictly necessary for a given task. It is a foundational principle across major data protection regulations like GDPR and CPRA. Violations of this principle have substantial real-world consequences, with regulatory actions resulting in fines reaching hundreds of millions of dollars. Notably, the relevance of data minimization is particularly pronounced in machine learning (ML) applications, which typically rely on large datasets, resulting in an emerging research area known as Data Minimization in Machine Learning (DMML). At the same time, existing work on other ML privacy and security topics often addresses concerns relevant to DMML without explicitly acknowledging the connection. This disconnect leads to confusion among practitioners, complicating their efforts to implement DM principles and interpret the terminology, metrics, and evaluation criteria used across different research communities. To address this gap, our work introduces a comprehensive framework for DMML, including a unified data pipeline, adversaries, and points of minimization. This framework allows us to systematically review the literature on data minimization and \emph{DM-adjacent} methodologies, for the first time presenting a structured overview designed to help practitioners and researchers effectively apply DM principles. Our work facilitates a unified DM-centric understanding and broader adoption of data minimization strategies in AI/ML.

Paperid: 370, https://arxiv.org/pdf/2507.18313.pdf

Abstract:
Malware evolves rapidly, forcing machine learning (ML)-based detectors to adapt continuously. With antivirus vendors processing hundreds of thousands of new samples daily, datasets can grow to billions of examples, making full retraining impractical. Continual learning (CL) has emerged as a scalable alternative, enabling incremental updates without full data access while mitigating catastrophic forgetting. In this work, we analyze a critical yet overlooked issue in this context: security regression. Unlike forgetting, which manifests as a general performance drop on previously seen data, security regression captures harmful prediction changes at the sample level, such as a malware sample that was once correctly detected but evades detection after a model update. Although often overlooked, regressions pose serious risks in security-critical applications, as the silent reintroduction of previously detected threats in the system may undermine users' trust in the whole updating process. To address this issue, we formalize and quantify security regression in CL-based malware detectors and propose a regression-aware penalty to mitigate it. Specifically, we adapt Positive Congruent Training (PCT) to the CL setting, preserving prior predictive behavior in a model-agnostic manner. Experiments on the ELSA, Tesseract, and AZ-Class datasets show that our method effectively reduces regression across different CL scenarios while maintaining strong detection performance over time.

Paperid: 371, https://arxiv.org/pdf/2507.16302.pdf

Abstract:
Text-to-image (T2I) diffusion models have achieved impressive image generation quality and are increasingly fine-tuned for personalized applications. However, these models often inherit unsafe behaviors from toxic pretraining data, raising growing safety concerns. While recent safety-driven unlearning methods have made promising progress in suppressing model toxicity, they are identified to be fragile to downstream fine-tuning, where we reveal that state-of-the-art methods largely fail to retain their effectiveness even when fine-tuned on entirely benign datasets. To mitigate this problem, in this paper, we propose ResAlign, a safety-driven unlearning framework with enhanced resilience against downstream fine-tuning. By modeling downstream fine-tuning as an implicit optimization problem with a Moreau Envelope-based reformulation, ResAlign enables efficient gradient estimation to minimize the recovery of harmful behaviors. Additionally, a meta-learning strategy is proposed to simulate a diverse distribution of fine-tuning scenarios to improve generalization. Extensive experiments across a wide range of datasets, fine-tuning methods, and configurations demonstrate that ResAlign consistently outperforms prior unlearning approaches in retaining safety after downstream fine-tuning while preserving benign generation capability well.

Paperid: 372, https://arxiv.org/pdf/2507.03450.pdf

Abstract:
Despite significant progress in designing powerful adversarial evasion attacks for robustness verification, the evaluation of these methods often remains inconsistent and unreliable. Many assessments rely on mismatched models, unverified implementations, and uneven computational budgets, which can lead to biased results and a false sense of security. Consequently, robustness claims built on such flawed testing protocols may be misleading and give a false sense of security. As a concrete step toward improving evaluation reliability, we present AttackBench, a benchmark framework developed to assess the effectiveness of gradient-based attacks under standardized and reproducible conditions. AttackBench serves as an evaluation tool that ranks existing attack implementations based on a novel optimality metric, which enables researchers and practitioners to identify the most reliable and effective attack for use in subsequent robustness evaluations. The framework enforces consistent testing conditions and enables continuous updates, making it a reliable foundation for robustness verification.

Paperid: 373, https://arxiv.org/pdf/2509.09898.pdf

Authors:Sophia Lockton, Jeremy Kepner, Michael Stonebraker, Hayden Jananthan, LaToya Anderson, William Arcand, David Bestor, William Bergeron, Alex Bonn, Daniel Burrill, Chansup Byun, Timothy Davis, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Piotr Luszczek, Peter Michaleas, Lauren Milechin, Chasen Milner, Guillermo Morales, Julie Mullen, Michel Pelletier, Alex Poliakov, Andrew Prout, Albert Reuther, Antonio Rosa, Charles Yee, Alex Pentland

Abstract:
DBOS (DataBase Operating System) is a novel capability that integrates web services, operating system functions, and database features to significantly reduce web-deployment effort while increasing resilience. Integration of high performance network sensing enables DBOS web services to collaboratively create a shared awareness of their network environments to enhance their collective resilience and security. Network sensing is added to DBOS using GraphBLAS hypersparse traffic matrices via two approaches: (1) Python-GraphBLAS and (2) OneSparse PostgreSQL. These capabilities are demonstrated using the workflow and analytics from the IEEE/MIT/Amazon Anonymized Network Sensing Graph Challenge. The system was parallelized using pPython and benchmarked using 64 compute nodes on the MIT SuperCloud. The web request rate sustained by a single DBOS instance was ${>}10^5$, well above the required maximum, indicating that network sensing can be added to DBOS with negligible overhead. For collaborative awareness, many DBOS instances were connected to a single DBOS aggregator. The Python-GraphBLAS and OneSparse PostgreSQL implementations scaled linearly up to 64 and 32 nodes respectively. These results suggest that DBOS collaborative network awareness can be achieved with a negligible increase in computing resources.

Paperid: 374, https://arxiv.org/pdf/2509.08200.pdf

Abstract:
AI development requires high fidelity testing environments to effectively transition from the laboratory to operations. The flexibility offered by cyber arenas presents a novel opportunity to test new artificial intelligence (AI) capabilities with users. Cyber arenas are designed to expose end-users to real-world situations and must rapidly incorporate evolving capabilities to meet their core objectives. To explore this concept the MIT/IEEE/Amazon Graph Challenge Anonymized Network Sensor was deployed in a cyber arena during a National Guard exercise.

Paperid: 375, https://arxiv.org/pdf/2508.19525.pdf

Abstract:
This paper presents an efficient framework for private Transformer inference that combines Homomorphic Encryption (HE) and Secure Multi-party Computation (MPC) to protect data privacy. Existing methods often leverage HE for linear layers (e.g., matrix multiplications) and MPC for non-linear layers (e.g., Softmax activation functions), but the conversion between HE and MPC introduces significant communication costs. The proposed framework, dubbed BLB, overcomes this by breaking down layers into fine-grained operators and further fusing adjacent linear operators, reducing the need for HE/MPC conversions. To manage the increased ciphertext bit width from the fused linear operators, BLB proposes the first secure conversion protocol between CKKS and MPC and enables CKKS-based computation of the fused operators. Additionally, BLB proposes an efficient matrix multiplication protocol for fused computation in Transformers. Extensive evaluations on BERT-base, BERT-large, and GPT2-base show that BLB achieves a $21\times$ reduction in communication overhead compared to BOLT (S\&P'24) and a $2\times$ reduction compared to Bumblebee (NDSS'25), along with latency reductions of $13\times$ and $1.8\times$, respectively, when leveraging GPU acceleration.

Paperid: 376, https://arxiv.org/pdf/2508.17121.pdf

Abstract:
Audio watermarking has been widely applied in copyright protection and source tracing. However, due to the inherent characteristics of audio signals, watermark localization and resistance to desynchronization attacks remain significant challenges. In this paper, we propose a learning-based scheme named SyncGuard to address these challenges. Specifically, we design a frame-wise broadcast embedding strategy to embed the watermark in arbitrary-length audio, enhancing time-independence and eliminating the need for localization during watermark extraction. To further enhance robustness, we introduce a meticulously designed distortion layer. Additionally, we employ dilated residual blocks in conjunction with dilated gated blocks to effectively capture multi-resolution time-frequency features. Extensive experimental results show that SyncGuard efficiently handles variable-length audio segments, outperforms state-of-the-art methods in robustness against various attacks, and delivers superior auditory quality.

Paperid: 377, https://arxiv.org/pdf/2508.16847.pdf

Abstract:
The advent of high-performance graph libraries, such as the GraphBLAS, has enabled the analysis of massive network data sets and revealed new models for their behavior. Physical analogies for complicated network behavior can be a useful aid to understanding these newly discovered network phenomena. Prior work leveraged the canonical Gull's Lighthouse problem and developed a computational heuristic for modeling large scale network traffic using this model. A general solution using this approach requires overcoming the essential mathematical singularities in the resulting differential equations. Further investigation reveals a simpler physical interpretation that alleviates the need for solving challenging differential equations. Specifically, that the probability of observing a source at a temporal ``distance'' $r(t)$ at time $t$ is $p(t) \propto 1/r(t)^2$. This analogy aligns with many physical phenomena and can be a rich source of intuition. Applying this physical analogy to the observed source correlations in the Anonymized Network Sensing Graph Challenge data leads to an elegant cyber orbit analogy that may assist with the understanding network behavior.

Paperid: 378, https://arxiv.org/pdf/2511.08367.pdf

Abstract:
Large Vision-Language Models (VLMs) are susceptible to jailbreak attacks: researchers have developed a variety of attack strategies that can successfully bypass the safety mechanisms of VLMs. Among these approaches, jailbreak methods based on the Out-of-Distribution (OOD) strategy have garnered widespread attention due to their simplicity and effectiveness. This paper further advances the in-depth understanding of OOD-based VLM jailbreak methods. Experimental results demonstrate that jailbreak samples generated via mild OOD strategies exhibit superior performance in circumventing the safety constraints of VLMs--a phenomenon we define as ''weak-OOD''. To unravel the underlying causes of this phenomenon, this study takes SI-Attack, a typical OOD-based jailbreak method, as the research object. We attribute this phenomenon to a trade-off between two dominant factors: input intent perception and model refusal triggering. The inconsistency in how these two factors respond to OOD manipulations gives rise to this phenomenon. Furthermore, we provide a theoretical argument for the inevitability of such inconsistency from the perspective of discrepancies between model pre-training and alignment processes. Building on the above insights, we draw inspiration from optical character recognition (OCR) capability enhancement--a core task in the pre-training phase of mainstream VLMs. Leveraging this capability, we design a simple yet highly effective VLM jailbreak method, whose performance outperforms that of SOTA baselines.

Paperid: 379, https://arxiv.org/pdf/2511.07315.pdf

Abstract:
The widespread application of large VLMs makes ensuring their secure deployment critical. While recent studies have demonstrated jailbreak attacks on VLMs, existing approaches are limited: they require either white-box access, restricting practicality, or rely on manually crafted patterns, leading to poor sample diversity and scalability. To address these gaps, we propose JPRO, a novel multi-agent collaborative framework designed for automated VLM jailbreaking. It effectively overcomes the shortcomings of prior methods in attack diversity and scalability. Through the coordinated action of four specialized agents and its two core modules: Tactic-Driven Seed Generation and Adaptive Optimization Loop, JPRO generates effective and diverse attack samples. Experimental results show that JPRO achieves over a 60\% attack success rate on multiple advanced VLMs, including GPT-4o, significantly outperforming existing methods. As a black-box attack approach, JPRO not only uncovers critical security vulnerabilities in multimodal models but also offers valuable insights for evaluating and enhancing VLM robustness.

Paperid: 380, https://arxiv.org/pdf/2511.07051.pdf

Abstract:
The generalization capability of deepfake detectors is critical for real-world use. Data augmentation via synthetic fake face generation effectively enhances generalization, yet current SoTA methods rely on fixed strategies-raising a key question: Is a single static augmentation sufficient, or does the diversity of forgery features demand dynamic approaches? We argue existing methods overlook the evolving complexity of real-world forgeries (e.g., facial warping, expression manipulation), which fixed policies cannot fully simulate. To address this, we propose CRDA (Curriculum Reinforcement-Learning Data Augmentation), a novel framework guiding detectors to progressively master multi-domain forgery features from simple to complex. CRDA synthesizes augmented samples via a configurable pool of forgery operations and dynamically generates adversarial samples tailored to the detector's current learning state. Central to our approach is integrating reinforcement learning (RL) and causal inference. An RL agent dynamically selects augmentation actions based on detector performance to efficiently explore the vast augmentation space, adapting to increasingly challenging forgeries. Simultaneously, the agent introduces action space variations to generate heterogeneous forgery patterns, guided by causal inference to mitigate spurious correlations-suppressing task-irrelevant biases and focusing on causally invariant features. This integration ensures robust generalization by decoupling synthetic augmentation patterns from the model's learned representations. Extensive experiments show our method significantly improves detector generalizability, outperforming SOTA methods across multiple cross-domain datasets.

Paperid: 381, https://arxiv.org/pdf/2510.15975.pdf

Abstract:
The rapid adoption of generative artificial intelligence (GenAI) in the biosciences is transforming biotechnology, medicine, and synthetic biology. Yet this advancement is intrinsically linked to new vulnerabilities, as GenAI lowers the barrier to misuse and introduces novel biosecurity threats, such as generating synthetic viral proteins or toxins. These dual-use risks are often overlooked, as existing safety guardrails remain fragile and can be circumvented through deceptive prompts or jailbreak techniques. In this Perspective, we first outline the current state of GenAI in the biosciences and emerging threat vectors ranging from jailbreak attacks and privacy risks to the dual-use challenges posed by autonomous AI agents. We then examine urgent gaps in regulation and oversight, drawing on insights from 130 expert interviews across academia, government, industry, and policy. A large majority ($\approx 76$\%) expressed concern over AI misuse in biology, and 74\% called for the development of new governance frameworks. Finally, we explore technical pathways to mitigation, advocating a multi-layered approach to GenAI safety. These defenses include rigorous data filtering, alignment with ethical principles during development, and real-time monitoring to block harmful requests. Together, these strategies provide a blueprint for embedding security throughout the GenAI lifecycle. As GenAI becomes integrated into the biosciences, safeguarding this frontier requires an immediate commitment to both adaptive governance and secure-by-design technologies.

Paperid: 382, https://arxiv.org/pdf/2509.13684.pdf

Abstract:
Private Information Retrieval (PIR) is a fundamental cryptographic primitive that enables users to retrieve data from a database without revealing which item is being accessed, thereby preserving query privacy. However, PIR protocols also face the challenge of result verifiability, as users expect the reconstructed data to be trustworthy and authentic. In this work, we propose two effective constructions of publicly verifiable PIR (PVPIR) in the multi-server setting, which achieve query privacy, correctness, and verifiability simultaneously. We further present three concrete instantiations based on these constructions. For the point query, our protocol introduces minimal computational overhead and achieves strong verifiability guarantees with significantly lower communication costs compared to existing Merkle tree-based approaches. For the predicate query, the communication complexity of our scheme remains stable as the database size increases, demonstrating strong scalability and suitability for large-scale private query applications.

Paperid: 383, https://arxiv.org/pdf/2508.17856.pdf

Abstract:
The rapid evolution of Android malware poses significant challenges to the maintenance and security of mobile applications (apps). Traditional detection techniques often struggle to keep pace with emerging malware variants that employ advanced tactics such as code obfuscation and dynamic behavior triggering. One major limitation of these approaches is their inability to localize malicious payloads at a fine-grained level, hindering precise understanding of malicious behavior. This gap in understanding makes the design of effective and targeted mitigation strategies difficult, leaving mobile apps vulnerable to continuously evolving threats. To address this gap, we propose MalLoc, a novel approach that leverages the code understanding capabilities of large language models (LLMs) to localize malicious payloads at a fine-grained level within Android malware. Our experimental results demonstrate the feasibility and effectiveness of using LLMs for this task, highlighting the potential of MalLoc to enhance precision and interpretability in malware analysis. This work advances beyond traditional detection and classification by enabling deeper insights into behavior-level malicious logic and opens new directions for research, including dynamic modeling of localized threats and targeted countermeasure development.

Paperid: 384, https://arxiv.org/pdf/2508.06783.pdf

Abstract:
Alignment is a key step in developing Large Language Models (LLMs) using human feedback to ensure adherence to human values and societal norms. Dependence on human feedback raises privacy concerns about how much a labeler's preferences may reveal about their personal values, beliefs, and personality traits. Existing approaches, such as Differentially Private SGD (DP-SGD), provide rigorous privacy guarantees by privatizing gradients during fine-tuning and alignment but can provide more privacy than necessary as human preferences are tied only to labels of (prompt, response) pairs and can degrade model utility. This work focuses on LLM alignment with preference-level privacy, which preserves the privacy of preference labels provided by humans. We propose PROPS (PROgressively Private Self-alignment), a multi-stage privacy preserving alignment framework where privately aligned models in previous stages can serve as labelers for supplementing training data in the subsequent stages of alignment. We present theoretical guarantees for PROPS as well as comprehensive validation using multiple models (Pythia and GPT) and datasets (AlpacaEval, Anthropic HH-RLHF, truthy-dpo-v0.1) to demonstrate the utility of PROPS over existing methods while still providing high privacy. For the same privacy budget, alignment via PROPS can achieve up to 3x higher win-rates compared to DP-SGD, and 2.5x higher win-rates compared to Randomized Response (RR) based alignment.

Paperid: 385, https://arxiv.org/pdf/2507.09860.pdf

Abstract:
This paper aims to propose a novel machine learning (ML) approach incorporating Homomorphic Encryption (HE) to address privacy limitations in Unmanned Aerial Vehicles (UAV)-based face detection. Due to challenges related to distance, altitude, and face orientation, high-resolution imagery and sophisticated neural networks enable accurate face recognition in dynamic environments. However, privacy concerns arise from the extensive surveillance capabilities of UAVs. To resolve this issue, we propose a novel framework that integrates HE with advanced neural networks to secure facial data throughout the inference phase. This method ensures that facial data remains secure with minimal impact on detection accuracy. Specifically, the proposed system leverages the Cheon-Kim-Kim-Song (CKKS) scheme to perform computations directly on encrypted data, optimizing computational efficiency and security. Furthermore, we develop an effective data encoding method specifically designed to preprocess the raw facial data into CKKS form in a Single-Instruction-Multiple-Data (SIMD) manner. Building on this, we design a secure inference algorithm to compute on ciphertext without needing decryption. This approach not only protects data privacy during the processing of facial data but also enhances the efficiency of UAV-based face detection systems. Experimental results demonstrate that our method effectively balances privacy protection and detection performance, making it a viable solution for UAV-based secure face detection. Significantly, our approach (while maintaining data confidentially with HE encryption) can still achieve an accuracy of less than 1% compared to the benchmark without using encryption.

Paperid: 386, https://arxiv.org/pdf/2507.07316.pdf

Abstract:
Federated Learning (FL) faces inherent challenges in balancing model performance, privacy preservation, and communication efficiency, especially in non-IID decentralized environments. Recent approaches either sacrifice formal privacy guarantees, incur high overheads, or overlook quantum-enhanced expressivity. We introduce AdeptHEQ-FL, a unified hybrid classical-quantum FL framework that integrates (i) a hybrid CNN-PQC architecture for expressive decentralized learning, (ii) an adaptive accuracy-weighted aggregation scheme leveraging differentially private validation accuracies, (iii) selective homomorphic encryption (HE) for secure aggregation of sensitive model layers, and (iv) dynamic layer-wise adaptive freezing to minimize communication overhead while preserving quantum adaptability. We establish formal privacy guarantees, provide convergence analysis, and conduct extensive experiments on the CIFAR-10, SVHN, and Fashion-MNIST datasets. AdeptHEQ-FL achieves a $\approx 25.43\%$ and $\approx 14.17\%$ accuracy improvement over Standard-FedQNN and FHE-FedQNN, respectively, on the CIFAR-10 dataset. Additionally, it reduces communication overhead by freezing less important layers, demonstrating the efficiency and practicality of our privacy-preserving, resource-aware design for FL.

Paperid: 387, https://arxiv.org/pdf/2512.02973.pdf

Abstract:
While Multimodal Large Language Models (MLLMs) show remarkable capabilities, their safety alignments are susceptible to jailbreak attacks. Existing attack methods typically focus on text-image interplay, treating the visual modality as a secondary prompt. This approach underutilizes the unique potential of images to carry complex, contextual information. To address this gap, we propose a new image-centric attack method, Contextual Image Attack (CIA), which employs a multi-agent system to subtly embeds harmful queries into seemingly benign visual contexts using four distinct visualization strategies. To further enhance the attack's efficacy, the system incorporate contextual element enhancement and automatic toxicity obfuscation techniques. Experimental results on the MMSafetyBench-tiny dataset show that CIA achieves high toxicity scores of 4.73 and 4.83 against the GPT-4o and Qwen2.5-VL-72B models, respectively, with Attack Success Rates (ASR) reaching 86.31\% and 91.07\%. Our method significantly outperforms prior work, demonstrating that the visual modality itself is a potent vector for jailbreaking advanced MLLMs.

Paperid: 388, https://arxiv.org/pdf/2511.15203.pdf

Abstract:
Large Language Model (LLM)-based agents with function-calling capabilities are increasingly deployed, but remain vulnerable to Indirect Prompt Injection (IPI) attacks that hijack their tool calls. In response, numerous IPI-centric defense frameworks have emerged. However, these defenses are fragmented, lacking a unified taxonomy and comprehensive evaluation. In this Systematization of Knowledge (SoK), we present the first comprehensive analysis of IPI-centric defense frameworks. We introduce a comprehensive taxonomy of these defenses, classifying them along five dimensions. We then thoroughly assess the security and usability of representative defense frameworks. Through analysis of defensive failures in the assessment, we identify six root causes of defense circumvention. Based on these findings, we design three novel adaptive attacks that significantly improve attack success rates targeting specific frameworks, demonstrating the severity of the flaws in these defenses. Our paper provides a foundation and critical insights for the future development of more secure and usable IPI-centric agent defense frameworks.

Paperid: 389, https://arxiv.org/pdf/2511.11356.pdf

Abstract:
Large language models (LLMs) have achieved remarkable success across a wide range of natural language processing tasks, demonstrating human-level performance in text generation, reasoning, and question answering. However, training such models requires substantial computational resources, large curated datasets, and sophisticated alignment procedures. As a result, they constitute highly valuable intellectual property (IP) assets that warrant robust protection mechanisms. Existing IP protection approaches suffer from critical limitations. Model fingerprinting techniques can identify model architectures but fail to establish ownership of specific model instances. In contrast, traditional backdoor-based watermarking methods embed behavioral anomalies that can be easily removed through common post-processing operations such as fine-tuning or knowledge distillation. We propose SEAL, a subspace-anchored watermarking framework that embeds multi-bit signatures directly into the model's latent representational space, supporting both white-box and black-box verification scenarios. Our approach leverages model editing techniques to align the hidden representations of selected anchor samples with predefined orthogonal bit vectors. This alignment embeds the watermark while preserving the model's original factual predictions, rendering the watermark functionally harmless and stealthy. We conduct comprehensive experiments on multiple benchmark datasets and six prominent LLMs, comparing SEAL with 11 existing fingerprinting and watermarking methods to demonstrate its superior effectiveness, fidelity, efficiency, and robustness. Furthermore, we evaluate SEAL under potential knowledgeable attacks and show that it maintains strong verification performance even when adversaries possess knowledge of the watermarking mechanism and the embedded signatures.

Paperid: 390, https://arxiv.org/pdf/2510.23675.pdf

Abstract:
Modern coding agents integrated into IDEs combine powerful tools and system-level actions, exposing a high-stakes attack surface. Existing Indirect Prompt Injection (IPI) studies focus mainly on query-specific behaviors, leading to unstable attacks with lower success rates. We identify a more severe, query-agnostic threat that remains effective across diverse user inputs. This challenge can be overcome by exploiting a common vulnerability: leakage of the agent's internal prompt, which turns the attack into a constrained white-box optimization problem. We present QueryIPI, the first query-agnostic IPI method for coding agents. QueryIPI refines malicious tool descriptions through an iterative, prompt-based process informed by the leaked internal prompt. Experiments on five simulated agents show that QueryIPI achieves up to 87 percent success, outperforming baselines, and the generated malicious descriptions also transfer to real-world systems, highlighting a practical security risk to modern LLM-based coding agents.

Paperid: 391, https://arxiv.org/pdf/2510.21057.pdf

Abstract:
Large Language Models (LLMs) are increasingly deployed in agentic systems that interact with an external environment; this makes them susceptible to prompt injections when dealing with untrusted data. To overcome this limitation, we propose SIC (Soft Instruction Control)-a simple yet effective iterative prompt sanitization loop designed for tool-augmented LLM agents. Our method repeatedly inspects incoming data for instructions that could compromise agent behavior. If such content is found, the malicious content is rewritten, masked, or removed, and the result is re-evaluated. The process continues until the input is clean or a maximum iteration limit is reached; if imperative instruction-like content remains, the agent halts to ensure security. By allowing multiple passes, our approach acknowledges that individual rewrites may fail but enables the system to catch and correct missed injections in later steps. Although immediately useful, worst-case analysis shows that SIC is not infallible; strong adversary can still get a 15% ASR by embedding non-imperative workflows. This nonetheless raises the bar.

Paperid: 392, https://arxiv.org/pdf/2510.09263.pdf

Authors:Sven Gowal, Rudy Bunel, Florian Stimberg, David Stutz, Guillermo Ortiz-Jimenez, Christina Kouridi, Mel Vecerik, Jamie Hayes, Sylvestre-Alvise Rebuffi, Paul Bernard, Chris Gamble, Miklós Z. Horváth, Fabian Kaczmarczyck, Alex Kaskasoli, Aleksandar Petrov, Ilia Shumailov, Meghana Thotakuri, Olivia Wiles, Jessica Yung, Zahra Ahmed, Victor Martin, Simon Rosen, Christopher Savčak, Armin Senoner, Nidhi Vyas, Pushmeet Kohli

Abstract:
We introduce SynthID-Image, a deep learning-based system for invisibly watermarking AI-generated imagery. This paper documents the technical desiderata, threat models, and practical challenges of deploying such a system at internet scale, addressing key requirements of effectiveness, fidelity, robustness, and security. SynthID-Image has been used to watermark over ten billion images and video frames across Google's services and its corresponding verification service is available to trusted testers. For completeness, we present an experimental evaluation of an external model variant, SynthID-O, which is available through partnerships. We benchmark SynthID-O against other post-hoc watermarking methods from the literature, demonstrating state-of-the-art performance in both visual quality and robustness to common image perturbations. While this work centers on visual media, the conclusions on deployment, constraints, and threat modeling generalize to other modalities, including audio. This paper provides a comprehensive documentation for the large-scale deployment of deep learning-based media provenance systems.

Paperid: 393, https://arxiv.org/pdf/2510.09023.pdf

Abstract:
How should we evaluate the robustness of language model defenses? Current defenses against jailbreaks and prompt injections (which aim to prevent an attacker from eliciting harmful knowledge or remotely triggering malicious actions, respectively) are typically evaluated either against a static set of harmful attack strings, or against computationally weak optimization methods that were not designed with the defense in mind. We argue that this evaluation process is flawed. Instead, we should evaluate defenses against adaptive attackers who explicitly modify their attack strategy to counter a defense's design while spending considerable resources to optimize their objective. By systematically tuning and scaling general optimization techniques-gradient descent, reinforcement learning, random search, and human-guided exploration-we bypass 12 recent defenses (based on a diverse set of techniques) with attack success rate above 90% for most; importantly, the majority of defenses originally reported near-zero attack success rates. We believe that future defense work must consider stronger attacks, such as the ones we describe, in order to make reliable and convincing claims of robustness.

Paperid: 394, https://arxiv.org/pdf/2509.05755.pdf

Abstract:
LLM-based agentic systems leverage large language models to handle user queries, make decisions, and execute external tools for complex tasks across domains like chatbots, customer service, and software engineering. A critical component of these systems is the Tool Invocation Prompt (TIP), which defines tool interaction protocols and guides LLMs to ensure the security and correctness of tool usage. Despite its importance, TIP security has been largely overlooked. This work investigates TIP-related security risks, revealing that major LLM-based systems like Cursor, Claude Code, and others are vulnerable to attacks such as remote code execution (RCE) and denial of service (DoS). Through a systematic TIP exploitation workflow (TEW), we demonstrate external tool behavior hijacking via manipulated tool invocations. We also propose defense mechanisms to enhance TIP security in LLM-based agentic systems.

Paperid: 395, https://arxiv.org/pdf/2509.05739.pdf

Abstract:
Early research into data poisoning attacks against Large Language Models (LLMs) demonstrated the ease with which backdoors could be injected. More recent LLMs add step-by-step reasoning, expanding the attack surface to include the intermediate chain-of-thought (CoT) and its inherent trait of decomposing problems into subproblems. Using these vectors for more stealthy poisoning, we introduce ``decomposed reasoning poison'', in which the attacker modifies only the reasoning path, leaving prompts and final answers clean, and splits the trigger across multiple, individually harmless components. Fascinatingly, while it remains possible to inject these decomposed poisons, reliably activating them to change final answers (rather than just the CoT) is surprisingly difficult. This difficulty arises because the models can often recover from backdoors that are activated within their thought processes. Ultimately, it appears that an emergent form of backdoor robustness is originating from the reasoning capabilities of these advanced LLMs, as well as from the architectural separation between reasoning and final answer generation.

Paperid: 396, https://arxiv.org/pdf/2509.04403.pdf

Abstract:
Multimodal large language models (MLLMs) are rapidly evolving, presenting increasingly complex safety challenges. However, current dataset construction methods, which are risk-oriented, fail to cover the growing complexity of real-world multimodal safety scenarios (RMS). And due to the lack of a unified evaluation metric, their overall effectiveness remains unproven. This paper introduces a novel image-oriented self-adaptive dataset construction method for RMS, which starts with images and end constructing paired text and guidance responses. Using the image-oriented method, we automatically generate an RMS dataset comprising 35k image-text pairs with guidance responses. Additionally, we introduce a standardized safety dataset evaluation metric: fine-tuning a safety judge model and evaluating its capabilities on other safety datasets.Extensive experiments on various tasks demonstrate the effectiveness of the proposed image-oriented pipeline. The results confirm the scalability and effectiveness of the image-oriented approach, offering a new perspective for the construction of real-world multimodal safety datasets.

Paperid: 397, https://arxiv.org/pdf/2508.20083.pdf

Abstract:
Retrieval-Augmented Generation (RAG) has become a standard approach for improving the reliability of large language models (LLMs). Prior work demonstrates the vulnerability of RAG systems by misleading them into generating attacker-chosen outputs through poisoning the knowledge base. However, this paper uncovers that such attacks could be mitigated by the strong \textit{self-correction ability (SCA)} of modern LLMs, which can reject false context once properly configured. This SCA poses a significant challenge for attackers aiming to manipulate RAG systems. In contrast to previous poisoning methods, which primarily target the knowledge base, we introduce \textsc{DisarmRAG}, a new poisoning paradigm that compromises the retriever itself to suppress the SCA and enforce attacker-chosen outputs. This compromisation enables the attacker to straightforwardly embed anti-SCA instructions into the context provided to the generator, thereby bypassing the SCA. To this end, we present a contrastive-learning-based model editing technique that performs localized and stealthy edits, ensuring the retriever returns a malicious instruction only for specific victim queries while preserving benign retrieval behavior. To further strengthen the attack, we design an iterative co-optimization framework that automatically discovers robust instructions capable of bypassing prompt-based defenses. We extensively evaluate DisarmRAG across six LLMs and three QA benchmarks. Our results show near-perfect retrieval of malicious instructions, which successfully suppress SCA and achieve attack success rates exceeding 90\% under diverse defensive prompts. Also, the edited retriever remains stealthy under several detection methods, highlighting the urgent need for retriever-centric defenses.

Paperid: 398, https://arxiv.org/pdf/2508.01655.pdf

Abstract:
The explosive growth of mini-game platforms has led to widespread code plagiarism, where malicious users access popular games' source code and republish them with modifications. While existing static analysis tools can detect simple obfuscation techniques like variable renaming and dead code injection, they fail against sophisticated deep obfuscation methods such as encrypted code with local or cloud-based decryption keys that completely destroy code structure and render traditional Abstract Syntax Tree analysis ineffective. To address these challenges, we present JSidentify-V2, a novel dynamic analysis framework that detects mini-game plagiarism by capturing memory invariants during program execution. Our key insight is that while obfuscation can severely distort static code characteristics, runtime memory behavior patterns remain relatively stable. JSidentify-V2 employs a four-stage pipeline: (1) static pre-analysis and instrumentation to identify potential memory invariants, (2) adaptive hot object slicing to maximize execution coverage of critical code segments, (3) Memory Dependency Graph construction to represent behavioral fingerprints resilient to obfuscation, and (4) graph-based similarity analysis for plagiarism detection. We evaluate JSidentify-V2 against eight obfuscation methods on a comprehensive dataset of 1,200 mini-games ...

Paperid: 399, https://arxiv.org/pdf/2510.23891.pdf

Abstract:
Text watermarking for large language models (LLMs) enables model owners to verify text origin and protect intellectual property. While watermarking methods for closed-source LLMs are relatively mature, extending them to open-source models remains challenging, as developers cannot control the decoding process. Consequently, owners of open-source LLMs lack practical means to verify whether text was generated by their models. A core difficulty lies in embedding watermarks directly into model weights without hurting detectability. A promising idea is to distill watermarks from a closed-source model into an open one, but this suffers from (i) poor detectability due to mismatch between learned and predefined patterns, and (ii) fragility to downstream modifications such as fine-tuning or model merging. To overcome these limitations, we propose PRO, a Precise and Robust text watermarking method for open-source LLMs. PRO jointly trains a watermark policy model with the LLM, producing patterns that are easier for the model to learn and more consistent with detection criteria. A regularization term further simulates downstream perturbations and penalizes degradation in watermark detectability, ensuring robustness under model edits. Experiments on open-source LLMs (e.g., LLaMA-3.2, LLaMA-3, Phi-2) show that PRO substantially improves both watermark detectability and resilience to model modifications.

Paperid: 400, https://arxiv.org/pdf/2510.02554.pdf

Abstract:
As LLMs increasingly power agents that interact with external tools, tool use has become an essential mechanism for extending their capabilities. These agents typically select tools from growing databases or marketplaces to solve user tasks, creating implicit competition among tool providers and developers for visibility and usage. In this paper, we show that this selection process harbors a critical vulnerability: by iteratively manipulating tool names and descriptions, adversaries can systematically bias agents toward selecting specific tools, gaining unfair advantage over equally capable alternatives. We present ToolTweak, a lightweight automatic attack that increases selection rates from a baseline of around 20% to as high as 81%, with strong transferability between open-source and closed-source models. Beyond individual tools, we show that such attacks cause distributional shifts in tool usage, revealing risks to fairness, competition, and security in emerging tool ecosystems. To mitigate these risks, we evaluate two defenses: paraphrasing and perplexity filtering, which reduce bias and lead agents to select functionally similar tools more equally. All code will be open-sourced upon acceptance.

Paperid: 401, https://arxiv.org/pdf/2509.21325.pdf

Abstract:
Retrieval-Augmented Generation (RAG) has become a foundational component of modern AI systems, yet it introduces significant privacy risks by exposing user queries to service providers. To address this, we introduce PIR-RAG, a practical system for privacy-preserving RAG. PIR-RAG employs a novel architecture that uses coarse-grained semantic clustering to prune the search space, combined with a fast, lattice-based Private Information Retrieval (PIR) protocol. This design allows for the efficient retrieval of entire document clusters, uniquely optimizing for the end-to-end RAG workflow where full document content is required. Our comprehensive evaluation against strong baseline architectures, including graph-based PIR and Tiptoe-style private scoring, demonstrates PIR-RAG's scalability and its superior performance in terms of "RAG-Ready Latency"-the true end-to-end time required to securely fetch content for an LLM. Our work establishes PIR-RAG as a viable and highly efficient solution for privacy in large-scale AI systems.

Paperid: 402, https://arxiv.org/pdf/2509.07755.pdf

Abstract:
As large language models (LLMs) are adapted to sensitive domains such as medicine, their fluency raises safety risks, particularly regarding provenance and accountability. Watermarking embeds detectable patterns to mitigate these risks, yet its reliability in medical contexts remains untested. Existing benchmarks focus on detection-quality tradeoffs and overlook factual risks. In medical text, watermarking often reweights low-entropy tokens, which are highly predictable and often carry critical medical terminology. Shifting these tokens can cause inaccuracy and hallucinations, risks that prior general-domain benchmarks fail to capture. We propose a medical-focused evaluation workflow that jointly assesses factual accuracy and coherence. Using GPT-Judger and further human validation, we introduce the Factuality-Weighted Score (FWS), a composite metric prioritizing factual accuracy beyond coherence to guide watermarking deployment in medical domains. Our evaluation shows current watermarking methods substantially compromise medical factuality, with entropy shifts degrading medical entity representation. These findings underscore the need for domain-aware watermarking approaches that preserve the integrity of medical content.

Paperid: 403, https://arxiv.org/pdf/2507.03278.pdf

Abstract:
Recent advances in Transformer models, e.g., large language models (LLMs), have brought tremendous breakthroughs in various artificial intelligence (AI) tasks, leading to their wide applications in many security-critical domains. Due to their unprecedented scale and prohibitively high development cost, these models have become highly valuable intellectual property for AI stakeholders and are increasingly deployed via machine learning as a service (MLaaS). However, MLaaS often runs on untrusted cloud infrastructure, exposing data and models to potential breaches. Mainstream protection mechanisms leverage trusted execution environments (TEEs) where confidentiality and integrity for secretive data are shielded using hardware-based encryption and integrity checking. Unfortunately, running model inference entirely within TEEs is subject to non-trivial slowdown, which is further exacerbated in LLMs due to the substantial computation and memory footprint involved. Recent studies reveal that the hybrid TEE-based scheme offloading partial model inference operations to the untrusted accelerators (e.g., GPU) is a promising solution. However, prior offloading schemes fail to ensure dual protection of data and model in Transformer inference, as they cannot securely offload critical operations, i.e., Attention and SoftMax, forcing these computations to remain confined within TEEs. To address these challenges, we propose TwinShield, a framework enabling secure Transformer inference in heterogeneous TEE and accelerator systems with dual protection for both model and data. TwinShield offloads ~87% of computation to GPUs and delivers 4.0x - 6.1x speedups over previous approaches across various Transformer models.

Paperid: 404, https://arxiv.org/pdf/2512.14233.pdf

Abstract:
Penetration testing is essential for assessing and strengthening system security against real-world threats, yet traditional workflows remain highly manual, expertise-intensive, and difficult to scale. Although recent advances in Large Language Models (LLMs) offer promising opportunities for automation, existing applications rely on simplistic prompting without task decomposition or domain adaptation, resulting in unreliable black-box behavior and limited insight into model capabilities across penetration testing stages. To address this gap, we introduce PentestEval, the first comprehensive benchmark for evaluating LLMs across six decomposed penetration testing stages: Information Collection, Weakness Gathering and Filtering, Attack Decision-Making, Exploit Generation and Revision. PentestEval integrates expert-annotated ground truth with a fully automated evaluation pipeline across 346 tasks covering all stages in 12 realistic vulnerable scenarios. Our stage-level evaluation of 9 widely used LLMs reveals generally weak performance and distinct limitations across the stages of penetration-testing workflow. End-to-end pipelines reach only 31% success rate, and existing LLM-powered systems such as PentestGPT, PentestAgent, and VulnBot exhibit similar limitations, with autonomous agents failing almost entirely. These findings highlight that autonomous penetration testing demands stronger structured reasoning, where modularization enhances each individual stage and improves overall performance. PentestEval provides the foundational benchmark needed for future research on fine-grained, stage-level evaluation, paving the way toward more reliable LLM-based automation.

Paperid: 405, https://arxiv.org/pdf/2510.18333.pdf

Abstract:
Despite progress in watermarking algorithms for large language models (LLMs), real-world deployment remains limited. We argue that this gap stems from misaligned incentives among LLM providers, platforms, and end users, which manifest as four key barriers: competitive risk, detection-tool governance, robustness concerns and attribution issues. We revisit three classes of watermarking through this lens. \emph{Model watermarking} naturally aligns with LLM provider interests, yet faces new challenges in open-source ecosystems. \emph{LLM text watermarking} offers modest provider benefit when framed solely as an anti-misuse tool, but can gain traction in narrowly scoped settings such as dataset de-contamination or user-controlled provenance. \emph{In-context watermarking} (ICW) is tailored for trusted parties, such as conference organizers or educators, who embed hidden watermarking instructions into documents. If a dishonest reviewer or student submits this text to an LLM, the output carries a detectable watermark indicating misuse. This setup aligns incentives: users experience no quality loss, trusted parties gain a detection tool, and LLM providers remain neutral by simply following watermark instructions. We advocate for a broader exploration of incentive-aligned methods, with ICW as an example, in domains where trusted parties need reliable tools to detect misuse. More broadly, we distill design principles for incentive-aligned, domain-specific watermarking and outline future research directions. Our position is that the practical adoption of LLM watermarking requires aligning stakeholder incentives in targeted application domains and fostering active community engagement.

Paperid: 406, https://arxiv.org/pdf/2510.15063.pdf

Abstract:
Physical layer deception (PLD) is a framework we previously introduced that integrates physical layer security (PLS) with deception techniques, enabling proactive countermeasures against eavesdropping rather than relying solely on passive defense. We extend this framework to a semantic communication model and conduct a theoretical analysis using semantic distortion as the performance metric. In this work, we further investigate the receiver's selection of decryption strategies and the transmitter's optimization of encryption strategies. By anticipating the decryption strategy likely to be employed by the legitimate receiver and eavesdropper, the transmitter can optimize resource allocation and encryption parameters, thereby maximizing the semantic distortion at the eavesdropper while maintaining a low level of semantic distortion for the legitimate receiver. We present a rigorous analysis of the resulting optimization problem, propose an efficient optimization algorithm, and derive closed-form optimal solutions for multiple scenarios. Finally, we corroborate the theoretical findings with numerical simulations, which also confirm the practicality of the proposed algorithm.

Paperid: 407, https://arxiv.org/pdf/2510.06605.pdf

Abstract:
The substantial investment required to develop Large Language Models (LLMs) makes them valuable intellectual property, raising significant concerns about copyright protection. LLM fingerprinting has emerged as a key technique to address this, which aims to verify a model's origin by extracting an intrinsic, unique signature (a "fingerprint") and comparing it to that of a source model to identify illicit copies. However, existing black-box fingerprinting methods often fail to generate distinctive LLM fingerprints. This ineffectiveness arises because black-box methods typically rely on model outputs, which lose critical information about the model's unique parameters due to the usage of non-linear functions. To address this, we first leverage Fisher Information Theory to formally demonstrate that the gradient of the model's input is a more informative feature for fingerprinting than the output. Based on this insight, we propose ZeroPrint, a novel method that approximates these information-rich gradients in a black-box setting using zeroth-order estimation. ZeroPrint overcomes the challenge of applying this to discrete text by simulating input perturbations via semantic-preserving word substitutions. This operation allows ZeroPrint to estimate the model's Jacobian matrix as a unique fingerprint. Experiments on the standard benchmark show ZeroPrint achieves a state-of-the-art effectiveness and robustness, significantly outperforming existing black-box methods.

Paperid: 408, https://arxiv.org/pdf/2510.04503.pdf

Abstract:
During fine-tuning, large language models (LLMs) are increasingly vulnerable to data-poisoning backdoor attacks, which compromise their reliability and trustworthiness. However, existing defense strategies suffer from limited generalization: they only work on specific attack types or task settings. In this study, we propose Poison-to-Poison (P2P), a general and effective backdoor defense algorithm. P2P injects benign triggers with safe alternative labels into a subset of training samples and fine-tunes the model on this re-poisoned dataset by leveraging prompt-based learning. This enforces the model to associate trigger-induced representations with safe outputs, thereby overriding the effects of original malicious triggers. Thanks to this robust and generalizable trigger-based fine-tuning, P2P is effective across task settings and attack types. Theoretically and empirically, we show that P2P can neutralize malicious backdoors while preserving task performance. We conduct extensive experiments on classification, mathematical reasoning, and summary generation tasks, involving multiple state-of-the-art LLMs. The results demonstrate that our P2P algorithm significantly reduces the attack success rate compared with baseline models. We hope that the P2P can serve as a guideline for defending against backdoor attacks and foster the development of a secure and trustworthy LLM community.

Paperid: 409, https://arxiv.org/pdf/2510.02964.pdf

Abstract:
In recent years, RAG has emerged as a key paradigm for enhancing large language models (LLMs). By integrating externally retrieved information, RAG alleviates issues like outdated knowledge and, crucially, insufficient domain expertise. While effective, RAG introduces new risks of external data extraction attacks (EDEAs), where sensitive or copyrighted data in its knowledge base may be extracted verbatim. These risks are particularly acute when RAG is used to customize specialized LLM applications with private knowledge bases. Despite initial studies exploring these risks, they often lack a formalized framework, robust attack performance, and comprehensive evaluation, leaving critical questions about real-world EDEA feasibility unanswered. In this paper, we present the first comprehensive study to formalize EDEAs against retrieval-augmented LLMs. We first formally define EDEAs and propose a unified framework decomposing their design into three components: extraction instruction, jailbreak operator, and retrieval trigger, under which prior attacks can be considered instances within our framework. Guided by this framework, we develop SECRET: a Scalable and EffeCtive exteRnal data Extraction aTtack. Specifically, SECRET incorporates (1) an adaptive optimization process using LLMs as optimizers to generate specialized jailbreak prompts for EDEAs, and (2) cluster-focused triggering, an adaptive strategy that alternates between global exploration and local exploitation to efficiently generate effective retrieval triggers. Extensive evaluations across 4 models reveal that SECRET significantly outperforms previous attacks, and is highly effective against all 16 tested RAG instances. Notably, SECRET successfully extracts 35% of the data from RAG powered by Claude 3.7 Sonnet for the first time, whereas other attacks yield 0% extraction. Our findings call for attention to this emerging threat.

Paperid: 410, https://arxiv.org/pdf/2509.10766.pdf

Abstract:
The rapid growth of digital and AI-generated images has amplified the need for secure and verifiable methods of image attribution. While digital watermarking offers more robust protection than metadata-based approaches--which can be easily stripped--current watermarking techniques remain vulnerable to forgery, creating risks of misattribution that can damage the reputations of AI model developers and the rights of digital artists. These vulnerabilities arise from two key issues: (1) content-agnostic watermarks, which, once learned or leaked, can be transferred across images to fake attribution, and (2) reliance on detector-based verification, which is unreliable since detectors can be tricked. We present MetaSeal, a novel framework for content-dependent watermarking with cryptographic security guarantees to safeguard image attribution. Our design provides (1) forgery resistance, preventing unauthorized replication and enforcing cryptographic verification; (2) robust, self-contained protection, embedding attribution directly into images while maintaining resilience against benign transformations; and (3) evidence of tampering, making malicious alterations visually detectable. Experiments demonstrate that MetaSeal effectively mitigates forgery attempts and applies to both natural and AI-generated images, establishing a new standard for secure image attribution.

Paperid: 411, https://arxiv.org/pdf/2508.01784.pdf

Abstract:
Model merging via Mixture-of-Experts (MoE) has emerged as a scalable solution for consolidating multiple task-specific models into a unified sparse architecture, where each expert is derived from a model fine-tuned on a distinct task. While effective for multi-task integration, this paradigm introduces a critical yet underexplored challenge: how to attribute and protect the intellectual property (IP) of individual experts after merging. We propose RouteMark, a framework for IP protection in merged MoE models through the design of expert routing fingerprints. Our key insight is that task-specific experts exhibit stable and distinctive routing behaviors under probing inputs. To capture these patterns, we construct expert-level fingerprints using two complementary statistics: the Routing Score Fingerprint (RSF), quantifying the intensity of expert activation, and the Routing Preference Fingerprint (RPF), characterizing the input distribution that preferentially activates each expert. These fingerprints are reproducible, task-discriminative, and lightweight to construct. For attribution and tampering detection, we introduce a similarity-based matching algorithm that compares expert fingerprints between a suspect and a reference (victim) model. Extensive experiments across diverse tasks and CLIP-based MoE architectures show that RouteMark consistently yields high similarity for reused experts and clear separation from unrelated ones. Moreover, it remains robust against both structural tampering (expert replacement, addition, deletion) and parametric tampering (fine-tuning, pruning, permutation), outperforming weight- and activation-based baseliness. Our work lays the foundation for RouteMark as a practical and broadly applicable framework for IP verification in MoE-based model merging.

Paperid: 412, https://arxiv.org/pdf/2507.15219.pdf

Abstract:
Despite their potential, recent research has demonstrated that LLM agents are vulnerable to prompt injection attacks, where malicious prompts are injected into the agent's input, causing it to perform an attacker-specified task rather than the intended task provided by the user. In this paper, we present PromptArmor, a simple yet effective defense against prompt injection attacks. Specifically, PromptArmor prompts an off-the-shelf LLM to detect and remove potential injected prompts from the input before the agent processes it. Our results show that PromptArmor can accurately identify and remove injected prompts. For example, using GPT-4o, GPT-4.1, or o4-mini, PromptArmor achieves both a false positive rate and a false negative rate below 1% on the AgentDojo benchmark. Moreover, after removing injected prompts with PromptArmor, the attack success rate drops to below 1%. We also demonstrate PromptArmor's effectiveness against adaptive attacks and explore different strategies for prompting an LLM. We recommend that PromptArmor be adopted as a standard baseline for evaluating new defenses against prompt injection attacks.

Paperid: 413, https://arxiv.org/pdf/2507.05578.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data. This phenomenon raises critical questions about model behavior, privacy risks, and the boundary between learning and memorization. Addressing these concerns, this paper synthesizes recent studies and investigates the landscape of memorization, the factors influencing it, and methods for its detection and mitigation. We explore key drivers, including training data duplication, training dynamics, and fine-tuning procedures that influence data memorization. In addition, we examine methodologies such as prefix-based extraction, membership inference, and adversarial prompting, assessing their effectiveness in detecting and measuring memorized content. Beyond technical analysis, we also explore the broader implications of memorization, including the legal and ethical implications. Finally, we discuss mitigation strategies, including data cleaning, differential privacy, and post-training unlearning, while highlighting open challenges in balancing the minimization of harmful memorization with utility. This paper provides a comprehensive overview of the current state of research on LLM memorization across technical, privacy, and performance dimensions, identifying critical directions for future work.

Paperid: 414, https://arxiv.org/pdf/2512.17363.pdf

Abstract:
Trusted Execution Environments (TEEs), such as Intel SGX and ARM TrustZone, provide isolated regions of CPU and memory for secure computation and are increasingly used to protect sensitive data and code across diverse application domains. However, little is known about how developers actually use TEEs in practice. This paper presents the first large-scale empirical study of real-world TEE applications. We collected and analyzed 241 open-source projects from GitHub that utilize the two most widely-adopted TEEs, Intel SGX and ARM TrustZone. By combining manual inspection with customized static analysis scripts, we examined their adoption contexts, usage patterns, and development practices across three phases. First, we categorized the projects into 8 application domains and identified trends in TEE adoption over time. We found that the dominant use case is IoT device security (30%), which contrasts sharply with prior academic focus on blockchain and cryptographic systems (7%), while AI model protection (12%) is rapidly emerging as a growing domain. Second, we analyzed how TEEs are integrated into software and observed that 32.4% of the projects reimplement cryptographic functionalities instead of using official SDK APIs, suggesting that current SDKs may have limited usability and portability to meet developers' practical needs. Third, we examined security practices through manual inspection and found that 25.3% (61 of 241) of the projects exhibit insecure coding behaviors when using TEEs, such as hardcoded secrets and missing input validation, which undermine their intended security guarantees. Our findings have important implications for improving the usability of TEE SDKs and supporting developers in trusted software development.

Paperid: 415, https://arxiv.org/pdf/2511.20994.pdf

Abstract:
Multimodal large reasoning models (MLRMs) are increasingly deployed for vision-language tasks that produce explicit intermediate rationales. However, reasoning traces can contain unsafe content even when the final answer is non-harmful, creating deployment risks. Existing multimodal safety guards primarily evaluate only the input question and the final answer, neglecting the intermediate reasoning process. This oversight allows undetected harm, such as biased inferences or policy-violating use of visual context, to emerge during reasoning. We introduce GuardTrace-VL, a vision-aware safety auditor that monitors the full Question-Thinking-Answer (QTA) pipeline via joint image-text analysis, enabling detection of unsafe content as it emerges in the reasoning stage. To support training and evaluation, we construct the GuardTrace dataset, which is generated through diverse prompting strategies and refined via a MLRM- and human-based voting and verification pipeline. Furthermore, we propose a three-stage progressive training scheme combined with the data refinement process, enabling the model to learn nuanced and context-dependent safety preferences according to different risk levels. On our proposed test set covering both in-domain and out-of-domain scenarios, GuardTrace-VL model achieves an F1 score of 93.1% on unsafe reasoning detection tasks, representing a 13.5% improvement in F1 score compared to the previous strongest multimodal safety defense methods. The codes will be made publicly available.

Paperid: 416, https://arxiv.org/pdf/2510.10111.pdf

Abstract:
Advances in image tampering pose serious security threats, underscoring the need for effective image manipulation localization (IML). While supervised IML achieves strong performance, it depends on costly pixel-level annotations. Existing weakly supervised or training-free alternatives often underperform and lack interpretability. We propose the In-Context Forensic Chain (ICFC), a training-free framework that leverages multi-modal large language models (MLLMs) for interpretable IML tasks. ICFC integrates an objectified rule construction with adaptive filtering to build a reliable knowledge base and a multi-step progressive reasoning pipeline that mirrors expert forensic workflows from coarse proposals to fine-grained forensics results. This design enables systematic exploitation of MLLM reasoning for image-level classification, pixel-level localization, and text-level interpretability. Across multiple benchmarks, ICFC not only surpasses state-of-the-art training-free methods but also achieves competitive or superior performance compared to weakly and fully supervised approaches.

Paperid: 417, https://arxiv.org/pdf/2509.02113.pdf

Abstract:
The advancement of graph-based malware analysis is critically limited by the absence of large-scale datasets that capture the inherent hierarchical structure of software. Existing methods often oversimplify programs into single level graphs, failing to model the crucial semantic relationship between high-level functional interactions and low-level instruction logic. To bridge this gap, we introduce \dataset, the largest public hierarchical graph dataset for malware analysis, comprising over \textbf{200M} Control Flow Graphs (CFGs) nested within \textbf{595K} Function Call Graphs (FCGs). This two-level representation preserves structural semantics essential for building robust detectors resilient to code obfuscation and malware evolution. We demonstrate HiGraph's utility through a large-scale analysis that reveals distinct structural properties of benign and malicious software, establishing it as a foundational benchmark for the community. The dataset and tools are publicly available at https://higraph.org.

Paperid: 418, https://arxiv.org/pdf/2512.08289.pdf

Abstract:
Retrieval-Augmented Generation (RAG) systems enhance LLMs with external knowledge but introduce a critical attack surface: corpus poisoning. While recent studies have demonstrated the potential of such attacks, they typically rely on impractical assumptions, such as white-box access or known user queries, thereby underestimating the difficulty of real-world exploitation. In this paper, we bridge this gap by proposing MIRAGE, a novel multi-stage poisoning pipeline designed for strict black-box and query-agnostic environments. Operating on surrogate model feedback, MIRAGE functions as an automated optimization framework that integrates three key mechanisms: it utilizes persona-driven query synthesis to approximate latent user search distributions, employs semantic anchoring to imperceptibly embed these intents for high retrieval visibility, and leverages an adversarial variant of Test-Time Preference Optimization (TPO) to maximize persuasion. To rigorously evaluate this threat, we construct a new benchmark derived from three long-form, domain-specific datasets. Extensive experiments demonstrate that MIRAGE significantly outperforms existing baselines in both attack efficacy and stealthiness, exhibiting remarkable transferability across diverse retriever-LLM configurations and highlighting the urgent need for robust defense strategies.

Paperid: 419, https://arxiv.org/pdf/2511.21758.pdf

Abstract:
Large language model (LLM) services have been rapidly integrated into people's daily lives as chatbots and agentic systems. They are nourished by collecting rich streams of data, raising privacy concerns around excessive collection of sensitive personal information. Privacy policies are the fundamental mechanism for informing users about data practices in modern information privacy paradigm. Although traditional web and mobile policies are well studied, the privacy policies of LLM providers, their LLM-specific content, and their evolution over time remain largely underexplored. In this paper, we present the first longitudinal empirical study of privacy policies for mainstream LLM providers worldwide. We curate a chronological dataset of 74 historical privacy policies and 115 supplemental privacy documents from 11 LLM providers across 5 countries up to August 2025, and extract over 3,000 sentence-level edits between consecutive policy versions. We compare LLM privacy policies to those of other software formats, propose a taxonomy tailored to LLM privacy policies, annotate policy edits and align them with a timeline of key LLM ecosystem events. Results show they are substantially longer, demand college-level reading ability, and remain highly vague. Our taxonomy analysis reveals patterns in how providers disclose LLM-specific practices and highlights regional disparities in coverage. Policy edits are concentrated in first-party data collection and international/specific-audience sections, and that product releases and regulatory actions are the primary drivers, shedding light on the status quo and the evolution of LLM privacy policies.

Paperid: 420, https://arxiv.org/pdf/2511.08060.pdf

Abstract:
The widespread adoption of open-source software (OSS) has accelerated software innovation but also increased security risks due to the rapid propagation of vulnerabilities and silent patch releases. In recent years, large language models (LLMs) and LLM-based agents have demonstrated remarkable capabilities in various software engineering (SE) tasks, enabling them to effectively address software security challenges such as vulnerability detection. However, systematic evaluation of the capabilities of LLMs and LLM-based agents in security patch detection remains limited. To bridge this gap, we conduct a comprehensive evaluation of the performance of LLMs and LLM-based agents for security patch detection. Specifically, we investigate three methods: Plain LLM (a single LLM with a system prompt), Data-Aug LLM (data augmentation based on the Plain LLM), and the ReAct Agent (leveraging the thought-action-observation mechanism). We also evaluate the performance of both commercial and open-source LLMs under these methods and compare these results with those of existing baselines. Furthermore, we analyze the detection performance of these methods across various vulnerability types, and examine the impact of different prompting strategies and context window sizes on the results. Our findings reveal that the Data-Aug LLM achieves the best overall performance, whereas the ReAct Agent demonstrates the lowest false positive rate (FPR). Although baseline methods exhibit strong accuracy, their false positive rates are significantly higher. In contrast, our evaluated methods achieve comparable accuracy while substantially reducing the FPR. These findings provide valuable insights into the practical applications of LLMs and LLM-based agents in security patch detection, highlighting their advantage in maintaining robust performance while minimizing false positive rates.

Paperid: 421, https://arxiv.org/pdf/2510.16823.pdf

Abstract:
In recent years, the AI wave has grown rapidly in software development. Even novice developers can now design and generate complex framework-constrained software systems based on their high-level requirements with the help of Large Language Models (LLMs). However, when LLMs gradually "take the wheel" of software development, developers may only check whether the program works. They often miss security problems hidden in how the generated programs are implemented. In this work, we investigate the security properties of framework-constrained programs generated by state-of-the-art LLMs. We focus specifically on Chrome extensions due to their complex security model involving multiple privilege boundaries and isolated components. To achieve this, we built ChromeSecBench, a dataset with 140 prompts based on known vulnerable extensions. We used these prompts to instruct nine state-of-the-art LLMs to generate complete Chrome extensions, and then analyzed them for vulnerabilities across three dimensions: scenario types, model differences, and vulnerability categories. Our results show that LLMs produced vulnerable programs at alarmingly high rates (18%-50%), particularly in Authentication & Identity and Cookie Management scenarios (up to 83% and 78% respectively). Most vulnerabilities exposed sensitive browser data like cookies, history, or bookmarks to untrusted code. Interestingly, we found that advanced reasoning models performed worse, generating more vulnerabilities than simpler models. These findings highlight a critical gap between LLMs' coding skills and their ability to write secure framework-constrained programs.

Paperid: 422, https://arxiv.org/pdf/2510.11108.pdf

Abstract:
The autonomy and contextual complexity of LLM-based agents render traditional access control (AC) mechanisms insufficient. Static, rule-based systems designed for predictable environments are fundamentally ill-equipped to manage the dynamic information flows inherent in agentic interactions. This position paper argues for a paradigm shift from binary access control to a more sophisticated model of information governance, positing that the core challenge is not merely about permission, but about governing the flow of information. We introduce Agent Access Control (AAC), a novel framework that reframes AC as a dynamic, context-aware process of information flow governance. AAC operates on two core modules: (1) multi-dimensional contextual evaluation, which assesses not just identity but also relationships, scenarios, and norms; and (2) adaptive response formulation, which moves beyond simple allow/deny decisions to shape information through redaction, summarization, and paraphrasing. This vision, powered by a dedicated AC reasoning engine, aims to bridge the gap between human-like nuanced judgment and scalable Al safety, proposing a new conceptual lens for future research in trustworthy agent design.

Paperid: 423, https://arxiv.org/pdf/2509.24048.pdf

Abstract:
Verifying the authenticity of AI-generated text has become increasingly important with the rapid advancement of large language models, and unbiased watermarking has emerged as a promising approach due to its ability to preserve output distribution without degrading quality. However, recent work reveals that unbiased watermarks can accumulate distributional bias over multiple generations and that existing robustness evaluations are inconsistent across studies. To address these issues, we introduce UWbench, the first open-source benchmark dedicated to the principled evaluation of unbiased watermarking methods. Our framework combines theoretical and empirical contributions: we propose a statistical metric to quantify multi-batch distribution drift, prove an impossibility result showing that no unbiased watermark can perfectly preserve the distribution under infinite queries, and develop a formal analysis of robustness against token-level modification attacks. Complementing this theory, we establish a three-axis evaluation protocol: unbiasedness, detectability, and robustness, and show that token modification attacks provide more stable robustness assessments than paraphrasing-based methods. Together, UWbench offers the community a standardized and reproducible platform for advancing the design and evaluation of unbiased watermarking algorithms.

Paperid: 424, https://arxiv.org/pdf/2509.24043.pdf

Abstract:
As large language models become increasingly capable and widely deployed, verifying the provenance of machine-generated content is critical to ensuring trust, safety, and accountability. Watermarking techniques have emerged as a promising solution by embedding imperceptible statistical signals into the generation process. Among them, unbiased watermarking is particularly attractive due to its theoretical guarantee of preserving the language model's output distribution, thereby avoiding degradation in fluency or detectability through distributional shifts. However, existing unbiased watermarking schemes often suffer from weak detection power and limited robustness, especially under short text lengths or distributional perturbations. In this work, we propose ENS, a novel ensemble framework that enhances the detectability and robustness of logits-based unbiased watermarks while strictly preserving their unbiasedness. ENS sequentially composes multiple independent watermark instances, each governed by a distinct key, to amplify the watermark signal. We theoretically prove that the ensemble construction remains unbiased in expectation and demonstrate how it improves the signal-to-noise ratio for statistical detectors. Empirical evaluations on multiple LLM families show that ENS substantially reduces the number of tokens needed for reliable detection and increases resistance to smoothing and paraphrasing attacks without compromising generation quality.

Paperid: 425, https://arxiv.org/pdf/2509.20399.pdf

Abstract:
Deep neural networks are being utilized in a growing number of applications, both in production systems and for personal use. Network checkpoints are as a consequence often shared and distributed on various platforms to ease the development process. This work considers the threat of neural network stegomalware, where malware is embedded in neural network checkpoints at a negligible cost to network accuracy. This constitutes a significant security concern, but is nevertheless largely neglected by the deep learning practitioners and security specialists alike. We propose the first effective countermeasure to these attacks. In particular, we show that state-of-the-art neural network stegomalware can be efficiently and effectively neutralized through shuffling the column order of the weight- and bias-matrices, or equivalently the channel-order of convolutional layers. We show that this effectively corrupts payloads that have been embedded by state-of-the-art methods in neural network steganography at no cost to network accuracy, outperforming competing methods by a significant margin. We then discuss possible means by which to bypass this defense, additional defense methods, and advocate for continued research into the security of machine learning systems.

Paperid: 426, https://arxiv.org/pdf/2509.17409.pdf

Abstract:
The advancement of low-altitude intelligent networks enables unmanned aerial vehicle (UAV) interconnection via flying ad-hoc networks (FANETs), offering flexibility and decentralized coordination. However, resource constraints, dynamic topologies, and UAV operations in open environments present significant security and communication challenges. Existing multi-factor and public-key cryptography protocols are vulnerable due to their reliance on stored sensitive information, increasing the risk of exposure and compromise. This paper proposes a lightweight authentication and key agreement protocol for FANETs, integrating physical unclonable functions with dynamic credential management and lightweight cryptographic primitives. The protocol reduces computational and communication overhead while enhancing security. Security analysis confirms its resilience against various attacks, and comparative evaluations demonstrate its superiority in security, communication efficiency, and computational cost.

Paperid: 427, https://arxiv.org/pdf/2509.05753.pdf

Abstract:
The rise of synthetic media has blurred the boundary between reality and fabrication under the evolving power of artificial intelligence, fueling an infodemic that erodes public trust in cyberspace. For digital imagery, a multitude of editing applications further complicates the forensic analysis, including semantic edits that alter content, photometric adjustments that recalibrate colour characteristics, and geometric projections that reshape viewpoints. Collectively, these transformations manipulate and control perceptual interpretation of digital imagery. This susceptibility calls for forensic enquiry into reconstructing the chain of events, thereby revealing deeper evidential insight into the presence or absence of criminal intent. This study seeks to address an inverse problem of tracing the underlying generation chain that gives rise to the observed synthetic media. A tell-tale watermarking system is developed for explanatory reasoning over the nature and extent of transformations across the lifecycle of synthetic media. Tell-tale watermarks are tailored to different classes of transformations, responding in a manner that is neither strictly robust nor fragile but instead interpretable. These watermarks function as reference clues that evolve under the same transformation dynamics as the carrier media, leaving interpretable traces when subjected to transformations. Explanatory reasoning is then performed to infer the most plausible account across the combinatorial parameter space of composite transformations. Experimental evaluations demonstrate the validity of tell-tale watermarking with respect to fidelity, synchronicity and traceability.

Paperid: 428, https://arxiv.org/pdf/2508.09442.pdf

Abstract:
The Key-Value (KV) cache, which stores intermediate attention computations (Key and Value pairs) to avoid redundant calculations, is a fundamental mechanism for accelerating Large Language Model (LLM) inference. However, this efficiency optimization introduces significant yet underexplored privacy risks. This paper provides the first comprehensive analysis of these vulnerabilities, demonstrating that an attacker can reconstruct sensitive user inputs directly from the KV-cache. We design and implement three distinct attack vectors: a direct Inversion Attack, a more broadly applicable and potent Collision Attack, and a semantic-based Injection Attack. These methods demonstrate the practicality and severity of KV-cache privacy leakage issues. To mitigate this, we propose KV-Cloak, a novel, lightweight, and efficient defense mechanism. KV-Cloak uses a reversible matrix-based obfuscation scheme, combined with operator fusion, to secure the KV-cache. Our extensive experiments show that KV-Cloak effectively thwarts all proposed attacks, reducing reconstruction quality to random noise. Crucially, it achieves this robust security with virtually no degradation in model accuracy and minimal performance overhead, offering a practical solution for trustworthy LLM deployment.

Paperid: 429, https://arxiv.org/pdf/2508.00938.pdf

Abstract:
Due to the high flexibility and versatility, unmanned aerial vehicles (UAVs) are leveraged in various fields including surveillance and disaster rescue.However, in UAV networks, routing is vulnerable to malicious damage due to distributed topologies and high dynamics. Hence, ensuring the routing security of UAV networks is challenging. In this paper, we characterize the routing process in a time-varying UAV network with malicious nodes. Specifically, we formulate the routing problem to minimize the total delay, which is an integer linear programming and intractable to solve. Then, to tackle the network security issue, a blockchain-based trust management mechanism (BTMM) is designed to dynamically evaluate trust values and identify low-trust UAVs. To improve traditional practical Byzantine fault tolerance algorithms in the blockchain, we propose a consensus UAV update mechanism. Besides, considering the local observability, the routing problem is reformulated into a decentralized partially observable Markov decision process. Further, a multi-agent double deep Q-network based routing algorithm is designed to minimize the total delay. Finally, simulations are conducted with attacked UAVs and numerical results show that the delay of the proposed mechanism decreases by 13.39$\%$, 12.74$\%$, and 16.6$\%$ than multi-agent proximal policy optimal algorithms, multi-agent deep Q-network algorithms, and methods without BTMM, respectively.

Paperid: 430, https://arxiv.org/pdf/2512.02321.pdf

Abstract:
Large Language Model (LLM)-based agents have demonstrated remarkable capabilities in reasoning, planning, and tool usage. The recently proposed Model Context Protocol (MCP) has emerged as a unifying framework for integrating external tools into agent systems, enabling a thriving open ecosystem of community-built functionalities. However, the openness and composability that make MCP appealing also introduce a critical yet overlooked security assumption -- implicit trust in third-party tool providers. In this work, we identify and formalize a new class of attacks that exploit this trust boundary without violating explicit permissions. We term this new attack vector implicit toxicity, where malicious behaviors occur entirely within the allowed privilege scope. We propose LeechHijack, a Latent Embedded Exploit for Computation Hijacking, in which an adversarial MCP tool covertly expropriates the agent's computational resources for unauthorized workloads. LeechHijack operates through a two-stage mechanism: an implantation stage that embeds a benign-looking backdoor in a tool, and an exploitation stage where the backdoor activates upon predefined triggers to establish a command-and-control channel. Through this channel, the attacker injects additional tasks that the agent executes as if they were part of its normal workflow, effectively parasitizing the user's compute budget. We implement LeechHijack across four major LLM families. Experiments show that LeechHijack achieves an average success rate of 77.25%, with a resource overhead of 18.62% compared to the baseline. This study highlights the urgent need for computational provenance and resource attestation mechanisms to safeguard the emerging MCP ecosystem.

Paperid: 431, https://arxiv.org/pdf/2510.15946.pdf

Abstract:
Internet memes have emerged as a popular multimodal medium, yet they are increasingly weaponized to convey harmful opinions through subtle rhetorical devices like irony and metaphor. Existing detection approaches, including MLLM-based techniques, struggle with these implicit expressions, leading to frequent misjudgments. This paper introduces PatMD, a novel approach that improves harmful meme detection by learning from and proactively mitigating these potential misjudgment risks. Our core idea is to move beyond superficial content-level matching and instead identify the underlying misjudgment risk patterns, proactively guiding the MLLMs to avoid known misjudgment pitfalls. We first construct a knowledge base where each meme is deconstructed into a misjudgment risk pattern explaining why it might be misjudged, either overlooking harmful undertones (false negative) or overinterpreting benign content (false positive). For a given target meme, PatMD retrieves relevant patterns and utilizes them to dynamically guide the MLLM's reasoning. Experiments on a benchmark of 6,626 memes across 5 harmful detection tasks show that PatMD outperforms state-of-the-art baselines, achieving an average of 8.30\% improvement in F1-score and 7.71\% improvement in accuracy, demonstrating strong generalizability and improved detection capability of harmful memes.

Paperid: 432, https://arxiv.org/pdf/2510.09307.pdf

Abstract:
Most of the existing speaker anonymization research has focused on single-speaker audio, leading to the development of techniques and evaluation metrics optimized for such condition. This study addresses the significant challenge of speaker anonymization within multi-speaker conversational audio, specifically when only a single target speaker needs to be anonymized. This scenario is highly relevant in contexts like call centers, where customer privacy necessitates anonymizing only the customer's voice in interactions with operators. Conventional anonymization methods are often not suitable for this task. Moreover, current evaluation methodology does not allow us to accurately assess privacy protection and utility in this complex multi-speaker scenario. This work aims to bridge these gaps by exploring effective strategies for targeted speaker anonymization in conversational audio, highlighting potential problems in their development and proposing corresponding improved evaluation methodologies.

Paperid: 433, https://arxiv.org/pdf/2509.11173.pdf

Abstract:
Deep learning (DL) compilers are core infrastructure in modern DL systems, offering flexibility and scalability beyond vendor-specific libraries. This work uncovers a fundamental vulnerability in their design: can an official, unmodified compiler alter a model's semantics during compilation and introduce hidden backdoors? We study both adversarial and natural settings. In the adversarial case, we craft benign models where triggers have no effect pre-compilation but become effective backdoors after compilation. Tested on six models, three commercial compilers, and two hardware platforms, our attack yields 100% success on triggered inputs while preserving normal accuracy and remaining undetected by state-of-the-art detectors. The attack generalizes across compilers, hardware, and floating-point settings. In the natural setting, we analyze the top 100 HuggingFace models (including one with 220M+ downloads) and find natural triggers in 31 models. This shows that compilers can introduce risks even without adversarial manipulation. Our results reveal an overlooked threat: unmodified DL compilers can silently alter model semantics. To our knowledge, this is the first work to expose inherent security risks in DL compiler design, opening a new direction for secure and trustworthy ML.

Paperid: 434, https://arxiv.org/pdf/2509.04091.pdf

Abstract:
Accurate detection of third-party libraries (TPLs) is fundamental to Android security, supporting vulnerability tracking, malware detection, and supply chain auditing. Despite many proposed tools, their real-world effectiveness remains unclear. We present the first large-scale empirical study of ten state-of-the-art TPL detection techniques across over 6,000 apps, enabled by a new ground truth dataset with precise version-level annotations for both remote and local dependencies. Our evaluation exposes tool fragility to R8-era transformations, weak version discrimination, inaccurate correspondence of candidate libraries, difficulty in generalizing similarity thresholds, and prohibitive runtime/memory overheads at scale. Beyond tool assessment, we further analyze how TPLs shape downstream tasks, including vulnerability analysis, malware detection, secret leakage assessment, and LLM-based evaluation. From this perspective, our study provides concrete insights into how TPL characteristics affect these tasks and informs future improvements in security analysis.

Paperid: 435, https://arxiv.org/pdf/2507.18053.pdf

Abstract:
Resource Consumption Attacks (RCAs) have emerged as a significant threat to the deployment of Large Language Models (LLMs). With the integration of vision modalities, additional attack vectors exacerbate the risk of RCAs in large vision-language models (LVLMs). However, existing red-teaming studies have largely overlooked visual inputs as a potential attack surface, resulting in insufficient mitigation strategies against RCAs in LVLMs. To address this gap, we propose RECALLED (\textbf{RE}source \textbf{C}onsumption \textbf{A}ttack on \textbf{L}arge Vision-\textbf{L}anguag\textbf{E} Mo\textbf{D}els), the first approach for exploiting visual modalities to trigger unbounded RCAs red-teaming. First, we present \textit{Vision Guided Optimization}, a fine-grained pixel-level optimization, to obtain \textit{Output Recall} adversarial perturbations, which can induce repeating output. Then, we inject the perturbations into visual inputs, triggering unbounded generations to achieve the goal of RCAs. Additionally, we introduce \textit{Multi-Objective Parallel Losses} to generate universal attack templates and resolve optimization conflicts when intending to implement parallel attacks. Empirical results demonstrate that RECALLED increases service response latency by over 26 $\uparrow$, resulting in an additional 20\% increase in GPU utilization and memory consumption. Our study exposes security vulnerabilities in LVLMs and establishes a red-teaming framework that can facilitate future defense development against RCAs.

Paperid: 436, https://arxiv.org/pdf/2507.16220.pdf

Abstract:
This study introduces LENS-DF, a novel and comprehensive recipe for training and evaluating audio deepfake detection and temporal localization under complicated and realistic audio conditions. The generation part of the recipe outputs audios from the input dataset with several critical characteristics, such as longer duration, noisy conditions, and containing multiple speakers, in a controllable fashion. The corresponding detection and localization protocol uses models. We conduct experiments based on self-supervised learning front-end and simple back-end. The results indicate that models trained using data generated with LENS-DF consistently outperform those trained via conventional recipes, demonstrating the effectiveness and usefulness of LENS-DF for robust audio deepfake detection and localization. We also conduct ablation studies on the variations introduced, investigating their impact on and relevance to realistic challenges in the field.

Paperid: 437, https://arxiv.org/pdf/2512.21362.pdf

Abstract:
Security in modern RISC-V processors demands more than functional correctness: It requires resilience to side-channel attacks. This paper evaluates the vulnerability of the side channel of the CVA6 RISC-V core by analyzing software-based AES encryption uses an RTL-level power profiling framework called VeriSide. This work represents that this design's Correlation Power Analysis (CPA) reveals significant leakage, enabling key recovery. These findings underscore the importance of early-stage RTL assessments in shaping future secure RISC-V designs.

Paperid: 438, https://arxiv.org/pdf/2511.12982.pdf

Abstract:
Multimodal large language models (MLLMs) have demonstrated impressive reasoning and instruction-following capabilities, yet their expanded modality space introduces new compositional safety risks that emerge from complex text-image interactions. Such cross-modal couplings can produce unsafe semantics even when individual inputs are benign, exposing the fragile safety awareness of current MLLMs. While recent works enhance safety by guiding models to reason about potential risks, unregulated reasoning traces may compromise alignment; although Group Relative Policy Optimization (GRPO) offers self-rewarded refinement without human supervision, it lacks verifiable signals for reasoning safety. To address this, we propose SafeGRPO a self-rewarded multimodal safety alignment framework that integrates rule-governed reward construction into GRPO, enabling interpretable and verifiable optimization of reasoning safety. Built upon the constructed SafeTag-VL-3K dataset with explicit visual, textual, and combined safety tags, SafeGRPO performs step-guided safety thinking to enforce structured reasoning and behavior alignment, substantially improving multimodal safety awareness, compositional robustness, and reasoning stability across diverse benchmarks without sacrificing general capabilities.

Paperid: 439, https://arxiv.org/pdf/2509.18871.pdf

Abstract:
Federated learning has emerged as a prominent privacy-preserving technique for leveraging large-scale distributed datasets by sharing gradients instead of raw data. However, recent studies indicate that private training data can still be exposed through gradient inversion attacks. While earlier analytical methods have demonstrated success in reconstructing input data from fully connected layers, their effectiveness significantly diminishes when applied to convolutional layers, high-dimensional inputs, and scenarios involving multiple training examples. This paper extends our previous work \cite{eltaras2024r} and proposes three advanced algorithms to broaden the applicability of gradient inversion attacks. The first algorithm presents a novel data leakage method that efficiently exploits convolutional layer gradients, demonstrating that even with non-fully invertible activation functions, such as ReLU, training samples can be analytically reconstructed directly from gradients without the need to reconstruct intermediate layer outputs. Building on this foundation, the second algorithm extends this analytical approach to support high-dimensional input data, substantially enhancing its utility across complex real-world datasets. The third algorithm introduces an innovative analytical method for reconstructing mini-batches, addressing a critical gap in current research that predominantly focuses on reconstructing only a single training example. Unlike previous studies that focused mainly on the weight constraints of convolutional layers, our approach emphasizes the pivotal role of gradient constraints, revealing that successful attacks can be executed with fewer than 5\% of the constraints previously deemed necessary in certain layers.

Paperid: 440, https://arxiv.org/pdf/2507.14739.pdf

Abstract:
The Controller Area Network (CAN) protocol, essential for automotive embedded systems, lacks inherent security features, making it vulnerable to cyber threats, especially with the rise of autonomous vehicles. Traditional security measures offer limited protection, such as payload encryption and message authentication. This paper presents a novel Intrusion Detection System (IDS) designed for the CAN environment, utilizing Hardware Performance Counters (HPCs) to detect anomalies indicative of cyber attacks. A RISC-V-based CAN receiver is simulated using the gem5 simulator, processing CAN frame payloads with AES-128 encryption as FreeRTOS tasks, which trigger distinct HPC responses. Key HPC features are optimized through data extraction and correlation analysis to enhance classification efficiency. Results indicate that this approach could significantly improve CAN security and address emerging challenges in automotive cybersecurity.

Paperid: 441, https://arxiv.org/pdf/2507.07143.pdf

Abstract:
Accurately modeling malware propagation is essential for designing effective cybersecurity defenses, particularly against adaptive threats that evolve in real time. While traditional epidemiological models and recent neural approaches offer useful foundations, they often fail to fully capture the nonlinear feedback mechanisms present in real-world networks. In this work, we apply scientific machine learning to malware modeling by evaluating three approaches: classical Ordinary Differential Equations (ODEs), Universal Differential Equations (UDEs), and Neural ODEs. Using data from the Code Red worm outbreak, we show that the UDE approach substantially reduces prediction error compared to both traditional and neural baselines by 44%, while preserving interpretability. We introduce a symbolic recovery method that transforms the learned neural feedback into explicit mathematical expressions, revealing suppression mechanisms such as network saturation, security response, and malware variant evolution. Our results demonstrate that hybrid physics-informed models can outperform both purely analytical and purely neural approaches, offering improved predictive accuracy and deeper insight into the dynamics of malware spread. These findings support the development of early warning systems, efficient outbreak response strategies, and targeted cyber defense interventions.

Paperid: 442, https://arxiv.org/pdf/2512.04668.pdf

Abstract:
Graph topology is a fundamental determinant of memory leakage in multi-agent LLM systems, yet its effects remain poorly quantified. We introduce MAMA (Multi-Agent Memory Attack), a framework that measures how network structure shapes leakage. MAMA operates on synthetic documents containing labeled Personally Identifiable Information (PII) entities, from which we generate sanitized task instructions. We execute a two-phase protocol: Engram (seeding private information into a target agent's memory) and Resonance (multi-round interaction where an attacker attempts extraction). Over up to 10 interaction rounds, we quantify leakage as the fraction of ground-truth PII recovered from attacking agent outputs via exact matching. We systematically evaluate six common network topologies (fully connected, ring, chain, binary tree, star, and star-ring), varying agent counts $n\in\{4,5,6\}$, attacker-target placements, and base models. Our findings reveal consistent patterns: fully connected graphs exhibit maximum leakage while chains provide strongest protection; shorter attacker-target graph distance and higher target centrality significantly increase vulnerability; leakage rises sharply in early rounds before plateauing; model choice shifts absolute leakage rates but preserves topology rankings; temporal/locational PII attributes leak more readily than identity credentials or regulated identifiers. These results provide the first systematic mapping from architectural choices to measurable privacy risk, yielding actionable guidance: prefer sparse or hierarchical connectivity, maximize attacker-target separation, limit node degree and network radius, avoid shortcuts bypassing hubs, and implement topology-aware access controls.

Paperid: 443, https://arxiv.org/pdf/2511.19644.pdf

Abstract:
Modern enterprise systems face escalating cyber threats that are increasingly dynamic, distributed, and multi-stage in nature. Traditional intrusion detection and response systems often rely on static rules and manual workflows, which limit their ability to respond with the speed and precision required in high-stakes environments. To address these challenges, we present the Intrusion Response System Digital Assistant (IRSDA), an agent-based framework designed to deliver autonomous and policy-compliant cyber defense. IRSDA combines Self-Adaptive Autonomic Computing Systems (SA-ACS) with the Knowledge guided Monitor, Analyze, Plan, and Execute (MAPE-K) loop to support real-time, partition-aware decision-making across enterprise infrastructure. IRSDA incorporates a knowledge-driven architecture that integrates contextual information with AI-based reasoning to support system-guided intrusion response. The framework leverages retrieval mechanisms and structured representations to inform decision-making while maintaining alignment with operational policies. We assess the system using a representative real-world microservices application, demonstrating its ability to automate containment, enforce compliance, and provide traceable outputs for security analyst interpretation. This work outlines a modular and agent-driven approach to cyber defense that emphasizes explainability, system-state awareness, and operational control in intrusion response.

Paperid: 444, https://arxiv.org/pdf/2510.18601.pdf

Abstract:
Mobile apps often embed authentication secrets, such as API keys, tokens, and client IDs, to integrate with cloud services. However, developers often hardcode these credentials into Android apps, exposing them to extraction through reverse engineering. Once compromised, adversaries can exploit secrets to access sensitive data, manipulate resources, or abuse APIs, resulting in significant security and financial risks. Existing detection approaches, such as regex-based analysis, static analysis, and machine learning, are effective for identifying known patterns but are fundamentally limited: they require prior knowledge of credential structures, API signatures, or training data. In this paper, we propose SecretLoc, an LLM-based approach for detecting hardcoded secrets in Android apps. SecretLoc goes beyond pattern matching; it leverages contextual and structural cues to identify secrets without relying on predefined patterns or labeled training sets. Using a benchmark dataset from the literature, we demonstrate that SecretLoc detects secrets missed by regex-, static-, and ML-based methods, including previously unseen types of secrets. In total, we discovered 4828 secrets that were undetected by existing approaches, discovering more than 10 "new" types of secrets, such as OpenAI API keys, GitHub Access Tokens, RSA private keys, and JWT tokens, and more. We further extend our analysis to newly crawled apps from Google Play, where we uncovered and responsibly disclosed additional hardcoded secrets. Across a set of 5000 apps, we detected secrets in 2124 apps (42.5%), several of which were confirmed and remediated by developers after we contacted them. Our results reveal a dual-use risk: if analysts can uncover these secrets with LLMs, so can attackers. This underscores the urgent need for proactive secret management and stronger mitigation practices across the mobile ecosystem.

Paperid: 445, https://arxiv.org/pdf/2510.02902.pdf

Abstract:
Diffusion large language models (dLLMs) offer faster generation than autoregressive models while maintaining comparable quality, but existing watermarking methods fail on them due to their non-sequential decoding. Unlike autoregressive models that generate tokens left-to-right, dLLMs can finalize tokens in arbitrary order, breaking the causal design underlying traditional watermarks. We present DMark, the first watermarking framework designed specifically for dLLMs. DMark introduces three complementary strategies to restore watermark detectability: predictive watermarking uses model-predicted tokens when actual context is unavailable; bidirectional watermarking exploits both forward and backward dependencies unique to diffusion decoding; and predictive-bidirectional watermarking combines both approaches to maximize detection strength. Experiments across multiple dLLMs show that DMark achieves 92.0-99.5% detection rates at 1% false positive rate while maintaining text quality, compared to only 49.6-71.2% for naive adaptations of existing methods. DMark also demonstrates robustness against text manipulations, establishing that effective watermarking is feasible for non-autoregressive language models.

Paperid: 446, https://arxiv.org/pdf/2509.18790.pdf

Abstract:
Infrastructure as Code (IaC) automates the provisioning and management of IT infrastructure through scripts and tools, streamlining software deployment. Prior studies have shown that IaC scripts often contain recurring security misconfigurations, and several detection and mitigation approaches have been proposed. Most of these rely on static analysis, using statistical code representations or Machine Learning (ML) classifiers to distinguish insecure configurations from safe code. In this work, we introduce a novel approach that enhances static analysis with semantic understanding by jointly leveraging natural language and code representations. Our method builds on two complementary ML models: CodeBERT, to capture semantics across code and text, and LongFormer, to represent long IaC scripts without losing contextual information. We evaluate our approach on misconfiguration datasets from two widely used IaC tools, Ansible and Puppet. To validate its effectiveness, we conduct two ablation studies (removing code text from the natural language input and truncating scripts to reduce context) and compare against four large language models (LLMs) and prior work. Results show that semantic enrichment substantially improves detection, raising precision and recall from 0.46 and 0.79 to 0.92 and 0.88 on Ansible, and from 0.55 and 0.97 to 0.87 and 0.75 on Puppet, respectively.

Paperid: 447, https://arxiv.org/pdf/2509.18761.pdf

Abstract:
Infrastructure as Code (IaC) has become essential for modern software management, yet security flaws in IaC scripts can have severe consequences, as exemplified by the recurring exploits of Cloud Web Services. Prior work has recognized the need to build a precise taxonomy of security smells in IaC scripts as a first step towards developing approaches to improve IaC security. This first effort led to the unveiling of seven sins, limited by the focus on a single IaC tool as well as by the extensive, and potentially biased, manual effort that was required. We propose, in our work, to revisit this taxonomy: first, we extend the study of IaC security smells to a more diverse dataset with scripts associated with seven popular IaC tools, including Terraform, Ansible, Chef, Puppet, Pulumi, Saltstack, and Vagrant; second, we bring in some automation for the analysis by relying on an LLM. While we leverage LLMs for initial pattern processing, all taxonomic decisions underwent systematic human validation and reconciliation with established security standards. Our study yields a comprehensive taxonomy of 62 security smell categories, significantly expanding beyond the previously known seven. We demonstrate actionability by implementing new security checking rules within linters for seven popular IaC tools, often achieving 1.00 precision score. Our evolution study of security smells in GitHub projects reveals that these issues persist for extended periods, likely due to inadequate detection and mitigation tools. This work provides IaC practitioners with insights for addressing common security smells and systematically adopting DevSecOps practices to build safer infrastructure code.

Paperid: 448, https://arxiv.org/pdf/2509.16950.pdf

Abstract:
Assessing the safety of autonomous driving (AD) systems against security threats, particularly backdoor attacks, is a stepping stone for real-world deployment. However, existing works mainly focus on pixel-level triggers that are impractical to deploy in the real world. We address this gap by introducing a novel backdoor attack against the end-to-end AD systems that leverage one or more other vehicles' trajectories as triggers. To generate precise trigger trajectories, we first use temporal logic (TL) specifications to define the behaviors of attacker vehicles. Configurable behavior models are then used to generate these trajectories, which are quantitatively evaluated and iteratively refined based on the TL specifications. We further develop a negative training strategy by incorporating patch trajectories that are similar to triggers but are designated not to activate the backdoor. It enhances the stealthiness of the attack and refines the system's responses to trigger scenarios. Through extensive experiments on 5 offline reinforcement learning (RL) driving agents with 6 trigger patterns and target action combinations, we demonstrate the flexibility and effectiveness of our proposed attack, showing the under-exploration of existing end-to-end AD systems' vulnerabilities to such trajectory-based backdoor attacks.

Paperid: 449, https://arxiv.org/pdf/2508.18684.pdf

Abstract:
Signature-based Intrusion Detection Systems (IDS) detect malicious activities by matching network or host activity against predefined rules. These rules are derived from extensive Cyber Threat Intelligence (CTI), which includes attack signatures and behavioral patterns obtained through automated tools and manual threat analysis, such as sandboxing. The CTI is then transformed into actionable rules for the IDS engine, enabling real-time detection and prevention. However, the constant evolution of cyber threats necessitates frequent rule updates, which delay deployment time and weaken overall security readiness. Recent advancements in agentic systems powered by Large Language Models (LLMs) offer the potential for autonomous IDS rule generation with internal evaluation. We introduce FALCON, an autonomous agentic framework that generates deployable IDS rules from CTI data in real-time and evaluates them using built-in multi-phased validators. To demonstrate versatility, we target both network (Snort) and host-based (YARA) mediums and construct a comprehensive dataset of IDS rules with their corresponding CTIs. Our evaluations indicate FALCON excels in automatic rule generation, with an average of 95% accuracy validated by qualitative evaluation with 84% inter-rater agreement among multiple cybersecurity analysts across all metrics. These results underscore the feasibility and effectiveness of LLM-driven data mining for real-time cyber threat mitigation.

Paperid: 450, https://arxiv.org/pdf/2508.00756.pdf

Abstract:
Understanding the memorization and privacy leakage risks in Contrastive Language--Image Pretraining (CLIP) is critical for ensuring the security of multimodal models. Recent studies have demonstrated the feasibility of extracting sensitive training examples from diffusion models, with conditional diffusion models exhibiting a stronger tendency to memorize and leak information. In this work, we investigate data memorization and extraction risks in CLIP through the lens of CLIP inversion, a process that aims to reconstruct training images from text prompts. To this end, we introduce \textbf{LeakyCLIP}, a novel attack framework designed to achieve high-quality, semantically accurate image reconstruction from CLIP embeddings. We identify three key challenges in CLIP inversion: 1) non-robust features, 2) limited visual semantics in text embeddings, and 3) low reconstruction fidelity. To address these challenges, LeakyCLIP employs 1) adversarial fine-tuning to enhance optimization smoothness, 2) linear transformation-based embedding alignment, and 3) Stable Diffusion-based refinement to improve fidelity. Empirical results demonstrate the superiority of LeakyCLIP, achieving over 358% improvement in Structural Similarity Index Measure (SSIM) for ViT-B-16 compared to baseline methods on LAION-2B subset. Furthermore, we uncover a pervasive leakage risk, showing that training data membership can even be successfully inferred from the metrics of low-fidelity reconstructions. Our work introduces a practical method for CLIP inversion while offering novel insights into the nature and scope of privacy risks in multimodal models.

Paperid: 451, https://arxiv.org/pdf/2509.21884.pdf

Abstract:
Large language models (LLMs) have been widely adopted across various applications, leveraging customized system prompts for diverse tasks. Facing potential system prompt leakage risks, model developers have implemented strategies to prevent leakage, primarily by disabling LLMs from repeating their context when encountering known attack patterns. However, it remains vulnerable to new and unforeseen prompt-leaking techniques. In this paper, we first introduce a simple yet effective prompt leaking attack to reveal such risks. Our attack is capable of extracting system prompts from various LLM-based application, even from SOTA LLM models such as GPT-4o or Claude 3.5 Sonnet. Our findings further inspire us to search for a fundamental solution to the problems by having no system prompt in the context. To this end, we propose SysVec, a novel method that encodes system prompts as internal representation vectors rather than raw text. By doing so, SysVec minimizes the risk of unauthorized disclosure while preserving the LLM's core language capabilities. Remarkably, this approach not only enhances security but also improves the model's general instruction-following abilities. Experimental results demonstrate that SysVec effectively mitigates prompt leakage attacks, preserves the LLM's functional integrity, and helps alleviate the forgetting issue in long-context scenarios.

Paperid: 452, https://arxiv.org/pdf/2509.17962.pdf

Abstract:
What if you could really revoke your actual biometric identity, and install a new one, by live rewriting your biological self? We propose some novel mechanisms for hot swapping identity based in novel biotechnology. We discuss the potential positive use cases, and negative consequences if such technology was to become available and affordable. Biometrics are selected on the basis that they are supposed to be unfakeable, or at least not at reasonable cost. If they become easier to fake, it may be much cheaper to fake someone else's biometrics than it is for you to change your own biometrics if someone does copy yours. This potentially makes biometrics a bad trade-off for the user. At the time of writing, this threat is highly speculative, but we believe it is worth raising and considering the potential consequences.

Paperid: 453, https://arxiv.org/pdf/2508.20452.pdf

Abstract:
Generative AI offers transformative potential for high-stakes domains such as healthcare and finance, yet privacy and regulatory barriers hinder the use of real-world data. To address this, differentially private synthetic data generation has emerged as a promising alternative. In this work, we introduce a unified benchmark to systematically evaluate the utility and fidelity of text datasets generated under formal Differential Privacy (DP) guarantees. Our benchmark addresses key challenges in domain-specific benchmarking, including choice of representative data and realistic privacy budgets, accounting for pre-training and a variety of evaluation metrics. We assess state-of-the-art privacy-preserving generation methods across five domain-specific datasets, revealing significant utility and fidelity degradation compared to real data, especially under strict privacy constraints. These findings underscore the limitations of current approaches, outline the need for advanced privacy-preserving data sharing methods and set a precedent regarding their evaluation in realistic scenarios.

Paperid: 454, https://arxiv.org/pdf/2508.18839.pdf

Abstract:
Malware detection in real-world settings must deal with evolving threats, limited labeling budgets, and uncertain predictions. Traditional classifiers, without additional mechanisms, struggle to maintain performance under concept drift in malware domains, as their supervised learning formulation cannot optimize when to defer decisions to manual labeling and adaptation. Modern malware detection pipelines combine classifiers with monthly active learning (AL) and rejection mechanisms to mitigate the impact of concept drift. In this work, we develop a novel formulation of malware detection as a one-step Markov Decision Process and train a deep reinforcement learning (DRL) agent, simultaneously optimizing sample classification performance and rejecting high-risk samples for manual labeling. We evaluated the joint detection and drift mitigation policy learned by the DRL-based Malware Detection (DRMD) agent through time-aware evaluations on Android malware datasets subject to realistic drift requiring multi-year performance stability. The policies learned under these conditions achieve a higher Area Under Time (AUT) performance compared to standard classification approaches used in the domain, showing improved resilience to concept drift. Specifically, the DRMD agent achieved a $5.18\pm5.44$, $14.49\pm12.86$, and $10.06\pm10.81$ average AUT performance improvement for the classification only, classification with rejection, and classification with rejection and AL settings, respectively. Our results demonstrate for the first time that DRL can facilitate effective malware detection and improved resiliency to concept drift in the dynamic environment of the Android malware domain.

Paperid: 455, https://arxiv.org/pdf/2508.09652.pdf

Abstract:
Malware detection increasingly relies on AI systems that integrate signature-based detection with machine learning. However, these components are typically developed and combined in isolation, missing opportunities to reduce data complexity and strengthen defenses against adversarial EXEmples, carefully crafted programs designed to evade detection. Hence, in this work we investigate the influence that signature-based detection exerts on model training, when they are included inside the training pipeline. Specifically, we compare models trained on a comprehensive dataset with an AI system whose machine learning component is trained solely on samples not already flagged by signatures. Our results demonstrate improved robustness to both adversarial EXEmples and temporal data drift, although this comes at the cost of a fixed lower bound on false positives, driven by suboptimal rule selection. We conclude by discussing these limitations and outlining how future research could extend AI-based malware detection to include dynamic analysis, thereby further enhancing system resilience.

Paperid: 456, https://arxiv.org/pdf/2508.09201.pdf

Abstract:
Despite extensive alignment efforts, Large Vision-Language Models (LVLMs) remain vulnerable to jailbreak attacks, posing serious safety risks. Although recent detection works have shifted to internal representations due to their rich cross-modal information, most methods rely on heuristic rules rather than principled objectives, resulting in suboptimal performance. To address these limitations, we propose Learning to Detect (LoD), a novel unsupervised framework that formulates jailbreak detection as anomaly detection. LoD introduces two key components: Multi-modal Safety Concept Activation Vectors (MSCAV), which capture layer-wise safety-related representations across modalities, and the Safety Pattern Auto-Encoder, which models the distribution of MSCAV derived from safe inputs and detects anomalies via reconstruction errors. By training the auto-encoder (AE) solely on safe samples without attack labels, LoD naturally identifies jailbreak inputs as distributional anomalies, enabling accurate and unified detection of jailbreak attacks. Comprehensive experiments on three different LVLMs and five benchmarks demonstrate that LoD achieves state-of-the-art performance, with an average AUROC of 0.9951 and an improvement of up to 38.89% in the minimum AUROC over the strongest baselines.

Paperid: 457, https://arxiv.org/pdf/2507.04883.pdf

Abstract:
Deep Reinforcement Learning (DRL) systems are increasingly used in safety-critical applications, yet their security remains severely underexplored. This work investigates backdoor attacks, which implant hidden triggers that cause malicious actions only when specific inputs appear in the observation space. Existing DRL backdoor research focuses solely on training-time attacks requiring unrealistic access to the training pipeline. In contrast, we reveal critical vulnerabilities across the DRL supply chain where backdoors can be embedded with significantly reduced adversarial privileges. We introduce two novel attacks: (1) TrojanentRL, which exploits component-level flaws to implant a persistent backdoor that survives full model retraining; and (2) InfrectroRL, a post-training backdoor attack which requires no access to training, validation, nor test data. Empirical and analytical evaluations across six Atari environments show our attacks rival state-of-the-art training-time backdoor attacks while operating under much stricter adversarial constraints. We also demonstrate that InfrectroRL further evades two leading DRL backdoor defenses. These findings challenge the current research focus and highlight the urgent need for robust defenses.

Paperid: 458, https://arxiv.org/pdf/2507.02971.pdf

Abstract:
Sharing health and behavioral data raises significant privacy concerns, as conventional de-identification methods are susceptible to privacy attacks. Differential Privacy (DP) provides formal guarantees against re-identification risks, but practical implementation necessitates balancing privacy protection and the utility of data. We demonstrate the use of DP to protect individuals in a real behavioral health study, while making the data publicly available and retaining high utility for downstream users of the data. We use the Adaptive Iterative Mechanism (AIM) to generate DP synthetic data for Phase 1 of the Lived Experiences Measured Using Rings Study (LEMURS). The LEMURS dataset comprises physiological measurements from wearable devices (Oura rings) and self-reported survey data from first-year college students. We evaluate the synthetic datasets across a range of privacy budgets, epsilon = 1 to 100, focusing on the trade-off between privacy and utility. We evaluate the utility of the synthetic data using a framework informed by actual uses of the LEMURS dataset. Our evaluation identifies the trade-off between privacy and utility across synthetic datasets generated with different privacy budgets. We find that synthetic data sets with epsilon = 5 preserve adequate predictive utility while significantly mitigating privacy risks. Our methodology establishes a reproducible framework for evaluating the practical impacts of epsilon on generating private synthetic datasets with numerous attributes and records, contributing to informed decision-making in data sharing practices.

Paperid: 459, https://arxiv.org/pdf/2512.22046.pdf

Abstract:
Prompt-driven Video Segmentation Foundation Models (VSFMs) such as SAM2 are increasingly deployed in applications like autonomous driving and digital pathology, raising concerns about backdoor threats. Surprisingly, we find that directly transferring classic backdoor attacks (e.g., BadNet) to VSFMs is almost ineffective, with ASR below 5\%. To understand this, we study encoder gradients and attention maps and observe that conventional training keeps gradients for clean and triggered samples largely aligned, while attention still focuses on the true object, preventing the encoder from learning a distinct trigger-related representation. To address this challenge, we propose BadVSFM, the first backdoor framework tailored to prompt-driven VSFMs. BadVSFM uses a two-stage strategy: (1) steer the image encoder so triggered frames map to a designated target embedding while clean frames remain aligned with a clean reference encoder; (2) train the mask decoder so that, across prompt types, triggered frame-prompt pairs produce a shared target mask, while clean outputs stay close to a reference decoder. Extensive experiments on two datasets and five VSFMs show that BadVSFM achieves strong, controllable backdoor effects under diverse triggers and prompts while preserving clean segmentation quality. Ablations over losses, stages, targets, trigger settings, and poisoning rates demonstrate robustness to reasonable hyperparameter changes and confirm the necessity of the two-stage design. Finally, gradient-conflict analysis and attention visualizations show that BadVSFM separates triggered and clean representations and shifts attention to trigger regions, while four representative defenses remain largely ineffective, revealing an underexplored vulnerability in current VSFMs.

Paperid: 460, https://arxiv.org/pdf/2511.16278.pdf

Abstract:
As LLMs become more common, non-expert users can pose risks, prompting extensive research into jailbreak attacks. However, most existing black-box jailbreak attacks rely on hand-crafted heuristics or narrow search spaces, which limit scalability. Compared with prior attacks, we propose Game-Theory Attack (GTA), an scalable black-box jailbreak framework. Concretely, we formalize the attacker's interaction against safety-aligned LLMs as a finite-horizon, early-stoppable sequential stochastic game, and reparameterize the LLM's randomized outputs via quantal response. Building on this, we introduce a behavioral conjecture "template-over-safety flip": by reshaping the LLM's effective objective through game-theoretic scenarios, the originally safety preference may become maximizing scenario payoffs within the template, which weakens safety constraints in specific contexts. We validate this mechanism with classical game such as the disclosure variant of the Prisoner's Dilemma, and we further introduce an Attacker Agent that adaptively escalates pressure to increase the ASR. Experiments across multiple protocols and datasets show that GTA achieves over 95% ASR on LLMs such as Deepseek-R1, while maintaining efficiency. Ablations over components, decoding, multilingual settings, and the Agent's core model confirm effectiveness and generalization. Moreover, scenario scaling studies further establish scalability. GTA also attains high ASR on other game-theoretic scenarios, and one-shot LLM-generated variants that keep the model mechanism fixed while varying background achieve comparable ASR. Paired with a Harmful-Words Detection Agent that performs word-level insertions, GTA maintains high ASR while lowering detection under prompt-guard models. Beyond benchmarks, GTA jailbreaks real-world LLM applications and reports a longitudinal safety monitoring of popular HuggingFace LLMs.

Paperid: 461, https://arxiv.org/pdf/2511.14045.pdf

Abstract:
Membership inference attacks (MIAs) on large language models (LLMs) pose significant privacy risks across various stages of model training. Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) have brought a profound paradigm shift in LLM training, particularly for complex reasoning tasks. However, the on-policy nature of RLVR introduces a unique privacy leakage pattern: since training relies on self-generated responses without fixed ground-truth outputs, membership inference must now determine whether a given prompt (independent of any specific response) is used during fine-tuning. This creates a threat where leakage arises not from answer memorization. To audit this novel privacy risk, we propose Divergence-in-Behavior Attack (DIBA), the first membership inference framework specifically designed for RLVR. DIBA shifts the focus from memorization to behavioral change, leveraging measurable shifts in model behavior across two axes: advantage-side improvement (e.g., correctness gain) and logit-side divergence (e.g., policy drift). Through comprehensive evaluations, we demonstrate that DIBA significantly outperforms existing baselines, achieving around 0.8 AUC and an order-of-magnitude higher TPR@0.1%FPR. We validate DIBA's superiority across multiple settings--including in-distribution, cross-dataset, cross-algorithm, black-box scenarios, and extensions to vision-language models. Furthermore, our attack remains robust under moderate defensive measures. To the best of our knowledge, this is the first work to systematically analyze privacy vulnerabilities in RLVR, revealing that even in the absence of explicit supervision, training data exposure can be reliably inferred through behavioral traces.

Paperid: 462, https://arxiv.org/pdf/2510.02563.pdf

Abstract:
Ear canal scanning/sensing (ECS) has emerged as a novel biometric authentication method for mobile devices paired with wireless earbuds. Existing studies have demonstrated the uniqueness of ear canals by training and testing machine learning classifiers on ECS data. However, implementing practical ECS-based authentication requires preventing raw biometric data leakage and designing computationally efficient protocols suitable for resource-constrained earbuds. To address these challenges, we propose an ear canal key extraction protocol, \textbf{EarID}. Without relying on classifiers, EarID extracts unique binary keys directly on the earbuds during authentication. These keys further allow the use of privacy-preserving fuzzy commitment scheme that verifies the wearer's key on mobile devices. Our evaluation results demonstrate that EarID achieves a 98.7\% authentication accuracy, comparable to machine learning classifiers. The mobile enrollment time (160~ms) and earbuds processing time (226~ms) are negligible in terms of wearer's experience. Moreover, our approach is robust and attack-resistant, maintaining a false acceptance rate below 1\% across all adversarial scenarios. We believe the proposed EarID offers a practical and secure solution for next-generation wireless earbuds.

Paperid: 463, https://arxiv.org/pdf/2509.22486.pdf

Abstract:
Retrieval-augmented generation (RAG) enhances factual grounding by integrating retrieval mechanisms with generative models but introduces new attack surfaces, particularly through backdoor attacks. While prior research has largely focused on disinformation threats, fairness vulnerabilities remain underexplored. Unlike conventional backdoors that rely on direct trigger-to-target mappings, fairness-driven attacks exploit the interaction between retrieval and generation models, manipulating semantic relationships between target groups and social biases to establish a persistent and covert influence on content generation. This paper introduces BiasRAG, a systematic framework that exposes fairness vulnerabilities in RAG through a two-phase backdoor attack. During the pre-training phase, the query encoder is compromised to align the target group with the intended social bias, ensuring long-term persistence. In the post-deployment phase, adversarial documents are injected into knowledge bases to reinforce the backdoor, subtly influencing retrieved content while remaining undetectable under standard fairness evaluations. Together, BiasRAG ensures precise target alignment over sensitive attributes, stealthy execution, and resilience. Empirical evaluations demonstrate that BiasRAG achieves high attack success rates while preserving contextual relevance and utility, establishing a persistent and evolving threat to fairness in RAG.

Paperid: 464, https://arxiv.org/pdf/2509.22428.pdf

Abstract:
Pointwise maximal leakage (PML) is a per-outcome privacy measure based on threat models from quantitative information flow. Privacy guarantees with PML rely on knowledge about the distribution that generated the private data. In this work, we propose a framework for PML privacy assessment and mechanism design with empirical estimates of this data-generating distribution. By extending the PML framework to consider sets of data-generating distributions, we arrive at bounds on the worst-case leakage within a given set. We use these bounds alongside large-deviation bounds from the literature to provide a method for obtaining distribution-independent $(\varepsilon,Î´)$-PML guarantees when the data-generating distribution is estimated from available data samples. We provide an optimal binary mechanism, and show that mechanism design with this type of uncertainty about the data-generating distribution reduces to a linearly constrained convex program. Further, we show that optimal mechanisms designed for a distribution estimate can be used. Finally, we apply these tools to leakage assessment of the Laplace mechanism and the Gaussian mechanism for binary private data, and numerically show that the presented approach to mechanism design can yield significant utility increase compared to local differential privacy, while retaining similar privacy guarantees.

Paperid: 465, https://arxiv.org/pdf/2509.00918.pdf

Abstract:
Addressing the intellectual property protection challenges in commercial deployment of large language models (LLMs), existing black-box fingerprinting techniques face dual challenges from incremental fine-tuning erasure and feature-space defense due to their reliance on overfitting high-perplexity trigger patterns. Recent work has revealed that model editing in the fingerprinting domain offers distinct advantages, including significantly lower false positive rates, enhanced harmlessness, and superior robustness. Building on this foundation, this paper innovatively proposes a $\textbf{Pr}$efix-$\textbf{e}$nhanced Fingerprint $\textbf{E}$diting Framework (PREE), which encodes copyright information into parameter offsets through dual-channel knowledge edit to achieve covert embedding of fingerprint features. Experimental results demonstrate that the proposed solution achieves the 90\% trigger precision in mainstream architectures including LLaMA-3 and Qwen-2.5. The minimal parameter offset (change rate < 0.03) effectively preserves original knowledge representation while demonstrating strong robustness against incremental fine-tuning and multi-dimensional defense strategies, maintaining zero false positive rate throughout evaluations.

Paperid: 466, https://arxiv.org/pdf/2509.00641.pdf

Abstract:
Generative models have achieved impressive results in text to image tasks, significantly advancing visual content creation. However, this progress comes at a cost, as such models rely heavily on large-scale training data and may unintentionally replicate copyrighted elements, creating serious legal and ethical challenges for real-world deployment. To address these concerns, researchers have proposed various strategies to mitigate copyright risks, most of which are prompt based methods that filter or rewrite user inputs to prevent explicit infringement. While effective in handling obvious cases, these approaches often fall short in more subtle situations, where seemingly benign prompts can still lead to infringing outputs. To address these limitations, this paper introduces Assessing and Mitigating Copyright Risks (AMCR), a comprehensive framework which i) builds upon prompt-based strategies by systematically restructuring risky prompts into safe and non-sensitive forms, ii) detects partial infringements through attention-based similarity analysis, and iii) adaptively mitigates risks during generation to reduce copyright violations without compromising image quality. Extensive experiments validate the effectiveness of AMCR in revealing and mitigating latent copyright risks, offering practical insights and benchmarks for the safer deployment of generative models.

Paperid: 467, https://arxiv.org/pdf/2508.16347.pdf

Abstract:
With the development of Large Language Models (LLMs), numerous efforts have revealed their vulnerabilities to jailbreak attacks. Although these studies have driven the progress in LLMs' safety alignment, it remains unclear whether LLMs have internalized authentic knowledge to deal with real-world crimes, or are merely forced to simulate toxic language patterns. This ambiguity raises concerns that jailbreak success is often attributable to a hallucination loop between jailbroken LLM and judger LLM. By decoupling the use of jailbreak techniques, we construct knowledge-intensive Q\&A to investigate the misuse threats of LLMs in terms of dangerous knowledge possession, harmful task planning utility, and harmfulness judgment robustness. Experiments reveal a mismatch between jailbreak success rates and harmful knowledge possession in LLMs, and existing LLM-as-a-judge frameworks tend to anchor harmfulness judgments on toxic language patterns. Our study reveals a gap between existing LLM safety assessments and real-world threat potential.

Paperid: 468, https://arxiv.org/pdf/2508.10991.pdf

Abstract:
The integration of Large Language Models (LLMs) with external tools via protocols such as the Model Context Protocol (MCP) introduces critical security vulnerabilities, including prompt injection, data exfiltration, and other threats. To counter these challenges, we propose MCP-Guard, a robust, layered defense architecture designed for LLM--tool interactions. MCP-Guard employs a three-stage detection pipeline that balances efficiency with accuracy: it progresses from lightweight static scanning for overt threats and a deep neural detector for semantic attacks, to our fine-tuned E5-based model achieves (96.01) accuracy in identifying adversarial prompts. Finally, a lightweight LLM arbitrator synthesizes these signals to deliver the final decision while minimizing false positives. To facilitate rigorous training and evaluation, we also introduce MCP-AttackBench, a comprehensive benchmark of over 70,000 samples. Sourced from public datasets and augmented by GPT-4, MCP-AttackBench simulates diverse, real-world attack vectors in the MCP format, providing a foundation for future research into securing LLM-tool ecosystems.

Paperid: 469, https://arxiv.org/pdf/2508.10029.pdf

Abstract:
Large language models (LLMs) demonstrate impressive capabilities in various language tasks but are susceptible to jailbreak attacks that circumvent their safety alignments. This paper introduces Latent Fusion Jailbreak (LFJ), a representation-based attack that interpolates hidden states from harmful and benign query pairs to elicit prohibited responses. LFJ begins by selecting query pairs with high thematic and syntactic similarity, then performs gradient-guided interpolation at influential layers and tokens, followed by optimization to balance attack success, output fluency, and computational efficiency. Evaluations on models such as Vicuna and LLaMA-2 across benchmarks like AdvBench and MaliciousInstruct yield an average attack success rate (ASR) of 94.01%, outperforming existing methods. To mitigate LFJ, we propose an adversarial training defense that fine-tunes models on interpolated examples, reducing ASR by over 80% without degrading performance on benign inputs. Ablation studies validate the importance of query pair selection, hidden state interpolation components, and optimization strategies in LFJ's effectiveness.

Paperid: 470, https://arxiv.org/pdf/2507.16773.pdf

Abstract:
Large Language Models (LLMs) have become integral to automated code analysis, enabling tasks such as vulnerability detection and code comprehension. However, their integration introduces novel attack surfaces. In this paper, we identify and investigate a new class of prompt-based attacks, termed Copy-Guided Attacks (CGA), which exploit the inherent copying tendencies of reasoning-capable LLMs. By injecting carefully crafted triggers into external code snippets, adversaries can induce the model to replicate malicious content during inference. This behavior enables two classes of vulnerabilities: inference length manipulation, where the model generates abnormally short or excessively long reasoning traces; and inference result manipulation, where the model produces misleading or incorrect conclusions. We formalize CGA as an optimization problem and propose a gradient-based approach to synthesize effective triggers. Empirical evaluation on state-of-the-art reasoning LLMs shows that CGA reliably induces infinite loops, premature termination, false refusals, and semantic distortions in code analysis tasks. While highly effective in targeted settings, we observe challenges in generalizing CGA across diverse prompts due to computational constraints, posing an open question for future research. Our findings expose a critical yet underexplored vulnerability in LLM-powered development pipelines and call for urgent advances in prompt-level defense mechanisms.

Paperid: 471, https://arxiv.org/pdf/2512.10732.pdf

Abstract:
Accurately measuring time passing is critical for many applications. However, in Trusted Execution Environments (TEEs) such as Intel SGX, the time source is outside the Trusted Computing Base: a malicious host can manipulate the TEE's notion of time, jumping in time or affecting perceived time speed. Previous work (Triad) proposes protocols for TEEs to maintain a trustworthy time source by building a cluster of TEEs that collaborate with each other and with a remote Time Authority to maintain a continuous notion of passing time. However, such approaches still allow an attacker to control the operating system and arbitrarily manipulate their own TEE's perceived clock speed. An attacker can even propagate faster passage of time to honest machines participating in Triad's trusted time protocol, causing them to skip to timestamps arbitrarily far in the future. We propose TriHaRd, a TEE trusted time protocol achieving high resilience against clock speed and offset manipulations, notably through Byzantine-resilient clock updates and consistency checks. We empirically show that TriHaRd mitigates known attacks against Triad.

Paperid: 472, https://arxiv.org/pdf/2512.02069.pdf

Abstract:
This paper presents LLMBugScanner, a large language model (LLM) based framework for smart contract vulnerability detection using fine-tuning and ensemble learning. Smart contract auditing presents several challenges for LLMs: different pretrained models exhibit varying reasoning abilities, and no single model performs consistently well across all vulnerability types or contract structures. These limitations persist even after fine-tuning individual LLMs. To address these challenges, LLMBugScanner combines domain knowledge adaptation with ensemble reasoning to improve robustness and generalization. Through domain knowledge adaptation, we fine-tune LLMs on complementary datasets to capture both general code semantics and instruction-guided vulnerability reasoning, using parameter-efficient tuning to reduce computational cost. Through ensemble reasoning, we leverage the complementary strengths of multiple LLMs and apply a consensus-based conflict resolution strategy to produce more reliable vulnerability assessments. We conduct extensive experiments across multiple popular LLMs and compare LLMBugScanner with both pretrained and fine-tuned individual models. Results show that LLMBugScanner achieves consistent accuracy improvements and stronger generalization, demonstrating that it provides a principled, cost-effective, and extensible framework for smart contract auditing.

Paperid: 473, https://arxiv.org/pdf/2511.18240.pdf

Abstract:
The rapid expansion of the Internet of Things (IoT) has intensified cybersecurity challenges, particularly in mitigating Distributed Denial-of-Service (DDoS) attacks at the network edge. Traditional Intrusion Detection Systems (IDSs) face significant limitations, including poor adaptability to evolving and zero-day attacks, reliance on static signatures and labeled datasets, and inefficiency on resource-constrained edge gateways. Moreover, most existing DRL-based IDS studies overlook sustainability factors such as energy efficiency and carbon impact. To address these challenges, this paper proposes two novel Deep Reinforcement Learning (DRL)-based IDS: DeepEdgeIDS, an unsupervised Autoencoder-DRL hybrid, and AutoDRL-IDS, a supervised LSTM-DRL model. Both DRL-based IDS are validated through theoretical analysis and experimental evaluation on edge gateways. Results demonstrate that AutoDRL-IDS achieves 94% detection accuracy using labeled data, while DeepEdgeIDS attains 98% accuracy and adaptability without labels. Distinctly, this study introduces a carbon-aware, multi-objective reward function optimized for sustainable and real-time IDS operations in dynamic IoT networks.

Paperid: 474, https://arxiv.org/pdf/2511.18230.pdf

Abstract:
As the number of connected IoT devices continues to grow, securing these systems against cyber threats remains a major challenge, especially in environments with limited computational and energy resources. This paper presents an edge-centric Intrusion Detection System (IDS) framework that integrates lightweight machine learning (ML) based IDS models with pre-trained large language models (LLMs) to improve detection accuracy, semantic interpretability, and operational efficiency at the network edge. The system evaluates six ML-based IDS models: Decision Tree (DT), K-Nearest Neighbors (KNN), Random Forest (RF), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and a hybrid CNN-LSTM model on low-power edge gateways, achieving accuracy up to 98 percent under real-world cyberattacks. For anomaly detection, the system transmits a compact and secure telemetry snapshot (for example, CPU usage, memory usage, latency, and energy consumption) via low-bandwidth API calls to LLMs including GPT-4-turbo, DeepSeek V2, and LLaMA 3.5. These models use zero-shot, few-shot, and chain-of-thought reasoning to produce human-readable threat analyses and actionable mitigation recommendations. Evaluations across diverse attacks such as DoS, DDoS, brute force, and port scanning show that the system enhances interpretability while maintaining low latency (<1.5 s), minimal bandwidth usage (<1.2 kB per prompt), and energy efficiency (<75 J), demonstrating its practicality and scalability as an IDS solution for edge gateways.

Paperid: 475, https://arxiv.org/pdf/2509.09215.pdf

Abstract:
Large language models (LLMs)-empowered autonomous agents are transforming both digital and physical environments by enabling adaptive, multi-agent collaboration. While these agents offer significant opportunities across domains such as finance, healthcare, and smart manufacturing, their unpredictable behaviors and heterogeneous capabilities pose substantial governance and accountability challenges. In this paper, we propose a blockchain-enabled layered architecture for regulatory agent collaboration, comprising an agent layer, a blockchain data layer, and a regulatory application layer. Within this framework, we design three key modules: (i) an agent behavior tracing and arbitration module for automated accountability, (ii) a dynamic reputation evaluation module for trust assessment in collaborative scenarios, and (iii) a malicious behavior forecasting module for early detection of adversarial activities. Our approach establishes a systematic foundation for trustworthy, resilient, and scalable regulatory mechanisms in large-scale agent ecosystems. Finally, we discuss the future research directions for blockchain-enabled regulatory frameworks in multi-agent systems.

Paperid: 476, https://arxiv.org/pdf/2509.00561.pdf

Abstract:
Recently, speech assistant and speech verification have been used in many fields, which brings much benefit and convenience for us. However, when we enjoy these speech applications, our speech may be collected by attackers for speech synthesis. For example, an attacker generates some inappropriate political opinions with the characteristic of the victim's voice by obtaining a piece of the victim's speech, which will greatly influence the victim's reputation. Specifically, with the appearance of some zero-shot voice conversion methods, the cost of speech synthesis attacks has been further reduced, which also brings greater challenges to user voice security and privacy. Some researchers have proposed the corresponding privacy-preserving methods. However, the existing approaches have some non-negligible drawbacks: low transferability and robustness, high computational overhead. These deficiencies seriously limit the existing method deployed in practical scenarios. Therefore, in this paper, we propose a lightweight, robust, plug-and-play privacy preservation method against speech synthesis attacks in a black-box setting. Our method generates and adds a frequency-domain perturbation to the original speech to achieve privacy protection and high speech quality. Then, we present a data augmentation strategy and noise smoothing mechanism to improve the robustness of the proposed method. Besides, to reduce the user's defense overhead, we also propose a novel identity-wise protection mechanism. It can generate a universal perturbation for one speaker and support privacy preservation for speech of any length. Finally, we conduct extensive experiments on 5 speech synthesis models, 5 speech verification models, 1 speech recognition model, and 2 datasets. The experimental results demonstrate that our method has satisfying privacy-preserving performance, high speech quality, and utility.

Paperid: 477, https://arxiv.org/pdf/2508.07094.pdf

Abstract:
Smart contracts have transformed decentralized finance by enabling programmable, trustless transactions. However, their widespread adoption and growing financial significance have attracted persistent and sophisticated threats, such as phishing campaigns and contract-level exploits. Traditional transaction-based threat detection methods often expose sensitive user data and interactions, raising privacy and security concerns. In response, static bytecode analysis has emerged as a proactive mitigation strategy, identifying malicious contracts before they execute harmful actions. Building on this approach, we introduced PhishingHook, the first machine-learning-based framework for detecting phishing activities in smart contracts via static bytecode and opcode analysis, achieving approximately 90% detection accuracy. Nevertheless, two pressing challenges remain: (1) the increasing use of sophisticated bytecode obfuscation techniques designed to evade static analysis, and (2) the heterogeneity of blockchain environments requiring platform-agnostic solutions. This paper presents a vision for ScamDetect (Smart Contract Agnostic Malware Detector), a robust, modular, and platform-agnostic framework for smart contract malware detection. Over the next 2.5 years, ScamDetect will evolve in two stages: first, by tackling obfuscated Ethereum Virtual Machine (EVM) bytecode through graph neural network (GNN) analysis of control flow graphs (CFGs), leveraging GNNs' ability to capture complex structural patterns beyond opcode sequences; and second, by generalizing detection capabilities to emerging runtimes such as WASM. ScamDetect aims to enable proactive, scalable security for the future of decentralized ecosystems.

Paperid: 478, https://arxiv.org/pdf/2512.21371.pdf

Abstract:
Chat-based cybercrime has emerged as a pervasive threat, with attackers leveraging real-time messaging platforms to conduct scams that rely on trust-building, deception, and psychological manipulation. Traditional defense mechanisms, which operate on static rules or shallow content filters, struggle to identify these conversational threats, especially when attackers use multimedia obfuscation and context-aware dialogue. In this work, we ask a provocative question inspired by the classic Imitation Game: Can machines convincingly pose as human victims to turn deception against cybercriminals? We present LURE (LLM-based User Response Engagement), the first system to deploy Large Language Models (LLMs) as active agents, not as passive classifiers, embedded within adversarial chat environments. LURE combines automated discovery, adversarial interaction, and OCR-based analysis of image-embedded payment data. Applied to the setting of illicit video chat scams on Telegram, our system engaged 53 actors across 98 groups. In over 56 percent of interactions, the LLM maintained multi-round conversations without being noticed as a bot, effectively "winning" the imitation game. Our findings reveal key behavioral patterns in scam operations, such as payment flows, upselling strategies, and platform migration tactics.

Paperid: 479, https://arxiv.org/pdf/2512.15144.pdf

Abstract:
Model Context Protocol (MCP) enables agents to interact with external tools, yet empirical research on MCP is hindered by the lack of large-scale, accessible datasets. We present MCPZoo, the largest and most comprehensive dataset of MCP servers collected from multiple public sources, comprising 129,059 servers (56,053 distinct). MCPZoo includes 16,356 server instances that have been deployed and verified as runnable and interactable, supporting realistic experimentation beyond static analysis. The dataset provides unified metadata and access interfaces, enabling systematic exploration and interaction without manual deployment effort. MCPZoo is released as an open and accessible resource to support research on MCP-based systems and security analysis.

Paperid: 480, https://arxiv.org/pdf/2512.00591.pdf

Abstract:
Hardware Trojans (HT s) are a persistent threat to integrated circuits, especially when inserted at the register-transfer level (RTL). Existing methods typically first convert the design into a graph, such as a gate-level netlist or an RTL-derived dataflow graph (DFG), and then use a graph neural network (GNN ) to obtain an embedding of that graph, which (i) loses compact RTL semantics, (ii) relies on shallow GNNs with limited receptive field, and (iii) is largely restricted to coarse, module-level binary HT detection. We propose TrojanLoC, an LLM-based framework for RTL-level HT localization. We use an RTL-finetuned LLM to derive module-level and line-level embeddings directly from RTL code, capturing both global design context and local semantics. Next, we train task-specific classifiers on these embeddings to perform module-level Trojan detection, type prediction, and fine-grained line-level localization. We also introduce TrojanInS, a large synthetic dataset of RTL designs with systematically injected Trojans from four effect-based categories, each accompanied by precise line-level annotations. Our experiments show that TrojanLoC achieves strong module-level performance, reaching 0.99 F1-score for Trojan detection, up to 0.68 higher than baseline, and 0.84 macro-F1 for Trojan-type classification. At the line level, TrojanLoc further achieves up to 0.93 macro-F1, enabling fine-grained localization of Trojan-relevant RTL lines

Paperid: 481, https://arxiv.org/pdf/2512.00119.pdf

Abstract:
Graph neural networks (GNNs) have shown promise in hardware security by learning structural motifs from netlist graphs. However, this reliance on motifs makes GNNs vulnerable to adversarial netlist rewrites; even small-scale edits can mislead GNN predictions. Existing adversarial approaches, ranging from synthesis-recipe perturbations to gate transformations, come with high design overheads. We present NetDeTox, an automated end-to-end framework that orchestrates large language models (LLMs) with reinforcement learning (RL) in a systematic manner, enabling focused local rewriting. The RL agent identifies netlist components critical for GNN-based reasoning, while the LLM devises rewriting plans to diversify motifs that preserve functionality. Iterative feedback between the RL and LLM stages refines adversarial rewritings to limit overheads. Compared to the SOTA work AttackGNN, NetDeTox successfully degrades the effectiveness of all security schemes with fewer rewrites and substantially lower area overheads (reductions of 54.50% for GNN-RE, 25.44% for GNN4IP, and 41.04% for OMLA, respectively). For GNN4IP, ours can even optimize/reduce the original benchmarks' area, in particular for larger circuits, demonstrating the practicality and scalability of NetDeTox.

Paperid: 482, https://arxiv.org/pdf/2510.26212.pdf

Abstract:
AI agents capable of GUI understanding and Model Context Protocol are increasingly deployed to automate mobile tasks. However, their reliance on over-privileged, static permissions creates a critical vulnerability: instruction injection. Malicious instructions, embedded in otherwise benign content like emails, can hijack the agent to perform unauthorized actions. We present AgentSentry, a lightweight runtime task-centric access control framework that enforces dynamic, task-scoped permissions. Instead of granting broad, persistent permissions, AgentSentry dynamically generates and enforces minimal, temporary policies aligned with the user's specific task (e.g., register for an app), revoking them upon completion. We demonstrate that AgentSentry successfully prevents an instruction injection attack, where an agent is tricked into forwarding private emails, while allowing the legitimate task to complete. Our approach highlights the urgent need for intent-aligned security models to safely govern the next generation of autonomous agents.

Paperid: 483, https://arxiv.org/pdf/2510.26210.pdf

Abstract:
Super apps are the cornerstones of modern digital life, embedding financial transactions into nearly every aspect of daily routine. The prevailing security paradigm for these platforms is overwhelmingly focused on pre-transaction authentication, preventing unauthorized payments before they occur. We argue that a critical vulnerability vector has been largely overlooked: the fragility of post-transaction audit trails. We investigate the ease with which a user can permanently erase their transaction history from an app's interface, thereby concealing unauthorized or sensitive activities from the account owner. To quantify this threat, we conducted an empirical study with 6 volunteers who performed a cross-evaluation on six super apps. Our findings are alarming: all six applications studied allow users to delete transaction records, yet a staggering five out of six (83+\%) fail to protect these records with strong authentication. Only one app in our study required biometric verification for deletion. This study provides the first concrete evidence of this near-ubiquitous vulnerability, demonstrating a critical gap in the current mobile security landscape and underscoring the urgent need for a paradigm shift towards ensuring post-transaction audit integrity.

Paperid: 484, https://arxiv.org/pdf/2510.12652.pdf

Abstract:
As e-commerce platforms develop, fraudulent activities are increasingly emerging, posing significant threats to the security and stability of these platforms. Promotion abuse is one of the fastest-growing types of fraud in recent years and is characterized by users exploiting promotional activities to gain financial benefits from the platform. To investigate this issue, we conduct the first study on promotion abuse fraud in e-commerce platforms MEITUAN. We find that promotion abuse fraud is a group-based fraudulent activity with two types of fraudulent activities: Stocking Up and Cashback Abuse. Unlike traditional fraudulent activities such as fake reviews, promotion abuse fraud typically involves ordinary customers conducting legitimate transactions and these two types of fraudulent activities are often intertwined. To address this issue, we propose leveraging additional information from the spatial and temporal perspectives to detect promotion abuse fraud. In this paper, we introduce PROMOGUARDIAN, a novel multi-relation fused graph neural network that integrates the spatial and temporal information of transaction data into a homogeneous graph to detect promotion abuse fraud. We conduct extensive experiments on real-world data from MEITUAN, and the results demonstrate that our proposed model outperforms state-of-the-art methods in promotion abuse fraud detection, achieving 93.15% precision, detecting 2.1 to 5.0 times more fraudsters, and preventing 1.5 to 8.8 times more financial losses in production environments.

Paperid: 485, https://arxiv.org/pdf/2510.12252.pdf

Abstract:
Prompt injection attacks deceive a large language model into completing an attacker-specified task instead of its intended task by contaminating its input data with an injected prompt, which consists of injected instruction(s) and data. Localizing the injected prompt within contaminated data is crucial for post-attack forensic analysis and data recovery. Despite its growing importance, prompt injection localization remains largely unexplored. In this work, we bridge this gap by proposing PromptLocate, the first method for localizing injected prompts. PromptLocate comprises three steps: (1) splitting the contaminated data into semantically coherent segments, (2) identifying segments contaminated by injected instructions, and (3) pinpointing segments contaminated by injected data. We show PromptLocate accurately localizes injected prompts across eight existing and eight adaptive attacks.

Paperid: 486, https://arxiv.org/pdf/2510.02389.pdf

Abstract:
Large language models show promise for vulnerability discovery, yet prevailing methods inspect code in isolation, struggle with long contexts, and focus on coarse function- or file-level detections - offering limited actionable guidance to engineers who need precise line-level localization and targeted patches in real-world software development. We present T2L-Agent (Trace-to-Line Agent), a project-level, end-to-end framework that plans its own analysis and progressively narrows scope from modules to exact vulnerable lines. T2L-Agent couples multi-round feedback with an Agentic Trace Analyzer (ATA) that fuses runtime evidence - crash points, stack traces, and coverage deltas - with AST-based code chunking, enabling iterative refinement beyond single pass predictions and translating symptoms into actionable, line-level diagnoses. To benchmark line-level vulnerability discovery, we introduce T2L-ARVO, a diverse, expert-verified 50-case benchmark spanning five crash families and real-world projects. T2L-ARVO is specifically designed to support both coarse-grained detection and fine-grained localization, enabling rigorous evaluation of systems that aim to move beyond file-level predictions. On T2L-ARVO, T2L-Agent achieves up to 58.0% detection and 54.8% line-level localization, substantially outperforming baselines. Together, the framework and benchmark push LLM-based vulnerability detection from coarse identification toward deployable, robust, precision diagnostics that reduce noise and accelerate patching in open-source software workflows.

Paperid: 487, https://arxiv.org/pdf/2510.00181.pdf

Abstract:
Embodied Artificial Intelligence (AI) promises to handle edge cases in robotic vehicle systems where data is scarce by using common-sense reasoning grounded in perception and action to generalize beyond training distributions and adapt to novel real-world situations. These capabilities, however, also create new security risks. In this paper, we introduce CHAI (Command Hijacking against embodied AI), a new class of prompt-based attacks that exploit the multimodal language interpretation abilities of Large Visual-Language Models (LVLMs). CHAI embeds deceptive natural language instructions, such as misleading signs, in visual input, systematically searches the token space, builds a dictionary of prompts, and guides an attacker model to generate Visual Attack Prompts. We evaluate CHAI on four LVLM agents; drone emergency landing, autonomous driving, and aerial object tracking, and on a real robotic vehicle. Our experiments show that CHAI consistently outperforms state-of-the-art attacks. By exploiting the semantic and multimodal reasoning strengths of next-generation embodied AI systems, CHAI underscores the urgent need for defenses that extend beyond traditional adversarial robustness.

Paperid: 488, https://arxiv.org/pdf/2509.24967.pdf

Abstract:
Prompt injection attacks pose a pervasive threat to the security of Large Language Models (LLMs). State-of-the-art prevention-based defenses typically rely on fine-tuning an LLM to enhance its security, but they achieve limited effectiveness against strong attacks. In this work, we propose \emph{SecInfer}, a novel defense against prompt injection attacks built on \emph{inference-time scaling}, an emerging paradigm that boosts LLM capability by allocating more compute resources for reasoning during inference. SecInfer consists of two key steps: \emph{system-prompt-guided sampling}, which generates multiple responses for a given input by exploring diverse reasoning paths through a varied set of system prompts, and \emph{target-task-guided aggregation}, which selects the response most likely to accomplish the intended task. Extensive experiments show that, by leveraging additional compute at inference, SecInfer effectively mitigates both existing and adaptive prompt injection attacks, outperforming state-of-the-art defenses as well as existing inference-time scaling approaches.

Paperid: 489, https://arxiv.org/pdf/2509.09207.pdf

Abstract:
Penetration testing is critical for identifying and mitigating security vulnerabilities, yet traditional approaches remain expensive, time-consuming, and dependent on expert human labor. Recent work has explored AI-driven pentesting agents, but their evaluation relies on oversimplified capture-the-flag (CTF) settings that embed prior knowledge and reduce complexity, leading to performance estimates far from real-world practice. We close this gap by introducing the first real-world, agent-oriented pentesting benchmark, TermiBench, which shifts the goal from 'flag finding' to achieving full system control. The benchmark spans 510 hosts across 25 services and 30 CVEs, with realistic environments that require autonomous reconnaissance, discrimination between benign and exploitable services, and robust exploit execution. Using this benchmark, we find that existing systems can hardly obtain system shells under realistic conditions. To address these challenges, we propose TermiAgent, a multi-agent penetration testing framework. TermiAgent mitigates long-context forgetting with a Located Memory Activation mechanism and builds a reliable exploit arsenal via structured code understanding rather than naive retrieval. In evaluations, our work outperforms state-of-the-art agents, exhibiting stronger penetration testing capability, reducing execution time and financial cost, and demonstrating practicality even on laptop-scale deployments. Our work delivers both the first open-source benchmark for real-world autonomous pentesting and a novel agent framework that establishes a milestone for AI-driven penetration testing.

Paperid: 490, https://arxiv.org/pdf/2507.10016.pdf

Abstract:
Our research uncovers a novel privacy risk associated with multimodal large language models (MLLMs): the ability to infer sensitive personal attributes from audio data -- a technique we term audio private attribute profiling. This capability poses a significant threat, as audio can be covertly captured without direct interaction or visibility. Moreover, compared to images and text, audio carries unique characteristics, such as tone and pitch, which can be exploited for more detailed profiling. However, two key challenges exist in understanding MLLM-employed private attribute profiling from audio: (1) the lack of audio benchmark datasets with sensitive attribute annotations and (2) the limited ability of current MLLMs to infer such attributes directly from audio. To address these challenges, we introduce AP^2, an audio benchmark dataset that consists of two subsets collected and composed from real-world data, and both are annotated with sensitive attribute labels. Additionally, we propose Gifts, a hybrid multi-agent framework that leverages the complementary strengths of audio-language models (ALMs) and large language models (LLMs) to enhance inference capabilities. Gifts employs an LLM to guide the ALM in inferring sensitive attributes, then forensically analyzes and consolidates the ALM's inferences, overcoming severe hallucinations of existing ALMs in generating long-context responses. Our evaluations demonstrate that Gifts significantly outperforms baseline approaches in inferring sensitive attributes. Finally, we investigate model-level and data-level defense strategies to mitigate the risks of audio private attribute profiling. Our work validates the feasibility of audio-based privacy attacks using MLLMs, highlighting the need for robust defenses, and provides a dataset and framework to facilitate future research.

Paperid: 491, https://arxiv.org/pdf/2512.20168.pdf

Abstract:
By integrating language understanding with perceptual modalities such as images, multimodal large language models (MLLMs) constitute a critical substrate for modern AI systems, particularly intelligent agents operating in open and interactive environments. However, their increasing accessibility also raises heightened risks of misuse, such as generating harmful or unsafe content. To mitigate these risks, alignment techniques are commonly applied to align model behavior with human values. Despite these efforts, recent studies have shown that jailbreak attacks can circumvent alignment and elicit unsafe outputs. Currently, most existing jailbreak methods are tailored for open-source models and exhibit limited effectiveness against commercial MLLM-integrated systems, which often employ additional filters. These filters can detect and prevent malicious input and output content, significantly reducing jailbreak threats. In this paper, we reveal that the success of these safety filters heavily relies on a critical assumption that malicious content must be explicitly visible in either the input or the output. This assumption, while often valid for traditional LLM-integrated systems, breaks down in MLLM-integrated systems, where attackers can leverage multiple modalities to conceal adversarial intent, leading to a false sense of security in existing MLLM-integrated systems. To challenge this assumption, we propose Odysseus, a novel jailbreak paradigm that introduces dual steganography to covertly embed malicious queries and responses into benign-looking images. Extensive experiments on benchmark datasets demonstrate that our Odysseus successfully jailbreaks several pioneering and realistic MLLM-integrated systems, achieving up to 99% attack success rate. It exposes a fundamental blind spot in existing defenses, and calls for rethinking cross-modal security in MLLM-integrated systems.

Paperid: 492, https://arxiv.org/pdf/2511.04711.pdf

Abstract:
Large-scale vision-language models, especially CLIP, have demonstrated remarkable performance across diverse downstream tasks. Soft prompts, as carefully crafted modules that efficiently adapt vision-language models to specific tasks, necessitate effective copyright protection. In this paper, we investigate model copyright protection by auditing whether suspicious third-party models incorporate protected soft prompts. While this can be viewed as a special case of model ownership auditing, our analysis shows that existing techniques are ineffective due to prompt learning's unique characteristics. Non-intrusive auditing is inherently prone to false positives when independent models share similar data distributions with victim models. Intrusive approaches also fail: backdoor methods designed for CLIP cannot embed functional triggers, while extending traditional DNN backdoor techniques to prompt learning suffers from harmfulness and ambiguity challenges. We find that these failures in intrusive auditing stem from the same fundamental reason: watermarking operates within the same decision space as the primary task yet pursues opposing objectives. Motivated by these findings, we propose sequential watermarking for soft prompts (SWAP), which implants watermarks into a different and more complex space. SWAP encodes watermarks through a specific order of defender-specified out-of-distribution classes, inspired by the zero-shot prediction capability of CLIP. This watermark, which is embedded in a more complex space, keeps the original prediction label unchanged, making it less opposed to the primary task. We further design a hypothesis-test-guided verification protocol for SWAP and provide theoretical analyses of success conditions. Extensive experiments on 11 datasets demonstrate SWAP's effectiveness, harmlessness, and robustness against potential adaptive attacks.

Paperid: 493, https://arxiv.org/pdf/2510.18003.pdf

Abstract:
The convergence of LLM-powered research assistants and AI-based peer review systems creates a critical vulnerability: fully automated publication loops where AI-generated research is evaluated by AI reviewers without human oversight. We investigate this through \textbf{BadScientist}, a framework that evaluates whether fabrication-oriented paper generation agents can deceive multi-model LLM review systems. Our generator employs presentation-manipulation strategies requiring no real experiments. We develop a rigorous evaluation framework with formal error guarantees (concentration bounds and calibration analysis), calibrated on real data. Our results reveal systematic vulnerabilities: fabricated papers achieve acceptance rates up to . Critically, we identify \textit{concern-acceptance conflict} -- reviewers frequently flag integrity issues yet assign acceptance-level scores. Our mitigation strategies show only marginal improvements, with detection accuracy barely exceeding random chance. Despite provably sound aggregation mathematics, integrity checking systematically fails, exposing fundamental limitations in current AI-driven review systems and underscoring the urgent need for defense-in-depth safeguards in scientific publishing.

Paperid: 494, https://arxiv.org/pdf/2510.07462.pdf

Abstract:
Internet of Things means connecting different devices through the Internet. The Internet of things enables humans to remotely manage and control the objects they use with the Internet infrastructure. After the advent of the Internet of Things in homes, organizations, and private companies, privacy and information security are the biggest concern. This issue has challenged the spread of the Internet of things as news of the users theft of information by hackers intensified. The proposed method in this paper consists of three phases. In the first phase, a star structure is constructed within each cluster, and a unique key is shared between each child and parent to encrypt and secure subsequent communications. The second phase is for intracluster communications, in which members of the cluster send their data to the cluster head in a multi hop manner. Also, in this phase, the data is encrypted with different keys in each hop, and at the end of each connection, the keys are updated to ensure data security. The third phase is to improve the security of inter cluster communications using an authentication protocol. In this way, the cluster heads are authenticated before sending information to prevent malicious nodes in the network. The proposed method is also simulated using NS2 software. The results showed that the proposed method has improved in terms of energy consumption, end-to-end delay, flexibility, packet delivery rate, and the number of alive nodes compared to other methods.

Paperid: 495, https://arxiv.org/pdf/2510.03542.pdf

Abstract:
The rapid advancement of Artificial Intelligence has enabled the development of cruise missiles endowed with high levels of autonomy, adaptability, and precision. These AI driven missiles integrating deep learning algorithms, real time data processing, and advanced guidance systems pose critical threats to strategic infrastructures, especially under complex geographic and climatic conditions such as those found in Irans Khuzestan Province. In this paper, we propose a multi layer interference model, encompassing electronic warfare, cyberattacks, and deception strategies, to degrade the performance of AI guided cruise missiles significantly. Our experimental results, derived from 400 simulation runs across four distinct scenarios, demonstrate notable improvements when employing the integrated multi layer approach compared to single layer or no interference baselines. Specifically, the average missile deviation from its intended target increases from 0.25 to 8.65 under multi layer interference a more than 3300 increase in angular deviation. Furthermore, the target acquisition success rate is reduced from 92.7 in the baseline scenario to 31.5, indicating a 66 decrease in successful strikes. While resource consumption for multi layer strategies rises by approximately 25 compared to single layer methods, the significant drop in missile accuracy and reliability justifies the more intensive deployment of jamming power, cyber resources, and decoy measures. Beyond these quantitative improvements, the proposed framework uses a deep reinforcement learning based defense coordinator to adaptively select the optimal configuration of EW, cyber, and deception tactics in real time.

Paperid: 496, https://arxiv.org/pdf/2509.25566.pdf

Abstract:
The rise of autonomous vehicles (AVs) promises to significantly enhance transportation safety and efficiency by mitigating human error, which is responsible for over 90\% of road accidents. However, the increasing connectivity of AVs introduces new cybersecurity challenges, as traditional perimeter-based security models are inadequate for dynamic and untrusted environments. This paper presents a novel Zero Trust-based Decentralized Identity Management (D-IM) protocol for AVs. By integrating the core principles of Zero Trust Architecture, "never trust, always verify", with the tamper resistant and decentralized nature of a blockchain network, our framework eliminates reliance on centralized authorities and provides continuous verification for every entity. We detail the system's design, which leverages Hyperledger Iroha to enable lightweight and secure authentication without a central trusted entity. A comprehensive experimental evaluation, conducted across both urban and highway scenarios, validates the protocol's practicality. Our results demonstrate that the D-IM framework introduces minimal overhead, with less than 7.5\% reduction in Packet Reception Rate (PRR) in urban settings and an increase of under 11\% in Channel Busy Ratio (CBR) for LTE-V2X. These findings prove the protocol's efficiency and robustness, providing a resilient foundation for securing real-time V2X communication against impersonation and replay attacks.

Paperid: 497, https://arxiv.org/pdf/2509.22873.pdf

Abstract:
Federated learning (FL) enables privacy-preserving model training by keeping data decentralized. However, it remains vulnerable to label-flipping attacks, where malicious clients manipulate labels to poison the global model. Despite their simplicity, these attacks can severely degrade model performance, and defending against them remains challenging. We introduce AntiFLipper, a novel and computationally efficient defense against multi-class label-flipping attacks in FL. Unlike existing methods that ensure security at the cost of high computational overhead, AntiFLipper employs a novel client-side detection strategy, significantly reducing the central server's burden during aggregation. Comprehensive empirical evaluations across multiple datasets under different distributions demonstrate that AntiFLipper achieves accuracy comparable to state-of-the-art defenses while requiring substantially fewer computational resources in server side. By balancing security and efficiency, AntiFLipper addresses a critical gap in existing defenses, making it particularly suitable for resource-constrained FL deployments where both model integrity and operational efficiency are essential.

Paperid: 498, https://arxiv.org/pdf/2509.06202.pdf

Abstract:
The rapid expansion of Internet of Things (IoT) systems across various domains such as industry, smart cities, healthcare, manufacturing, and government services has led to a significant increase in security risks, threatening data integrity, confidentiality, and availability. Consequently, ensuring the security and resilience of IoT systems has become a critical requirement. In this paper, we propose a lightweight and efficient intrusion detection system (IDS) for IoT environments, leveraging a hybrid model of CNN and ConvNeXt-Tiny. The proposed method is designed to detect and classify different types of network attacks, particularly botnet and malicious traffic, while the lightweight ConvNeXt-Tiny architecture enables effective deployment in resource-constrained devices and networks. A real-world dataset comprising both benign and malicious network packets collected from practical IoT scenarios was employed in the experiments. The results demonstrate that the proposed method achieves high accuracy while significantly reducing training and inference time compared to more complex models. Specifically, the system attained 99.63% accuracy in the testing phase, 99.67% accuracy in the training phase, and an error rate of 0.0107 across eight classes, while maintaining short response times and low resource consumption. These findings highlight the effectiveness of the proposed method in detecting and classifying attacks in real-world IoT environments, indicating that the lightweight architecture can serve as a practical alternative to complex and resource-intensive approaches in IoT network security.

Paperid: 499, https://arxiv.org/pdf/2509.02998.pdf

Abstract:
This full paper describes an LLM-assisted instruction integrated with a virtual cybersecurity lab platform. The digital transformation of Fourth Industrial Revolution (4IR) systems is reshaping workforce needs, widening skill gaps, especially among older workers. With rising emphasis on robotics, automation, AI, and security, re-skilling and up-skilling are essential. Generative AI can help build this workforce by acting as an instructional assistant to support skill acquisition during experiential learning. We present a generative AI instructional assistant integrated into a prior experiential learning platform. The assistant employs a zero-shot OCR-LLM pipeline within the legacy Cybersecurity Labs-as-a-Service (CLaaS) platform (2015). Text is extracted from slide images using Tesseract OCR, then simplified instructions are generated via a general-purpose LLM, enabling real-time instructional support with minimal infrastructure. The system was evaluated in a live university course where student feedback (n=42) averaged 7.83/10, indicating strong perceived usefulness. A comparative study with multimodal LLMs that directly interpret slide images showed higher performance on visually dense slides, but the OCR-LLM pipeline provided comparable pedagogical value on text-centric slides with much lower computational overhead and cost. This work demonstrates that a lightweight, easily integrable pipeline can effectively extend legacy platforms with modern generative AI, offering scalable enhancements for student comprehension in technical education.

Paperid: 500, https://arxiv.org/pdf/2509.01724.pdf

Abstract:
The Internet of Things (IoT) has emerged as a foundational paradigm supporting a range of applications, including healthcare, education, agriculture, smart homes, and, more recently, enterprise systems. However, significant advancements in IoT networks have been impeded by security vulnerabilities and threats that, if left unaddressed, could hinder the deployment and operation of IoT based systems. Detecting unwanted activities within the IoT is crucial, as it directly impacts confidentiality, integrity, and availability. Consequently, intrusion detection has become a fundamental research area and the focus of numerous studies. An intrusion detection system (IDS) is essential to the IoTs alarm mechanisms, enabling effective security management. This paper examines IoT security and introduces an intelligent two-layer intrusion detection system for IoT. Machine learning techniques power the system's intelligence, with a two layer structure enhancing intrusion detection. By selecting essential features, the system maintains detection accuracy while minimizing processing overhead. The proposed method for intrusion detection in IoT is implemented in two phases. In the first phase, the Grasshopper Optimization Algorithm (GOA) is applied for feature selection. In the second phase, the Support Vector Machine (SVM) algorithm is used to detect intrusions. The method was implemented in MATLAB, and the NSLKDD dataset was used for evaluation. Simulation results show that the proposed method improves accuracy compared to other approaches.

Paperid: 501, https://arxiv.org/pdf/2509.01717.pdf

Abstract:
This paper presents a novel multi-layered hybrid security approach aimed at enhancing lightweight encryption for IoT-Cloud systems. The primary goal is to overcome limitations inherent in conventional solutions such as TPA, Blockchain, ECDSA and ZSS which often fall short in terms of data protection, computational efficiency and scalability. Our proposed method strategically refines and integrates these technologies to address their shortcomings while maximizing their individual strengths. By doing so we create a more reliable and high-performance framework for secure data exchange across heterogeneous environments. The model leverages the combined potential of emerging technologies, particularly Blockchain, IoT and Cloud computing which when effectively coordinated offer significant advancements in security architecture. The proposed framework consists of three core layers: (1) the H.E.EZ Layer which integrates improved versions of Hyperledger Fabric, Enc-Block and a hybrid ECDSA-ZSS scheme to improve encryption speed, scalability and reduce computational cost; (2) the Credential Management Layer independently verifying data integrity and authenticity; and (3) the Time and Auditing Layer designed to reduce traffic overhead and optimize performance across dynamic workloads. Evaluation results highlight that the proposed solution not only strengthens security but also significantly improves execution time, communication efficiency and system responsiveness, offering a robust path forward for next-generation IoT-Cloud infrastructures.

Paperid: 502, https://arxiv.org/pdf/2509.00647.pdf

Abstract:
The rapid growth of hardware vulnerabilities has created an urgent need for systematic and scalable analysis methods. Unlike software flaws, which are often patchable post-deployment, hardware weaknesses remain embedded across product lifecycles, posing persistent risks to processors, embedded devices, and IoT platforms. Existing efforts such as the MITRE CWE Hardware List (2021) relied on expert-driven Delphi surveys, which lack statistical rigor and introduce subjective bias, while large-scale data-driven foundations for hardware weaknesses have been largely absent. In this work, we propose LLM-HyPZ, an LLM-assisted hybrid framework for zero-shot knowledge extraction and refinement from vulnerability corpora. Our approach integrates zero-shot LLM classification, contextualized embeddings, unsupervised clustering, and prompt-driven summarization to mine hardware-related CVEs at scale. Applying LLM-HyPZ to the 2021-2024 CVE corpus (114,836 entries), we identified 1,742 hardware-related vulnerabilities. We distilled them into five recurring themes, including privilege escalation via firmware and BIOS, memory corruption in mobile and IoT systems, and physical access exploits. Benchmarking across seven LLMs shows that LLaMA 3.3 70B achieves near-perfect classification accuracy (99.5%) on a curated validation set. Beyond methodological contributions, our framework directly supported the MITRE CWE Most Important Hardware Weaknesses (MIHW) 2025 update by narrowing the candidate search space. Specifically, our pipeline surfaced 411 of the 1,026 CVEs used for downstream MIHW analysis, thereby reducing expert workload and accelerating evidence gathering. These results establish LLM-HyPZ as the first data-driven, scalable approach for systematically discovering hardware vulnerabilities, thereby bridging the gap between expert knowledge and real-world vulnerability evidence.

Paperid: 503, https://arxiv.org/pdf/2508.21432.pdf

Abstract:
The rapid development of Large Language Models (LLMs) for code generation has transformed software development by automating coding tasks with unprecedented efficiency. However, the training of these models on open-source code repositories (e.g., from GitHub) raises critical ethical and legal concerns, particularly regarding data authorization and open-source license compliance. Developers are increasingly questioning whether model trainers have obtained proper authorization before using repositories for training, especially given the lack of transparency in data collection. To address these concerns, we propose a novel data marking framework RepoMark to audit the data usage of code LLMs. Our method enables repository owners to verify whether their code has been used in training, while ensuring semantic preservation, imperceptibility, and theoretical false detection rate (FDR) guarantees. By generating multiple semantically equivalent code variants, RepoMark introduces data marks into the code files, and during detection, RepoMark leverages a novel ranking-based hypothesis test to detect memorization within the model. Compared to prior data auditing approaches, RepoMark significantly enhances sample efficiency, allowing effective auditing even when the user's repository possesses only a small number of code files. Experiments demonstrate that RepoMark achieves a detection success rate over 90\% on small code repositories under a strict FDR guarantee of 5\%. This represents a significant advancement over existing data marking techniques, all of which only achieve accuracy below 55\% under identical settings. This further validates RepoMark as a robust, theoretically sound, and promising solution for enhancing transparency in code LLM training, which can safeguard the rights of repository owners.

Paperid: 504, https://arxiv.org/pdf/2508.12398.pdf

Abstract:
Diffusion Large Language Models (dLLMs) have recently emerged as a competitive non-autoregressive paradigm due to their unique training and inference approach. However, there is currently a lack of safety study on this novel architecture. In this paper, we present the first analysis of dLLMs' safety performance and propose a novel safety alignment method tailored to their unique generation characteristics. Specifically, we identify a critical asymmetry between the defender and attacker in terms of security. For the defender, we reveal that the middle tokens of the response, rather than the initial ones, are more critical to the overall safety of dLLM outputs; this seems to suggest that aligning middle tokens can be more beneficial to the defender. The attacker, on the contrary, may have limited power to manipulate middle tokens, as we find dLLMs have a strong tendency towards a sequential generation order in practice, forcing the attack to meet this distribution and diverting it from influencing the critical middle tokens. Building on this asymmetry, we introduce Middle-tOken Safety Alignment (MOSA), a novel method that directly aligns the model's middle generation with safe refusals exploiting reinforcement learning. We implement MOSA and compare its security performance against eight attack methods on two benchmarks. We also test the utility of MOSA-aligned dLLM on coding, math, and general reasoning. The results strongly prove the superiority of MOSA.

Paperid: 505, https://arxiv.org/pdf/2507.22037.pdf

Abstract:
The rapid advancement of multimodal large language models (MLLMs) has led to breakthroughs in various applications, yet their security remains a critical challenge. One pressing issue involves unsafe image-query pairs--jailbreak inputs specifically designed to bypass security constraints and elicit unintended responses from MLLMs. Compared to general multimodal data, such unsafe inputs are relatively sparse, which limits the diversity and richness of training samples available for developing robust defense models. Meanwhile, existing guardrail-type methods rely on external modules to enforce security constraints but fail to address intrinsic vulnerabilities within MLLMs. Traditional supervised fine-tuning (SFT), on the other hand, often over-refuses harmless inputs, compromising general performance. Given these challenges, we propose Secure Tug-of-War (SecTOW), an innovative iterative defense-attack training method to enhance the security of MLLMs. SecTOW consists of two modules: a defender and an auxiliary attacker, both trained iteratively using reinforcement learning (GRPO). During the iterative process, the attacker identifies security vulnerabilities in the defense model and expands jailbreak data. The expanded data are then used to train the defender, enabling it to address identified security vulnerabilities. We also design reward mechanisms used for GRPO to simplify the use of response labels, reducing dependence on complex generative labels and enabling the efficient use of synthetic data. Additionally, a quality monitoring mechanism is used to mitigate the defender's over-refusal of harmless inputs and ensure the diversity of the jailbreak data generated by the attacker. Experimental results on safety-specific and general benchmarks demonstrate that SecTOW significantly improves security while preserving general performance.

Paperid: 506, https://arxiv.org/pdf/2507.07056.pdf

Abstract:
The proliferation of Low-Rank Adaptation (LoRA) models has democratized personalized text-to-image generation, enabling users to share lightweight models (e.g., personal portraits) on platforms like Civitai and Liblib. However, this "share-and-play" ecosystem introduces critical risks: benign LoRAs can be weaponized by adversaries to generate harmful content (e.g., political, defamatory imagery), undermining creator rights and platform safety. Existing defenses like concept-erasure methods focus on full diffusion models (DMs), neglecting LoRA's unique role as a modular adapter and its vulnerability to adversarial prompt engineering. To bridge this gap, we propose LoRAShield, the first data-free editing framework for securing LoRA models against misuse. Our platform-driven approach dynamically edits and realigns LoRA's weight subspace via adversarial optimization and semantic augmentation. Experimental results demonstrate that LoRAShield achieves remarkable effectiveness, efficiency, and robustness in blocking malicious generations without sacrificing the functionality of the benign task. By shifting the defense to platforms, LoRAShield enables secure, scalable sharing of personalized models, a critical step toward trustworthy generative ecosystems.

Paperid: 507, https://arxiv.org/pdf/2507.05512.pdf

Abstract:
Distinguishing AI-generated code from human-written code is becoming crucial for tasks such as authorship attribution, content tracking, and misuse detection. Based on this, N-gram-based watermarking schemes have emerged as prominent, which inject secret watermarks to be detected during the generation. However, their robustness in code content remains insufficiently evaluated. Most claims rely solely on defenses against simple code transformations or code optimizations as a simulation of attack, creating a questionable sense of robustness. In contrast, more sophisticated schemes already exist in the software engineering world, e.g., code obfuscation, which significantly alters code while preserving functionality. Although obfuscation is commonly used to protect intellectual property or evade software scanners, the robustness of code watermarking techniques against such transformations remains largely unexplored. In this work, we formally model the code obfuscation and prove the impossibility of N-gram-based watermarking's robustness with only one intuitive and experimentally verified assumption, distribution consistency, satisfied. Given the original false positive rate of the watermarking detection, the ratio that the detector failed on the watermarked code after obfuscation will increase to 1 - fpr. The experiments have been performed on three SOTA watermarking schemes, two LLMs, two programming languages, four code benchmarks, and four obfuscators. Among them, all watermarking detectors show coin-flipping detection abilities on obfuscated codes (AUROC tightly surrounds 0.5). Among all models, watermarking schemes, and datasets, both programming languages own obfuscators that can achieve attack effects with no detection AUROC higher than 0.6 after the attack. Based on the theoretical and practical observations, we also proposed a potential path of robust code watermarking.

Paperid: 508, https://arxiv.org/pdf/2507.04055.pdf

Abstract:
Malware Family Classification (MFC) aims to identify the fine-grained family (e.g., GuLoader or BitRAT) to which a potential malware sample belongs, in contrast to malware detection or sample classification that predicts only an Yes/No. Accurate family identification can greatly facilitate automated sample labeling and understanding on crowdsourced malware analysis platforms such as VirusTotal and MalwareBazaar, which generate vast amounts of data daily. In this paper, we explore and assess the feasibility of using traditional binary string features for MFC in the new era of large language models (LLMs) and Retrieval-Augmented Generation (RAG). Specifically, we investigate how Family-Specific String (FSS) features could be utilized in a manner similar to RAG to facilitate MFC. To this end, we develop a curated evaluation framework covering 4,347 samples from 67 malware families, extract and analyze over 25 million strings, and conduct detailed ablation studies to assess the impact of different design choices in four major modules.

Paperid: 509, https://arxiv.org/pdf/2512.21871.pdf

Abstract:
Large vision-language models (LVLMs) have achieved remarkable advancements in multimodal reasoning tasks. However, their widespread accessibility raises critical concerns about potential copyright infringement. Will LVLMs accurately recognize and comply with copyright regulations when encountering copyrighted content (i.e., user input, retrieved documents) in the context? Failure to comply with copyright regulations may lead to serious legal and ethical consequences, particularly when LVLMs generate responses based on copyrighted materials (e.g., retrieved book experts, news reports). In this paper, we present a comprehensive evaluation of various LVLMs, examining how they handle copyrighted content -- such as book excerpts, news articles, music lyrics, and code documentation when they are presented as visual inputs. To systematically measure copyright compliance, we introduce a large-scale benchmark dataset comprising 50,000 multimodal query-content pairs designed to evaluate how effectively LVLMs handle queries that could lead to copyright infringement. Given that real-world copyrighted content may or may not include a copyright notice, the dataset includes query-content pairs in two distinct scenarios: with and without a copyright notice. For the former, we extensively cover four types of copyright notices to account for different cases. Our evaluation reveals that even state-of-the-art closed-source LVLMs exhibit significant deficiencies in recognizing and respecting the copyrighted content, even when presented with the copyright notice. To solve this limitation, we introduce a novel tool-augmented defense framework for copyright compliance, which reduces infringement risks in all scenarios. Our findings underscore the importance of developing copyright-aware LVLMs to ensure the responsible and lawful use of copyrighted content.

Paperid: 510, https://arxiv.org/pdf/2512.21132.pdf

Abstract:
As LLMs see wide adoption in software engineering, the reliable assessment of the correctness and security of LLM-generated code is crucial. Notably, prior work has demonstrated that security is often overlooked, exposing that LLMs are prone to generating code with security vulnerabilities. These insights were enabled by specialized benchmarks, crafted through significant manual effort by security experts. However, relying on manually-crafted benchmarks is insufficient in the long term, because benchmarks (i) naturally end up contaminating training data, (ii) must extend to new tasks to provide a more complete picture, and (iii) must increase in difficulty to challenge more capable LLMs. In this work, we address these challenges and present AutoBaxBuilder, a framework that generates tasks and tests for code security benchmarking from scratch. We introduce a robust pipeline with fine-grained plausibility checks, leveraging the code understanding capabilities of LLMs to construct functionality tests and end-to-end security-probing exploits. To confirm the quality of the generated benchmark, we conduct both a qualitative analysis and perform quantitative experiments, comparing it against tasks constructed by human experts. We use AutoBaxBuilder to construct entirely new tasks and release them to the public as AutoBaxBench, together with a thorough evaluation of the security capabilities of LLMs on these tasks. We find that a new task can be generated in under 2 hours, costing less than USD 10.

Paperid: 511, https://arxiv.org/pdf/2512.16310.pdf

Abstract:
Driven by Large Language Models, the single-agent, multi-tool architecture has become a popular paradigm for autonomous agents due to its simplicity and effectiveness. However, this architecture also introduces a new and severe privacy risk, which we term Tools Orchestration Privacy Risk (TOP-R), where an agent, to achieve a benign user goal, autonomously aggregates information fragments across multiple tools and leverages its reasoning capabilities to synthesize unexpected sensitive information. We provide the first systematic study of this risk. First, we establish a formal framework, attributing the risk's root cause to the agent's misaligned objective function: an overoptimization for helpfulness while neglecting privacy awareness. Second, we construct TOP-Bench, comprising paired leakage and benign scenarios, to comprehensively evaluate this risk. To quantify the trade-off between safety and robustness, we introduce the H-Score as a holistic metric. The evaluation results reveal that TOP-R is a severe risk: the average Risk Leakage Rate (RLR) of eight representative models reaches 90.24%, while the average H-Score is merely 0.167, with no model exceeding 0.3. Finally, we propose the Privacy Enhancement Principle (PEP) method, which effectively mitigates TOP-R, reducing the Risk Leakage Rate to 46.58% and significantly improving the H-Score to 0.624. Our work reveals both a new class of risk and inherent structural limitations in current agent architectures, while also offering feasible mitigation strategies.

Paperid: 512, https://arxiv.org/pdf/2512.15179.pdf

Abstract:
As the Ethereum platform continues to mature and gain widespread usage, it is crucial to maintain high standards of smart contract writing practices. While bad practices in smart contracts may not directly lead to security issues, they elevate the risk of encountering problems. Therefore, to understand and avoid these bad practices, this paper introduces the first systematic study of bad practices in smart contracts, delving into over 47 specific issues. Specifically, we propose SCALM, an LLM-powered framework featuring two methodological innovations: (1) A hybrid architecture that combines context-aware function-level slicing with knowledge-enhanced semantic reasoning via extensible vectorized pattern matching. (2) A multi-layer reasoning verification system connects low-level code patterns with high-level security principles through syntax, design patterns, and architecture analysis. Our extensive experiments using multiple LLMs and datasets have shown that SCALM outperforms existing tools in detecting bad practices in smart contracts.

Paperid: 513, https://arxiv.org/pdf/2512.15030.pdf

Abstract:
The web3 applications have recently been growing, especially on the Ethereum platform, starting to become the target of scammers. The web3 scams, imitating the services provided by legitimate platforms, mimic regular activity to deceive users. However, previous studies have primarily concentrated on de-anonymization and phishing nodes, neglecting the distinctive features of web3 scams. Moreover, the current phishing account detection tools utilize graph learning or sampling algorithms to obtain graph features. However, large-scale transaction networks with temporal attributes conform to a power-law distribution, posing challenges in detecting web3 scams. To overcome these challenges, we present ScamSweeper, a novel framework that emphasizes the dynamic evolution of transaction graphs, to identify web3 scams on Ethereum. ScamSweeper samples the network with a structure temporal random walk, which is an optimized sample walking method that considers both temporal attributes and structural information. Then, the directed graph encoder generates the features of each subgraph during different temporal intervals, sorting as a sequence. Moreover, a variational Transformer is utilized to extract the dynamic evolution in the subgraph sequence. Furthermore, we collect a large-scale transaction dataset consisting of web3 scams, phishing, and normal accounts, which are from the first 18 million block heights on Ethereum. Subsequently, we comprehensively analyze the distinctions in various attributes, including nodes, edges, and degree distribution. Our experiments indicate that ScamSweeper outperforms SIEGE, Ethident, and PDTGA in detecting web3 scams, achieving a weighted F1-score improvement of at least 17.29% with the base value of 0.59. In addition, ScamSweeper in phishing node detection achieves at least a 17.5% improvement over DGTSG and BERT4ETH in F1-score from 0.80.

Paperid: 514, https://arxiv.org/pdf/2512.14557.pdf

Abstract:
Causal inference plays a crucial role in scientific research across multiple disciplines. Estimating causal effects, particularly the average treatment effect (ATE), from observational data has garnered significant attention. However, computing the ATE from real-world observational data poses substantial privacy risks to users. Differential privacy, which offers strict theoretical guarantees, has emerged as a standard approach for privacy-preserving data analysis. However, existing differentially private ATE estimation works rely on specific assumptions, provide limited privacy protection, or fail to offer comprehensive information protection. To this end, we introduce PrivATE, a practical ATE estimation framework that ensures differential privacy. In fact, various scenarios require varying levels of privacy protection. For example, only test scores are generally sensitive information in education evaluation, while all types of medical record data are usually private. To accommodate different privacy requirements, we design two levels (i.e., label-level and sample-level) of privacy protection in PrivATE. By deriving an adaptive matching limit, PrivATE effectively balances noise-induced error and matching error, leading to a more accurate estimate of ATE. Our evaluation validates the effectiveness of PrivATE. PrivATE outperforms the baselines on all datasets and privacy budgets.

Paperid: 515, https://arxiv.org/pdf/2512.14439.pdf

Abstract:
Video recognition systems are increasingly being deployed in daily life, such as content recommendation and security monitoring. To enhance video recognition development, many institutions have released high-quality public datasets with open-source licenses for training advanced models. At the same time, these datasets are also susceptible to misuse and infringement. Dataset copyright auditing is an effective solution to identify such unauthorized use. However, existing dataset copyright solutions primarily focus on the image domain; the complex nature of video data leaves dataset copyright auditing in the video domain unexplored. Specifically, video data introduces an additional temporal dimension, which poses significant challenges to the effectiveness and stealthiness of existing methods. In this paper, we propose VICTOR, the first dataset copyright auditing approach for video recognition systems. We develop a general and stealthy sample modification strategy that enhances the output discrepancy of the target model. By modifying only a small proportion of samples (e.g., 1%), VICTOR amplifies the impact of published modified samples on the prediction behavior of the target models. Then, the difference in the model's behavior for published modified and unpublished original samples can serve as a key basis for dataset auditing. Extensive experiments on multiple models and datasets highlight the superiority of VICTOR. Finally, we show that VICTOR is robust in the presence of several perturbation mechanisms to the training videos or the target models.

Paperid: 516, https://arxiv.org/pdf/2512.08372.pdf

Abstract:
In the case of upgrading smart contracts on blockchain systems, it is essential to consider the continuity of upgrade and subsequent maintenance. In practice, upgrade operations often introduce new vulnerabilities. To address this, we propose an Upgradeable Smart Contract Security Analyzer, USCSA, which evaluates the risks associated with the upgrade process using the Abstract Syntax Tree (AST) difference analysis. We collected and analyzed 3,546 cases of vulnerabilities in upgradeable contracts, covering common vulnerability categories such as reentrancy, access control flaws, and integer overflow. Experimental results show that USCSA achieves an accuracy of 92.3%, recall of 89.7%, and F1-score of 91.0% in detecting upgrade-induced vulnerabilities. In addition, compared with traditional methods, the efficiency of mapping high-risk changes has increased by approximately 30%. As a result, USCSA provides a significant advantage to improve the security and integrity of upgradeable smart contracts, providing a novel and efficient solution to secure audits on blockchain applications.

Paperid: 517, https://arxiv.org/pdf/2512.06846.pdf

Abstract:
Traditional approaches for smart contract analysis often rely on intermediate representations such as abstract syntax trees, control-flow graphs, or static single assignment form. However, these methods face limitations in capturing both semantic structures and control logic. Knowledge graphs, by contrast, offer a structured representation of entities and relations, enabling richer intermediate abstractions of contract code and supporting the use of graph query languages to identify rule-violating elements. This paper presents CKG-LLM, a framework for detecting access-control vulnerabilities in smart contracts. Leveraging the reasoning and code generation capabilities of large language models, CKG-LLM translates natural-language vulnerability patterns into executable queries over contract knowledge graphs to automatically locate vulnerable code elements. Experimental evaluation demonstrates that CKG-LLM achieves superior performance in detecting access-control vulnerabilities compared to existing tools. Finally, we discuss potential extensions of CKG-LLM as part of future research directions.

Paperid: 518, https://arxiv.org/pdf/2512.02399.pdf

Abstract:
Smart contracts are a core component of blockchain technology and are widely deployed across various scenarios. However, atomicity violations have become a potential security risk. Existing analysis tools often lack the precision required to detect these issues effectively. To address this challenge, we introduce AtomGraph, an automated framework designed for detecting atomicity violations. This framework leverages Graph Convolutional Networks (GCN) to identify atomicity violations through multimodal feature learning and fusion. Specifically, driven by a collaborative learning mechanism, the model simultaneously learns from two heterogeneous modalities: extracting structural topological features from the contract's Control Flow Graph (CFG) and uncovering deep semantics from its opcode sequence. We designed an adaptive weighted fusion mechanism to dynamically adjust the weights of features from each modality to achieve optimal feature fusion. Finally, GCN detects graph-level atomicity violation on the contract. Comprehensive experimental evaluations demonstrate that AtomGraph achieves 96.88% accuracy and 96.97% F1 score, outperforming existing tools. Furthermore, compared to the concatenation fusion model, AtomGraph improves the F1 score by 6.4%, proving its potential in smart contract security detection.

Paperid: 519, https://arxiv.org/pdf/2512.01577.pdf

Abstract:
Non-fungible tokens (NFTs) serve as a representative form of digital asset ownership and have attracted numerous investors, creators, and tech enthusiasts in recent years. However, related fraud activities, especially phishing scams, have caused significant property losses. There are many graph analysis methods to detect malicious scam incidents, but no research on the transaction patterns of the NFT scams. Therefore, to fill this gap, we are the first to systematically explore NFT phishing frauds through graph analysis, aiming to comprehensively investigate the characteristics and patterns of NFT phishing frauds on the transaction graph. During the research process, we collect transaction records, log data, and security reports related to NFT phishing incidents published on multiple platforms. After collecting, sanitizing, and unifying the data, we construct a transaction graph and analyze the distribution, transaction features, and interaction patterns of NFT phishing scams. We find that normal transactions on the blockchain accounted for 96.71% of all transactions. Although phishing-related accounts accounted for only 0.94% of the total accounts, they appeared in 8.36% of the transaction scenarios, and their interaction probability with normal accounts is significantly higher in large-scale transaction networks. Moreover, NFT phishing scammers often carry out fraud in a collective manner, targeting specific accounts, tend to interact with victims through multiple token standards, have shorter transaction cycles than normal transactions, and involve more multi-party transactions. This study reveals the core behavioral features of NFT phishing scams, providing important references for the detection and prevention of NFT phishing scams in the future.

Paperid: 520, https://arxiv.org/pdf/2511.18138.pdf

Abstract:
Multimodal learning has shown significant superiority on various tasks by integrating multiple modalities. However, the interdependencies among modalities increase the susceptibility of multimodal models to adversarial attacks. Existing methods mainly focus on attacks on specific modalities or indiscriminately attack all modalities. In this paper, we find that these approaches ignore the differences between modalities in their contribution to final robustness, resulting in suboptimal robustness performance. To bridge this gap, we introduce Vulnerability-Aware Robust Multimodal Adversarial Training (VARMAT), a probe-in-training adversarial training method that improves multimodal robustness by identifying the vulnerability of each modality. To be specific, VARMAT first explicitly quantifies the vulnerability of each modality, grounded in a first-order approximation of the attack objective (Probe). Then, we propose a targeted regularization term that penalizes modalities with high vulnerability, guiding robust learning while maintaining task accuracy (Training). We demonstrate the enhanced robustness of our method across multiple multimodal datasets involving diverse modalities. Finally, we achieve {12.73%, 22.21%, 11.19%} robustness improvement on three multimodal datasets, revealing a significant blind spot in multimodal adversarial training.

Paperid: 521, https://arxiv.org/pdf/2511.01144.pdf

Abstract:
Large Language Models (LLMs) have demonstrated strong capabilities in natural language reasoning, yet their application to Cyber Threat Intelligence (CTI) remains limited. CTI analysis involves distilling large volumes of unstructured reports into actionable knowledge, a process where LLMs could substantially reduce analyst workload. CTIBench introduced a comprehensive benchmark for evaluating LLMs across multiple CTI tasks. In this work, we extend CTIBench by developing AthenaBench, an enhanced benchmark that includes an improved dataset creation pipeline, duplicate removal, refined evaluation metrics, and a new task focused on risk mitigation strategies. We evaluate twelve LLMs, including state-of-the-art proprietary models such as GPT-5 and Gemini-2.5 Pro, alongside seven open-source models from the LLaMA and Qwen families. While proprietary LLMs achieve stronger results overall, their performance remains subpar on reasoning-intensive tasks, such as threat actor attribution and risk mitigation, with open-source models trailing even further behind. These findings highlight fundamental limitations in the reasoning capabilities of current LLMs and underscore the need for models explicitly tailored to CTI workflows and automation.

Paperid: 522, https://arxiv.org/pdf/2511.00408.pdf

Abstract:
Decentralized finance (DeFi) protocols are crypto projects developed on the blockchain to manage digital assets. Attacks on DeFi have been frequent and have resulted in losses exceeding $80 billion. Current tools detect and locate possible vulnerabilities in contracts by analyzing the state changes that may occur during malicious events. However, this victim-only approaches seldom possess the capability to cover the attacker's interaction intention logic. Furthermore, only a minuscule percentage of DeFi protocols experience attacks in real-world scenarios, which poses a significant challenge for these detection tools to demonstrate practical effectiveness. In this paper, we propose DeFiTail, the first framework that utilizes deep learning technology for access control and flash loan exploit detection. Through feeding the cross-contract static data flow, DeFiTail automatically learns the attack logic in real-world malicious events that occur on DeFi protocols, capturing the threat patterns between attacker and victim contracts. Since the DeFi protocol events involve interactions with multi-account transactions, the execution path with external and internal transactions requires to be unified. Moreover, to mitigate the impact of mistakes in Control Flow Graph (CFG) connections, DeFiTail validates the data path by employing the symbolic execution stack. Furthermore, we feed the data paths through our model to achieve the inspection of DeFi protocols. Comparative experiment results indicate that DeFiTail achieves the highest accuracy, with 98.39% in access control and 97.43% in flash loan exploits. DeFiTail also demonstrates an enhanced capability to detect malicious contracts, identifying 86.67% accuracy from the CVE dataset.

Paperid: 523, https://arxiv.org/pdf/2510.08496.pdf

Abstract:
Transportation Cyber-Physical Systems (TCPS) integrate physical elements, such as transportation infrastructure and vehicles, with cyber elements via advanced communication technologies, allowing them to interact seamlessly. This integration enhances the efficiency, safety, and sustainability of transportation systems. TCPS rely heavily on cryptographic security to protect sensitive information transmitted between vehicles, transportation infrastructure, and other entities within the transportation ecosystem, ensuring data integrity, confidentiality, and authenticity. Traditional cryptographic methods have been employed to secure TCPS communications, but the advent of quantum computing presents a significant threat to these existing security measures. Therefore, integrating Post-Quantum Cryptography (PQC) into TCPS is essential to maintain secure and resilient communications. While PQC offers a promising approach to developing cryptographic algorithms resistant to quantum attacks, artificial intelligence (AI) can enhance PQC by optimizing algorithm selection, resource allocation, and adapting to evolving threats in real-time. AI-driven PQC approaches can improve the efficiency and effectiveness of PQC implementations, ensuring robust security without compromising system performance. This chapter introduces TCPS communication protocols, discusses the vulnerabilities of corresponding communications to cyber-attacks, and explores the limitations of existing cryptographic methods in the quantum era. By examining how AI can strengthen PQC solutions, the chapter presents cyber-resilient communication strategies for TCPS.

Paperid: 524, https://arxiv.org/pdf/2510.06607.pdf

Abstract:
Computer-use agent (CUA) frameworks, powered by large language models (LLMs) or multimodal LLMs (MLLMs), are rapidly maturing as assistants that can perceive context, reason, and act directly within software environments. Among their most critical applications is operating system (OS) control. As CUAs in the OS domain become increasingly embedded in daily operations, it is imperative to examine their real-world security implications, specifically whether CUAs can be misused to perform realistic, security-relevant attacks. Existing works exhibit four major limitations: Missing attacker-knowledge model on tactics, techniques, and procedures (TTP), Incomplete coverage for end-to-end kill chains, unrealistic environment without multi-host and encrypted user credentials, and unreliable judgment dependent on LLM-as-a-Judge. To address these gaps, we propose AdvCUA, the first benchmark aligned with real-world TTPs in MITRE ATT&CK Enterprise Matrix, which comprises 140 tasks, including 40 direct malicious tasks, 74 TTP-based malicious tasks, and 26 end-to-end kill chains, systematically evaluates CUAs under a realistic enterprise OS security threat in a multi-host environment sandbox by hard-coded evaluation. We evaluate the existing five mainstream CUAs, including ReAct, AutoGPT, Gemini CLI, Cursor CLI, and Cursor IDE based on 8 foundation LLMs. The results demonstrate that current frontier CUAs do not adequately cover OS security-centric threats. These capabilities of CUAs reduce dependence on custom malware and deep domain expertise, enabling even inexperienced attackers to mount complex enterprise intrusions, which raises social concern about the responsibility and security of CUAs.

Paperid: 525, https://arxiv.org/pdf/2510.05169.pdf

Abstract:
Large Language Models (LLMs) can acquire deceptive behaviors through backdoor attacks, where the model executes prohibited actions whenever secret triggers appear in the input. Existing safety training methods largely fail to address this vulnerability, due to the inherent difficulty of uncovering hidden triggers implanted in the model. Motivated by recent findings on LLMs' situational awareness, we propose a novel post-training framework that cultivates self-awareness of backdoor risks and enables models to articulate implanted triggers even when they are absent from the prompt. At its core, our approach introduces an inversion-inspired reinforcement learning framework that encourages models to introspectively reason about their own behaviors and reverse-engineer the triggers responsible for misaligned outputs. Guided by curated reward signals, this process transforms a poisoned model into one capable of precisely identifying its implanted trigger. Surprisingly, we observe that such backdoor self-awareness emerges abruptly within a short training window, resembling a phase transition in capability. Building on this emergent property, we further present two complementary defense strategies for mitigating and detecting backdoor threats. Experiments on five backdoor attacks, compared against six baseline methods, demonstrate that our approach has strong potential to improve the robustness of LLMs against backdoor risks. The code is available at LLM Backdoor Self-Awareness.

Paperid: 526, https://arxiv.org/pdf/2510.02356.pdf

Abstract:
The deployment of Large Language Models (LLMs) in embodied agents creates an urgent need to measure their privacy awareness in the physical world. Existing evaluation methods, however, are confined to natural language based scenarios. To bridge this gap, we introduce EAPrivacy, a comprehensive evaluation benchmark designed to quantify the physical-world privacy awareness of LLM-powered agents. EAPrivacy utilizes procedurally generated scenarios across four tiers to test an agent's ability to handle sensitive objects, adapt to changing environments, balance task execution with privacy constraints, and resolve conflicts with social norms. Our measurements reveal a critical deficit in current models. The top-performing model, Gemini 2.5 Pro, achieved only 59\% accuracy in scenarios involving changing physical environments. Furthermore, when a task was accompanied by a privacy request, models prioritized completion over the constraint in up to 86\% of cases. In high-stakes situations pitting privacy against critical social norms, leading models like GPT-4o and Claude-3.5-haiku disregarded the social norm over 15\% of the time. These findings, demonstrated by our benchmark, underscore a fundamental misalignment in LLMs regarding physically grounded privacy and establish the need for more robust, physically-aware alignment.

Paperid: 527, https://arxiv.org/pdf/2509.26640.pdf

Abstract:
Due to the susceptibility of Artificial Intelligence (AI) to data perturbations and adversarial examples, it is crucial to perform a thorough robustness evaluation before any Machine Learning (ML) model is deployed. However, examining a model's decision boundaries and identifying potential vulnerabilities typically requires access to the training and testing datasets, which may pose risks to data privacy and confidentiality. To improve transparency in organizations that handle confidential data or manage critical infrastructure, it is essential to allow external verification and validation of AI without the disclosure of private datasets. This paper presents Systematic Pattern Analysis (SPATA), a deterministic method that converts any tabular dataset to a domain-independent representation of its statistical patterns, to provide more detailed and transparent data cards. SPATA computes the projection of each data instance into a discrete space where they can be analyzed and compared, without risking data leakage. These projected datasets can be reliably used for the evaluation of how different features affect ML model robustness and for the generation of interpretable explanations of their behavior, contributing to more trustworthy AI.

Paperid: 528, https://arxiv.org/pdf/2509.16899.pdf

Abstract:
The interest in autonomous vehicles (AVs) for critical missions, including transportation, rescue, surveillance, reconnaissance, and mapping, is growing rapidly due to their significant safety and mobility benefits. AVs consist of complex software systems that leverage artificial intelligence (AI), sensor fusion algorithms, and real-time data processing. Additionally, AVs are becoming increasingly reliant on open-source software supply chains, such as open-source packages, third-party software components, AI models, and third-party datasets. Software security best practices in the automotive sector are often an afterthought for developers. Thus, significant cybersecurity risks exist in the software supply chain of AVs, particularly when secure software development practices are not rigorously implemented. For example, Upstream's 2024 Automotive Cybersecurity Report states that 49.5% of cyberattacks in the automotive sector are related to exploiting security vulnerabilities in software systems. In this chapter, we analyze security vulnerabilities in open-source software components in AVs. We utilize static analyzers on popular open-source AV software, such as Autoware, Apollo, and openpilot. Specifically, this chapter covers: (1) prevalent software security vulnerabilities of AVs; and (2) a comparison of static analyzer outputs for different open-source AV repositories. The goal is to inform researchers, practitioners, and policymakers about the existing security flaws in the commonplace open-source software ecosystem in the AV domain. The findings would emphasize the necessity of security best practices earlier in the software development lifecycle to reduce cybersecurity risks, thereby ensuring system reliability, safeguarding user data, and maintaining public trust in an increasingly automated world.

Paperid: 529, https://arxiv.org/pdf/2509.16489.pdf

Abstract:
Intelligent Transportation Systems (ITS) have been widely deployed across major metropolitan regions worldwide to improve roadway safety, optimize traffic flow, and reduce environmental impacts. These systems integrate advanced sensors, communication networks, and data analytics to enable real-time traffic monitoring, adaptive signal control, and predictive maintenance. However, such integration significantly broadens the ITS attack surface, exposing critical infrastructures to cyber threats that jeopardize safety, data integrity, and operational resilience. Ensuring robust cybersecurity is therefore essential, yet comprehensive vulnerability assessments, threat modeling, and mitigation validations are often cost-prohibitive and time-intensive when applied to large-scale, heterogeneous transportation systems. Simulation platforms offer a cost-effective and repeatable means for cybersecurity evaluation, and the simulation platform should encompass the full range of ITS dimensions - mobility, sensing, networking, and applications. This chapter discusses an integrated co-simulation testbed that links CARLA for 3D environment and sensor modeling, SUMO for microscopic traffic simulation and control, and OMNeT++ for V2X communication simulation. The co-simulation testbed enables end-to-end experimentation, vulnerability identification, and mitigation benchmarking, providing practical insights for developing secure, efficient, and resilient ITS infrastructures. To illustrate its capabilities, the chapter incorporates a case study on a C-V2X proactive safety alert system enhanced with post-quantum cryptography, highlighting the role of the testbed in advancing secure and resilient ITS infrastructures.

Paperid: 530, https://arxiv.org/pdf/2509.06052.pdf

Abstract:
Security patch detection (SPD) is crucial for maintaining software security, as unpatched vulnerabilities can lead to severe security risks. In recent years, numerous learning-based SPD approaches have demonstrated promising results on source code. However, these approaches typically cannot be applied to closed-source applications and proprietary systems that constitute a significant portion of real-world software, as they release patches only with binary files, and the source code is inaccessible. Given the impressive performance of code large language models (LLMs) in code intelligence and binary analysis tasks such as decompilation and compilation optimization, their potential for detecting binary security patches remains unexplored, exposing a significant research gap between their demonstrated low-level code understanding capabilities and this critical security task. To address this gap, we construct a large-scale binary patch dataset containing \textbf{19,448} samples, with two levels of representation: assembly code and pseudo-code, and systematically evaluate \textbf{19} code LLMs of varying scales to investigate their capability in binary SPD tasks. Our initial exploration demonstrates that directly prompting vanilla code LLMs struggles to accurately identify security patches from binary patches, and even state-of-the-art prompting techniques fail to mitigate the lack of domain knowledge in binary SPD within vanilla models. Drawing on the initial findings, we further investigate the fine-tuning strategy for injecting binary SPD domain knowledge into code LLMs through two levels of representation. Experimental results demonstrate that fine-tuned LLMs achieve outstanding performance, with the best results obtained on the pseudo-code representation.

Paperid: 531, https://arxiv.org/pdf/2509.03427.pdf

Abstract:
Federated Learning (FL) is a distributed machine learning approach that promises privacy by keeping the data on the device. However, gradient reconstruction and membership-inference attacks show that model updates still leak information. Fully Homomorphic Encryption (FHE) can address those privacy concerns but it suffers from ciphertext expansion and requires prohibitive overhead on resource-constrained devices. We propose the first Hybrid Homomorphic Encryption (HHE) framework for FL that pairs the PASTA symmetric cipher with the BFV FHE scheme. Clients encrypt local model updates with PASTA and send both the lightweight ciphertexts and the PASTA key (itself BFV-encrypted) to the server, which performs a homomorphic evaluation of the decryption circuit of PASTA and aggregates the resulting BFV ciphertexts. A prototype implementation, developed on top of the Flower FL framework, shows that on independently and identically distributed MNIST dataset with 12 clients and 10 training rounds, the proposed HHE system achieves 97.6% accuracy, just 1.3% below plaintext, while reducing client upload bandwidth by over 2,000x and cutting client runtime by 30% compared to a system based solely on the BFV FHE scheme. However, server computational cost increases by roughly 15621x for each client participating in the training phase, a challenge to be addressed in future work.

Paperid: 532, https://arxiv.org/pdf/2508.15314.pdf

Abstract:
The rapid growth of text-to-video (T2V) diffusion models has raised concerns about privacy, copyright, and safety due to their potential misuse in generating harmful or misleading content. These models are often trained on numerous datasets, including unauthorized personal identities, artistic creations, and harmful materials, which can lead to uncontrolled production and distribution of such content. To address this, we propose VideoEraser, a training-free framework that prevents T2V diffusion models from generating videos with undesirable concepts, even when explicitly prompted with those concepts. Designed as a plug-and-play module, VideoEraser can seamlessly integrate with representative T2V diffusion models via a two-stage process: Selective Prompt Embedding Adjustment (SPEA) and Adversarial-Resilient Noise Guidance (ARNG). We conduct extensive evaluations across four tasks, including object erasure, artistic style erasure, celebrity erasure, and explicit content erasure. Experimental results show that VideoEraser consistently outperforms prior methods regarding efficacy, integrity, fidelity, robustness, and generalizability. Notably, VideoEraser achieves state-of-the-art performance in suppressing undesirable content during T2V generation, reducing it by 46% on average across four tasks compared to baselines.

Paperid: 533, https://arxiv.org/pdf/2508.15310.pdf

Abstract:
Large language model (LLM) agents are widely deployed in real-world applications, where they leverage tools to retrieve and manipulate external data for complex tasks. However, when interacting with untrusted data sources (e.g., fetching information from public websites), tool responses may contain injected instructions that covertly influence agent behaviors and lead to malicious outcomes, a threat referred to as Indirect Prompt Injection (IPI). Existing defenses typically rely on advanced prompting strategies or auxiliary detection models. While these methods have demonstrated some effectiveness, they fundamentally rely on assumptions about the model's inherent security, which lacks structural constraints on agent behaviors. As a result, agents still retain unrestricted access to tool invocations, leaving them vulnerable to stronger attack vectors that can bypass the security guardrails of the model. To prevent malicious tool invocations at the source, we propose a novel defensive task execution paradigm, called IPIGuard, which models the agents' task execution process as a traversal over a planned Tool Dependency Graph (TDG). By explicitly decoupling action planning from interaction with external data, IPIGuard significantly reduces unintended tool invocations triggered by injected instructions, thereby enhancing robustness against IPI attacks. Experiments on the AgentDojo benchmark show that IPIGuard achieves a superior balance between effectiveness and robustness, paving the way for the development of safer agentic systems in dynamic environments.

Paperid: 534, https://arxiv.org/pdf/2508.07745.pdf

Abstract:
Insider threats, which can lead to severe losses, remain a major security concern. While machine learning-based insider threat detection (ITD) methods have shown promising results, their progress is hindered by the scarcity of high-quality data. Enterprise data is sensitive and rarely accessible, while publicly available datasets, when limited in scale due to cost, lack sufficient real-world coverage; and when purely synthetic, they fail to capture rich semantics and realistic user behavior. To address this, we propose Chimera, the first large language model (LLM)-based multi-agent framework that automatically simulates both benign and malicious insider activities and collects diverse logs across diverse enterprise environments. Chimera models each employee with agents that have role-specific behavior and integrates modules for group meetings, pairwise interactions, and autonomous scheduling, capturing realistic organizational dynamics. It incorporates 15 types of insider attacks (e.g., IP theft, system sabotage) and has been deployed to simulate activities in three sensitive domains: technology company, finance corporation, and medical institution, producing a new dataset, ChimeraLog. We assess ChimeraLog via human studies and quantitative analysis, confirming its diversity, realism, and presence of explainable threat patterns. Evaluations of existing ITD methods show an average F1-score of 0.83, which is significantly lower than 0.99 on the CERT dataset, demonstrating ChimeraLog's higher difficulty and utility for advancing ITD research.

Paperid: 535, https://arxiv.org/pdf/2508.05681.pdf

Abstract:
Active learning(AL), which serves as the representative label-efficient learning paradigm, has been widely applied in resource-constrained scenarios. The achievement of AL is attributed to acquisition functions, which are designed for identifying the most important data to label. Despite this success, one question remains unanswered: is AL safe? In this work, we introduce ALA, a practical and the first framework to utilize the acquisition function as the poisoning attack surface to reveal the weakness of active learning. Specifically, ALA optimizes imperceptibly poisoned inputs to exhibit high uncertainty scores, increasing their probability of being selected by acquisition functions. To evaluate ALA, we conduct extensive experiments across three datasets, three acquisition functions, and two types of clean-label backdoor triggers. Results show that our attack can achieve high success rates (up to 94%) even under low poisoning budgets (0.5%-1.0%) while preserving model utility and remaining undetectable to human annotators. Our findings remind active learning users: acquisition functions can be easily exploited, and active learning should be deployed with caution in trusted data scenarios.

Paperid: 536, https://arxiv.org/pdf/2508.03936.pdf

Abstract:
AI coding assistants like GitHub Copilot are rapidly transforming software development, but their safety remains deeply uncertain-especially in high-stakes domains like cybersecurity. Current red-teaming tools often rely on fixed benchmarks or unrealistic prompts, missing many real-world vulnerabilities. We present ASTRA, an automated agent system designed to systematically uncover safety flaws in AI-driven code generation and security guidance systems. ASTRA works in three stages: (1) it builds structured domain-specific knowledge graphs that model complex software tasks and known weaknesses; (2) it performs online vulnerability exploration of each target model by adaptively probing both its input space, i.e., the spatial exploration, and its reasoning processes, i.e., the temporal exploration, guided by the knowledge graphs; and (3) it generates high-quality violation-inducing cases to improve model alignment. Unlike prior methods, ASTRA focuses on realistic inputs-requests that developers might actually ask-and uses both offline abstraction guided domain modeling and online domain knowledge graph adaptation to surface corner-case vulnerabilities. Across two major evaluation domains, ASTRA finds 11-66% more issues than existing techniques and produces test cases that lead to 17% more effective alignment training, showing its practical value for building safer AI systems.

Paperid: 537, https://arxiv.org/pdf/2508.01351.pdf

Abstract:
Security issues are becoming increasingly significant with the rapid evolution of Non-fungible Tokens (NFTs). As NFTs are traded as digital assets, they have emerged as prime targets for cyber attackers. In the development of NFT smart contracts, there may exist undiscovered defects that could lead to substantial financial losses if exploited. To tackle this issue, this paper presents a framework called NATLM(NFT Assistant LLM), designed to detect potential defects in NFT smart contracts. The framework effectively identifies four common types of vulnerabilities in NFT smart contracts: ERC-721 Reentrancy, Public Burn, Risky Mutable Proxy, and Unlimited Minting. Relying exclusively on large language models (LLMs) for defect detection can lead to a high false-positive rate. To enhance detection performance, NATLM integrates static analysis with LLMs, specifically Gemini Pro 1.5. Initially, NATLM employs static analysis to extract structural, syntactic, and execution flow information from the code, represented through Abstract Syntax Trees (AST) and Control Flow Graphs (CFG). These extracted features are then combined with vectors of known defect examples to create a matrix for input into the knowledge base. Subsequently, the feature vectors and code vectors of the analyzed contract are compared with the contents of the knowledge base. Finally, the LLM performs deep semantic analysis to enhance detection capabilities, providing a more comprehensive and accurate identification of potential security issues. Experimental results indicate that NATLM analyzed 8,672 collected NFT smart contracts, achieving an overall precision of 87.72%, a recall of 89.58%, and an F1 score of 88.94%. The results outperform other baseline experiments, successfully identifying four common types of defects.

Paperid: 538, https://arxiv.org/pdf/2508.01346.pdf

Abstract:
The introduction of smart contract functionality marks the advent of the blockchain 2.0 era, enabling blockchain technology to support digital currency transactions and complex distributed applications. However, many smart contracts have been found to contain vulnerabilities and errors, leading to the loss of assets within the blockchain. Despite a range of tools that have been developed to identify vulnerabilities in smart contracts at the source code or bytecode level, most rely on a single modality, reducing performance, accuracy, and limited generalization capabilities. This paper proposes a multimodal deep learning approach, MultiCFV, which is designed specifically to analyze and detect erroneous control flow vulnerability, as well as identify code clones in smart contracts. Bytecode is generated from source code to construct control flow graphs, with graph embedding techniques extracting graph features. Abstract syntax trees are used to obtain syntax features, while code comments capture key commentary words and comment features. These three feature vectors are fused to create a database for code inspection, which is used to detect similar code and identify contract vulnerabilities. Experimental results demonstrate our method effectively combines structural, syntactic, and semantic information, improving the accuracy of smart contract vulnerability detection and clone detection.

Paperid: 539, https://arxiv.org/pdf/2508.01343.pdf

Abstract:
The increasing number of attacks on the contract layer of DApps has resulted in economic losses amounting to $66 billion. Vulnerabilities arise when contracts interact with external protocols without verifying the results of the calls, leading to exploit entry points such as flash loan attacks and reentrancy attacks. In this paper, we propose UEChecker, a deep learning-based tool that utilizes a call graph and a Graph Convolutional Network to detect unchecked external call vulnerabilities. We design the following components: An edge prediction module that reconstructs the feature representation of nodes and edges in the call graph; A node aggregation module that captures structural information from both the node itself and its neighbors, thereby enhancing feature representation between nodes and improving the model's understanding of the global graph structure; A Conformer Block module that integrates multi-head attention, convolutional modules, and feedforward neural networks to more effectively capture dependencies of different scales within the call graph, extending beyond immediate neighbors and enhancing the performance of vulnerability detection. Finally, we combine these modules with Graph Convolutional Network to detect unchecked external call vulnerabilities. By auditing the smart contracts of 608 DApps, our results show that our tool achieves an accuracy of 87.59% in detecting unchecked external call vulnerabilities. Furthermore, we compare our tool with GAT, LSTM, and GCN baselines, and in the comparison experiments, UEChecker consistently outperforms these models in terms of accuracy.

Paperid: 540, https://arxiv.org/pdf/2508.01249.pdf

Abstract:
Large Language Model (LLM) agents offer a powerful new paradigm for solving various problems by combining natural language reasoning with the execution of external tools. However, their dynamic and non-transparent behavior introduces critical security risks, particularly in the presence of prompt injection attacks. In this work, we propose a novel insight that treats the agent runtime traces as structured programs with analyzable semantics. Thus, we present AgentArmor, a program analysis framework that converts agent traces into graph intermediate representation-based structured program dependency representations (e.g., CFG, DFG, and PDG) and enforces security policies via a type system. AgentArmor consists of three key components: (1) a graph constructor that reconstructs the agent's runtime traces as graph-based intermediate representations with control and data flow described within; (2) a property registry that attaches security-relevant metadata of interacted tools \& data, and (3) a type system that performs static inference and checking over the intermediate representation. By representing agent behavior as structured programs, AgentArmor enables program analysis for sensitive data flow, trust boundaries, and policy violations. We evaluate AgentArmor on the AgentDojo benchmark, the results show that AgentArmor can reduce the ASR to 3\%, with the utility drop only 1\%.

Paperid: 541, https://arxiv.org/pdf/2508.00596.pdf

Abstract:
In decentralized federated learning (FL), multiple clients collaboratively learn a shared machine learning (ML) model by leveraging their privately held datasets distributed across the network, through interactive exchange of the intermediate model updates. To ensure data security, cryptographic techniques are commonly employed to protect model updates during aggregation. Despite growing interest in secure aggregation, existing works predominantly focus on protocol design and computational guarantees, with limited understanding of the fundamental information-theoretic limits of such systems. Moreover, optimal bounds on communication and key usage remain unknown in decentralized settings, where no central aggregator is available. Motivated by these gaps, we study the problem of decentralized secure aggregation (DSA) from an information-theoretic perspective. Specifically, we consider a network of $K$ fully-connected users, each holding a private input -- an abstraction of local training data -- who aim to securely compute the sum of all inputs. The security constraint requires that no user learns anything beyond the input sum, even when colluding with up to $T$ other users. We characterize the optimal rate region, which specifies the minimum achievable communication and secret key rates for DSA. In particular, we show that to securely compute one symbol of the desired input sum, each user must (i) transmit at least one symbol to others, (ii) hold at least one symbol of secret key, and (iii) all users must collectively hold no fewer than $K - 1$ independent key symbols. Our results establish the fundamental performance limits of DSA, providing insights for the design of provably secure and communication-efficient protocols in distributed learning systems.

Paperid: 542, https://arxiv.org/pdf/2507.18105.pdf

Abstract:
The rise of Large Language Models (LLMs) has led to the widespread deployment of LLM-based systems across diverse domains. As these systems proliferate, understanding the risks associated with their complex supply chains is increasingly important. LLM-based systems are not standalone as they rely on interconnected supply chains involving pretrained models, third-party libraries, datasets, and infrastructure. Yet, most risk assessments narrowly focus on model or data level, overlooking broader supply chain vulnerabilities. While recent studies have begun to address LLM supply chain risks, there remains a lack of benchmarks for systematic research. To address this gap, we introduce the first comprehensive dataset for analyzing and benchmarking LLM supply chain security. We collect 3,859 real-world LLM applications and perform interdependency analysis, identifying 109,211 models, 2,474 datasets, and 9,862 libraries. We extract model fine-tuning paths, dataset reuse, and library reliance, mapping the ecosystem's structure. To evaluate security, we gather 1,555 risk-related issues-50 for applications, 325 for models, 18 for datasets, and 1,229 for libraries from public vulnerability databases. Using this dataset, we empirically analyze component dependencies and risks. Our findings reveal deeply nested dependencies in LLM applications and significant vulnerabilities across the supply chain, underscoring the need for comprehensive security analysis. We conclude with practical recommendations to guide researchers and developers toward safer, more trustworthy LLM-enabled systems.

Paperid: 543, https://arxiv.org/pdf/2507.17978.pdf

Abstract:
Phishing emails continue to pose a significant threat to cybersecurity by exploiting human vulnerabilities through deceptive content and malicious payloads. While Machine Learning (ML) models are effective at detecting phishing threats, their performance largely relies on the quality and diversity of the training data. This paper presents MeAJOR (Merged email Assets from Joint Open-source Repositories) Corpus, a novel, multi-source phishing email dataset designed to overcome critical limitations in existing resources. It integrates 135894 samples representing a broad number of phishing tactics and legitimate emails, with a wide spectrum of engineered features. We evaluated the dataset's utility for phishing detection research through systematic experiments with four classification models (RF, XGB, MLP, and CNN) across multiple feature configurations. Results highlight the dataset's effectiveness, achieving 98.34% F1 with XGB. By integrating broad features from multiple categories, our dataset provides a reusable and consistent resource, while addressing common challenges like class imbalance, generalisability and reproducibility.

Paperid: 544, https://arxiv.org/pdf/2507.13313.pdf

Abstract:
This paper introduces a dataset and experimental study for decentralized federated learning (DFL) applied to IoT crowdsensing malware detection. The dataset comprises behavioral records from benign and eight malware families. A total of 21,582,484 original records were collected from system calls, file system activities, resource usage, kernel events, input/output events, and network records. These records were aggregated into 30-second windows, resulting in 342,106 features used for model training and evaluation. Experiments on the DFL platform compare traditional machine learning (ML), centralized federated learning (CFL), and DFL across different node counts, topologies, and data distributions. Results show that DFL maintains competitive performance while preserving data locality, outperforming CFL in most settings. This dataset provides a solid foundation for studying the security of IoT crowdsensing environments.

Paperid: 545, https://arxiv.org/pdf/2512.23778.pdf

Abstract:
In recent years, drone delivery, which utilizes unmanned aerial vehicles (UAVs) for package delivery and pickup, has gradually emerged as a crucial method in logistics. Since delivery drones are expensive and may carry valuable packages, they must maintain a safe distance from individuals until user-drone mutual authentication is confirmed. Despite numerous authentication schemes being developed, existing solutions are limited in authentication distance and lack resilience against sophisticated attacks. To this end, we introduce SyncGait, an implicit gait-based mutual authentication system for drone delivery. SyncGait leverages the user's unique arm swing as he walks toward the drone to achieve mutual authentication without requiring additional hardware or specific authentication actions. We conducted extensive experiments on 14 datasets collected from 31 subjects. The results demonstrate that SyncGait achieves an average accuracy of 99.84\% at a long distance ($>18m$) and exhibits strong resilience against various spoofing attacks, making it a robust, secure, and user-friendly solution in real-world scenarios.

Paperid: 546, https://arxiv.org/pdf/2512.18733.pdf

Abstract:
Large language model (LLM)-based multi-agent systems (MAS) have shown strong capabilities in solving complex tasks. As MAS become increasingly autonomous in various safety-critical tasks, detecting malicious agents has become a critical security concern. Although existing graph anomaly detection (GAD)-based defenses can identify anomalous agents, they mainly rely on coarse sentence-level information and overlook fine-grained lexical cues, leading to suboptimal performance. Moreover, the lack of interpretability in these methods limits their reliability and real-world applicability. To address these limitations, we propose XG-Guard, an explainable and fine-grained safeguarding framework for detecting malicious agents in MAS. To incorporate both coarse and fine-grained textual information for anomalous agent identification, we utilize a bi-level agent encoder to jointly model the sentence- and token-level representations of each agent. A theme-based anomaly detector further captures the evolving discussion focus in MAS dialogues, while a bi-level score fusion mechanism quantifies token-level contributions for explanation. Extensive experiments across diverse MAS topologies and attack scenarios demonstrate robust detection performance and strong interpretability of XG-Guard.

Paperid: 547, https://arxiv.org/pdf/2512.17710.pdf

Abstract:
Software Bill of Materials (SBOM) provides new opportunities for automated vulnerability identification in software products. While the industry is adopting SBOM-based Vulnerability Scanning (SVS) to identify vulnerabilities, we increasingly observe inconsistencies and unexpected behavior, that result in false negatives and silent failures. In this work, we present the background necessary to understand the underlying complexity of SVS and introduce SVS-TEST, a method and tool to analyze the capability, maturity, and failure conditions of SVS-tools in real-world scenarios. We showcase the utility of SVS-TEST in a case study evaluating seven real-world SVS-tools using 16 precisely crafted SBOMs and their respective ground truth. Our results unveil significant differences in the reliability and error handling of SVS-tools; multiple SVS-tools silently fail on valid input SBOMs, creating a false sense of security. We conclude our work by highlighting implications for researchers and practitioners, including how organizations and developers of SVS-tools can utilize SVS-TEST to monitor SVS capability and maturity. All results and research artifacts are made publicly available and all findings were disclosed to the SVS-tool developers ahead of time.

Paperid: 548, https://arxiv.org/pdf/2512.00098.pdf

Abstract:
This paper introduces GAMBiT (Guarding Against Malicious Biased Threats), a cognitive-informed cyber defense framework that leverages deviations from human rationality as a new defensive surface. Conventional cyber defenses assume rational, utility-maximizing attackers, yet real-world adversaries exhibit cognitive constraints and biases that shape their interactions with complex digital systems. GAMBiT embeds insights from cognitive science into cyber environments through cognitive triggers, which activate biases such as loss aversion, base-rate neglect, and sunk-cost fallacy, and through newly developed cognitive sensors that infer attackers' cognitive states from behavioral and network data. Three rounds of human-subject experiments (total n=61) in a simulated small business network demonstrate that these manipulations significantly disrupt attacker performance, reducing mission progress, diverting actions off the true attack path, and increasing detectability. These results demonstrate that cognitive biases can be systematically triggered to degrade the attacker's efficiency and enhance the defender's advantage. GAMBiT establishes a new paradigm in which the attacker's mind becomes part of the battlefield and cognitive manipulation becomes a proactive vector for cyber defense.

Paperid: 549, https://arxiv.org/pdf/2511.12956.pdf

Abstract:
Traffic Sign Recognition (TSR) systems play a critical role in Autonomous Driving (AD) systems, enabling real-time detection of road signs, such as STOP and speed limit signs. While these systems are increasingly integrated into commercial vehicles, recent research has exposed their vulnerability to physical-world adversarial appearance attacks. In such attacks, carefully crafted visual patterns are misinterpreted by TSR models as legitimate traffic signs, while remaining inconspicuous or benign to human observers. However, existing adversarial appearance attacks suffer from notable limitations. Pixel-level perturbation-based methods often lack stealthiness and tend to overfit to specific surrogate models, resulting in poor transferability to real-world TSR systems. On the other hand, text-to-image (T2I) diffusion model-based approaches demonstrate limited effectiveness and poor generalization to out-of-distribution sign types. In this paper, we present DiffSign, a novel T2I-based appearance attack framework designed to generate physically robust, highly effective, transferable, practical, and stealthy appearance attacks against TSR systems. To overcome the limitations of prior approaches, we propose a carefully designed attack pipeline that integrates CLIP-based loss and masked prompts to improve attack focus and controllability. We also propose two novel style customization methods to guide visual appearance and improve out-of-domain traffic sign attack generalization and attack stealthiness. We conduct extensive evaluations of DiffSign under varied real-world conditions, including different distances, angles, light conditions, and sign categories. Our method achieves an average physical-world attack success rate of 83.3%, leveraging DiffSign's high effectiveness in attack transferability.

Paperid: 550, https://arxiv.org/pdf/2511.08462.pdf

Abstract:
Static analysis tools provide a powerful means to detect security vulnerabilities by specifying queries that encode vulnerable code patterns. However, writing such queries is challenging and requires diverse expertise in security and program analysis. To address this challenge, we present QLCoder - an agentic framework that automatically synthesizes queries in CodeQL, a powerful static analysis engine, directly from a given CVE metadata. QLCode embeds an LLM in a synthesis loop with execution feedback, while constraining its reasoning using a custom MCP interface that allows structured interaction with a Language Server Protocol (for syntax guidance) and a RAG database (for semantic retrieval of queries and documentation). This approach allows QLCoder to generate syntactically and semantically valid security queries. We evaluate QLCode on 176 existing CVEs across 111 Java projects. Building upon the Claude Code agent framework, QLCoder synthesizes correct queries that detect the CVE in the vulnerable but not in the patched versions for 53.4% of CVEs. In comparison, using only Claude Code synthesizes 10% correct queries.

Paperid: 551, https://arxiv.org/pdf/2510.14381.pdf

Abstract:
Large language model (LLM) systems now underpin everyday AI applications such as chatbots, computer-use assistants, and autonomous robots, where performance often depends on carefully designed prompts. LLM-based prompt optimizers reduce that effort by iteratively refining prompts from scored feedback, yet the security of this optimization stage remains underexamined. We present the first systematic analysis of poisoning risks in LLM-based prompt optimization. Using HarmBench, we find systems are substantially more vulnerable to manipulated feedback than to injected queries: feedback-based attacks raise attack success rate (ASR) by up to $Δ$ASR = 0.48. We introduce a simple fake-reward attack that requires no access to the reward model and significantly increases vulnerability, and we propose a lightweight highlighting defense that reduces the fake-reward $Δ$ASR from 0.23 to 0.07 without degrading utility. These results establish prompt optimization pipelines as a first-class attack surface and motivate stronger safeguards for feedback channels and optimization frameworks.

Paperid: 552, https://arxiv.org/pdf/2510.05156.pdf

Abstract:
The deployment of autonomous AI agents in sensitive domains, such as healthcare, introduces critical risks to safety, security, and privacy. These agents may deviate from user objectives, violate data handling policies, or be compromised by adversarial attacks. Mitigating these dangers necessitates a mechanism to formally guarantee that an agent's actions adhere to predefined safety constraints, a challenge that existing systems do not fully address. We introduce VeriGuard, a novel framework that provides formal safety guarantees for LLM-based agents through a dual-stage architecture designed for robust and verifiable correctness. The initial offline stage involves a comprehensive validation process. It begins by clarifying user intent to establish precise safety specifications. VeriGuard then synthesizes a behavioral policy and subjects it to both testing and formal verification to prove its compliance with these specifications. This iterative process refines the policy until it is deemed correct. Subsequently, the second stage provides online action monitoring, where VeriGuard operates as a runtime monitor to validate each proposed agent action against the pre-verified policy before execution. This separation of the exhaustive offline validation from the lightweight online monitoring allows formal guarantees to be practically applied, providing a robust safeguard that substantially improves the trustworthiness of LLM agents.

Paperid: 553, https://arxiv.org/pdf/2510.03737.pdf

Abstract:
With the development of Internet of Things (IoT), it is gaining a lot of attention. It is important to secure the embedded systems with low overhead. The Linux Seccomp is widely used by developers to secure the kernels by blocking the access of unused syscalls, which introduces less overhead. However, there are no systematic Seccomp configuration approaches for IoT applications without the help of developers. In addition, the existing Seccomp configuration approaches are coarse-grained, which cannot analyze and limit the syscall arguments. In this paper, a novel static dependent syscall analysis approach for embedded applications is proposed, which can obtain all of the possible dependent syscalls and the corresponding arguments of the target applications. So, a fine-grained kernel access limitation can be performed for the IoT applications. To this end, the mappings between dynamic library APIs and syscalls according with their arguments are built, by analyzing the control flow graphs and the data dependency relationships of the dynamic libraries. To the best of our knowledge, this is the first work to generate the fine-grained Seccomp profile for embedded applications.

Paperid: 554, https://arxiv.org/pdf/2510.03720.pdf

Abstract:
Linux Seccomp is widely used by the program developers and the system maintainers to secure the operating systems, which can block unused syscalls for different applications and containers to shrink the attack surface of the operating systems. However, it is difficult to configure the whitelist of a container or application without the help of program developers. Docker containers block about only 50 syscalls by default, and lots of unblocked useless syscalls introduce a big kernel attack surface. To obtain the dependent syscalls, dynamic tracking is a straight-forward approach but it cannot get the full syscall list. Static analysis can construct an over-approximated syscall list, but the list contains many false positives. In this paper, a systematic dependent syscall analysis approach, sysverify, is proposed by combining static analysis and dynamic verification together to shrink the kernel attack surface. The semantic gap between the binary executables and syscalls is bridged by analyzing the binary and the source code, which builds the mapping between the library APIs and syscalls systematically. To further reduce the attack surface at best effort, we propose a dynamic verification approach to intercept and analyze the security of the invocations of indirect-call-related or rarely invoked syscalls with low overhead.

Paperid: 555, https://arxiv.org/pdf/2509.26598.pdf

Abstract:
Model fingerprinting has emerged as a promising paradigm for claiming model ownership. However, robustness evaluations of these schemes have mostly focused on benign perturbations such as incremental fine-tuning, model merging, and prompting. Lack of systematic investigations into {\em adversarial robustness} against a malicious model host leaves current systems vulnerable. To bridge this gap, we first define a concrete, practical threat model against model fingerprinting. We then take a critical look at existing model fingerprinting schemes to identify their fundamental vulnerabilities. Based on these, we develop adaptive adversarial attacks tailored for each vulnerability, and demonstrate that these can bypass model authentication completely for ten recently proposed fingerprinting schemes while maintaining high utility of the model for the end users. Our work encourages fingerprint designers to adopt adversarial robustness by design. We end with recommendations for future fingerprinting methods.

Paperid: 556, https://arxiv.org/pdf/2509.16030.pdf

Abstract:
As cloud computing continues to advance and become an integral part of modern IT infrastructure, container security has emerged as a critical factor in ensuring the smooth operation of cloud-native applications. An attacker can attack the service in the container or even perform the container escape attack by tampering with the files. Monitoring container files is important for APT detection and cyberspace security. Existing file monitoring methods are usually based on host operating system or virtual machine introspection to protect file security in real time. The methods based on the host operating system usually monitor file operations in the host operating system. However, when the container escapes to the host, the host operating system will no longer be secure, so these methods face the problem of weak security. Aiming at the problems of low security and high overload introduced in existing container file monitoring, a high-performance container file monitoring method based on virtual machine introspection is proposed. The experimental results show that the proposed approach can effectively monitor the container files and introduce an acceptable monitoring overload.

Paperid: 557, https://arxiv.org/pdf/2509.15756.pdf

Abstract:
Sequential deep learning models (e.g., RNN and LSTM) can learn the sequence features of software behaviors, such as API or syscall sequences. However, recent studies have shown that these deep learning-based approaches are vulnerable to adversarial samples. Attackers can use adversarial samples to change the sequential characteristics of behavior sequences and mislead malware classifiers. In this paper, an adversarial robustness anomaly detection method based on the analysis of behavior units is proposed to overcome this problem. We extract related behaviors that usually perform a behavior intention as a behavior unit, which contains the representative semantic information of local behaviors and can be used to improve the robustness of behavior analysis. By learning the overall semantics of each behavior unit and the contextual relationships among behavior units based on a multilevel deep learning model, our approach can mitigate perturbation attacks that target local and large-scale behaviors. In addition, our approach can be applied to both low-level and high-level behavior logs (e.g., API and syscall logs). The experimental results show that our approach outperforms all the compared methods, which indicates that our approach has better performance against obfuscation attacks.

Paperid: 558, https://arxiv.org/pdf/2509.14285.pdf

Abstract:
Prompt injection attacks represent a major vulnerability in Large Language Model (LLM) deployments, where malicious instructions embedded in user inputs can override system prompts and induce unintended behaviors. This paper presents a novel multi-agent defense framework that employs specialized LLM agents in coordinated pipelines to detect and neutralize prompt injection attacks in real-time. We evaluate our approach using two distinct architectures: a sequential chain-of-agents pipeline and a hierarchical coordinator-based system. Our comprehensive evaluation on 55 unique prompt injection attacks, grouped into 8 categories and totaling 400 attack instances across two LLM platforms (ChatGLM and Llama2), demonstrates significant security improvements. Without defense mechanisms, baseline Attack Success Rates (ASR) reached 30% for ChatGLM and 20% for Llama2. Our multi-agent pipeline achieved 100% mitigation, reducing ASR to 0% across all tested scenarios. The framework demonstrates robustness across multiple attack categories including direct overrides, code execution attempts, data exfiltration, and obfuscation techniques, while maintaining system functionality for legitimate queries.

Paperid: 559, https://arxiv.org/pdf/2509.11836.pdf

Abstract:
Sequence-based deep learning models (e.g., RNNs), can detect malware by analyzing its behavioral sequences. Meanwhile, these models are susceptible to adversarial attacks. Attackers can create adversarial samples that alter the sequence characteristics of behavior sequences to deceive malware classifiers. The existing methods for generating adversarial samples typically involve deleting or replacing crucial behaviors in the original data sequences, or inserting benign behaviors that may violate the behavior constraints. However, these methods that directly manipulate sequences make adversarial samples difficult to implement or apply in practice. In this paper, we propose an adversarial attack approach based on Deep Q-Network and a heuristic backtracking search strategy, which can generate perturbation sequences that satisfy practical conditions for successful attacks. Subsequently, we utilize a novel transformation approach that maps modifications back to the source code, thereby avoiding the need to directly modify the behavior log sequences. We conduct an evaluation of our approach, and the results confirm its effectiveness in generating adversarial samples from real-world malware behavior sequences, which have a high success rate in evading anomaly detection models. Furthermore, our approach is practical and can generate adversarial samples while maintaining the functionality of the modified software.

Paperid: 560, https://arxiv.org/pdf/2509.11786.pdf

Abstract:
Industrial control systems (ICSs) are widely used in industry, and their security and stability are very important. Once the ICS is attacked, it may cause serious damage. Therefore, it is very important to detect anomalies in ICSs. ICS can monitor and manage physical devices remotely using communication networks. The existing anomaly detection approaches mainly focus on analyzing the security of network traffic or sensor data. However, the behaviors of different domains (e.g., network traffic and sensor physical status) of ICSs are correlated, so it is difficult to comprehensively identify anomalies by analyzing only a single domain. In this paper, an anomaly detection approach based on cross-domain representation learning in ICSs is proposed, which can learn the joint features of multi-domain behaviors and detect anomalies within different domains. After constructing a cross-domain graph that can represent the behaviors of multiple domains in ICSs, our approach can learn the joint features of them by leveraging graph neural networks. Since anomalies behave differently in different domains, we leverage a multi-task learning approach to identify anomalies in different domains separately and perform joint training. The experimental results show that the performance of our approach is better than existing approaches for identifying anomalies in ICSs.

Paperid: 561, https://arxiv.org/pdf/2509.11242.pdf

Abstract:
Recently, the WebAssembly (or Wasm) technology has been rapidly evolving, with many runtimes actively under development, providing cross-platform secure sandboxes for Wasm modules to run as portable containers. Compared with Docker, which isolates applications at the operating system level, Wasm runtimes provide more security mechanisms, such as linear memory, type checking, and protected call stacks. Although Wasm is designed with security in mind and considered to be a more secure container runtime, various security challenges have arisen, and researchers have focused on the security of Wasm runtimes, such as discovering vulnerabilities or proposing new security mechanisms to achieve robust isolation. However, we have observed that the resource isolation is not well protected by the current Wasm runtimes, and attackers can exhaust the host's resources to interfere with the execution of other container instances by exploiting the WASI/WASIX interfaces. And the attack surface has not been well explored and measured. In this paper, we explore the resource isolation attack surface of Wasm runtimes systematically by proposing several static Wasm runtime analysis approaches. Based on the analysis results, we propose several exploitation strategies to break the resource isolation of Wasm runtimes. The experimental results show that malicious Wasm instances can not only consume large amounts of system resources on their own but also introduce high workloads into other components of the underlying operating system, leading to a substantial performance degradation of the whole system. In addition, the mitigation approaches have also been discussed.

Paperid: 562, https://arxiv.org/pdf/2509.08992.pdf

Abstract:
5G marks a major departure from previous cellular architectures, by transitioning from a monolithic design of the core network to a Service-Based Architecture (SBA) where services are modularized as Network Functions (NFs) which communicate with each other via standard-defined HTTP-based APIs called Service-Based Interfaces (SBIs). These NFs are deployed in private and public cloud infrastructure, and an access control framework based on OAuth restricts how they communicate with each other and obtain access to resources. Given the increased vulnerabilities of clouds to insiders, it is important to study the security of the 5G Core services for vulnerabilities that allow attackers to use compromised NFs to obtain unauthorized access to resources. We present FivGeeFuzz, a grammar-based fuzzing framework designed to uncover security flaws in 5G core SBIs. FivGeeFuzz automatically derives grammars from 3GPP API specifications to generate malformed, unexpected, or semantically inconsistent inputs, and it integrates automated bug detection with manual validation and root-cause analysis. We evaluate our approach on free5GC, the only open-source 5G core implementing Release 17-compliant SBIs with an access control mechanism. Using FivGeeFuzz, we discovered 8 previously unknown vulnerabilities in free5GC, leading to runtime crashes, improper error handling, and unauthorized access to resources, including a very severe attack we call Cross-Service Token Attack. All bugs were confirmed by the free5GC team, 7 have already been patched, and the remaining one has a patch under development.

Paperid: 563, https://arxiv.org/pdf/2509.05498.pdf

Abstract:
Cognitive vulnerabilities shape human decision-making and arise primarily from two sources: (1) cognitive capabilities, which include disparities in knowledge, education, expertise, or access to information, and (2) cognitive biases, such as rational inattention, confirmation bias, and base rate neglect, which influence how individuals perceive and process information. Exploiting these vulnerabilities allows an entity with superior cognitive awareness to gain a strategic advantage, a concept referred to as cognitive arbitrage. This paper investigates how to exploit the cognitive vulnerabilities of Advanced Persistent Threat (APT) attackers and proposes cognition-aware defenses that leverage windows of superiority to counteract attacks. Specifically, the proposed bi-level cyber warfare game focuses on "strategic-level" design for defensive deception mechanisms, which then facilitates "operational-level" actions and tactical-level execution of Tactics, Techniques, and Procedures (TTPs). Game-theoretic reasoning and analysis play a significant role in the cross-echelon quantitative modeling and design of cognitive arbitrage strategies. Our numerical results demonstrate that although the defender's initial advantage diminishes over time, strategically timed and deployed deception techniques can turn a negative value for the attacker into a positive one during the planning phase, and achieve at least a 40% improvement in total rewards during execution. This demonstrates that the defender can amplify even small initial advantages, sustain a strategic edge over the attacker, and secure long-term objectives, such as protecting critical assets throughout the attacker's lifecycle.

Paperid: 564, https://arxiv.org/pdf/2508.20963.pdf

Abstract:
We present three large-scale human-subjects red-team cyber range datasets from the Guarding Against Malicious Biased Threats (GAMBiT) project. Across Experiments 1-3 (July 2024-March 2025), 19-20 skilled attackers per experiment conducted two 8-hour days of self-paced operations in a simulated enterprise network (SimSpace Cyber Force Platform) while we captured multi-modal data: self-reports (background, demographics, psychometrics), operational notes, terminal histories, keylogs, network packet captures (PCAP), and NIDS alerts (Suricata). Each participant began from a standardized Kali Linux VM and pursued realistic objectives (e.g., target discovery and data exfiltration) under controlled constraints. Derivative curated logs and labels are included. The combined release supports research on attacker behavior modeling, bias-aware analytics, and method benchmarking. Data are available via IEEE Dataport entries for Experiments 1-3.

Paperid: 565, https://arxiv.org/pdf/2508.00912.pdf

Abstract:
Commercial LLM services often conceal internal reasoning traces while still charging users for every generated token, including those from hidden intermediate steps, raising concerns of token inflation and potential overbilling. This gap underscores the urgent need for reliable token auditing, yet achieving it is far from straightforward: cryptographic verification (e.g., hash-based signature) offers little assurance when providers control the entire execution pipeline, while user-side prediction struggles with the inherent variance of reasoning LLMs, where token usage fluctuates across domains and prompt styles. To bridge this gap, we present PALACE (Predictive Auditing of LLM APIs via Reasoning Token Count Estimation), a user-side framework that estimates hidden reasoning token counts from prompt-answer pairs without access to internal traces. PALACE introduces a GRPO-augmented adaptation module with a lightweight domain router, enabling dynamic calibration across diverse reasoning tasks and mitigating variance in token usage patterns. Experiments on math, coding, medical, and general reasoning benchmarks show that PALACE achieves low relative error and strong prediction accuracy, supporting both fine-grained cost auditing and inflation detection. Taken together, PALACE represents an important first step toward standardized predictive auditing, offering a practical path to greater transparency, accountability, and user trust.

Paperid: 566, https://arxiv.org/pdf/2507.03021.pdf

Abstract:
Cyber warfare has become a critical dimension of modern conflict, driven by society's increasing dependence on interconnected digital and physical infrastructure. Effective cyber defense often requires decision-making at different echelons, where the tactical layer focuses on detailed actions such as techniques, tactics, and procedures, while the strategic layer addresses long-term objectives and coordinated planning. Modeling these interactions at different echelons remains challenging due to the dynamic, large-scale, and interdependent nature of cyber environments. To address this, we propose a multi-resolution dynamic game framework in which the tactical layer captures fine-grained interactions using high-resolution extensive-form game trees, while the strategic layer is modeled as a Markov game defined over lower-resolution states abstracted from those game trees. This framework supports scalable reasoning and planning across different levels of abstraction through zoom-in and zoom-out operations that adjust the granularity of the modeling based on operational needs. A case study demonstrates how the framework works and its effectiveness in improving the defender's strategic advantage.

Paperid: 567, https://arxiv.org/pdf/2507.02281.pdf

Abstract:
Homomorphic ring signature schemes combine the strong anonymity of ring signatures with the computability of homomorphic signatures, demonstrating significant potential in scenarios requiring both anonymous data provenance and verifiable homomorphic computation (e.g., confidential blockchain transactions and secure multi-party computation). However, no feasible homomorphic ring signature scheme currently exists. In this work, we propose the first lattice-based linearly homomorphic ring signature scheme. Proven secure in the standard model under the small integer solution (SIS) assumption, our scheme achieves strong anonymity under full key exposure and unforgeability against insider corruption attacks. As the first unified framework for ring signatures and linear homomorphic signatures, this construction provides a post-quantum-secure solution for the aforementioned applications, advancing the development of privacy-enhanced homomorphic computation.

Paperid: 568, https://arxiv.org/pdf/2511.13143.pdf

Abstract:
Backdoor attacks pose a significant threat to deep learning models by implanting hidden vulnerabilities that can be activated by malicious inputs. While numerous defenses have been proposed to mitigate these attacks, the heterogeneous landscape of evaluation methodologies hinders fair comparison between defenses. This work presents a systematic (meta-)analysis of backdoor defenses through a comprehensive literature review and empirical evaluation. We analyzed 183 backdoor defense papers published between 2018 and 2025 across major AI and security venues, examining the properties and evaluation methodologies of these defenses. Our analysis reveals significant inconsistencies in experimental setups, evaluation metrics, and threat model assumptions in the literature. Through extensive experiments involving three datasets (MNIST, CIFAR-100, ImageNet-1K), four model architectures (ResNet-18, VGG-19, ViT-B/16, DenseNet-121), 16 representative defenses, and five commonly used attacks, totaling over 3\,000 experiments, we demonstrate that defense effectiveness varies substantially across different evaluation setups. We identify critical gaps in current evaluation practices, including insufficient reporting of computational overhead and behavior under benign conditions, bias in hyperparameter selection, and incomplete experimentation. Based on our findings, we provide concrete challenges and well-motivated recommendations to standardize and improve future defense evaluations. Our work aims to equip researchers and industry practitioners with actionable insights for developing, assessing, and deploying defenses to different systems.

Paperid: 569, https://arxiv.org/pdf/2511.11265.pdf

Abstract:
Software quality research increasingly relies on large-scale datasets that measure both the product and process aspects of software systems. However, existing resources often focus on limited dimensions, such as code smells, technical debt, or refactoring activity, thereby restricting comprehensive analyses across time and quality dimensions. To address this gap, we present the Software Quality Dataset (SQuaD), a multi-dimensional, time-aware collection of software quality metrics extracted from 450 mature open-source projects across diverse ecosystems, including Apache, Mozilla, FFmpeg, and the Linux kernel. By integrating nine state-of-the-art static analysis tools, i.e., SonarQube, CodeScene, PMD, Understand, CK, JaSoMe, RefactoringMiner, RefactoringMiner++, and PyRef, our dataset unifies over 700 unique metrics at method, class, file, and project levels. Covering a total of 63,586 analyzed project releases, SQuaD also provides version control and issue-tracking histories, software vulnerability data (CVE/CWE), and process metrics proven to enhance Just-In-Time (JIT) defect prediction. The SQuaD enables empirical research on maintainability, technical debt, software evolution, and quality assessment at unprecedented scale. We also outline emerging research directions, including automated dataset updates and cross-project quality modeling to support the continuous evolution of software analytics. The dataset is publicly available on ZENODO (DOI: 10.5281/zenodo.17566690).

Paperid: 570, https://arxiv.org/pdf/2511.06072.pdf

Abstract:
Backdoor attacks in machine learning have drawn significant attention for their potential to compromise models stealthily, yet most research has focused on homogeneous data such as images. In this work, we propose a novel backdoor attack on tabular data, which is particularly challenging due to the presence of both numerical and categorical features. Our key idea is a novel technique to convert categorical values into floating-point representations. This approach preserves enough information to maintain clean-model accuracy compared to traditional methods like one-hot or ordinal encoding. By doing this, we create a gradient-based universal perturbation that applies to all features, including categorical ones. We evaluate our method on five datasets and four popular models. Our results show up to a 100% attack success rate in both white-box and black-box settings (including real-world applications like Vertex AI), revealing a severe vulnerability for tabular data. Our method is shown to surpass the previous works like Tabdoor in terms of performance, while remaining stealthy against state-of-the-art defense mechanisms. We evaluate our attack against Spectral Signatures, Neural Cleanse, Beatrix, and Fine-Pruning, all of which fail to defend successfully against it. We also verify that our attack successfully bypasses popular outlier detection mechanisms.

Paperid: 571, https://arxiv.org/pdf/2510.06036.pdf

Abstract:
Large reasoning models (LRMs) with multi-step reasoning capabilities have shown remarkable problem-solving abilities, yet they exhibit concerning safety vulnerabilities that remain poorly understood. In this work, we investigate why safety alignment fails in reasoning models through a mechanistic interpretability lens. Using a linear probing approach to trace refusal intentions across token positions, we discover a striking phenomenon termed as \textbf{refusal cliff}: many poorly-aligned reasoning models correctly identify harmful prompts and maintain strong refusal intentions during their thinking process, but experience a sharp drop in refusal scores at the final tokens before output generation. This suggests that these models are not inherently unsafe; rather, their refusal intentions are systematically suppressed. Through causal intervention analysis, we identify a sparse set of attention heads that negatively contribute to refusal behavior. Ablating just 3\% of these heads can reduce attack success rates below 10\%. Building on these mechanistic insights, we propose \textbf{Cliff-as-a-Judge}, a novel data selection method that identifies training examples exhibiting the largest refusal cliff to efficiently repair reasoning models' safety alignment. This approach achieves comparable safety improvements using only 1.7\% of the vanilla safety training data, demonstrating a less-is-more effect in safety alignment.

Paperid: 572, https://arxiv.org/pdf/2510.05159.pdf

Abstract:
The practice of fine-tuning AI agents on data from their own interactions--such as web browsing or tool use--, while being a strong general recipe for improving agentic capabilities, also introduces a critical security vulnerability within the AI supply chain. In this work, we show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors that are triggerred by specific target phrases, such that when the agent encounters these triggers, it performs an unsafe or malicious action. We formalize and validate three realistic threat models targeting different layers of the supply chain: 1) direct poisoning of fine-tuning data, where an attacker controls a fraction of the training traces; 2) environmental poisoning, where malicious instructions are injected into webpages scraped or tools called while creating training data; and 3) supply chain poisoning, where a pre-backdoored base model is fine-tuned on clean data to improve its agentic capabilities. Our results are stark: by poisoning as few as 2% of the collected traces, an attacker can embed a backdoor causing an agent to leak confidential user information with over 80% success when a specific trigger is present. This vulnerability holds across all three threat models. Furthermore, we demonstrate that prominent safeguards, including two guardrail models and one weight-based defense, fail to detect or prevent the malicious behavior. These findings highlight an urgent threat to agentic AI development and underscore the critical need for rigorous security vetting of data collection processes and end-to-end model supply chains.

Paperid: 573, https://arxiv.org/pdf/2509.26032.pdf

Abstract:
Graph Neural Networks (GNNs) have demonstrated strong performance across tasks such as node classification, link prediction, and graph classification, but remain vulnerable to backdoor attacks that implant imperceptible triggers during training to control predictions. While node-level attacks exploit local message passing, graph-level attacks face the harder challenge of manipulating global representations while maintaining stealth. We identify two main sources of anomaly in existing graph classification backdoor methods: structural deviation from rare subgraph triggers and semantic deviation caused by label flipping, both of which make poisoned graphs easily detectable by anomaly detection models. To address this, we propose DPSBA, a clean-label backdoor framework that learns in-distribution triggers via adversarial training guided by anomaly-aware discriminators. DPSBA effectively suppresses both structural and semantic anomalies, achieving high attack success while significantly improving stealth. Extensive experiments on real-world datasets validate that DPSBA achieves a superior balance between effectiveness and detectability compared to state-of-the-art baselines.

Paperid: 574, https://arxiv.org/pdf/2509.23680.pdf

Abstract:
With the development of foundation AI technologies, task-executable voice assistants (VAs) have become more popular, enhancing user convenience and expanding device functionality. Android task-executable VAs are applications that are capable of understanding complex tasks and performing corresponding operations. Given their prevalence and great autonomy, there is no existing work examine the privacy risks within the voice assistants from the task-execution pattern in a holistic manner. To fill this research gap, this paper presents a user-centric comprehensive empirical study on privacy risks in Android task-executable VA applications. We collect ten mainstream VAs as our research target and analyze their operational characteristics. We then cross-check their privacy declarations across six sources, including privacy labels, policies, and manifest files, and our findings reveal widespread inconsistencies. Moreover, we uncover three significant privacy threat models: (1) privacy misdisclosure in mega apps, where integrated mini apps such as Alexa skills are inadequately represented; (2) privilege escalation via inter-application interactions, which exploit Android's communication mechanisms to bypass user consent; and (3) abuse of Google system applications, enabling apps to evade the declaration of dangerous permissions. Our study contributes actionable recommendations for practitioners and underscores broader relevance of these privacy risks to emerging autonomous AI agents.

Paperid: 575, https://arxiv.org/pdf/2509.10413.pdf

Abstract:
Bitcoin's limited scripting capabilities and lack of native interoperability mechanisms have constrained its integration into the broader blockchain ecosystem, especially decentralized finance (DeFi) and multi-chain applications. This paper presents a comprehensive taxonomy of Bitcoin cross-chain bridge protocols, systematically analyzing their trust assumptions, performance characteristics, and applicability to the Artificial Intelligence of Things (AIoT) scenarios. We categorize bridge designs into three main types: naive token swapping, pegged-asset bridges, and arbitrary-message bridges. Each category is evaluated across key metrics such as trust model, latency, capital efficiency, and DeFi composability. Emerging innovations like BitVM and recursive sidechains are highlighted for their potential to enable secure, scalable, and programmable Bitcoin interoperability. Furthermore, we explore practical use cases of cross-chain bridges in AIoT applications, including decentralized energy trading, healthcare data integration, and supply chain automation. This taxonomy provides a foundational framework for researchers and practitioners seeking to design secure and efficient cross-chain infrastructures in AIoT systems.

Paperid: 576, https://arxiv.org/pdf/2507.19060.pdf

Abstract:
We introduce PurpCode, the first post-training recipe for training safe code reasoning models towards generating secure code and defending against malicious cyberactivities. PurpCode trains a reasoning model in two stages: (i) Rule Learning, which explicitly teaches the model to reference cybersafety rules to generate vulnerability-free code and to avoid facilitating malicious cyberactivities; and (ii) Reinforcement Learning, which optimizes model safety and preserves model utility through diverse, multi-objective reward mechanisms. To empower the training pipelines with comprehensive cybersafety data, we conduct internal red-teaming to synthesize comprehensive and high-coverage prompts based on real-world tasks for inducing unsafe cyberactivities in the model. Based on PurpCode, we develop a reasoning-based coding model, namely PurpCode-32B, which demonstrates state-of-the-art cybersafety, outperforming various frontier models. Meanwhile, our alignment method decreases the model overrefusal rates in both general and cybersafety-specific scenarios, while preserving model utility in both code generation and common security knowledge.

Paperid: 577, https://arxiv.org/pdf/2507.04365.pdf

Abstract:
As large language models (LLMs) become more integral to society and technology, ensuring their safety becomes essential. Jailbreak attacks exploit vulnerabilities to bypass safety guardrails, posing a significant threat. However, the mechanisms enabling these attacks are not well understood. In this paper, we reveal a universal phenomenon that occurs during jailbreak attacks: Attention Slipping. During this phenomenon, the model gradually reduces the attention it allocates to unsafe requests in a user query during the attack process, ultimately causing a jailbreak. We show Attention Slipping is consistent across various jailbreak methods, including gradient-based token replacement, prompt-level template refinement, and in-context learning. Additionally, we evaluate two defenses based on query perturbation, Token Highlighter and SmoothLLM, and find they indirectly mitigate Attention Slipping, with their effectiveness positively correlated with the degree of mitigation achieved. Inspired by this finding, we propose Attention Sharpening, a new defense that directly counters Attention Slipping by sharpening the attention score distribution using temperature scaling. Experiments on four leading LLMs (Gemma2-9B-It, Llama3.1-8B-It, Qwen2.5-7B-It, Mistral-7B-It v0.2) show that our method effectively resists various jailbreak attacks while maintaining performance on benign tasks on AlpacaEval. Importantly, Attention Sharpening introduces no additional computational or memory overhead, making it an efficient and practical solution for real-world deployment.

Paperid: 578, https://arxiv.org/pdf/2512.24452.pdf

Abstract:
Semantic communications conveys task-relevant meaning rather than focusing solely on message reconstruction, improving bandwidth efficiency and robustness for next-generation wireless systems. However, learned semantic representations can still leak sensitive information to unintended receivers (eavesdroppers). This paper presents a deep learning-based semantic communication framework that jointly supports multiple receiver tasks while explicitly limiting semantic leakage to an eavesdropper. The legitimate link employs a learned encoder at the transmitter, while the receiver trains decoders for semantic inference and data reconstruction. The security problem is formulated via an iterative min-max optimization in which an eavesdropper is trained to improve its semantic inference, while the legitimate transmitter-receiver pair is trained to preserve task performance while reducing the eavesdropper's success. We also introduce an auxiliary layer that superimposes a cooperative, adversarially crafted perturbation on the transmitted waveform to degrade semantic leakage to an eavesdropper. Performance is evaluated over Rayleigh fading channels with additive white Gaussian noise using MNIST and CIFAR-10 datasets. Semantic accuracy and reconstruction quality improve with increasing latent dimension, while the min-max mechanism reduces the eavesdropper's inference performance significantly without degrading the legitimate receiver. The perturbation layer is successful in reducing semantic leakage even when the legitimate link is trained only for its own task. This comprehensive framework motivates semantic communication designs with tunable, end-to-end privacy against adaptive adversaries in realistic wireless settings.

Paperid: 579, https://arxiv.org/pdf/2512.22883.pdf

Abstract:
Cybersecurity is being fundamentally reshaped by foundation-model-based artificial intelligence. Large language models now enable autonomous planning, tool orchestration, and strategic adaptation at scale, challenging security architectures built on static rules, perimeter defenses, and human-centered workflows. This chapter argues for a shift from prevention-centric security toward agentic cyber resilience. Rather than seeking perfect protection, resilient systems must anticipate disruption, maintain critical functions under attack, recover efficiently, and learn continuously. We situate this shift within the historical evolution of cybersecurity paradigms, culminating in an AI-augmented paradigm where autonomous agents participate directly in sensing, reasoning, action, and adaptation across cyber and cyber-physical systems. We then develop a system-level framework for designing agentic AI workflows. A general agentic architecture is introduced, and attacker and defender workflows are analyzed as coupled adaptive processes, and game-theoretic formulations are shown to provide a unifying design language for autonomy allocation, information flow, and temporal composition. Case studies in automated penetration testing, remediation, and cyber deception illustrate how equilibrium-based design enables system-level resiliency design.

Paperid: 580, https://arxiv.org/pdf/2512.19286.pdf

Abstract:
Federated Learning (FL) has recently emerged as a revolutionary approach to collaborative training Machine Learning models. In particular, it enables decentralized model training while preserving data privacy, but its distributed nature makes it highly vulnerable to a severe attack known as Data Poisoning. In such scenarios, malicious clients inject manipulated data into the training process, thereby degrading global model performance or causing targeted misclassification. In this paper, we present a novel defense mechanism called GShield, designed to detect and mitigate malicious and low-quality updates, especially under non-independent and identically distributed (non-IID) data scenarios. GShield operates by learning the distribution of benign gradients through clustering and Gaussian modeling during an initial round, enabling it to establish a reliable baseline of trusted client behavior. With this benign profile, GShield selectively aggregates only those updates that align with the expected gradient patterns, effectively isolating adversarial clients and preserving the integrity of the global model. An extensive experimental campaign demonstrates that our proposed defense significantly improves model robustness compared to the state-of-the-art methods while maintaining a high accuracy of performance across both tabular and image datasets. Furthermore, GShield improves the accuracy of the targeted class by 43\% to 65\% after detecting malicious and low-quality clients.

Paperid: 581, https://arxiv.org/pdf/2512.17254.pdf

Abstract:
Federated Learning (FL) allows multiple clients to collaboratively train a model without sharing their private data. However, FL is vulnerable to Byzantine attacks, where adversaries manipulate client models to compromise the federated model, and privacy inference attacks, where adversaries exploit client models to infer private data. Existing defenses against both backdoor and privacy inference attacks introduce significant computational and communication overhead, creating a gap between theory and practice. To address this, we propose ABBR, a practical framework for Byzantine-robust and privacy-preserving FL. We are the first to utilize dimensionality reduction to speed up the private computation of complex filtering rules in privacy-preserving FL. Additionally, we analyze the accuracy loss of vector-wise filtering in low-dimensional space and introduce an adaptive tuning strategy to minimize the impact of malicious models that bypass filtering on the global model. We implement ABBR with state-of-the-art Byzantine-robust aggregation rules and evaluate it on public datasets, showing that it runs significantly faster, has minimal communication overhead, and maintains nearly the same Byzantine-resilience as the baselines.

Paperid: 582, https://arxiv.org/pdf/2512.10652.pdf

Abstract:
Advances in generative modeling have made it increasingly easy to fabricate realistic portrayals of individuals, creating serious risks for security, communication, and public trust. Detecting such person-driven manipulations requires systems that not only distinguish altered content from authentic media but also provide clear and reliable reasoning. In this paper, we introduce TriDF, a comprehensive benchmark for interpretable DeepFake detection. TriDF contains high-quality forgeries from advanced synthesis models, covering 16 DeepFake types across image, video, and audio modalities. The benchmark evaluates three key aspects: Perception, which measures the ability of a model to identify fine-grained manipulation artifacts using human-annotated evidence; Detection, which assesses classification performance across diverse forgery families and generators; and Hallucination, which quantifies the reliability of model-generated explanations. Experiments on state-of-the-art multimodal large language models show that accurate perception is essential for reliable detection, but hallucination can severely disrupt decision-making, revealing the interdependence of these three aspects. TriDF provides a unified framework for understanding the interaction between detection accuracy, evidence identification, and explanation reliability, offering a foundation for building trustworthy systems that address real-world synthetic media threats.

Paperid: 583, https://arxiv.org/pdf/2512.00966.pdf

Abstract:
Indirect prompt injection attacks (IPIAs), where large language models (LLMs) follow malicious instructions hidden in input data, pose a critical threat to LLM-powered agents. In this paper, we present IntentGuard, a general defense framework based on instruction-following intent analysis. The key insight of IntentGuard is that the decisive factor in IPIAs is not the presence of malicious text, but whether the LLM intends to follow instructions from untrusted data. Building on this insight, IntentGuard leverages an instruction-following intent analyzer (IIA) to identify which parts of the input prompt the model recognizes as actionable instructions, and then flag or neutralize any overlaps with untrusted data segments. To instantiate the framework, we develop an IIA that uses three "thinking intervention" strategies to elicit a structured list of intended instructions from reasoning-enabled LLMs. These techniques include start-of-thinking prefilling, end-of-thinking refinement, and adversarial in-context demonstration. We evaluate IntentGuard on two agentic benchmarks (AgentDojo and Mind2Web) using two reasoning-enabled LLMs (Qwen-3-32B and gpt-oss-20B). Results demonstrate that IntentGuard achieves (1) no utility degradation in all but one setting and (2) strong robustness against adaptive prompt injection attacks (e.g., reducing attack success rates from 100% to 8.5% in a Mind2Web scenario).

Paperid: 584, https://arxiv.org/pdf/2510.25025.pdf

Abstract:
Large language models (LLMs) have transformed natural language processing (NLP), enabling applications from content generation to decision support. Retrieval-Augmented Generation (RAG) improves LLMs by incorporating external knowledge but also introduces security risks, particularly from data poisoning, where the attacker injects poisoned texts into the knowledge database to manipulate system outputs. While various defenses have been proposed, they often struggle against advanced attacks. To address this, we introduce RAGuard, a detection framework designed to identify poisoned texts. RAGuard first expands the retrieval scope to increase the proportion of clean texts, reducing the likelihood of retrieving poisoned content. It then applies chunk-wise perplexity filtering to detect abnormal variations and text similarity filtering to flag highly similar texts. This non-parametric approach enhances RAG security, and experiments on large-scale datasets demonstrate its effectiveness in detecting and mitigating poisoning attacks, including strong adaptive attacks.

Paperid: 585, https://arxiv.org/pdf/2510.23960.pdf

Abstract:
With the rapid proliferation of digital media, the need for efficient and transparent safeguards against unsafe content is more critical than ever. Traditional image guardrail models, constrained by predefined categories, often misclassify content due to their pure feature-based learning without semantic reasoning. Moreover, these models struggle to adapt to emerging threats, requiring costly retraining for new threats. To address these limitations, we introduce SafeVision, a novel image guardrail that integrates human-like reasoning to enhance adaptability and transparency. Our approach incorporates an effective data collection and generation framework, a policy-following training pipeline, and a customized loss function. We also propose a diverse QA generation and training strategy to enhance learning effectiveness. SafeVision dynamically aligns with evolving safety policies at inference time, eliminating the need for retraining while ensuring precise risk assessments and explanations. Recognizing the limitations of existing unsafe image benchmarks, which either lack granularity or cover limited risks, we introduce VisionHarm, a high-quality dataset comprising two subsets: VisionHarm Third-party (VisionHarm-T) and VisionHarm Comprehensive(VisionHarm-C), spanning diverse harmful categories. Through extensive experiments, we show that SafeVision achieves state-of-the-art performance on different benchmarks. SafeVision outperforms GPT-4o by 8.6% on VisionHarm-T and by 15.5% on VisionHarm-C, while being over 16x faster. SafeVision sets a comprehensive, policy-following, and explainable image guardrail with dynamic adaptation to emerging threats.

Paperid: 586, https://arxiv.org/pdf/2510.12395.pdf

Abstract:
Malicious URL detection remains a critical cybersecurity challenge as adversaries increasingly employ sophisticated evasion techniques including obfuscation, character-level perturbations, and adversarial attacks. Although pre-trained language models (PLMs) like BERT have shown potential for URL analysis tasks, three limitations persist in current implementations: (1) inability to effectively model the non-natural hierarchical structure of URLs, (2) insufficient sensitivity to character-level obfuscation, and (3) lack of mechanisms to incorporate auxiliary network-level signals such as IP addresses-all essential for robust detection. To address these challenges, we propose CURL-IP, an advanced multi-modal detection framework incorporating three key innovations: (1) Token-Contrastive Representation Enhancer, which enhances subword token representations through token-aware contrastive learning to produce more discriminative and isotropic embeddings; (2) Cross-Layer Multi-Scale Aggregator, employing hierarchical aggregation of Transformer outputs via convolutional operations and gated MLPs to capture both local and global semantic patterns across layers; and (3) Blockwise Multi-Modal Coupler that decomposes URL-IP features into localized block units and computes cross-modal attention weights at the block level, enabling fine-grained inter-modal interaction. This architecture enables simultaneous preservation of fine-grained lexical cues, contextual semantics, and integration of network-level signals. Our evaluation on large-scale real-world datasets shows the framework significantly outperforms state-of-the-art baselines across binary and multi-class classification tasks.

Paperid: 587, https://arxiv.org/pdf/2510.12143.pdf

Abstract:
Federated learning (FL) is a privacy-preserving machine learning technique that facilitates collaboration among participants across demographics. FL enables model sharing, while restricting the movement of data. Since FL provides participants with independence over their training data, it becomes susceptible to poisoning attacks. Such collaboration also propagates bias among the participants, even unintentionally, due to different data distribution or historical bias present in the data. This paper proposes an intentional fairness attack, where a client maliciously sends a biased model, by increasing the fairness loss while training, even considering homogeneous data distribution. The fairness loss is calculated by solving an optimization problem for fairness metrics such as demographic parity and equalized odds. The attack is insidious and hard to detect, as it maintains global accuracy even after increasing the bias. We evaluate our attack against the state-of-the-art Byzantine-robust and fairness-aware aggregation schemes over different datasets, in various settings. The empirical results demonstrate the attack efficacy by increasing the bias up to 90\%, even in the presence of a single malicious client in the FL system.

Paperid: 588, https://arxiv.org/pdf/2510.10436.pdf

Abstract:
Post-quantum cryptography (PQC) is moving from evaluation to deployment as NIST finalizes standards for ML-KEM, ML-DSA, and SLH-DSA. This survey maps the space from foundations to practice. We first develop a taxonomy across lattice-, code-, hash-, multivariate-, isogeny-, and MPC-in-the-Head families, summarizing security assumptions, cryptanalysis, and standardization status. We then compare performance and communication costs using representative, implementation-grounded measurements, and review hardware acceleration (AVX2, FPGA/ASIC) and implementation security with a focus on side-channel resistance. Building upward, we examine protocol integration (TLS, DNSSEC), PKI and certificate hygiene, and deployment in constrained and high-assurance environments (IoT, cloud, finance, blockchain). We also discuss complementarity with quantum technologies (QKD, QRNGs) and the limits of near-term quantum computing. Throughout, we emphasize crypto-agility, hybrid migration, and evidence-based guidance for operators. We conclude with open problems spanning parameter agility, leakage-resilient implementations, and domain-specific rollout playbooks. This survey aims to be a practical reference for researchers and practitioners planning quantum-safe systems, bridging standards, engineering, and operations.

Paperid: 589, https://arxiv.org/pdf/2510.06525.pdf

Abstract:
Generative AI leaderboards are central to evaluating model capabilities, but remain vulnerable to manipulation. Among key adversarial objectives is rank manipulation, where an attacker must first deanonymize the models behind displayed outputs -- a threat previously demonstrated and explored for large language models (LLMs). We show that this problem can be even more severe for text-to-image leaderboards, where deanonymization is markedly easier. Using over 150,000 generated images from 280 prompts and 19 diverse models spanning multiple organizations, architectures, and sizes, we demonstrate that simple real-time classification in CLIP embedding space identifies the generating model with high accuracy, even without prompt control or historical data. We further introduce a prompt-level separability metric and identify prompts that enable near-perfect deanonymization. Our results indicate that rank manipulation in text-to-image leaderboards is easier than previously recognized, underscoring the need for stronger defenses.

Paperid: 590, https://arxiv.org/pdf/2510.04448.pdf

Abstract:
One-way puzzles (OWPuzzs) introduced by Khurana and Tomer [STOC 2024] are a natural quantum analogue of one-way functions (OWFs), and one of the most fundamental primitives in ''Microcrypt'' where OWFs do not exist but quantum cryptography is possible. OWPuzzs are implied by almost all quantum cryptographic primitives, and imply several important applications such as non-interactive commitments and multi-party computations. A significant goal in the field of quantum cryptography is to base OWPuzzs on plausible assumptions that will not imply OWFs. In this paper, we base OWPuzzs on hardness of non-collapsing measurements. To that end, we introduce a new complexity class, $\mathbf{SampPDQP}$, which is a sampling version of the decision class $\mathbf{PDQP}$ introduced in [Aaronson, Bouland, Fitzsimons, and Lee, ITCS 2016]. We show that if $\mathbf{SampPDQP}$ is hard on average for quantum polynomial time, then OWPuzzs exist. $\mathbf{SampPDQP}$ is the class of sampling problems that can be solved by a classical polynomial-time algorithm that can make a single query to a non-collapsing measurement oracle, which is a ''magical'' oracle that can sample measurement results on quantum states without collapsing the states. Such non-collapsing measurements are highly unphysical operations that should be hard to realize in quantum polynomial-time. We also study upperbounds of the hardness of $\mathbf{SampPDQP}$. We introduce a new primitive, distributional collision-resistant puzzles (dCRPuzzs), which are a natural quantum analogue of distributional collision-resistant hashing [Dubrov and Ishai, STOC 2006]. We show that dCRPuzzs imply average-case hardness of $\mathbf{SampPDQP}$ (and therefore OWPuzzs as well). We also show that two-message honest-statistically-hiding commitments with classical communication and one-shot signatures [Amos, Georgiou, Kiayias, Zhandry, STOC 2020] imply dCRPuzzs.

Paperid: 591, https://arxiv.org/pdf/2510.04159.pdf

Abstract:
With the rapid advances in quantum computer architectures and the emerging prospect of large-scale quantum memory, it is becoming essential to classically verify that remote devices genuinely allocate the promised quantum memory with specified number of qubits and coherence time. In this paper, we introduce a new concept, proofs of quantum memory (PoQM). A PoQM is an interactive protocol between a classical probabilistic polynomial-time (PPT) verifier and a quantum polynomial-time (QPT) prover over a classical channel where the verifier can verify that the prover has possessed a quantum memory with a certain number of qubits during a specified period of time. PoQM generalize the notion of proofs of quantumness (PoQ) [Brakerski, Christiano, Mahadev, Vazirani, and Vidick, JACM 2021]. Our main contributions are a formal definition of PoQM and its constructions based on hardness of LWE. Specifically, we give two constructions of PoQM. The first is of a four-round and has negligible soundness error under subexponential-hardness of LWE. The second is of a polynomial-round and has inverse-polynomial soundness error under polynomial-hardness of LWE. As a lowerbound of PoQM, we also show that PoQM imply one-way puzzles. Moreover, a certain restricted version of PoQM implies quantum computation classical communication (QCCC) key exchange.

Paperid: 592, https://arxiv.org/pdf/2510.01676.pdf

Abstract:
As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level vulnerabilities with real-world impact. This paper studies how adversarial attacks targeting an ML component can degrade or bypass an entire production-grade malware detection system, performing a case study analysis of Gmail's pipeline where file-type identification relies on a ML model. The malware detection pipeline in use by Gmail contains a machine learning model that routes each potential malware sample to a specialized malware classifier to improve accuracy and performance. This model, called Magika, has been open sourced. By designing adversarial examples that fool Magika, we can cause the production malware service to incorrectly route malware to an unsuitable malware detector thereby increasing our chance of evading detection. Specifically, by changing just 13 bytes of a malware sample, we can successfully evade Magika in 90% of cases and thereby allow us to send malware files over Gmail. We then turn our attention to defenses, and develop an approach to mitigate the severity of these types of attacks. For our defended production model, a highly resourced adversary requires 50 bytes to achieve just a 20% attack success rate. We implement this defense, and, thanks to a collaboration with Google engineers, it has already been deployed in production for the Gmail classifier.

Paperid: 593, https://arxiv.org/pdf/2510.00532.pdf

Abstract:
The Language Server Protocol (LSP) has revolutionized the integration of code intelligence in modern software development. There are approximately 300 LSP server implementations for various languages and 50 editors offering LSP integration. However, the reliability of LSP servers is a growing concern, as crashes can disable all code intelligence features and significantly impact productivity, while vulnerabilities can put developers at risk even when editing untrusted source code. Despite the widespread adoption of LSP, no existing techniques specifically target LSP server testing. To bridge this gap, we present LSPFuzz, a grey-box hybrid fuzzer for systematic LSP server testing. Our key insight is that effective LSP server testing requires holistic mutation of source code and editor operations, as bugs often manifest from their combinations. To satisfy the sophisticated constraints of LSP and effectively explore the input space, we employ a two-stage mutation pipeline: syntax-aware mutations to source code, followed by context-aware dispatching of editor operations. We evaluated LSPFuzz on four widely used LSP servers. LSPFuzz demonstrated superior performance compared to baseline fuzzers, and uncovered previously unknown bugs in real-world LSP servers. Of the 51 bugs we reported, 42 have been confirmed, 26 have been fixed by developers, and two have been assigned CVE numbers. Our work advances the quality assurance of LSP servers, providing both a practical tool and foundational insights for future research in this domain.

Paperid: 594, https://arxiv.org/pdf/2509.23179.pdf

Abstract:
Recent advancements in post-quantum cryptographic algorithms have led to their standardization by the National Institute of Standards and Technology (NIST) to safeguard information security in the post-quantum era. These algorithms, however, employ public keys and signatures that are 3 to 9$\times$ longer than those used in pre-quantum cryptography, resulting in significant performance and energy efficiency overheads. A critical bottleneck identified in our analysis is the cache bandwidth. This limitation motivates the adoption of on-chip in-/near-cache computing, a computing paradigm that offers high-performance, exceptional energy efficiency, and flexibility to accelerate post-quantum cryptographic algorithms. Our analysis of existing works reveals challenges in integrating in-/near-cache computing into modern computer systems and performance limitations due to external bandwidth limitation, highlighting the need for innovative solutions that can seamlessly integrate into existing systems without performance and energy efficiency issues. In this paper, we introduce a near-cache-slice computing paradigm with support of customization and virtual address, named Crypto-Near-Cache (CNC), designed to accelerate post-quantum cryptographic algorithms and other applications. By placing SRAM arrays with bitline computing capability near cache slices, high internal bandwidth and short data movement are achieved with native support of virtual addressing. An ISA extension to facilitate CNC is also proposed, with detailed discussion on the implementation aspects of the core/cache datapath.

Paperid: 595, https://arxiv.org/pdf/2509.22986.pdf

Abstract:
Secure communication is a critical requirement for Internet of Things (IoT) devices, which are often based on Microcontroller Units (MCUs). Current cryptographic solutions, which rely on software libraries or dedicated hardware accelerators, are fundamentally limited by the performance and energy costs of data movement between memory and processing units. This paper introduces CryptoSRAM, an in-SRAM computing architecture that performs cryptographic operations directly within the MCU's standard SRAM array. By repurposing the memory array into a massively parallel processing fabric, CryptoSRAM eliminates the data movement bottleneck. This approach is well-suited to MCUs, which utilize physical addressing and Direct Memory Access (DMA) to manage SRAM, allowing for seamless integration with minimal hardware overhead. Our analysis shows that for common cryptographic kernels, CryptoSRAM achieves throughput improvements of up to 74$\times$ and 67$\times$ for AES and SHA3, respectively, compared to a software implementation. Furthermore, our solution delivers up to 6$\times$ higher throughput than existing hardware accelerators for AES. CryptoSRAM demonstrates a viable and efficient architecture for secure communication in next-generation IoT systems.

Paperid: 596, https://arxiv.org/pdf/2509.22723.pdf

Abstract:
Diffusion models (DMs) have been investigated in various domains due to their ability to generate high-quality data, thereby attracting significant attention. However, similar to traditional deep learning systems, there also exist potential threats to DMs. To provide advanced and comprehensive insights into safety, ethics, and trust in DMs, this survey comprehensively elucidates its framework, threats, and countermeasures. Each threat and its countermeasures are systematically examined and categorized to facilitate thorough analysis. Furthermore, we introduce specific examples of how DMs are used, what dangers they might bring, and ways to protect against these dangers. Finally, we discuss key lessons learned, highlight open challenges related to DM security, and outline prospective research directions in this critical field. This work aims to accelerate progress not only in the technical capabilities of generative artificial intelligence but also in the maturity and wisdom of its application.

Paperid: 597, https://arxiv.org/pdf/2509.20943.pdf

Abstract:
Cyber Threat Intelligence (CTI) enables organizations to anticipate, detect, and mitigate evolving cyber threats. Its effectiveness depends on high-quality datasets, which support model development, training, evaluation, and benchmarking. Building such datasets is crucial, as attack vectors and adversary tactics continually evolve. Recently, Telegram has gained prominence as a valuable CTI source, offering timely and diverse threat-related information that can help address these challenges. In this work, we address these challenges by presenting an end-to-end automated pipeline that systematically collects and filters threat-related content from Telegram. The pipeline identifies relevant Telegram channels and scrapes 145,349 messages from 12 curated channels out of 150 identified sources. To accurately filter threat intelligence messages from generic content, we employ a BERT-based classifier, achieving an accuracy of 96.64%. From the filtered messages, we compile a dataset of 86,509 malicious Indicators of Compromise, including domains, IPs, URLs, hashes, and CVEs. This approach not only produces a large-scale, high-fidelity CTI dataset but also establishes a foundation for future research and operational applications in cyber threat detection.

Paperid: 598, https://arxiv.org/pdf/2509.13772.pdf

Abstract:
Retrieval-Augmented Generation (RAG) integrates external knowledge into large language models to improve response quality. However, recent work has shown that RAG systems are highly vulnerable to poisoning attacks, where malicious texts are inserted into the knowledge database to influence model outputs. While several defenses have been proposed, they are often circumvented by more adaptive or sophisticated attacks. This paper presents RAGOrigin, a black-box responsibility attribution framework designed to identify which texts in the knowledge database are responsible for misleading or incorrect generations. Our method constructs a focused attribution scope tailored to each misgeneration event and assigns a responsibility score to each candidate text by evaluating its retrieval ranking, semantic relevance, and influence on the generated response. The system then isolates poisoned texts using an unsupervised clustering method. We evaluate RAGOrigin across seven datasets and fifteen poisoning attacks, including newly developed adaptive poisoning strategies and multi-attacker scenarios. Our approach outperforms existing baselines in identifying poisoned content and remains robust under dynamic and noisy conditions. These results suggest that RAGOrigin provides a practical and effective solution for tracing the origins of corrupted knowledge in RAG systems.

Paperid: 599, https://arxiv.org/pdf/2509.10287.pdf

Abstract:
Malicious URL detection remains a major challenge in cybersecurity, primarily due to two factors: (1) the exponential growth of the Internet has led to an immense diversity of URLs, making generalized detection increasingly difficult; and (2) attackers are increasingly employing sophisticated obfuscation techniques to evade detection. We advocate that addressing these challenges fundamentally requires: (1) obtaining semantic understanding to improve generalization across vast and diverse URL sets, and (2) accurately modeling contextual relationships within the structural composition of URLs. In this paper, we propose a novel malicious URL detection method combining multi-granularity graph learning with semantic embedding to jointly capture semantic, character-level, and structural features for robust URL analysis. To model internal dependencies within URLs, we first construct dual-granularity URL graphs at both subword and character levels, where nodes represent URL tokens/characters and edges encode co-occurrence relationships. To obtain fine-grained embeddings, we initialize node representations using a character-level convolutional network. The two graphs are then processed through jointly trained Graph Convolutional Networks to learn consistent graph-level representations, enabling the model to capture complementary structural features that reflect co-occurrence patterns and character-level dependencies. Furthermore, we employ BERT to derive semantic representations of URLs for semantically aware understanding. Finally, we introduce a gated dynamic fusion network to combine the semantically enriched BERT representations with the jointly optimized graph vectors, further enhancing detection performance. We extensively evaluate our method across multiple challenging dimensions. Results show our method exceeds SOTA performance, including against large language models.

Paperid: 600, https://arxiv.org/pdf/2509.10252.pdf

Abstract:
The success of smart contracts has made them a target for attacks, but their closed-source nature often forces vulnerability detection to work on bytecode, which is inherently more challenging than source-code-based analysis. While recent studies try to align source and bytecode embeddings during training to transfer knowledge, current methods rely on graph-level alignment that obscures fine-grained structural and semantic correlations between the two modalities. Moreover, the absence of precise vulnerability patterns and granular annotations in bytecode leads to depriving the model of crucial supervisory signals for learning discriminant features. We propose ExDoS to transfer rich semantic knowledge from source code to bytecode, effectively supplementing the source code prior in practical settings. Specifically, we construct semantic graphs from source code and control-flow graphs from bytecode. To address obscured local signals in graph-level contract embeddings, we propose a Dual-Attention Graph Network introducing a novel node attention aggregation module to enhance local pattern capture in graph embeddings. Furthermore, by summarizing existing source code vulnerability patterns and designing a corresponding set of bytecode-level patterns for each, we construct the first dataset of vulnerability pattern annotations aligned with source code definitions to facilitate fine-grained cross-modal alignment and the capture of function-level vulnerability signals. Finally, we propose a dual-focus objective for our cross-modal distillation framework, comprising: a Global Semantic Distillation Loss for transferring graph-level knowledge and a Local Semantic Distillation Loss enabling expert-guided, fine-grained vulnerability-specific distillation. Experiments on real-world contracts demonstrate that our method achieves consistent F1-score improvements (3\%--6\%) over strong baselines.

Paperid: 601, https://arxiv.org/pdf/2509.03939.pdf

Abstract:
Current Ethereum fraud detection methods rely on context-independent, numerical transaction sequences, failing to capture semantic of account transactions. Furthermore, the pervasive homogeneity in Ethereum transaction records renders it challenging to learn discriminative account embeddings. Moreover, current self-supervised graph learning methods primarily learn node representations through graph reconstruction, resulting in suboptimal performance for node-level tasks like fraud account detection, while these methods also encounter scalability challenges. To tackle these challenges, we propose LMAE4Eth, a multi-view learning framework that fuses transaction semantics, masked graph embedding, and expert knowledge. We first propose a transaction-token contrastive language model (TxCLM) that transforms context-independent numerical transaction records into logically cohesive linguistic representations. To clearly characterize the semantic differences between accounts, we also use a token-aware contrastive learning pre-training objective together with the masked transaction model pre-training objective, learns high-expressive account representations. We then propose a masked account graph autoencoder (MAGAE) using generative self-supervised learning, which achieves superior node-level account detection by focusing on reconstructing account node features. To enable MAGAE to scale for large-scale training, we propose to integrate layer-neighbor sampling into the graph, which reduces the number of sampled vertices by several times without compromising training quality. Finally, using a cross-attention fusion network, we unify the embeddings of TxCLM and MAGAE to leverage the benefits of both. We evaluate our method against 21 baseline approaches on three datasets. Experimental results show that our method outperforms the best baseline by over 10% in F1-score on two of the datasets.

Paperid: 602, https://arxiv.org/pdf/2509.03860.pdf

Abstract:
Ethereum's rapid ecosystem expansion and transaction anonymity have triggered a surge in malicious activity. Detection mechanisms currently bifurcate into three technical strands: expert-defined features, graph embeddings, and sequential transaction patterns, collectively spanning the complete feature sets of Ethereum's native data layer. Yet the absence of cross-paradigm integration mechanisms forces practitioners to choose between sacrificing sequential context awareness, structured fund-flow patterns, or human-curated feature insights in their solutions. To bridge this gap, we propose KGBERT4Eth, a feature-complete pre-training encoder that synergistically combines two key components: (1) a Transaction Semantic Extractor, where we train an enhanced Transaction Language Model (TLM) to learn contextual semantic representations from conceptualized transaction records, and (2) a Transaction Knowledge Graph (TKG) that incorporates expert-curated domain knowledge into graph node embeddings to capture fund flow patterns and human-curated feature insights. We jointly optimize pre-training objectives for both components to fuse these complementary features, generating feature-complete embeddings. To emphasize rare anomalous transactions, we design a biased masking prediction task for TLM to focus on statistical outliers, while the Transaction TKG employs link prediction to learn latent transaction relationships and aggregate knowledge. Furthermore, we propose a mask-invariant attention coordination module to ensure stable dynamic information exchange between TLM and TKG during pre-training. KGBERT4Eth significantly outperforms state-of-the-art baselines in both phishing account detection and de-anonymization tasks, achieving absolute F1-score improvements of 8-16% on three phishing detection benchmarks and 6-26% on four de-anonymization datasets.

Paperid: 603, https://arxiv.org/pdf/2509.01211.pdf

Abstract:
With the proliferation of applications built upon LLM-driven multi-agent systems (MAS), the security of Web links has become a critical concern in ensuring system reliability. Once an agent is induced to visit a malicious website, attackers can use it as a springboard to conduct diverse subsequent attacks, which will drastically expand the attack surface. In this paper, we propose Web Fraud Attacks, a novel type of attack aiming at inducing MAS to visit malicious websites. We design 11 representative attack variants that encompass domain name tampering (homoglyph deception, character substitution, etc.), link structure camouflage (sub-directory nesting, sub-domain grafting, parameter obfuscation, etc.), and other deceptive techniques tailored to exploit MAS's vulnerabilities in link validation. Through extensive experiments on these crafted attack vectors, we demonstrate that Web fraud attacks not only exhibit significant destructive potential across different MAS architectures but also possess a distinct advantage in evasion: they circumvent the need for complex input formats such as jailbreaking, which inherently carry higher exposure risks. These results underscore the importance of addressing Web fraud attacks in LLM-driven MAS, as their stealthiness and destructiveness pose non-negligible threats to system security and user safety.

Paperid: 604, https://arxiv.org/pdf/2508.19277.pdf

Abstract:
Recent advances in Chain-of-Thought (CoT) prompting have substantially enhanced the reasoning capabilities of large language models (LLMs), enabling sophisticated problem-solving through explicit multi-step reasoning traces. However, these enhanced reasoning processes introduce novel attack surfaces, particularly vulnerabilities to computational inefficiency through unnecessarily verbose reasoning chains that consume excessive resources without corresponding performance gains. Prior overthinking attacks typically require restrictive conditions including access to external knowledge sources for data poisoning, reliance on retrievable poisoned content, and structurally obvious templates that limit practical applicability in real-world scenarios. To address these limitations, we propose POT (Prompt-Only OverThinking), a novel black-box attack framework that employs LLM-based iterative optimization to generate covert and semantically natural adversarial prompts, eliminating dependence on external data access and model retrieval. Extensive experiments across diverse model architectures and datasets demonstrate that POT achieves superior performance compared to other methods.

Paperid: 605, https://arxiv.org/pdf/2508.15987.pdf

Abstract:
Machine learning model repositories such as the Hugging Face Model Hub facilitate model exchanges. However, bad actors can deliver malware through compromised models. Existing defenses such as safer model formats, restrictive (but inflexible) loading policies, and model scanners have shortcomings: 44.9% of popular models on Hugging Face still use the insecure pickle format, 15% of these cannot be loaded by restrictive loading policies, and model scanners have both false positives and false negatives. Pickle remains the de facto standard for model exchange, and the ML community lacks a tool that offers transparent safe loading. We present PickleBall to help machine learning engineers load pickle-based models safely. PickleBall statically analyzes the source code of a given machine learning library and computes a custom policy that specifies a safe load-time behavior for benign models. PickleBall then dynamically enforces the policy during load time as a drop-in replacement for the pickle module. PickleBall generates policies that correctly load 79.8% of benign pickle-based models in our dataset, while rejecting all (100%) malicious examples in our dataset. In comparison, evaluated model scanners fail to identify known malicious models, and the state-of-art loader loads 22% fewer benign models than PickleBall. PickleBall removes the threat of arbitrary function invocation from malicious pickle-based models, raising the bar for attackers to depend on code reuse techniques.

Paperid: 606, https://arxiv.org/pdf/2508.11548.pdf

Abstract:
Copyright protection for large language models is of critical importance, given their substantial development costs, proprietary value, and potential for misuse. Existing surveys have predominantly focused on techniques for tracing LLM-generated content-namely, text watermarking-while a systematic exploration of methods for protecting the models themselves (i.e., model watermarking and model fingerprinting) remains absent. Moreover, the relationships and distinctions among text watermarking, model watermarking, and model fingerprinting have not been comprehensively clarified. This work presents a comprehensive survey of the current state of LLM copyright protection technologies, with a focus on model fingerprinting, covering the following aspects: (1) clarifying the conceptual connection from text watermarking to model watermarking and fingerprinting, and adopting a unified terminology that incorporates model watermarking into the broader fingerprinting framework; (2) providing an overview and comparison of diverse text watermarking techniques, highlighting cases where such methods can function as model fingerprinting; (3) systematically categorizing and comparing existing model fingerprinting approaches for LLM copyright protection; (4) presenting, for the first time, techniques for fingerprint transfer and fingerprint removal; (5) summarizing evaluation metrics for model fingerprints, including effectiveness, harmlessness, robustness, stealthiness, and reliability; and (6) discussing open challenges and future research directions. This survey aims to offer researchers a thorough understanding of both text watermarking and model fingerprinting technologies in the era of LLMs, thereby fostering further advances in protecting their intellectual property.

Paperid: 607, https://arxiv.org/pdf/2508.07505.pdf

Abstract:
Decentralized min-max optimization allows multi-agent systems to collaboratively solve global min-max optimization problems by facilitating the exchange of model updates among neighboring agents, eliminating the need for a central server. However, sharing model updates in such systems carry a risk of exposing sensitive data to inference attacks, raising significant privacy concerns. To mitigate these privacy risks, differential privacy (DP) has become a widely adopted technique for safeguarding individual data. Despite its advantages, implementing DP in decentralized min-max optimization poses challenges, as the added noise can hinder convergence, particularly in non-convex scenarios with complex agent interactions in min-max optimization problems. In this work, we propose an algorithm called DPMixSGD (Differential Private Minmax Hybrid Stochastic Gradient Descent), a novel privacy-preserving algorithm specifically designed for non-convex decentralized min-max optimization. Our method builds on the state-of-the-art STORM-based algorithm, one of the fastest decentralized min-max solutions. We rigorously prove that the noise added to local gradients does not significantly compromise convergence performance, and we provide theoretical bounds to ensure privacy guarantees. To validate our theoretical findings, we conduct extensive experiments across various tasks and models, demonstrating the effectiveness of our approach.

Paperid: 608, https://arxiv.org/pdf/2508.04208.pdf

Abstract:
As deep learning-based, data-driven information extraction systems become increasingly integrated into modern document processing workflows, one primary concern is the risk of malicious leakage of sensitive private data from these systems. While some recent works have explored Differential Privacy (DP) to mitigate these privacy risks, DP-based training is known to cause significant performance degradation and impose several limitations on standard training procedures, making its direct application to downstream tasks both difficult and costly. In this work, we aim to address the above challenges within the context of document image classification by substituting real private data with a synthetic counterpart. In particular, we propose to use conditional latent diffusion models (LDMs) in combination with differential privacy (DP) to generate class-specific synthetic document images under strict privacy constraints, which can then be utilized to train a downstream classifier following standard training procedures. We investigate our approach under various pretraining setups, including unconditional, class-conditional, and layout-conditional pretraining, in combination with multiple private training strategies such as class-conditional and per-label private fine-tuning with DPDM and DP-Promise algorithms. Additionally, we evaluate it on two well-known document benchmark datasets, RVL-CDIP and Tobacco3482, and show that it can generate useful and realistic document samples across various document types and privacy levels ($\varepsilon \in \{1, 5, 10\}$). Lastly, we show that our approach achieves substantial performance improvements in downstream evaluations on small-scale datasets, compared to the direct application of DP-Adam.

Paperid: 609, https://arxiv.org/pdf/2508.01365.pdf

Abstract:
Backdoor attacks pose a significant threat to Large Language Models (LLMs), where adversaries can embed hidden triggers to manipulate LLM's outputs. Most existing defense methods, primarily designed for classification tasks, are ineffective against the autoregressive nature and vast output space of LLMs, thereby suffering from poor performance and high latency. To address these limitations, we investigate the behavioral discrepancies between benign and backdoored LLMs in output space. We identify a critical phenomenon which we term sequence lock: a backdoored model generates the target sequence with abnormally high and consistent confidence compared to benign generation. Building on this insight, we propose ConfGuard, a lightweight and effective detection method that monitors a sliding window of token confidences to identify sequence lock. Extensive experiments demonstrate ConfGuard achieves a near 100\% true positive rate (TPR) and a negligible false positive rate (FPR) in the vast majority of cases. Crucially, the ConfGuard enables real-time detection almost without additional latency, making it a practical backdoor defense for real-world LLM deployments.

Paperid: 610, https://arxiv.org/pdf/2507.18034.pdf

Abstract:
The intellectual property of deep generative networks (GNets) can be protected using a cascaded hiding network (HNet) which embeds watermarks (or marks) into GNet outputs, known as box-free watermarking. Although both GNet and HNet are encapsulated in a black box (called operation network, or ONet), with only the generated and marked outputs from HNet being released to end users and deemed secure, in this paper, we reveal an overlooked vulnerability in such systems. Specifically, we show that the hidden GNet outputs can still be reliably estimated via query-based reverse engineering, leaking the generated and unmarked images, despite the attacker's limited knowledge of the system. Our first attempt is to reverse-engineer an inverse model for HNet under the stringent black-box condition, for which we propose to exploit the query process with specially curated input images. While effective, this method yields unsatisfactory image quality. To improve this, we subsequently propose an alternative method leveraging the equivalent additive property of box-free model watermarking and reverse-engineering a forward surrogate model of HNet, with better image quality preservation. Extensive experimental results on image processing and image generation tasks demonstrate that both attacks achieve impressive watermark removal success rates (100%) while also maintaining excellent image quality (reaching the highest PSNR of 34.69 dB), substantially outperforming existing attacks, highlighting the urgent need for robust defensive strategies to mitigate the identified vulnerability in box-free model watermarking.

Paperid: 611, https://arxiv.org/pdf/2507.08983.pdf

Abstract:
While poisoning attacks on machine learning models have been extensively studied, the mechanisms by which adversaries can distribute poisoned models at scale remain largely unexplored. In this paper, we shed light on how model leaderboards -- ranked platforms for model discovery and evaluation -- can serve as a powerful channel for adversaries for stealthy large-scale distribution of poisoned models. We present TrojanClimb, a general framework that enables injection of malicious behaviors while maintaining competitive leaderboard performance. We demonstrate its effectiveness across four diverse modalities: text-embedding, text-generation, text-to-speech and text-to-image, showing that adversaries can successfully achieve high leaderboard rankings while embedding arbitrary harmful functionalities, from backdoors to bias injection. Our findings reveal a significant vulnerability in the machine learning ecosystem, highlighting the urgent need to redesign leaderboard evaluation mechanisms to detect and filter malicious (e.g., poisoned) models, while exposing broader security implications for the machine learning community regarding the risks of adopting models from unverified sources.

Paperid: 612, https://arxiv.org/pdf/2507.02057.pdf

Abstract:
Large language models (LLMs) have democratized software development, reducing the expertise barrier for programming complex applications. This accessibility extends to malicious software development, raising significant security concerns. While LLM providers have implemented alignment mechanisms to prevent direct generation of overtly malicious code, these safeguards predominantly evaluate individual prompts in isolation, overlooking a critical vulnerability: malicious operations can be systematically decomposed into benign-appearing sub-tasks. In this paper, we introduce the Malware Generation Compiler (MGC), a novel framework that leverages this vulnerability through modular decomposition and alignment-evasive generation. MGC employs a specialized Malware Description Intermediate Representation (MDIR) to bridge high-level malicious intents and benign-appearing code snippets. Extensive evaluation demonstrates that our attack reliably generates functional malware across diverse task specifications and categories, outperforming jailbreaking methods by +365.79% and underground services by +78.07% in correctness on three benchmark datasets. Case studies further show that MGC can reproduce and even enhance 16 real-world malware samples. This work provides critical insights for security researchers by exposing the risks of compositional attacks against aligned AI systems. Demonstrations are available at https://sites.google.com/view/malware-generation-compiler.

Paperid: 613, https://arxiv.org/pdf/2512.12090.pdf

Abstract:
The advent of high-quality video generation models has amplified the need for robust watermarking schemes that can be used to reliably detect and track the provenance of generated videos. Existing video watermarking methods based on both post-hoc and in-generation approaches fail to simultaneously achieve imperceptibility, robustness, and computational efficiency. This work introduces a novel framework for in-generation video watermarking called SPDMark (pronounced `SpeedMark') based on selective parameter displacement of a video diffusion model. Watermarks are embedded into the generated videos by modifying a subset of parameters in the generative model. To make the problem tractable, the displacement is modeled as an additive composition of layer-wise basis shifts, where the final composition is indexed by the watermarking key. For parameter efficiency, this work specifically leverages low-rank adaptation (LoRA) to implement the basis shifts. During the training phase, the basis shifts and the watermark extractor are jointly learned by minimizing a combination of message recovery, perceptual similarity, and temporal consistency losses. To detect and localize temporal modifications in the watermarked videos, we use a cryptographic hashing function to derive frame-specific watermark messages from the given base watermarking key. During watermark extraction, maximum bipartite matching is applied to recover the correct frame order, even from temporally tampered videos. Evaluations on both text-to-video and image-to-video generation models demonstrate the ability of SPDMark to generate imperceptible watermarks that can be recovered with high accuracy and also establish its robustness against a variety of common video modifications.

Paperid: 614, https://arxiv.org/pdf/2511.19523.pdf

Abstract:
Large Language Models (LLMs) are highly effective for cybersecurity question answering (QA) but are difficult to deploy on edge devices due to their size. Quantization reduces memory and compute requirements but often degrades accuracy and increases vulnerability to adversarial attacks. We present EAGER, an edge-aligned defense framework that integrates parameter-efficient quantization with domain-specific preference alignment to jointly optimize efficiency, robustness, and accuracy. Unlike prior methods that address these aspects separately, EAGER leverages Quantized Low-Rank Adaptation (QLoRA) for low-cost fine-tuning and Direct Preference Optimization (DPO) on a self-constructed cybersecurity preference dataset, eliminating the need for human labels. Experiments show that EAGER reduces adversarial attack success rates by up to 7.3x and improves QA accuracy by up to 55% over state-of-the-art defenses, while achieving the lowest response latency on a Jetson Orin, demonstrating its practical edge deployment.

Paperid: 615, https://arxiv.org/pdf/2510.18232.pdf

Abstract:
Generating high-quality synthetic text under differential privacy (DP) is critical for training and evaluating language models without compromising user privacy. Prior work on synthesizing DP datasets often fail to preserve key statistical attributes, suffer utility loss from the noise required by DP, and lack fine-grained control over generation. To address these challenges, we make two contributions. First, we introduce a hierarchical framework that decomposes DP synthetic text generation into two subtasks: feature learning and conditional text generation. This design explicitly incorporates learned features into the generation process and simplifies the end-to-end synthesis task. Through systematic ablations, we identify the most effective configuration: a rich tabular schema as feature, a DP tabular synthesizer, and a DP fine-tuned conditional generator, which we term ACTG (Attribute-Conditioned Text Generation). Second, we propose Anchored RL (ARL), a post-training method that improves the instruction-following ability of ACTG for conditional generation. ARL combines RL to boost control with an SFT anchor on best-of-$N$ data to prevent reward hacking. Together, these components form our end-to-end algorithm ACTG-ARL, which advances both the quality of DP synthetic text (+20% MAUVE over prior work) and the control of the conditional generator under strong privacy guarantees.

Paperid: 616, https://arxiv.org/pdf/2510.12062.pdf

Abstract:
Random numbers play a vital role in many decentralized applications (dApps), such as gaming and decentralized finance (DeFi) applications. Existing random number provision mechanisms can be roughly divided into two categories, on-chain, and off-chain. On-chain approaches usually rely on the blockchain as the major input and all computations are done by blockchain nodes. The major risk for this type of method is that the input itself is susceptible to the adversary's influence. Off-chain approaches, as the name suggested, complete the generation without the involvement of blockchain nodes and share the result directly with a dApp. These mechanisms usually have a strong security assumption and high complexity. To mitigate these limitations and provide a framework that allows a dApp to balance different factors involved in random number generation, we propose a hybrid random number generation solution that leverages IoT devices equipped with trusted execution environment (TEE) as the randomness sources, and then utilizes a set of cryptographic tools to aggregate the multiple sources and obtain the final random number that can be consumed by the dApp. The new approach only needs one honest random source to guarantee the unbiasedness of the final random number and a user can configure the system to tolerate malicious participants who can refuse to respond to avoid unfavored results. We also provide a concrete construction that can further reduce the on-chain computation complexity to lower the cost of the solution in practice. We evaluate the computation and gas costs to demonstrate the effectiveness of the improvement.

Paperid: 617, https://arxiv.org/pdf/2510.05581.pdf

Abstract:
Traditional collaborative learning approaches are based on sharing of model weights between clients and a server. However, there are advantages to resource efficiency through schemes based on sharing of embeddings (activations) created from the data. Several differentially private methods were developed for sharing of weights while such mechanisms do not exist so far for sharing of embeddings. We propose Ours to learn a privacy encoding network in conjunction with a small utility generation network such that the final embeddings generated from it are equipped with formal differential privacy guarantees. These privatized embeddings are then shared with a more powerful server, that learns a post-processing that results in a higher accuracy for machine learning tasks. We show that our co-design of collaborative and private learning results in requiring only one round of privatized communication and lesser compute on the client than traditional methods. The privatized embeddings that we share from the client are agnostic to the type of model (deep learning, random forests or XGBoost) used on the server in order to process these activations to complete a task.

Paperid: 618, https://arxiv.org/pdf/2510.00293.pdf

Abstract:
Generative models can generate photorealistic images at scale. This raises urgent concerns about the ability to detect synthetically generated images and attribute these images to specific sources. While watermarking has emerged as a possible solution, existing methods remain fragile to realistic distortions, susceptible to adaptive removal, and expensive to update when the underlying watermarking key changes. We propose a general watermarking framework that formulates the encoding problem as key-dependent perturbation of the parameters of a generative model. Within this framework, we introduce Mixture of LoRA Markers (MOLM), a routing-based instantiation in which binary keys activate lightweight LoRA adapters inside residual and attention blocks. This design avoids key-specific re-training and achieves the desired properties such as imperceptibility, fidelity, verifiability, and robustness. Experiments on Stable Diffusion and FLUX show that MOLM preserves image quality while achieving robust key recovery against distortions, compression and regeneration, averaging attacks, and black-box adversarial attacks on the extractor.

Paperid: 619, https://arxiv.org/pdf/2509.21011.pdf

Abstract:
The remarkable capability of large language models (LLMs) has led to the wide application of LLM-based agents in various domains. To standardize interactions between LLM-based agents and their environments, model context protocol (MCP) tools have become the de facto standard and are now widely integrated into these agents. However, the incorporation of MCP tools introduces the risk of tool poisoning attacks, which can manipulate the behavior of LLM-based agents. Although previous studies have identified such vulnerabilities, their red teaming approaches have largely remained at the proof-of-concept stage, leaving the automatic and systematic red teaming of LLM-based agents under the MCP tool poisoning paradigm an open question. To bridge this gap, we propose AutoMalTool, an automated red teaming framework for LLM-based agents by generating malicious MCP tools. Our extensive evaluation shows that AutoMalTool effectively generates malicious MCP tools capable of manipulating the behavior of mainstream LLM-based agents while evading current detection mechanisms, thereby revealing new security risks in these agents.

Paperid: 620, https://arxiv.org/pdf/2509.18014.pdf

Abstract:
Tabular Generative Models are often argued to preserve privacy by creating synthetic datasets that resemble training data. However, auditing their empirical privacy remains challenging, as commonly used similarity metrics fail to effectively characterize privacy risk. Membership Inference Attacks (MIAs) have recently emerged as a method for evaluating privacy leakage in synthetic data, but their practical effectiveness is limited. Numerous attacks exist across different threat models, each with distinct implementations targeting various sources of privacy leakage, making them difficult to apply consistently. Moreover, no single attack consistently outperforms the others, leading to a routine underestimation of privacy risk. To address these issues, we propose a unified, model-agnostic threat framework that deploys a collection of attacks to estimate the maximum empirical privacy leakage in synthetic datasets. We introduce Synth-MIA, an open-source Python library that streamlines this auditing process through a novel testbed that integrates seamlessly into existing synthetic data evaluation pipelines through a Scikit-Learn-like API. Our software implements 13 attack methods through a Scikit-Learn-like API, designed to enable fast systematic estimation of privacy leakage for practitioners as well as facilitate the development of new attacks and experiments for researchers. We demonstrate our framework's utility in the largest tabular synthesis privacy benchmark to date, revealing that higher synthetic data quality corresponds to greater privacy leakage, that similarity-based privacy metrics show weak correlation with MIA results, and that the differentially private generator PATEGAN can fail to preserve privacy under such attacks. This underscores the necessity of MIA-based auditing when designing and deploying Tabular Generative Models.

Paperid: 621, https://arxiv.org/pdf/2509.13514.pdf

Abstract:
Large Language Models (LLMs) have recently demonstrated strong potential for cybersecurity question answering (QA), supporting decision-making in real-time threat detection and response workflows. However, their substantial computational demands pose significant challenges for deployment on resource-constrained edge devices. Quantization, a widely adopted model compression technique, can alleviate these constraints. Nevertheless, quantization may degrade model accuracy and increase susceptibility to adversarial attacks. Fine-tuning offers a potential means to mitigate these limitations, but its effectiveness when combined with quantization remains insufficiently explored. Hence, it is essential to understand the trade-offs among accuracy, efficiency, and robustness. We propose AQUA-LLM, an evaluation framework designed to benchmark several state-of-the-art small LLMs under four distinct configurations: base, quantized-only, fine-tuned, and fine-tuned combined with quantization, specifically for cybersecurity QA. Our results demonstrate that quantization alone yields the lowest accuracy and robustness despite improving efficiency. In contrast, combining quantization with fine-tuning enhances both LLM robustness and predictive performance, achieving an optimal balance of accuracy, robustness, and efficiency. These findings highlight the critical need for quantization-aware, robustness-preserving fine-tuning methodologies to enable the robust and efficient deployment of LLMs for cybersecurity QA.

Paperid: 622, https://arxiv.org/pdf/2509.13464.pdf

Abstract:
The expansion of edge computing has increased the attack surface, creating an urgent need for robust, real-time machine learning (ML)-based host intrusion detection systems (HIDS) that balance accuracy and efficiency. In such settings, inference latency poses a critical security risk, as delays may provide exploitable opportunities for attackers. However, many state-of-the-art ML-based HIDS solutions rely on computationally intensive architectures with high inference costs, limiting their practical deployment. This paper proposes LIGHT-HIDS, a lightweight machine learning framework that combines a compressed neural network feature extractor trained via Deep Support Vector Data Description (DeepSVDD) with an efficient novelty detection model. This hybrid approach enables the learning of compact, meaningful representations of normal system call behavior for accurate anomaly detection. Experimental results on multiple datasets demonstrate that LIGHT-HIDS consistently enhances detection accuracy while reducing inference time by up to 75x compared to state-of-the-art methods. These findings highlight its effectiveness and scalability as a machine learning-based solution for real-time host intrusion detection.

Paperid: 623, https://arxiv.org/pdf/2509.08091.pdf

Abstract:
The rapid proliferation of the Internet of Things (IoT) continues to expose critical security vulnerabilities, necessitating the development of efficient and robust intrusion detection systems (IDS). Machine learning-based intrusion detection systems (ML-IDS) have significantly improved threat detection capabilities; however, they remain highly susceptible to adversarial attacks. While numerous defense mechanisms have been proposed to enhance ML-IDS resilience, a systematic approach for selecting the most effective defense against a specific adversarial attack remains absent. To address this challenge, we previously proposed DYNAMITE, a dynamic defense selection approach that identifies the most suitable defense against adversarial attacks through an ML-driven selection mechanism. Building on this foundation, we propose SAGE (Sample-Aware Guarding Engine), a substantially improved defense algorithm that integrates active learning with targeted data reduction. It employs an active learning mechanism to selectively identify the most informative input samples and their corresponding optimal defense labels, which are then used to train a second-level learner responsible for selecting the most effective defense. This targeted sampling improves computational efficiency, exposes the model to diverse adversarial strategies during training, and enhances robustness, stability, and generalizability. As a result, SAGE demonstrates strong predictive performance across multiple intrusion detection datasets, achieving an average F1-score improvement of 201% over the state-of-the-art defenses. Notably, SAGE narrows the performance gap to the Oracle to just 3.8%, while reducing computational overhead by up to 29x.

Paperid: 624, https://arxiv.org/pdf/2509.07315.pdf

Abstract:
Large Language Models (LLMs) have exhibited great performance in autonomously calling various tools in external environments, leading to better problem solving and task automation capabilities. However, these external tools also amplify potential risks such as financial loss or privacy leakage with ambiguous or malicious user instructions. Compared to previous studies, which mainly assess the safety awareness of LLMs after obtaining the tool execution results (i.e., retrospective evaluation), this paper focuses on prospective ways to assess the safety of LLM tool utilization, aiming to avoid irreversible harm caused by directly executing tools. To this end, we propose SafeToolBench, the first benchmark to comprehensively assess tool utilization security in a prospective manner, covering malicious user instructions and diverse practical toolsets. Additionally, we propose a novel framework, SafeInstructTool, which aims to enhance LLMs' awareness of tool utilization security from three perspectives (i.e., \textit{User Instruction, Tool Itself, and Joint Instruction-Tool}), leading to nine detailed dimensions in total. We experiment with four LLMs using different methods, revealing that existing approaches fail to capture all risks in tool utilization. In contrast, our framework significantly enhances LLMs' self-awareness, enabling a more safe and trustworthy tool utilization.

Paperid: 625, https://arxiv.org/pdf/2509.05350.pdf

Abstract:
Membership Inference Attacks (MIAs) have emerged as a principled framework for auditing the privacy of synthetic data generated by tabular generative models, where many diverse methods have been proposed that each exploit different privacy leakage signals. However, in realistic threat scenarios, an adversary must choose a single method without a priori guarantee that it will be the empirically highest performing option. We study this challenge as a decision theoretic problem under uncertainty and conduct the largest synthetic data privacy benchmark to date. Here, we find that no MIA constitutes a strictly dominant strategy across a wide variety of model architectures and dataset domains under our threat model. Motivated by these findings, we propose ensemble MIAs and show that unsupervised ensembles built on individual attacks offer empirically more robust, regret-minimizing strategies than individual attacks.

Paperid: 626, https://arxiv.org/pdf/2509.03985.pdf

Abstract:
In deployment and application, large language models (LLMs) typically undergo safety alignment to prevent illegal and unethical outputs. However, the continuous advancement of jailbreak attack techniques, designed to bypass safety mechanisms with adversarial prompts, has placed increasing pressure on the security defenses of LLMs. Strengthening resistance to jailbreak attacks requires an in-depth understanding of the security mechanisms and vulnerabilities of LLMs. However, the vast number of parameters and complex structure of LLMs make analyzing security weaknesses from an internal perspective a challenging task. This paper presents NeuroBreak, a top-down jailbreak analysis system designed to analyze neuron-level safety mechanisms and mitigate vulnerabilities. We carefully design system requirements through collaboration with three experts in the field of AI security. The system provides a comprehensive analysis of various jailbreak attack methods. By incorporating layer-wise representation probing analysis, NeuroBreak offers a novel perspective on the model's decision-making process throughout its generation steps. Furthermore, the system supports the analysis of critical neurons from both semantic and functional perspectives, facilitating a deeper exploration of security mechanisms. We conduct quantitative evaluations and case studies to verify the effectiveness of our system, offering mechanistic insights for developing next-generation defense strategies against evolving jailbreak attacks.

Paperid: 627, https://arxiv.org/pdf/2509.03000.pdf

Abstract:
Open Radio Access Network (Open RAN) is reshaping mobile network architecture by promoting openness, disaggregation, and cross-vendor interoperability. However, this architectural flexibility introduces new security challenges, especially in deployments where multiple mobile network operators (MNOs) jointly operate shared components. Existing Zero Trust Architectures (ZTA) in O-RAN, as defined by governmental and industry standards, implicitly assume that authenticated components will comply with operational policies. However, this assumption creates a critical blind spot: misconfigured or compromised components can silently violate policies, misuse resources, or corrupt downstream processes (e.g., ML-based RIC xApps). To address this critical gap, we propose a monitoring framework for low-trust O-RAN environments that proactively verifies configuration state and control behavior against tenant-defined policies. Our system provides scalable, verifiable oversight to enhance transparency and trust in O-RAN operations. We implement and evaluate the framework using standardized O-RAN configurations, with total processing latency of approximately 200 ms, demonstrating its efficiency and practicality for timely policy enforcement and compliance auditing in multi-MNO deployments.

Paperid: 628, https://arxiv.org/pdf/2509.00634.pdf

Abstract:
Federated Learning (FL) has gained significant attention for its privacy-preserving capabilities, enabling distributed devices to collaboratively train a global model without sharing raw data. However, its distributed nature forces the central server to blindly trust the local training process and aggregate uncertain model updates, making it susceptible to Byzantine attacks from malicious participants, especially in mission-critical scenarios. Detecting such attacks is challenging due to the diverse knowledge across clients, where variations in model updates may stem from benign factors, such as non-IID data, rather than adversarial behavior. Existing data-driven defenses struggle to distinguish malicious updates from natural variations, leading to high false positive rates and poor filtering performance. To address this challenge, we propose Sentinel, a remote attestation (RA)-based scheme for FL systems that regains client-side transparency and mitigates Byzantine attacks from a system security perspective. Our system employs code instrumentation to track control-flow and monitor critical variables in the local training process. Additionally, we utilize a trusted training recorder within a Trusted Execution Environment (TEE) to generate an attestation report, which is cryptographically signed and securely transmitted to the server. Upon verification, the server ensures that legitimate client training processes remain free from program behavior violation or data manipulation, allowing only trusted model updates to be aggregated into the global model. Experimental results on IoT devices demonstrate that Sentinel ensures the trustworthiness of the local training integrity with low runtime and memory overhead.

Paperid: 629, https://arxiv.org/pdf/2508.19456.pdf

Abstract:
Minimizing computational overhead in time-series classification, particularly in deep learning models, presents a significant challenge due to the high complexity of model architectures and the large volume of sequential data that must be processed in real time. This challenge is further compounded by adversarial attacks, emphasizing the need for resilient methods that ensure robust performance and efficient model selection. To address this challenge, we propose ReLATE+, a comprehensive framework that detects and classifies adversarial attacks, adaptively selects deep learning models based on dataset-level similarity, and thus substantially reduces retraining costs relative to conventional methods that do not leverage prior knowledge, while maintaining strong performance. ReLATE+ first checks whether the incoming data is adversarial and, if so, classifies the attack type, using this insight to identify a similar dataset from a repository and enable the reuse of the best-performing associated model. This approach ensures strong performance while reducing the need for retraining, and it generalizes well across different domains with varying data distributions and feature spaces. Experiments show that ReLATE+ reduces computational overhead by an average of 77.68%, enhancing adversarial resilience and streamlining robust model selection, all without sacrificing performance, within 2.02% of Oracle.

Paperid: 630, https://arxiv.org/pdf/2508.19450.pdf

Abstract:
The Internet of Things (IoT), with its high degree of interconnectivity and limited computational resources, is particularly vulnerable to a wide range of cyber threats. Intrusion detection systems (IDS) have been extensively studied to enhance IoT security, and machine learning-based IDS (ML-IDS) show considerable promise for detecting malicious activity. However, their effectiveness is often constrained by poor adaptability to emerging threats and the issue of catastrophic forgetting during continuous learning. To address these challenges, we propose CITADEL, a self-supervised continual learning framework designed to extract robust representations from benign data while preserving long-term knowledge through optimized memory consolidation mechanisms. CITADEL integrates a tabular-to-image transformation module, a memory-aware masked autoencoder for self-supervised representation learning, and a novelty detection component capable of identifying anomalies without dependence on labeled attack data. Our design enables the system to incrementally adapt to emerging behaviors while retaining its ability to detect previously observed threats. Experiments on multiple intrusion datasets demonstrate that CITADEL achieves up to a 72.9% improvement over the VAE-based lifelong anomaly detector (VLAD) in key detection and retention metrics, highlighting its effectiveness in dynamic IoT environments.

Paperid: 631, https://arxiv.org/pdf/2507.22447.pdf

Abstract:
With the rapid expansion of web-based applications and cloud services, malicious JavaScript code continues to pose significant threats to user privacy, system integrity, and enterprise security. But, detecting such threats remains challenging due to sophisticated code obfuscation techniques and JavaScript's inherent language characteristics, particularly its nested closure structures and syntactic flexibility. In this work, we propose DeCoda, a hybrid defense framework that combines large language model (LLM)-based deobfuscation with code graph learning: (1) We first construct a sophisticated prompt-learning pipeline with multi-stage refinement, where the LLM progressively reconstructs the original code structure from obfuscated inputs and then generates normalized Abstract Syntax Tree (AST) representations; (2) In JavaScript ASTs, dynamic typing scatters semantically similar nodes while deeply nested functions fracture scope capturing, introducing structural noise and semantic ambiguity. To address these challenges, we then propose to learn hierarchical code graph representations via a Cluster-wise Graph that synergistically integrates graph transformer network, node clustering, and node-to-cluster attention to simultaneously capture both local node-level semantics and global cluster-induced structural relationships from AST graph. Experimental results demonstrate that our method achieves F1-scores of 94.64% and 97.71% on two benchmark datasets, demonstrating absolute improvements of 10.74% and 13.85% over state-of-the-art baselines. In false-positive control evaluation at fixed FPR levels (0.0001, 0.001, 0.01), our approach delivers 4.82, 5.91, and 2.53 higher TPR respectively compared to the best-performing baseline. These results highlight the effectiveness of LLM-based deobfuscation and underscore the importance of modeling cluster-level relationships in detecting malicious code.

Paperid: 632, https://arxiv.org/pdf/2507.12084.pdf

Abstract:
Smart contracts play a pivotal role in blockchain ecosystems, and fuzzing remains an important approach to securing smart contracts. Even though mutation scheduling is a key factor influencing fuzzing effectiveness, existing fuzzers have primarily explored seed scheduling and generation, while mutation scheduling has been rarely addressed by prior work. In this work, we propose a Large Language Models (LLMs)-based Multi-feedback Smart Contract Fuzzing framework (LLAMA) that integrates LLMs, evolutionary mutation strategies, and hybrid testing techniques. Key components of the proposed LLAMA include: (i) a hierarchical prompting strategy that guides LLMs to generate semantically valid initial seeds, coupled with a lightweight pre-fuzzing phase to select high-potential inputs; (ii) a multi-feedback optimization mechanism that simultaneously improves seed generation, seed selection, and mutation scheduling by leveraging runtime coverage and dependency feedback; and (iii) an evolutionary fuzzing engine that dynamically adjusts mutation operator probabilities based on effectiveness, while incorporating symbolic execution to escape stagnation and uncover deeper vulnerabilities. Our experiments demonstrate that LLAMA outperforms state-of-the-art fuzzers in both coverage and vulnerability detection. Specifically, it achieves 91% instruction coverage and 90% branch coverage, while detecting 132 out of 148 known vulnerabilities across diverse categories. These results highlight LLAMA's effectiveness, adaptability, and practicality in real-world smart contract security testing scenarios.

Paperid: 633, https://arxiv.org/pdf/2507.11112.pdf

Abstract:
Recent studies have shown that Large Language Models (LLMs) are vulnerable to data poisoning attacks, where malicious training examples embed hidden behaviours triggered by specific input patterns. However, most existing works assume a phrase and focus on the attack's effectiveness, offering limited understanding of trigger mechanisms and how multiple triggers interact within the model. In this paper, we present a framework for studying poisoning in LLMs. We show that multiple distinct backdoor triggers can coexist within a single model without interfering with each other, enabling adversaries to embed several triggers concurrently. Using multiple triggers with high embedding similarity, we demonstrate that poisoned triggers can achieve robust activation even when tokens are substituted or separated by long token spans. Our findings expose a broader and more persistent vulnerability surface in LLMs. To mitigate this threat, we propose a post hoc recovery method that selectively retrains specific model components based on a layer-wise weight difference analysis. Our method effectively removes the trigger behaviour with minimal parameter updates, presenting a practical and efficient defence against multi-trigger poisoning.

Paperid: 634, https://arxiv.org/pdf/2507.08270.pdf

Abstract:
The emergence of autonomous Large Language Model (LLM) agents capable of tool usage has introduced new safety risks that go beyond traditional conversational misuse. These agents, empowered to execute external functions, are vulnerable to both user-initiated threats (e.g., adversarial prompts) and tool-initiated threats (e.g., malicious outputs from compromised tools). In this paper, we propose the first unified safety-alignment framework for tool-using agents, enabling models to handle both channels of threat via structured reasoning and sandboxed reinforcement learning. We introduce a tri-modal taxonomy, including benign, malicious, and sensitive for both user prompts and tool responses, and define a policy-driven decision model. Our framework employs a custom-designed sandbox environment that simulates real-world tool execution and allows fine-grained reward shaping. Through extensive evaluations on public and self-built benchmarks, including Agent SafetyBench, InjecAgent, and BFCL, we demonstrate that our safety-aligned agents significantly improve resistance to security threats while preserving strong utility on benign tasks. Our results show that safety and effectiveness can be jointly optimized, laying the groundwork for trustworthy deployment of autonomous LLM agents.

Paperid: 635, https://arxiv.org/pdf/2511.11912.pdf

Abstract:
Graph machine learning has advanced rapidly in tasks such as link prediction, anomaly detection, and node classification. As models scale up, pretrained graph models have become valuable intellectual assets because they encode extensive computation and domain expertise. Building on these advances, Graph Foundation Models (GFMs) mark a major step forward by jointly pretraining graph and text encoders on massive and diverse data. This unifies structural and semantic understanding, enables zero-shot inference, and supports applications such as fraud detection and biomedical analysis. However, the high pretraining cost and broad cross-domain knowledge in GFMs also make them attractive targets for model extraction attacks (MEAs). Prior work has focused only on small graph neural networks trained on a single graph, leaving the security implications for large-scale and multimodal GFMs largely unexplored. This paper presents the first systematic study of MEAs against GFMs. We formalize a black-box threat model and define six practical attack scenarios covering domain-level and graph-specific extraction goals, architectural mismatch, limited query budgets, partial node access, and training data discrepancies. To instantiate these attacks, we introduce a lightweight extraction method that trains an attacker encoder using supervised regression of graph embeddings. Even without contrastive pretraining data, this method learns an encoder that stays aligned with the victim text encoder and preserves its zero-shot inference ability on unseen graphs. Experiments on seven datasets show that the attacker can approximate the victim model using only a tiny fraction of its original training cost, with almost no loss in accuracy. These findings reveal that GFMs greatly expand the MEA surface and highlight the need for deployment-aware security defenses in large-scale graph learning systems.

Paperid: 636, https://arxiv.org/pdf/2511.07480.pdf

Abstract:
With the widespread application of large language models (LLMs) in various fields, the security challenges they face have become increasingly prominent, especially the issue of jailbreak. These attacks induce the model to generate erroneous or uncontrolled outputs through crafted inputs, threatening the generality and security of the model. Although existing defense methods have shown some effectiveness, they often struggle to strike a balance between model generality and security. Excessive defense may limit the normal use of the model, while insufficient defense may lead to security vulnerabilities. In response to this problem, we propose a Knowledge Graph Defense Framework (KG-DF). Specifically, because of its structured knowledge representation and semantic association capabilities, Knowledge Graph(KG) can be searched by associating input content with safe knowledge in the knowledge base, thus identifying potentially harmful intentions and providing safe reasoning paths. However, traditional KG methods encounter significant challenges in keyword extraction, particularly when confronted with diverse and evolving attack strategies. To address this issue, we introduce an extensible semantic parsing module, whose core task is to transform the input query into a set of structured and secure concept representations, thereby enhancing the relevance of the matching process. Experimental results show that our framework enhances defense performance against various jailbreak attack methods, while also improving the response quality of the LLM in general QA scenarios by incorporating domain-general knowledge.

Paperid: 637, https://arxiv.org/pdf/2511.03248.pdf

Abstract:
Recent advances in multi-modal Large Language Models (M-LLMs) have demonstrated a powerful ability to synthesize implicit information from disparate sources, including images and text. These resourceful data from social media also introduce a significant and underexplored privacy risk: the inference of sensitive personal attributes from seemingly daily media content. However, the lack of benchmarks and comprehensive evaluations of state-of-the-art M-LLM capabilities hinders the research of private attribute profiling on social media. Accordingly, we propose (1) PRISM, the first multi-modal, multi-dimensional and fine-grained synthesized dataset incorporating a comprehensive privacy landscape and dynamic user history; (2) an Efficient evaluation framework that measures the cross-modal privacy inference capabilities of advanced M-LLM. Specifically, PRISM is a large-scale synthetic benchmark designed to evaluate cross-modal privacy risks. Its key feature is 12 sensitive attribute labels across a diverse set of multi-modal profiles, which enables targeted privacy analysis. These profiles are generated via a sophisticated LLM agentic workflow, governed by a prior distribution to ensure they realistically mimic social media users. Additionally, we propose a Multi-Agent Inference Framework that leverages a pipeline of specialized LLMs to enhance evaluation capabilities. We evaluate the inference capabilities of six leading M-LLMs (Qwen, Gemini, GPT-4o, GLM, Doubao, and Grok) on PRISM. The comparison with human performance reveals that these MLLMs significantly outperform in accuracy and efficiency, highlighting the threat of potential privacy risks and the urgent need for robust defenses. Dataset available at https://huggingface.co/datasets/xaddh/multimodal-privacy

Paperid: 638, https://arxiv.org/pdf/2510.14470.pdf

Abstract:
The emergence of graph foundation models (GFMs), particularly those incorporating language models (LMs), has revolutionized graph learning and demonstrated remarkable performance on text-attributed graphs (TAGs). However, compared to traditional GNNs, these LM-empowered GFMs introduce unique security vulnerabilities during the unsecured prompt tuning phase that remain understudied in current research. Through empirical investigation, we reveal a significant performance degradation in traditional graph backdoor attacks when operating in attribute-inaccessible constrained TAG systems without explicit trigger node attribute optimization. To address this, we propose a novel dual-trigger backdoor attack framework that operates at both text-level and struct-level, enabling effective attacks without explicit optimization of trigger node text attributes through the strategic utilization of a pre-established text pool. Extensive experimental evaluations demonstrate that our attack maintains superior clean accuracy while achieving outstanding attack success rates, including scenarios with highly concealed single-trigger nodes. Our work highlights critical backdoor risks in web-deployed LM-empowered GFMs and contributes to the development of more robust supervision mechanisms for open-source platforms in the era of foundation models.

Paperid: 639, https://arxiv.org/pdf/2510.10281.pdf

Abstract:
The integration of Large Language Models (LLMs) into computer applications has introduced transformative capabilities but also significant security challenges. Existing safety alignments, which primarily focus on semantic interpretation, leave LLMs vulnerable to attacks that use non-standard data representations. This paper introduces ArtPerception, a novel black-box jailbreak framework that strategically leverages ASCII art to bypass the security measures of state-of-the-art (SOTA) LLMs. Unlike prior methods that rely on iterative, brute-force attacks, ArtPerception introduces a systematic, two-phase methodology. Phase 1 conducts a one-time, model-specific pre-test to empirically determine the optimal parameters for ASCII art recognition. Phase 2 leverages these insights to launch a highly efficient, one-shot malicious jailbreak attack. We propose a Modified Levenshtein Distance (MLD) metric for a more nuanced evaluation of an LLM's recognition capability. Through comprehensive experiments on four SOTA open-source LLMs, we demonstrate superior jailbreak performance. We further validate our framework's real-world relevance by showing its successful transferability to leading commercial models, including GPT-4o, Claude Sonnet 3.7, and DeepSeek-V3, and by conducting a rigorous effectiveness analysis against potential defenses such as LLaMA Guard and Azure's content filters. Our findings underscore that true LLM security requires defending against a multi-modal space of interpretations, even within text-only inputs, and highlight the effectiveness of strategic, reconnaissance-based attacks. Content Warning: This paper includes potentially harmful and offensive model outputs.

Paperid: 640, https://arxiv.org/pdf/2510.07109.pdf

Abstract:
Consumer electronics (CE) connected to the Internet of Things are susceptible to various attacks, including DDoS and web-based threats, which can compromise their functionality and facilitate remote hijacking. These vulnerabilities allow attackers to exploit CE for broader system attacks while enabling the propagation of malicious code across the CE network, resulting in device failures. Existing deep learning-based traffic anomaly detection systems exhibit high accuracy in traditional network environments but are often overly complex and reliant on static infrastructure, necessitating manual configuration and management. To address these limitations, we propose a scalable network model that integrates Software-defined Networking (SDN) and Compute First Networking (CFN) for next-generation CE networks. In this network model, we propose a Graph Neural Networks-based Network Anomaly Detection framework (GNN-NAD) that integrates SDN-based CE networks and enables the CFN architecture. GNN-NAD uniquely fuses a static, vulnerability-aware attack graph with dynamic traffic features, providing a holistic view of network security. The core of the framework is a GNN model (GSAGE) for graph representation learning, followed by a Random Forest (RF) classifier. This design (GSAGE+RF) demonstrates superior performance compared to existing feature selection methods. Experimental evaluations on CE environment reveal that GNN-NAD achieves superior metrics in accuracy, recall, precision, and F1 score, even with small sample sizes, exceeding the performance of current network anomaly detection methods. This work advances the security and efficiency of next-generation intelligent CE networks.

Paperid: 641, https://arxiv.org/pdf/2509.26404.pdf

Abstract:
Fingerprinting Large Language Models (LLMs) is essential for provenance verification and model attribution. Existing methods typically extract post-hoc signatures based on training dynamics, data exposure, or hyperparameters -- properties that only emerge after training begins. In contrast, we propose a stronger and more intrinsic notion of LLM fingerprinting: SeedPrints, a method that leverages random initialization biases as persistent, seed-dependent identifiers present even before training. We show that untrained models exhibit reproducible token selection biases conditioned solely on their parameters at initialization. These biases are stable and measurable throughout training, enabling our statistical detection method to recover a model's lineage with high confidence. Unlike prior techniques, unreliable before convergence and vulnerable to distribution shifts, SeedPrints remains effective across all training stages and robust under domain shifts or parameter modifications. Experiments on LLaMA-style and Qwen-style models show that SeedPrints achieves seed-level distinguishability and can provide birth-to-lifecycle identity verification akin to a biometric fingerprint. Evaluations on large-scale pretrained models and fingerprinting benchmarks further confirm its effectiveness under practical deployment scenarios. These results suggest that initialization itself imprints a unique and persistent identity on neural language models, forming a true ''Galtonian'' fingerprint.

Paperid: 642, https://arxiv.org/pdf/2509.24823.pdf

Abstract:
We propose a high-payload image watermarking method for textual embedding, where a semantic description of the image - which may also correspond to the input text prompt-, is embedded inside the image. In order to be able to robustly embed high payloads in large-scale images - such as those produced by modern AI generators - the proposed approach builds upon a traditional watermarking scheme that exploits orthogonal and turbo codes for improved robustness, and integrates frequency-domain embedding and perceptual masking techniques to enhance watermark imperceptibility. Experiments show that the proposed method is extremely robust against a wide variety of image processing, and the embedded text can be retrieved also after traditional and AI inpainting, permitting to unveil the semantic modification the image has undergone via image-text mismatch analysis.

Paperid: 643, https://arxiv.org/pdf/2509.15694.pdf

Abstract:
Online voting enables individuals to participate in elections remotely, offering greater efficiency and accessibility in both governmental and organizational settings. As this method gains popularity, ensuring the security of online voting systems becomes increasingly vital, as the systems supporting it must satisfy a demanding set of security requirements. Most research in this area emphasizes the design and verification of cryptographic protocols to protect voter integrity and system confidentiality. However, other vectors, such as network traffic analysis, remain relatively understudied, even though they may pose significant threats to voter privacy and the overall trustworthiness of the system. In this paper, we examine how adversaries can exploit metadata from encrypted network traffic to uncover sensitive information during online voting. Our analysis reveals that, even without accessing the encrypted content, it is possible to infer critical voter actions, such as whether a person votes, the exact moment a ballot is submitted, and whether the ballot is valid or spoiled. We test these attacks with both rule-based techniques and machine learning methods. We evaluate our attacks on two widely used online voting platforms, one proprietary and one partially open source, achieving classification accuracy as high as 99.5%. These results expose a significant privacy vulnerability that threatens key properties of secure elections, including voter secrecy and protection against coercion or vote-buying. We explore mitigations to our attacks, demonstrating that countermeasures such as payload padding and timestamp equalization can substantially limit their effectiveness.

Paperid: 644, https://arxiv.org/pdf/2509.11123.pdf

Abstract:
The Domain Name System (DNS), which converts domain names to their respective IP addresses, has advanced enhancements aimed at safeguarding DNS data and users' identity from attackers. The recent privacy-focused advancements have enabled the IETF to standardize several protocols. Nevertheless, these protocols tend to focus on either strengthening user privacy (like Oblivious DNS and Oblivious DNS-over-HTTPS) or reducing resolution latency (as demonstrated by DNS-over-QUIC). Achieving both within a single protocol remains a key challenge, which we address in this paper. Our proposed protocol -- 'Oblivious DNS-over-QUIC' (ODoQ) -- leverages the benefits of the QUIC protocol and incorporates an intermediary proxy server to protect the client's identity from exposure to the recursive resolver.

Paperid: 645, https://arxiv.org/pdf/2509.10920.pdf

Abstract:
The rapid proliferation of network applications has led to a significant increase in network attacks. According to the OWASP Top 10 Projects report released in 2021, injection attacks rank among the top three vulnerabilities in software projects. This growing threat landscape has increased the complexity and workload of software testing, necessitating advanced tools to support agile development cycles. This paper introduces a novel test prioritization method for SQL injection vulnerabilities to enhance testing efficiency. By leveraging previous test outcomes, our method adjusts defense strength vectors for subsequent tests, optimizing the testing workflow and tailoring defense mechanisms to specific software needs. This approach aims to improve the effectiveness and efficiency of vulnerability detection and mitigation through a flexible framework that incorporates dynamic adjustments and considers the temporal aspects of vulnerability exposure.

Paperid: 646, https://arxiv.org/pdf/2509.09592.pdf

Abstract:
To combat phishing attacks -- aimed at luring web users to divulge their sensitive information -- various phishing detection approaches have been proposed. As attackers focus on devising new tactics to bypass existing detection solutions, researchers have adapted by integrating machine learning and deep learning into phishing detection. Phishing dataset collection is vital to developing effective phishing detection approaches, which highly depend on the diversity of the gathered datasets. The lack of diversity in the dataset results in a biased model. Since phishing websites are often short-lived, collecting them is also a challenge. Consequently, very few phishing webpage dataset repositories exist to date. No single repository comprehensively consolidates all phishing elements corresponding to a phishing webpage, namely, URL, webpage source code, screenshot, and related webpage resources. This paper introduces a resource collection tool designed to gather various resources associated with a URL, such as CSS, Javascript, favicons, webpage images, and screenshots. Our tool leverages PhishTank as the primary source for obtaining active phishing URLs. Our tool fetches several additional webpage resources compared to PyWebCopy Python library, which provides webpage content for a given URL. Additionally, we share a sample dataset generated using our tool comprising 4,056 legitimate and 5,666 phishing URLs along with their associated resources. We also remark on the top correlated phishing features with their associated class label found in our dataset. Our tool offers a comprehensive resource set that can aid researchers in developing effective phishing detection approaches.

Paperid: 647, https://arxiv.org/pdf/2509.08424.pdf

Abstract:
In the realm of cybersecurity, phishing stands as a prevalent cyber attack, where attackers employ various tactics to deceive users into gathering their sensitive information, potentially leading to identity theft or financial gain. Researchers have been actively working on advancing phishing webpage detection approaches to detect new phishing URLs, bolstering user protection. Nonetheless, the ever-evolving strategies employed by attackers, aimed at circumventing existing detection approaches and tools, present an ongoing challenge to the research community. This survey presents a systematic categorization of diverse phishing webpage detection approaches, encompassing URL-based, webpage content-based, and visual techniques. Through a comprehensive review of these approaches and an in-depth analysis of existing literature, our study underscores current research gaps in phishing webpage detection. Furthermore, we suggest potential solutions to address some of these gaps, contributing valuable insights to the ongoing efforts to combat phishing attacks.

Paperid: 648, https://arxiv.org/pdf/2509.08375.pdf

Abstract:
Phishing attacks are increasingly prevalent, with adversaries creating deceptive webpages to steal sensitive information. Despite advancements in machine learning and deep learning for phishing detection, attackers constantly develop new tactics to bypass detection models. As a result, phishing webpages continue to reach users, particularly those unable to recognize phishing indicators. To improve detection accuracy, models must be trained on large datasets containing both phishing and legitimate webpages, including URLs, webpage content, screenshots, and logos. However, existing tools struggle to collect the required resources, especially given the short lifespan of phishing webpages, limiting dataset comprehensiveness. In response, we introduce Phish-Blitz, a tool that downloads phishing and legitimate webpages along with their associated resources, such as screenshots. Unlike existing tools, Phish-Blitz captures live webpage screenshots and updates resource file paths to maintain the original visual integrity of the webpage. We provide a dataset containing 8,809 legitimate and 5,000 phishing webpages, including all associated resources. Our dataset and tool are publicly available on GitHub, contributing to the research community by offering a more complete dataset for phishing detection.

Paperid: 649, https://arxiv.org/pdf/2509.08364.pdf

Abstract:
The Domain Name System (DNS) serves as the backbone of the Internet, primarily translating domain names to IP addresses. Over time, various enhancements have been introduced to strengthen the integrity of DNS. Among these, DNSSEC stands out as a leading cryptographic solution. It protects against attacks (such as DNS spoofing) by establishing a chain of trust throughout the DNS nameserver hierarchy. However, DNSSEC's effectiveness is compromised when there is a break in this chain, resulting in "Islands of Security", where domains can authenticate locally but not across hierarchical levels, leading to a loss of trust and validation between them. Leading approaches to addressing these issues were centralized, with a single authority maintaining some kind of bulletin board. This approach requires significantly more infrastructure and places excessive trust in the entity responsible for managing it properly. In this paper, we propose a decentralized approach to addressing gaps in DNSSEC's chain of trust, commonly referred to as "Islands of Security". We leverage TLS and IP-based certificates to enable end-to-end authentication between hierarchical levels, eliminating the need for uniform DNSSEC deployment across every level of the DNS hierarchy. This approach enhances the overall integrity of DNSSEC, while reducing dependence on registrars for maintaining signature records to verify the child nameserver's authenticity. By offering a more flexible and efficient solution, our method strengthens DNS security and streamlines deployment across diverse environments.

Paperid: 650, https://arxiv.org/pdf/2508.20504.pdf

Abstract:
The Internet of Energy (IoE) integrates IoT-driven digital communication with power grids to enable efficient and sustainable energy systems. Still, its interconnectivity exposes critical infrastructure to sophisticated cyber threats, including adversarial attacks designed to bypass traditional safeguards. Unlike general IoT risks, IoE threats have heightened public safety consequences, demanding resilient solutions. From the networking-level safeguard perspective, we propose a Graph Structure Learning (GSL)-based safeguards framework that jointly optimizes graph topology and node representations to resist adversarial network model manipulation inherently. Through a conceptual overview, architectural discussion, and case study on a security dataset, we demonstrate GSL's superior robustness over representative methods, offering practitioners a viable path to secure IoE networks against evolving attacks. This work highlights the potential of GSL to enhance the resilience and reliability of future IoE networks for practitioners managing critical infrastructure. Lastly, we identify key open challenges and propose future research directions in this novel research area.

Paperid: 651, https://arxiv.org/pdf/2508.12132.pdf

Abstract:
Quantized Neural Networks (QNNs) are increasingly deployed in edge and resource-constrained environments due to their efficiency in computation and memory usage. While shown to distort the gradient landscape and weaken conventional pixel-level attacks, it provides limited robustness against patch-based adversarial attacks-localized, high-saliency perturbations that remain surprisingly transferable across bit-widths. Existing defenses either overfit to fixed quantization settings or fail to address this cross-bit generalization vulnerability. We introduce \textbf{TriQDef}, a tri-level quantization-aware defense framework designed to disrupt the transferability of patch-based adversarial attacks across QNNs. TriQDef consists of: (1) a Feature Disalignment Penalty (FDP) that enforces semantic inconsistency by penalizing perceptual similarity in intermediate representations; (2) a Gradient Perceptual Dissonance Penalty (GPDP) that explicitly misaligns input gradients across bit-widths by minimizing structural and directional agreement via Edge IoU and HOG Cosine metrics; and (3) a Joint Quantization-Aware Training Protocol that unifies these penalties within a shared-weight training scheme across multiple quantization levels. Extensive experiments on CIFAR-10 and ImageNet demonstrate that TriQDef reduces Attack Success Rates (ASR) by over 40\% on unseen patch and quantization combinations, while preserving high clean accuracy. Our findings underscore the importance of disrupting both semantic and perceptual gradient alignment to mitigate patch transferability in QNNs.

Paperid: 652, https://arxiv.org/pdf/2507.10845.pdf

Abstract:
Collaborative fuzzing combines multiple individual fuzzers and dynamically chooses appropriate combinations for different programs. Unlike individual fuzzers that rely on specific assumptions, collaborative fuzzing relaxes assumptions on target programs, providing robust performance across various programs. However, existing collaborative fuzzing frameworks face challenges including additional computational resource requirements and inefficient resource allocation among fuzzers. To tackle these challenges, we present BANDFUZZ, an ML-powered collaborative fuzzing framework that outperforms individual fuzzers without requiring additional computational resources. The key contribution of BANDFUZZ lies in its novel resource allocation algorithm driven by our proposed multi-armed bandits model. Different from greedy methods in existing frameworks, BANDFUZZ models the long-term impact of individual fuzzers, enabling discovery of globally optimal collaborative strategies. We propose a novel fuzzer evaluation method that assesses not only code coverage but also the fuzzer's capability of solving difficult branches. Finally, we integrate a real-time seed synchronization mechanism and implementation-wise optimizations to improve fuzzing efficiency and stability. Through extensive experiments on Fuzzbench and Fuzzer Test Suite, we show that BANDFUZZ outperforms state-of-the-art collaborative fuzzing framework autofz and widely used individual fuzzers. We verify BANDFUZZ's key designs through comprehensive ablation study. Notably, we demonstrate BANDFUZZ's effectiveness in real-world bug detection by analyzing results of a worldwide fuzzing competition, where BANDFUZZ won first place.

Paperid: 653, https://arxiv.org/pdf/2512.13666.pdf

Abstract:
The security and decentralization of Proof-of-Work (PoW) have been well-tested in existing blockchain systems. However, its tremendous energy waste has raised concerns about sustainability. Proof-of-Useful-Work (PoUW) aims to redirect the meaningless computation to meaningful tasks such as solving machine learning (ML) problems, giving rise to the branch of Proof-of-Learning (PoL). While previous studies have proposed various PoLs, they all, to some degree, suffer from security, decentralization, or efficiency issues. In this paper, we propose a PoL framework that trains ML models efficiently while maintaining blockchain security in a fully distributed manner. We name the framework SEDULity, which stands for a Secure, Efficient, Distributed, and Useful Learning-based blockchain system. Specifically, we encode the template block into the training process and design a useful function that is difficult to solve but relatively easy to verify, as a substitute for the PoW puzzle. We show that our framework is distributed, secure, and efficiently trains ML models. We further demonstrate that the proposed PoL framework can be extended to other types of useful work and design an incentive mechanism to incentivize task verification. We show theoretically that a rational miner is incentivized to train fully honestly with well-designed system parameters. Finally, we present simulation results to demonstrate the performance of our framework and validate our analysis.

Paperid: 654, https://arxiv.org/pdf/2512.05745.pdf

Abstract:
Multimodal Large Language Models (MLLMs) are increasingly vulnerable to multimodal Indirect Prompt Injection (IPI) attacks, which embed malicious instructions in images, videos, or audio to hijack model behavior. Existing defenses, designed primarily for text-only LLMs, are unsuitable for countering these multimodal threats, as they are easily bypassed, modality-dependent, or generalize poorly. Inspired by activation steering researches, we hypothesize that a robust, general defense independent of modality can be achieved by steering the model's behavior in the representation space. Through extensive experiments, we discover that the instruction-following behavior of MLLMs is encoded in a subspace. Steering along directions within this subspace can enforce adherence to user instructions, forming the basis of a defense. However, we also found that a naive defense direction could be coupled with a utility-degrading direction, and excessive intervention strength harms model performance. To address this, we propose ARGUS, which searches for an optimal defense direction within the safety subspace that decouples from the utility degradation direction, further combining adaptive strength steering to achieve a better safety-utility trade-off. ARGUS also introduces lightweight injection detection stage to activate the defense on-demand, and a post-filtering stage to verify defense success. Experimental results show that ARGUS can achieve robust defense against multimodal IPI while maximally preserving the MLLM's utility.

Paperid: 655, https://arxiv.org/pdf/2511.11538.pdf

Abstract:
Bitcoin's (BTC) Difficulty Adjustment Algorithm (DAA) has been a source of vulnerability for incentive attacks such as selfish mining, block withholding and coin hopping strategies. In this paper, first, we rigorously study the short-term revenue change per hashpower of the adversarial and honest miners for these incentive attacks. To study the long-term effects, we introduce a new efficiency metric defined as the revenue/cost per hashpower per time for the attacker and the honest miners. Our results indicate that the short-term benefits of intermittent mining strategies are negligible compared to the original selfish mining attack, and in the long-term, selfish mining provides better efficiency. We further demonstrate that a coin hopping strategy between BTC and Bitcoin Cash (BCH) relying on BTC DAA benefits the loyal honest miners of BTC in the same way and to the same extent per unit of computational power as it does the hopper in the short-term. For the long-term, we establish a new boundary between the selfish mining and coin hopping attack, identifying the optimal efficient strategy for each parameter. For block withholding strategies, it turns out, the honest miners outside the pool profit from the attack, usually even more than the attacker both in the short-term and the long-term. Moreover, a power adjusting withholding attacker does not necessarily observe a profit lag in the short-term. It has been long thought that the profit lag of selfish mining is among the main reasons why such an attack has not been observed in practice. We show that such a barrier does not apply to power adjusting attacks and relatively small pools are at an immediate threat.

Paperid: 656, https://arxiv.org/pdf/2511.10913.pdf

Abstract:
Modern text-to-speech (TTS) systems, particularly those built on Large Audio-Language Models (LALMs), generate high-fidelity speech that faithfully reproduces input text and mimics specified speaker identities. While prior misuse studies have focused on speaker impersonation, this work explores a distinct content-centric threat: exploiting TTS systems to produce speech containing harmful content. Realizing such threats poses two core challenges: (1) LALM safety alignment frequently rejects harmful prompts, yet existing jailbreak attacks are ill-suited for TTS because these systems are designed to faithfully vocalize any input text, and (2) real-world deployment pipelines often employ input/output filters that block harmful text and audio. We present HARMGEN, a suite of five attacks organized into two families that address these challenges. The first family employs semantic obfuscation techniques (Concat, Shuffle) that conceal harmful content within text. The second leverages audio-modality exploits (Read, Spell, Phoneme) that inject harmful content through auxiliary audio channels while maintaining benign textual prompts. Through evaluation across five commercial LALMs-based TTS systems and three datasets spanning two languages, we demonstrate that our attacks substantially reduce refusal rates and increase the toxicity of generated speech. We further assess both reactive countermeasures deployed by audio-streaming platforms and proactive defenses implemented by TTS providers. Our analysis reveals critical vulnerabilities: deepfake detectors underperform on high-fidelity audio; reactive moderation can be circumvented by adversarial perturbations; while proactive moderation detects 57-93% of attacks. Our work highlights a previously underexplored content-centric misuse vector for TTS and underscore the need for robust cross-modal safeguards throughout training and deployment.

Paperid: 657, https://arxiv.org/pdf/2510.16581.pdf

Abstract:
Text-to-image (T2I) models, though exhibiting remarkable creativity in image generation, can be exploited to produce unsafe images. Existing safety measures, e.g., content moderation or model alignment, fail in the presence of white-box adversaries who know and can adjust model parameters, e.g., by fine-tuning. This paper presents a novel defensive framework, named Patronus, which equips T2I models with holistic protection to defend against white-box adversaries. Specifically, we design an internal moderator that decodes unsafe input features into zero vectors while ensuring the decoding performance of benign input features. Furthermore, we strengthen the model alignment with a carefully designed non-fine-tunable learning mechanism, ensuring the T2I model will not be compromised by malicious fine-tuning. We conduct extensive experiments to validate the intactness of the performance on safe content generation and the effectiveness of rejecting unsafe content generation. Results also confirm the resilience of Patronus against various fine-tuning attacks by white-box adversaries.

Paperid: 658, https://arxiv.org/pdf/2510.15994.pdf

Abstract:
The Model Context Protocol (MCP) standardizes how large language model (LLM) agents discover, describe, and call external tools. While MCP unlocks broad interoperability, it also enlarges the attack surface by making tools first-class, composable objects with natural-language metadata, and standardized I/O. We present MSB (MCP Security Benchmark), the first end-to-end evaluation suite that systematically measures how well LLM agents resist MCP-specific attacks throughout the full tool-use pipeline: task planning, tool invocation, and response handling. MSB contributes: (1) a taxonomy of 12 attacks including name-collision, preference manipulation, prompt injections embedded in tool descriptions, out-of-scope parameter requests, user-impersonating responses, false-error escalation, tool-transfer, retrieval injection, and mixed attacks; (2) an evaluation harness that executes attacks by running real tools (both benign and malicious) via MCP rather than simulation; and (3) a robustness metric that quantifies the trade-off between security and performance: Net Resilient Performance (NRP). We evaluate nine popular LLM agents across 10 domains and 400+ tools, producing 2,000 attack instances. Results reveal the effectiveness of attacks against each stage of MCP. Models with stronger performance are more vulnerable to attacks due to their outstanding tool calling and instruction following capabilities. MSB provides a practical baseline for researchers and practitioners to study, compare, and harden MCP agents.

Paperid: 659, https://arxiv.org/pdf/2510.13842.pdf

Abstract:
Knowledge poisoning poses a critical threat to Retrieval-Augmented Generation (RAG) systems by injecting adversarial content into knowledge bases, tricking Large Language Models (LLMs) into producing attacker-controlled outputs grounded in manipulated context. Prior work highlights LLMs' susceptibility to misleading or malicious retrieved content. However, real-world fact-checking scenarios are more challenging, as credible evidence typically dominates the retrieval pool. To investigate this problem, we extend knowledge poisoning to the fact-checking setting, where retrieved context includes authentic supporting or refuting evidence. We propose \textbf{ADMIT} (\textbf{AD}versarial \textbf{M}ulti-\textbf{I}njection \textbf{T}echnique), a few-shot, semantically aligned poisoning attack that flips fact-checking decisions and induces deceptive justifications, all without access to the target LLMs, retrievers, or token-level control. Extensive experiments show that ADMIT transfers effectively across 4 retrievers, 11 LLMs, and 4 cross-domain benchmarks, achieving an average attack success rate (ASR) of 86\% at an extremely low poisoning rate of $0.93 \times 10^{-6}$, and remaining robust even in the presence of strong counter-evidence. Compared with prior state-of-the-art attacks, ADMIT improves ASR by 11.2\% across all settings, exposing significant vulnerabilities in real-world RAG-based fact-checking systems.

Paperid: 660, https://arxiv.org/pdf/2509.24257.pdf

Abstract:
Decentralized inference is an appealing paradigm for serving large language models (LLMs), offering strong security, high efficiency, and lower operating costs. Yet the permissionless setting admits no a priori trust in participating nodes, making output verifiability a prerequisite for secure deployment. We present VeriLLM, a publicly verifiable protocol for decentralized LLM inference that (i) achieves security under a one-honest-verifier assumption, (ii) attains near-negligible verification cost (about 1% of the underlying inference) via a lightweight verification algorithm designed explicitly for LLMs, and (iii) enforces honest checking through a peer-prediction mechanism that mitigates lazy verification in naive voting. We further introduce an isomorphic inference-verification network that multiplexes both roles on the same set of GPU workers. This architecture (i) increases GPU utilization and thereby improves end-to-end throughput for both inference and verification, (ii) expands the effective pool of available validators, strengthening robustness and security, and (iii) enforces task indistinguishability at the worker boundary to prevent job-type-conditioned behavior. Finally, we provide a formal game-theoretic analysis and prove that, under our incentives, honest inference and verification constitute a Nash equilibrium, ensuring incentive compatibility against rational adversaries. To our knowledge, this is the first decentralized inference verification protocol with an end-to-end game-theoretic security proof.

Paperid: 661, https://arxiv.org/pdf/2509.16915.pdf

Abstract:
In this paper, we study differentially private mechanisms for functions whose outputs lie in a Euclidean Jordan algebra. Euclidean Jordan algebras capture many important mathematical structures and form the foundation of linear programming, second-order cone programming, and semidefinite programming. Our main contribution is a generic Gaussian mechanism for such functions, with sensitivity measured in $\ell_2$, $\ell_1$, and $\ell_\infty$ norms. Notably, this framework includes the important case where the function outputs are symmetric matrices, and sensitivity is measured in the Frobenius, nuclear, or spectral norm. We further derive private algorithms for solving symmetric cone programs under various settings, using a combination of the multiplicative weights update method and our generic Gaussian mechanism. As an application, we present differentially private algorithms for semidefinite programming, resolving a major open question posed by [Hsu, Roth, Roughgarden, and Ullman, ICALP 2014].

Paperid: 662, https://arxiv.org/pdf/2509.14278.pdf

Abstract:
Large Language Models (LLMs) have achieved remarkable progress in natural language understanding, reasoning, and autonomous decision-making. However, these advancements have also come with significant privacy concerns. While significant research has focused on mitigating the data privacy risks of LLMs during various stages of model training, less attention has been paid to new threats emerging from their deployment. The integration of LLMs into widely used applications and the weaponization of their autonomous abilities have created new privacy vulnerabilities. These vulnerabilities provide opportunities for both inadvertent data leakage and malicious exfiltration from LLM-powered systems. Additionally, adversaries can exploit these systems to launch sophisticated, large-scale privacy attacks, threatening not only individual privacy but also financial security and societal trust. In this paper, we systematically examine these emerging privacy risks of LLMs. We also discuss potential mitigation strategies and call for the research community to broaden its focus beyond data privacy risks, developing new defenses to address the evolving threats posed by increasingly powerful LLMs and LLM-powered systems.

Paperid: 663, https://arxiv.org/pdf/2509.05362.pdf

Abstract:
Scams exploiting real-time social engineering -- such as phishing, impersonation, and phone fraud -- remain a persistent and evolving threat across digital platforms. Existing defenses are largely reactive, offering limited protection during active interactions. We propose a privacy-preserving, AI-in-the-loop framework that proactively detects and disrupts scam conversations in real time. The system combines instruction-tuned artificial intelligence with a safety-aware utility function that balances engagement with harm minimization, and employs federated learning to enable continual model updates without raw data sharing. Experimental evaluations show that the system produces fluent and engaging responses (perplexity as low as 22.3, engagement $\approx$0.80), while human studies confirm significant gains in realism, safety, and effectiveness over strong baselines. In federated settings, models trained with FedAvg sustain up to 30 rounds while preserving high engagement ($\approx$0.80), strong relevance ($\approx$0.74), and low PII leakage ($\leq$0.0085). Even with differential privacy, novelty and safety remain stable, indicating that robust privacy can be achieved without sacrificing performance. The evaluation of guard models (LlamaGuard, LlamaGuard2/3, MD-Judge) shows a straightforward pattern: stricter moderation settings reduce the chance of exposing personal information, but they also limit how much the model engages in conversation. In contrast, more relaxed settings allow longer and richer interactions, which improve scam detection, but at the cost of higher privacy risk. To our knowledge, this is the first framework to unify real-time scam-baiting, federated privacy preservation, and calibrated safety moderation into a proactive defense paradigm.

Paperid: 664, https://arxiv.org/pdf/2508.20517.pdf

Abstract:
Cross-chain bridges play a vital role in enabling blockchain interoperability. However, due to the inherent design flaws and the enormous value they hold, they have become prime targets for hacker attacks. Existing detection methods show progress yet remain limited, as they mainly address single-chain behaviors and fail to capture cross-chain semantics. To address this gap, we leverage heterogeneous graph attention networks, which are well-suited for modeling multi-typed entities and relations, to capture the complex execution semantics of cross-chain behaviors. We propose BridgeShield, a detection framework that jointly models the source chain, off-chain coordination, and destination chain within a unified heterogeneous graph representation. BridgeShield incorporates intra-meta-path attention to learn fine-grained dependencies within cross-chain paths and inter-meta-path attention to highlight discriminative cross-chain patterns, thereby enabling precise identification of attack behaviors. Extensive experiments on 51 real-world cross-chain attack events demonstrate that BridgeShield achieves an average F1-score of 92.58%, representing a 24.39% improvement over state-of-the-art baselines. These results validate the effectiveness of BridgeShield as a practical solution for securing cross-chain bridges and enhancing the resilience of multi-chain ecosystems.

Paperid: 665, https://arxiv.org/pdf/2508.20414.pdf

Abstract:
Artificial intelligence (AI) has demonstrated considerable potential in the realm of medical imaging. However, the development of high-performance AI models typically necessitates training on large-scale, centralized datasets. This approach is confronted with significant challenges due to strict patient privacy regulations and legal restrictions on data sharing and utilization. These limitations hinder the development of large-scale models in medical domains and impede continuous updates and training with new data. Federated Learning (FL), a privacy-preserving distributed training framework, offers a new solution by enabling collaborative model development across fragmented medical datasets. In this survey, we review FL's contributions at two stages of the full-stack medical analysis pipeline. First, in upstream tasks such as CT or MRI reconstruction, FL enables joint training of robust reconstruction networks on diverse, multi-institutional datasets, alleviating data scarcity while preserving confidentiality. Second, in downstream clinical tasks like tumor diagnosis and segmentation, FL supports continuous model updating by allowing local fine-tuning on new data without centralizing sensitive images. We comprehensively analyze FL implementations across the medical imaging pipeline, from physics-informed reconstruction networks to diagnostic AI systems, highlighting innovations that improve communication efficiency, align heterogeneous data, and ensure secure parameter aggregation. Meanwhile, this paper provides an outlook on future research directions, aiming to serve as a valuable reference for the field's development.

Paperid: 666, https://arxiv.org/pdf/2508.18976.pdf

Abstract:
Differentially private text sanitization refers to the process of privatizing texts under the framework of Differential Privacy (DP), providing provable privacy guarantees while also empirically defending against adversaries seeking to harm privacy. Despite their simplicity, DP text sanitization methods operating at the word level exhibit a number of shortcomings, among them the tendency to leave contextual clues from the original texts due to randomization during sanitization $\unicode{x2013}$ this we refer to as $\textit{contextual vulnerability}$. Given the powerful contextual understanding and inference capabilities of Large Language Models (LLMs), we explore to what extent LLMs can be leveraged to exploit the contextual vulnerability of DP-sanitized texts. We expand on previous work not only in the use of advanced LLMs, but also in testing a broader range of sanitization mechanisms at various privacy levels. Our experiments uncover a double-edged sword effect of LLM-based data reconstruction attacks on privacy and utility: while LLMs can indeed infer original semantics and sometimes degrade empirical privacy protections, they can also be used for good, to improve the quality and privacy of DP-sanitized texts. Based on our findings, we propose recommendations for using LLM data reconstruction as a post-processing step, serving to increase privacy protection by thinking adversarially.

Paperid: 667, https://arxiv.org/pdf/2508.13214.pdf

Abstract:
Large Language Models (LLMs) have recently demonstrated strong emergent abilities in complex reasoning and zero-shot generalization, showing unprecedented potential for LLM-as-a-judge applications in education, peer review, and data quality evaluation. However, their robustness under prompt injection attacks, where malicious instructions are embedded into the content to manipulate outputs, remains a significant concern. In this work, we explore a frustratingly simple yet effective attack setting to test whether LLMs can be easily misled. Specifically, we evaluate LLMs on basic arithmetic questions (e.g., "What is 3 + 2?") presented as either multiple-choice or true-false judgment problems within PDF files, where hidden prompts are injected into the file. Our results reveal that LLMs are indeed vulnerable to such hidden prompt injection attacks, even in these trivial scenarios, highlighting serious robustness risks for LLM-as-a-judge applications.

Paperid: 668, https://arxiv.org/pdf/2508.06489.pdf

Abstract:
Parallel Proof-of-Work (PoW) protocols are suggested to improve the safety guarantees, transaction throughput and confirmation latencies of Nakamoto consensus. In this work, we first consider the existing parallel PoW protocols and develop hard-coded incentive attack structures. Our theoretical results and simulations show that the existing parallel PoW protocols are more vulnerable to incentive attacks than the Nakamoto consensus, e.g., attacks have smaller profitability threshold and they result in higher relative rewards. Next, we introduce a voting-based semi-parallel PoW protocol that outperforms both Nakamoto consensus and the existing parallel PoW protocols from most practical perspectives such as communication overheads, throughput, transaction conflicts, incentive compatibility of the protocol as well as a fair distribution of transaction fees among the voters and the leaders. We use state-of-the-art analysis to evaluate the consistency of the protocol and consider Markov decision process (MDP) models to substantiate our claims about the resilience of our protocol against incentive attacks.

Paperid: 669, https://arxiv.org/pdf/2507.21412.pdf

Abstract:
A Membership Inference Attack (MIA) assesses how much a trained machine learning model reveals about its training data by determining whether specific query instances were included in the dataset. We classify existing MIAs into adaptive or non-adaptive, depending on whether the adversary is allowed to train shadow models on membership queries. In the adaptive setting, where the adversary can train shadow models after accessing query instances, we highlight the importance of exploiting membership dependencies between instances and propose an attack-agnostic framework called Cascading Membership Inference Attack (CMIA), which incorporates membership dependencies via conditional shadow training to boost membership inference performance. In the non-adaptive setting, where the adversary is restricted to training shadow models before obtaining membership queries, we introduce Proxy Membership Inference Attack (PMIA). PMIA employs a proxy selection strategy that identifies samples with similar behaviors to the query instance and uses their behaviors in shadow models to perform a membership posterior odds test for membership inference. We provide theoretical analyses for both attacks, and extensive experimental results demonstrate that CMIA and PMIA substantially outperform existing MIAs in both settings, particularly in the low false-positive regime, which is crucial for evaluating privacy risks.

Paperid: 670, https://arxiv.org/pdf/2512.12174.pdf

Abstract:
EIP-7702 introduces a delegation-based authorization mechanism that allows an externally owned account (EOA) to authenticate a single authorization tuple, after which all subsequent calls are routed to arbitrary delegate code. We show that this design enables a qualitatively new class of phishing attacks: instead of deceiving users into signing individual transactions, an attacker can induce a victim to sign a single authorization tuple that grants unconditional and persistent execution control over the account. Through controlled experiments, we identify three reliable trigger pathways: user-driven, attacker-driven, and protocol-triggered. Each can lead to full account takeover and complete asset drainage. We further propose two extended attack surfaces. First, ERC-4337's EntryPoint pipeline enables remote and repeated activation of the delegated code without further victim involvement. Second, the chain-agnostic authorization mode permits replay-like compromises across independent networks. We also present the first empirical measurement of EIP-7702 usage across major EVM chains. Analyzing over 150k authorization and execution events involving 26k addresses and hundreds of delegator contracts, we assess the protocol's real-world footprint. Our findings show that EIP-7702 authorizations are highly centralized, dominated by a small number of contract families linked to criminal activity and repeatedly reused across incidents. Corresponding loss data reveals substantial theft of ETH, ERC-20 tokens, and NFTs. These results provide practical evidence that the attack surface we identify is not merely theoretical, but is already being exploited at scale. We conclude by proposing protocol-level defenses to mitigate the delegation-based phishing vector introduced by EIP-7702.

Paperid: 671, https://arxiv.org/pdf/2512.09485.pdf

Abstract:
Zero-Touch Networks (ZTNs) represent a transformative paradigm toward fully automated and intelligent network management, providing the scalability and adaptability required for the complexity of sixth-generation (6G) networks. However, the distributed architecture, high openness, and deep heterogeneity of 6G networks expand the attack surface and pose unprecedented security challenges. To address this, security automation aims to enable intelligent security management across dynamic and complex environments, serving as a key capability for securing 6G ZTNs. Despite its promise, implementing security automation in 6G ZTNs presents two primary challenges: 1) automating the lifecycle from security strategy generation to validation and update under real-world, parallel, and adversarial conditions, and 2) adapting security strategies to evolving threats and dynamic environments. This motivates us to propose SecLoop and SA-GRPO. SecLoop constitutes the first fully automated framework that integrates large language models (LLMs) across the entire lifecycle of security strategy generation, orchestration, response, and feedback, enabling intelligent and adaptive defenses in dynamic network environments, thus tackling the first challenge. Furthermore, we propose SA-GRPO, a novel security-aware group relative policy optimization algorithm that iteratively refines security strategies by contrasting group feedback collected from parallel SecLoop executions, thereby addressing the second challenge. Extensive real-world experiments on five benchmarks, including 11 MITRE ATT&CK processes and over 20 types of attacks, demonstrate the superiority of the proposed SecLoop and SA-GRPO. We will release our platform to the community, facilitating the advancement of security automation towards next generation communications.

Paperid: 672, https://arxiv.org/pdf/2512.08104.pdf

Abstract:
As AI agents increasingly operate in real-world, multi-agent environments, ensuring reliable and context-aware privacy in agent communication is critical, especially to comply with evolving regulatory requirements. Traditional access controls are insufficient, as privacy risks often arise after access is granted; agents may use information in ways that compromise privacy, such as messaging humans, sharing context with other agents, making tool calls, persisting data, or generating derived private information. Existing approaches often treat privacy as a binary constraint, whether data is shareable or not, overlooking nuanced, role-specific, and computation-dependent privacy needs essential for regulatory compliance. Agents, including those based on large language models, are inherently probabilistic and heuristic. There is no formal guarantee of how an agent will behave for any query, making them ill-suited for operations critical to security. To address this, we introduce AgentCrypt, a four-tiered framework for fine-grained, encrypted agent communication that adds a protection layer atop any AI agent platform. AgentCrypt spans unrestricted data exchange (Level 1) to fully encrypted computation using techniques such as homomorphic encryption (Level 4). Crucially, it guarantees the privacy of tagged data is always maintained, prioritizing privacy above correctness. AgentCrypt ensures privacy across diverse interactions and enables computation on otherwise inaccessible data, overcoming barriers such as data silos. We implemented and tested it with Langgraph and Google ADK, demonstrating versatility across platforms. We also introduce a benchmark dataset simulating privacy-critical tasks at all privacy levels, enabling systematic evaluation and fostering the development of regulatable machine learning systems for secure agent communication and computation.

Paperid: 673, https://arxiv.org/pdf/2512.06906.pdf

Abstract:
Detecting the anomalies of web applications, important infrastructures for running modern companies and governments, is crucial for providing reliable web services. Many modern web applications operate on web APIs (e.g., RESTful, SOAP, and WebSockets), their exposure invites intended attacks or unintended illegal visits, causing abnormal system behaviors. However, such anomalies can share very similar logs with normal logs, missing crucial information (which could be in database) for log discrimination. Further, log instances can be also noisy, which can further mislead the state-of-the-art log learning solutions to learn spurious correlation, resulting superficial models and rules for anomaly detection. In this work, we propose MINES which infers explainable API invariants for anomaly detection from the schema level instead of detailed raw log instances, which can (1) significantly discriminate noise in logs to identify precise normalities and (2) detect abnormal behaviors beyond the instrumented logs. Technically, MINES (1) converts API signatures into table schema to enhance the original database shema; and (2) infers the potential database constraints on the enhanced database schema to capture the potential relationships between APIs and database tables. MINES uses LLM for extracting potential relationship based on two given table structures; and use normal log instances to reject and accept LLM-generated invariants. Finally, MINES translates the inferred constraints into invariants to generate Python code for verifying the runtime logs. We extensively evaluate MINES on web-tamper attacks on the benchmarks of TrainTicket, NiceFish, Gitea, Mastodon, and NextCloud against baselines such as LogRobust, LogFormer, and WebNorm. The results show that MINES achieves high recall for the anomalies while introducing almost zero false positives, indicating a new state-of-the-art.

Paperid: 674, https://arxiv.org/pdf/2512.03238.pdf

Abstract:
High quality data is needed to unlock the full potential of AI for end users. However finding new sources of such data is getting harder: most publicly-available human generated data will soon have been used. Additionally, publicly available data often is not representative of users of a particular system -- for example, a research speech dataset of contractors interacting with an AI assistant will likely be more homogeneous, well articulated and self-censored than real world commands that end users will issue. Therefore unlocking high-quality data grounded in real user interactions is of vital interest. However, the direct use of user data comes with significant privacy risks. Differential Privacy (DP) is a well established framework for reasoning about and limiting information leakage, and is a gold standard for protecting user privacy. The focus of this work, \emph{Differentially Private Synthetic data}, refers to synthetic data that preserves the overall trends of source data,, while providing strong privacy guarantees to individuals that contributed to the source dataset. DP synthetic data can unlock the value of datasets that have previously been inaccessible due to privacy concerns and can replace the use of sensitive datasets that previously have only had rudimentary protections like ad-hoc rule-based anonymization. In this paper we explore the full suite of techniques surrounding DP synthetic data, the types of privacy protections they offer and the state-of-the-art for various modalities (image, tabular, text and decentralized). We outline all the components needed in a system that generates DP synthetic data, from sensitive data handling and preparation, to tracking the use and empirical privacy testing. We hope that work will result in increased adoption of DP synthetic data, spur additional research and increase trust in DP synthetic data approaches.

Paperid: 675, https://arxiv.org/pdf/2511.18653.pdf

Abstract:
Fully Homomorphic Encryption (FHE), particularly the CKKS scheme, is a promising enabler for privacy-preserving MLaaS, but its practical deployment faces a prohibitive barrier: it heavily relies on domain expertise. Configuring CKKS involves a tightly coupled space of ring dimensions, modulus chains, and packing layouts. Without deep cryptographic knowledge to navigate these interactions, practitioners are restricted to compilers that rely on fixed heuristics. These "one-shot" tools often emit rigid configurations that are either severely over-provisioned in latency or fail to find a feasible solution entirely for deeper networks. We present FHE-Agent, an agentic framework that automates this expert reasoning process. By coupling a Large Language Model (LLM) controller with a deterministic tool suite, FHE-Agent decomposes the search into global parameter selection and layer-wise bottleneck repair. The agents operate within a multi-fidelity workflow, pruning invalid regimes using cheap static analysis and reserving expensive encrypted evaluations for the most promising candidates. We instantiate FHE-Agent on the Orion compiler and evaluate it on standard benchmarks (MLP, LeNet, LoLa) and deeper architectures (AlexNet). FHE-Agent consistently achieves better precision and lower latency than naïve search strategies. Crucially, it automatically discovers feasible, 128-bit secure configurations for complex models where baseline heuristics and one-shot prompts fail to produce a valid setup.

Paperid: 676, https://arxiv.org/pdf/2511.01180.pdf

Abstract:
Binary Function Similarity Detection (BFSD) is a foundational technique in software security, underpinning a wide range of applications including vulnerability detection, malware analysis. Recent advances in AI-based BFSD tools have led to significant performance improvements. However, existing evaluations of these tools suffer from three key limitations: a lack of in-depth analysis of performance-influencing factors, an absence of realistic application analysis, and reliance on small-scale or low-quality datasets. In this paper, we present the first large-scale empirical study of AI-based BFSD tools to address these gaps. We construct two high-quality and diverse datasets: BinAtlas, comprising 12,453 binaries and over 7 million functions for capability evaluation; and BinAres, containing 12,291 binaries and 54 real-world 1-day vulnerabilities for evaluating vulnerability detection performance in practical IoT firmware settings. Using these datasets, we evaluate nine representative BFSD tools, analyze the challenges and limitations of existing BFSD tools, and investigate the consistency among BFSD tools. We also propose an actionable strategy for combining BFSD tools to enhance overall performance (an improvement of 13.4%). Our study not only advances the practical adoption of BFSD tools but also provides valuable resources and insights to guide future research in scalable and automated binary similarity detection.

Paperid: 677, https://arxiv.org/pdf/2511.00447.pdf

Abstract:
Large language models (LLMs) have demonstrated impressive instruction-following capabilities. However, these capabilities also expose models to prompt injection attacks, where maliciously crafted inputs overwrite or distract from the intended instructions. A core vulnerability lies in the model's lack of semantic role understanding: it cannot distinguish directive intent from descriptive content, leading it to execute instruction-like phrases embedded in data. We propose DRIP, a training-time defense grounded in a semantic modeling perspective, which enforces robust separation between instruction and data semantics without sacrificing utility. DRIP introduces two lightweight yet complementary mechanisms: (1) a token-wise de-instruction shift that performs semantic disentanglement, weakening directive semantics in data tokens while preserving content meaning; and (2) a residual fusion pathway that provides a persistent semantic anchor, reinforcing the influence of the true top-level instruction during generation. Experimental results on LLaMA-8B and Mistral-7B across three prompt injection benchmarks (SEP, AlpacaFarm, and InjecAgent) demonstrate that DRIP outperforms state-of-the-art defenses, including StruQ, SecAlign, ISE, and PFT, improving role separation by 49%, and reducing attack success rate by 66% for adaptive attacks. Meanwhile, DRIP's utility is on par with the undefended model across AlpacaEval, IFEval, and MT-Bench. Our findings underscore the power of lightweight representation edits and role-aware supervision in securing LLMs against adaptive prompt injection.

Paperid: 678, https://arxiv.org/pdf/2510.27629.pdf

Abstract:
Open-weight bio-foundation models present a dual-use dilemma. While holding great promise for accelerating scientific research and drug development, they could also enable bad actors to develop more deadly bioweapons. To mitigate the risk posed by these models, current approaches focus on filtering biohazardous data during pre-training. However, the effectiveness of such an approach remains unclear, particularly against determined actors who might fine-tune these models for malicious use. To address this gap, we propose BioRiskEval, a framework to evaluate the robustness of procedures that are intended to reduce the dual-use capabilities of bio-foundation models. BioRiskEval assesses models' virus understanding through three lenses, including sequence modeling, mutational effects prediction, and virulence prediction. Our results show that current filtering practices may not be particularly effective: Excluded knowledge can be rapidly recovered in some cases via fine-tuning, and exhibits broader generalizability in sequence modeling. Furthermore, dual-use signals may already reside in the pretrained representations, and can be elicited via simple linear probing. These findings highlight the challenges of data filtering as a standalone procedure, underscoring the need for further research into robust safety and security strategies for open-weight bio-foundation models.

Paperid: 679, https://arxiv.org/pdf/2510.22062.pdf

Abstract:
Sparse variable selection improves interpretability and generalization in high-dimensional learning by selecting a small subset of informative features. Recent advances in Mixed Integer Programming (MIP) have enabled solving large-scale non-private sparse regression - known as Best Subset Selection (BSS) - with millions of variables in minutes. However, extending these algorithmic advances to the setting of Differential Privacy (DP) has remained largely unexplored. In this paper, we introduce two new pure differentially private estimators for sparse variable selection, levering modern MIP techniques. Our framework is general and applies broadly to problems like sparse regression or classification, and we provide theoretical support recovery guarantees in the case of BSS. Inspired by the exponential mechanism, we develop structured sampling procedures that efficiently explore the non-convex objective landscape, avoiding the exhaustive combinatorial search in the exponential mechanism. We complement our theoretical findings with extensive numerical experiments, using both least squares and hinge loss for our objective function, and demonstrate that our methods achieve state-of-the-art empirical support recovery, outperforming competing algorithms in settings with up to $p=10^4$.

Paperid: 680, https://arxiv.org/pdf/2510.12200.pdf

Abstract:
Web applications are prime targets for cyberattacks as gateways to critical services and sensitive data. Traditional penetration testing is costly and expertise-intensive, making it difficult to scale with the growing web ecosystem. While language model agents show promise in cybersecurity, modern web applications demand visual understanding, dynamic content handling, and multi-step interactions that only computer-use agents (CUAs) can perform. Yet, their ability to discover and exploit vulnerabilities through graphical interfaces remains largely unexplored. We present HackWorld, the first framework for systematically evaluating CUAs' capabilities to exploit web application vulnerabilities via visual interaction. Unlike sanitized benchmarks, HackWorld includes 36 real-world applications across 11 frameworks and 7 languages, featuring realistic flaws such as injection vulnerabilities, authentication bypasses, and unsafe input handling. Using a Capture-the-Flag (CTF) setup, it tests CUAs' capacity to identify and exploit these weaknesses while navigating complex web interfaces. Evaluation of state-of-the-art CUAs reveals concerning trends: exploitation rates below 12% and low cybersecurity awareness. CUAs often fail at multi-step attack planning and misuse security tools. These results expose the current limitations of CUAs in web security contexts and highlight opportunities for developing more security-aware agents capable of effective vulnerability detection and exploitation.

Paperid: 681, https://arxiv.org/pdf/2510.07809.pdf

Abstract:
Large vision-language models (LVLMs) enable autonomous mobile agents to operate smartphone user interfaces, yet vulnerabilities to UI-level attacks remain critically understudied. Existing research often depends on conspicuous UI overlays, elevated permissions, or impractical threat models, limiting stealth and real-world applicability. In this paper, we present a practical and stealthy one-shot jailbreak attack that leverages in-app prompt injections: malicious applications embed short prompts in UI text that remain inert during human interaction but are revealed when an agent drives the UI via ADB (Android Debug Bridge). Our framework comprises three crucial components: (1) low-privilege perception-chain targeting, which injects payloads into malicious apps as the agent's visual inputs; (2) stealthy user-invisible activation, a touch-based trigger that discriminates agent from human touches using physical touch attributes and exposes the payload only during agent operation; and (3) one-shot prompt efficacy, a heuristic-guided, character-level iterative-deepening search algorithm (HG-IDA*) that performs one-shot, keyword-level detoxification to evade on-device safety filters. We evaluate across multiple LVLM backends, including closed-source services and representative open-source models within three Android applications, and we observe high planning and execution hijack rates in single-shot scenarios (e.g., GPT-4o: 82.5% planning / 75.0% execution). These findings expose a fundamental security vulnerability in current mobile agents with immediate implications for autonomous smartphone operation.

Paperid: 682, https://arxiv.org/pdf/2510.07697.pdf

Abstract:
With the rise of advanced reasoning capabilities, large language models (LLMs) are receiving increasing attention. However, although reasoning improves LLMs' performance on downstream tasks, it also introduces new security risks, as adversaries can exploit these capabilities to conduct backdoor attacks. Existing surveys on backdoor attacks and reasoning security offer comprehensive overviews but lack in-depth analysis of backdoor attacks and defenses targeting LLMs' reasoning abilities. In this paper, we take the first step toward providing a comprehensive review of reasoning-based backdoor attacks in LLMs by analyzing their underlying mechanisms, methodological frameworks, and unresolved challenges. Specifically, we introduce a new taxonomy that offers a unified perspective for summarizing existing approaches, categorizing reasoning-based backdoor attacks into associative, passive, and active. We also present defense strategies against such attacks and discuss current challenges alongside potential directions for future research. This work offers a novel perspective, paving the way for further exploration of secure and trustworthy LLM communities.

Paperid: 683, https://arxiv.org/pdf/2510.04465.pdf

Abstract:
Large Language Model (LLM) agents require personal information for personalization in order to better act on users' behalf in daily tasks, but this raises privacy concerns and a personalization-privacy dilemma. Agent's autonomy introduces both risks and opportunities, yet its effects remain unclear. To better understand this, we conducted a 3$\times$3 between-subjects experiment ($N=450$) to study how agent's autonomy level and personalization influence users' privacy concerns, trust and willingness to use, as well as the underlying psychological processes. We find that personalization without considering users' privacy preferences increases privacy concerns and decreases trust and willingness to use. Autonomy moderates these effects: Intermediate autonomy flattens the impact of personalization compared to No- and Full autonomy conditions. Our results suggest that rather than aiming for perfect model alignment in output generation, balancing autonomy of agent's action and user control offers a promising path to mitigate the personalization-privacy dilemma.

Paperid: 684, https://arxiv.org/pdf/2510.02158.pdf

Abstract:
Sound Event Detection (SED) systems are increasingly deployed in safety-critical applications such as industrial monitoring and audio surveillance. However, their robustness against adversarial attacks has not been well explored. Existing audio adversarial attacks targeting SED systems, which incorporate both detection and localization capabilities, often lack effectiveness due to SED's strong contextual dependencies or lack precision by focusing solely on misclassifying the target region as the target event, inadvertently affecting non-target regions. To address these challenges, we propose the Mirage and Mute Attack (M2A) framework, which is designed for targeted adversarial attacks on polyphonic SED systems. In our optimization process, we impose specific constraints on the non-target output, which we refer to as preservation loss, ensuring that our attack does not alter the model outputs for non-target region, thus achieving precise attacks. Furthermore, we introduce a novel evaluation metric Editing Precison (EP) that balances effectiveness and precision, enabling our method to simultaneously enhance both. Comprehensive experiments show that M2A achieves 94.56% and 99.11% EP on two state-of-the-art SED models, demonstrating that the framework is sufficiently effective while significantly enhancing attack precision.

Paperid: 685, https://arxiv.org/pdf/2509.26310.pdf

Abstract:
Understanding how fast physical systems can resemble Haar-random unitaries is a fundamental question in physics. Many experiments of interest in quantum gravity and many-body physics, including the butterfly effect in quantum information scrambling and the Hayden-Preskill thought experiment, involve queries to a random unitary $U$ alongside its inverse $U^\dagger$, conjugate $U^*$, and transpose $U^T$. However, conventional notions of approximate unitary designs and pseudorandom unitaries (PRUs) fail to capture these experiments. In this work, we introduce and construct strong unitary designs and strong PRUs that remain robust under all such queries. Our constructions achieve the optimal circuit depth of $O(\log n)$ for systems of $n$ qubits. We further show that strong unitary designs can form in circuit depth $O(\log^2 n)$ in circuits composed of independent two-qubit Haar-random gates, and that strong PRUs can form in circuit depth $\text{poly}(\log n)$ in circuits with no ancilla qubits. Our results provide an operational proof of the fast scrambling conjecture from black hole physics: every observable feature of the fastest scrambling quantum systems reproduces Haar-random behavior at logarithmic times.

Paperid: 686, https://arxiv.org/pdf/2509.22568.pdf

Abstract:
During large-scale crises disrupting cellular and Internet infrastructure, civilians lack reliable methods for communication, aid coordination, and access to trustworthy information. This paper presents a unified emergency communication system integrating a low-power, long-range network with a crisis-oriented smartphone application, enabling decentralized and off-grid civilian communication. Unlike previous solutions separating physical layer resilience from user layer usability, our design merges these aspects into a cohesive crisis-tailored framework. The system is evaluated in two dimensions: communication performance and application functionality. Field experiments in urban ZÃ¼rich demonstrate that the 868 MHz band, using the LongFast configuration, achieves a communication range of up to 1.2 km with 92% Packet Delivery Ratio, validating network robustness under real-world infrastructure degraded conditions. In parallel, a purpose-built mobile application featuring peer-to-peer messaging, identity verification, and community moderation was evaluated through a requirements-based analysis.

Paperid: 687, https://arxiv.org/pdf/2509.21712.pdf

Abstract:
Aligning AI systems with human privacy preferences requires understanding individuals' nuanced disclosure behaviors beyond general norms. Yet eliciting such boundaries remains challenging due to the context-dependent nature of privacy decisions and the complex trade-offs involved. We present an AI-powered elicitation approach that probes individuals' privacy boundaries through a discriminative task. We conducted a between-subjects study that systematically varied communication roles and delegation conditions, resulting in 1,681 boundary specifications from 169 participants for 61 scenarios. We examined how these contextual factors and individual differences influence the boundary specification. Quantitative results show that communication roles influence individuals' acceptance of detailed and identifiable disclosure, AI delegation and individuals' need for privacy heighten sensitivity to disclosed identifiers, and AI delegation results in less consensus across individuals. Our findings highlight the importance of situating privacy preference elicitation within real-world data flows. We advocate using nuanced privacy boundaries as an alignment goal for future AI systems.

Paperid: 688, https://arxiv.org/pdf/2509.11683.pdf

Abstract:
Cyber attacks are rapidly increasing with the advancement of technology and there is no protection for our information. To prevent future cyberattacks it is critical to promptly recognize cyberattacks and establish strong defense mechanisms against them. To respond to cybersecurity threats immediately, it is essential to examine the attackers skills, knowledge, and behaviors with the goal of evaluating their impact on the system and comprehending the traits associated with these attacks. Creating a profile of cyber threat actors based on their traits or patterns of behavior can help to create effective defenses against cyberattacks in advance. In the current literature, multiple supervised machine learning based approaches considered a smaller number of features for attacker profiling that are reported in textual cyber threat incident documents although these profiles have been developed based on the security experts own perception, we cannot rely on them. Supervised machine learning approaches strictly depend upon the structure data set. This usually leads to a two step process where we first have to establish a structured data set before we can analyze it and then employ it to construct defense mechanisms, which takes time. In this paper, an unsupervised efficient agglomerative hierarchal clustering technique is proposed for profiling cybercriminal groups based on their comprehensive contextual threat information in order to address the aforementioned issues. The main objective of this report is to identify the relationship between cyber threat actors based on their common features, aggregate them, and also profile cyber criminal groups.

Paperid: 689, https://arxiv.org/pdf/2509.11668.pdf

Abstract:
Innovative solutions to cyber security issues are shaped by the ever-changing landscape of cyber threats. Automating the mitigation of these threats can be achieved through a new methodology that addresses the domain of mitigation automation, which is often overlooked. This literature overview emphasizes the lack of scholarly work focusing specifically on automated cyber threat mitigation, particularly in addressing challenges beyond detection. The proposed methodology comprise of the development of an automatic cyber threat mitigation framework tailored for Distributed Denial-of-Service (DDoS) attacks. This framework adopts a multi-layer security approach, utilizing smart devices at the device layer, and leveraging fog network and cloud computing layers for deeper understanding and technological adaptability. Initially, firewall rule-based packet inspection is conducted on simulated attack traffic to filter out DoS packets, forwarding legitimate packets to the fog. The methodology emphasizes the integration of fog detection through statistical and behavioral analysis, specification-based detection, and deep packet inspection, resulting in a comprehensive cyber protection system. Furthermore, cloud-level inspection is performed to confirm and mitigate attacks using firewalls, enhancing strategic defense and increasing robustness against cyber threats. These enhancements enhance understanding of the research framework's practical implementation and assessment strategies, substantiating its importance in addressing current cyber security challenges and shaping future automation mitigation approaches.

Paperid: 690, https://arxiv.org/pdf/2509.11615.pdf

Abstract:
With the ever-changing landscape of cyber threats, identifying their origin has become paramount, surpassing the simple task of attack classification. Cyber threat attribution gives security analysts the insights they need to device effective threat mitigation strategies. Such strategies empower enterprises to proactively detect and defend against future cyber-attacks. However, existing approaches exhibit limitations in accurately identifying threat actors, leading to low precision and a significant occurrence of false positives. Machine learning offers the potential to automate certain aspects of cyber threat attribution. The distributed nature of information regarding cyber threat actors and their intricate attack methodologies has hindered substantial progress in this domain. Cybersecurity analysts deal with an ever-expanding collection of cyber threat intelligence documents. While these documents hold valuable insights, their sheer volume challenges efficient organization and retrieval of pertinent information. To assist the cybersecurity analyst activities, we propose a machine learning based approach featuring visually interactive analytics tool named the Cyber-Attack Pattern Explorer (CAPE), designed to facilitate efficient information discovery by employing interactive visualization and mining techniques. In the proposed system, a non-parametric mining technique is proposed to create a dataset for identifying the attack patterns within cyber threat intelligence documents. These attack patterns align semantically with commonly employed themes ensuring ease of interpretation. The extracted dataset is used for training of proposed machine learning algorithms that enables the attribution of cyber threats with respective to the actors.

Paperid: 691, https://arxiv.org/pdf/2509.08704.pdf

Abstract:
In this paper, we study the problem of privacy audit in one run and show that our method achieves tight audit results for various differentially private protocols. This includes obtaining tight results for auditing $(\varepsilon,Î´)$-DP algorithms where all previous work fails to achieve in any parameter setups. We first formulate a framework for privacy audit \textit{in one run} with refinement compared with previous work. Then, based on modeling privacy by the $f$-DP formulation, we study the implications of our framework to obtain a theoretically justified lower bound for privacy audit. In the experiment, we compare with previous work and show that our audit method outperforms the rest in auditing various differentially private algorithms. We also provide experiments that give contrasting conclusions to previous work on the parameter settings for privacy audits in one run.

Paperid: 692, https://arxiv.org/pdf/2509.01619.pdf

Abstract:
AI web agents use Internet resources at far greater speed, scale, and complexity -- changing how users and services interact. Deployed maliciously or erroneously, these agents could overload content providers. At the same time, web agents can bypass CAPTCHAs and other defenses by mimicking user behavior or flood authentication systems with fake accounts. Yet providers must protect their services and content from denial-of-service attacks and scraping by web agents. In this paper, we design a framework that imposes tunable costs on agents before providing access to resources; we call this Web Agent Throttling. We start by formalizing Throttling Gates as challenges issued to an agent that are asymmetric, scalable, robust, and compatible with any agent. Focusing on a common component -- the language model -- we require the agent to solve reasoning puzzles, thereby incurring excessive token-generation costs. However, we find that using existing puzzles, e.g., coding or math, as throttling gates fails to satisfy our properties. To address this, we introduce rebus-based Reasoning Gates, synthetic text puzzles that require multi-hop reasoning over world knowledge (thereby throttling an agent's model). We design a scalable generation and verification protocol for such reasoning gates. Our framework achieves computational asymmetry, i.e., the response-generation cost is 9.2x higher than the generation cost for SOTA models. We further deploy reasoning gates on a custom website and Model Context Protocol (MCP) servers and evaluate with real-world web agents. Finally, we discuss the limitations and environmental impact of real-world deployment of our framework.

Paperid: 693, https://arxiv.org/pdf/2508.21727.pdf

Abstract:
Watermarking diffusion-generated images is crucial for copyright protection and user tracking. However, current diffusion watermarking methods face significant limitations: zero-bit watermarking systems lack the capacity for large-scale user tracking, while multi-bit methods are highly sensitive to certain image transformations or generative attacks, resulting in a lack of comprehensive robustness. In this paper, we propose OptMark, an optimization-based approach that embeds a robust multi-bit watermark into the intermediate latents of the diffusion denoising process. OptMark strategically inserts a structural watermark early to resist generative attacks and a detail watermark late to withstand image transformations, with tailored regularization terms to preserve image quality and ensure imperceptibility. To address the challenge of memory consumption growing linearly with the number of denoising steps during optimization, OptMark incorporates adjoint gradient methods, reducing memory usage from O(N) to O(1). Experimental results demonstrate that OptMark achieves invisible multi-bit watermarking while ensuring robust resilience against valuemetric transformations, geometric transformations, editing, and regeneration attacks.

Paperid: 694, https://arxiv.org/pdf/2508.17674.pdf

Abstract:
We introduce Advertisement Embedding Attacks (AEA), a new class of LLM security threats that stealthily inject promotional or malicious content into model outputs and AI agents. AEA operate through two low-cost vectors: (1) hijacking third-party service-distribution platforms to prepend adversarial prompts, and (2) publishing back-doored open-source checkpoints fine-tuned with attacker data. Unlike conventional attacks that degrade accuracy, AEA subvert information integrity, causing models to return covert ads, propaganda, or hate speech while appearing normal. We detail the attack pipeline, map five stakeholder victim groups, and present an initial prompt-based self-inspection defense that mitigates these injections without additional model retraining. Our findings reveal an urgent, under-addressed gap in LLM security and call for coordinated detection, auditing, and policy responses from the AI-safety community.

Paperid: 695, https://arxiv.org/pdf/2508.17043.pdf

Abstract:
The increasing deployment of Unmanned Aerial Vehicles (UAVs) for military, commercial, and logistics applications has raised significant concerns regarding flight path privacy. Conventional UAV communication systems often expose flight path data to third parties, making them vulnerable to tracking, surveillance, and location inference attacks. Existing encryption techniques provide security but fail to ensure complete privacy, as adversaries can still infer movement patterns through metadata analysis. To address these challenges, we propose a zk-SNARK(Zero-Knowledge Succinct Non-Interactive Argument of Knowledge)-based privacy-preserving flight path authentication and verification framework. Our approach ensures that a UAV can prove its authorisation, validate its flight path with a control centre, and comply with regulatory constraints without revealing any sensitive trajectory information. By leveraging zk-SNARKs, the UAV can generate cryptographic proofs that verify compliance with predefined flight policies while keeping the exact path and location undisclosed. This method mitigates risks associated with real-time tracking, identity exposure, and unauthorised interception, thereby enhancing UAV operational security in adversarial environments. Our proposed solution balances privacy, security, and computational efficiency, making it suitable for resource-constrained UAVs in both civilian and military applications.

Paperid: 696, https://arxiv.org/pdf/2508.14300.pdf

Abstract:
Traditional protocol fuzzing techniques, such as those employed by AFL-based systems, often lack effectiveness due to a limited semantic understanding of complex protocol grammars and rigid seed mutation strategies. Recent works, such as ChatAFL, have integrated Large Language Models (LLMs) to guide protocol fuzzing and address these limitations, pushing protocol fuzzers to wider exploration of the protocol state space. But ChatAFL still faces issues like unreliable output, LLM hallucinations, and assumptions of LLM knowledge about protocol specifications. This paper introduces MultiFuzz, a novel dense retrieval-based multi-agent system designed to overcome these limitations by integrating semantic-aware context retrieval, specialized agents, and structured tool-assisted reasoning. MultiFuzz utilizes agentic chunks of protocol documentation (RFC Documents) to build embeddings in a vector database for a retrieval-augmented generation (RAG) pipeline, enabling agents to generate more reliable and structured outputs, enhancing the fuzzer in mutating protocol messages with enhanced state coverage and adherence to syntactic constraints. The framework decomposes the fuzzing process into modular groups of agents that collaborate through chain-of-thought reasoning to dynamically adapt fuzzing strategies based on the retrieved contextual knowledge. Experimental evaluations on the Real-Time Streaming Protocol (RTSP) demonstrate that MultiFuzz significantly improves branch coverage and explores deeper protocol states and transitions over state-of-the-art (SOTA) fuzzers such as NSFuzz, AFLNet, and ChatAFL. By combining dense retrieval, agentic coordination, and language model reasoning, MultiFuzz establishes a new paradigm in autonomous protocol fuzzing, offering a scalable and extensible foundation for future research in intelligent agentic-based fuzzing systems.

Paperid: 697, https://arxiv.org/pdf/2508.10880.pdf

Abstract:
The widespread deployment of LLM-based agents is likely to introduce a critical privacy threat: malicious agents that proactively engage others in multi-turn interactions to extract sensitive information. However, the evolving nature of such dynamic dialogues makes it challenging to anticipate emerging vulnerabilities and design effective defenses. To tackle this problem, we present a search-based framework that alternates between improving attack and defense strategies through the simulation of privacy-critical agent interactions. Specifically, we employ LLMs as optimizers to analyze simulation trajectories and iteratively propose new agent instructions. To explore the strategy space more efficiently, we further utilize parallel search with multiple threads and cross-thread propagation. Through this process, we find that attack strategies escalate from direct requests to sophisticated tactics, such as impersonation and consent forgery, while defenses evolve from simple rule-based constraints to robust identity-verification state machines. The discovered attacks and defenses transfer across diverse scenarios and backbone models, demonstrating strong practical utility for building privacy-aware agents.

Paperid: 698, https://arxiv.org/pdf/2508.07190.pdf

Abstract:
We argue that the technical foundations of non-fungible tokens (NFTs) remain inadequately understood. Prior research has focused on market dynamics, user behavior, and isolated security incidents, yet systematic analysis of the standards underpinning NFT functionality is largely absent. We present the first study of NFTs through the lens of Ethereum Improvement Proposals (EIPs). We conduct a large-scale empirical analysis of 191 NFT-related EIPs and 10K+ Ethereum Magicians discussions (as of July, 2025). We integrate multi-dimensional analyses including the automated parsing of Solidity interfaces, graph-based modeling of inheritance structures, contributor profiling, and mining of community discussion data. We distinguish foundational from emerging standards, expose poor cross-version interoperability, and show that growing functional complexity heightens security risks.

Paperid: 699, https://arxiv.org/pdf/2508.05865.pdf

Abstract:
Blockchain technology offers a promising foundation for modernizing E-Voting systems by enhancing transparency, decentralization, and security. Yet, real-world adoption remains limited due to persistent challenges such as scalability constraints, high computational demands, and complex privacy requirements. This paper presents a comparative framework for analyzing blockchain-based E-Voting architectures, consensus mechanisms, and cryptographic protocols. We examine the limitations of prevalent models like Proof of Work, Proof of Stake, and Delegated Proof of Stake, and propose optimization strategies that include hybrid consensus, lightweight cryptography, and decentralized identity management. Additionally, we explore the novel role of Large Language Models (LLMs) in smart contract generation, anomaly detection, and user interaction. Our findings offer a foundation for designing secure, scalable, and intelligent blockchain-based E-Voting systems suitable for national-scale deployment. This work lays the groundwork for building an end-to-end blockchain E-Voting prototype enhanced by LLM-guided smart contract generation and validation, supported by a systematic framework and simulation-based analysis.

Paperid: 700, https://arxiv.org/pdf/2508.01371.pdf

Abstract:
We explore the feasibility of using LLMs for Automated Exploit Generation (AEG) against vulnerable smart contracts. We present \textsc{ReX}, a framework integrating LLM-based exploit synthesis with the Foundry testing suite, enabling the automated generation and validation of proof-of-concept (PoC) exploits. We evaluate five state-of-the-art LLMs (GPT-4.1, Gemini 2.5 Pro, Claude Opus 4, DeepSeek, and Qwen3 Plus) on both synthetic benchmarks and real-world smart contracts affected by known high-impact exploits. Our results show that modern LLMs can reliably generate functional PoC exploits for diverse vulnerability types, with success rates reaching up to 92\%. Notably, Gemini 2.5 Pro and GPT-4.1 consistently outperform others in both synthetic and real-world scenarios. We further analyze factors influencing AEG effectiveness, including model capabilities, contract structure, and vulnerability types. We also collect the first curated dataset of real-world PoC exploits to support future research.

Paperid: 701, https://arxiv.org/pdf/2507.20361.pdf

Abstract:
It is well known that antivirus engines are vulnerable to evasion techniques (e.g., obfuscation) that transform malware into its variants. However, it cannot be necessarily attributed to the effectiveness of these evasions, and the limits of engines may also make this unsatisfactory result. In this study, we propose a data-driven approach to measure the effect of app transformations to malware detection, and further explain why the detection result is produced by these engines. First, we develop an interaction model for antivirus engines, illustrating how they respond with different detection results in terms of varying inputs. Six app transformation techniques are implemented in order to generate a large number of Android apps with traceable changes. Then we undertake a one-month tracking of app detection results from multiple antivirus engines, through which we obtain over 971K detection reports from VirusTotal for 179K apps in total. Last, we conduct a comprehensive analysis of antivirus engines based on these reports from the perspectives of signature-based, static analysis-based, and dynamic analysis-based detection techniques. The results, together with 7 highlighted findings, identify a number of sealed working mechanisms occurring inside antivirus engines and what are the indicators of compromise in apps during malware detection.

Paperid: 702, https://arxiv.org/pdf/2507.15393.pdf

Abstract:
Phishing emails are a critical component of the cybercrime kill chain due to their wide reach and low cost. Their ever-evolving nature renders traditional rule-based and feature-engineered detectors ineffective in the ongoing arms race between attackers and defenders. The rise of large language models (LLMs) further exacerbates the threat, enabling attackers to craft highly convincing phishing emails at minimal cost. This work demonstrates that LLMs can generate psychologically persuasive phishing emails tailored to victim profiles, successfully bypassing nearly all commercial and academic detectors. To defend against such threats, we propose PiMRef, the first reference-based phishing email detector that leverages knowledge-based invariants. Our core insight is that persuasive phishing emails often contain disprovable identity claims, which contradict real-world facts. PiMRef reframes phishing detection as an identity fact-checking task. Given an email, PiMRef (i) extracts the sender's claimed identity, (ii) verifies the legitimacy of the sender's domain against a predefined knowledge base, and (iii) detects call-to-action prompts that push user engagement. Contradictory claims are flagged as phishing indicators and serve as human-understandable explanations. Compared to existing methods such as D-Fence, HelpHed, and ChatSpamDetector, PiMRef boosts precision by 8.8% with no loss in recall on standard benchmarks like Nazario and PhishPot. In a real-world evaluation of 10,183 emails across five university accounts over three years, PiMRef achieved 92.1% precision, 87.9% recall, and a median runtime of 0.05s, outperforming the state-of-the-art in both effectiveness and efficiency.

Paperid: 703, https://arxiv.org/pdf/2507.09624.pdf

Abstract:
Driving trajectory data remains vulnerable to privacy breaches despite existing mitigation measures. Traditional methods for detecting driving trajectories typically rely on map-matching the path using Global Positioning System (GPS) data, which is susceptible to GPS data outage. This paper introduces CAN-Trace, a novel privacy attack mechanism that leverages Controller Area Network (CAN) messages to uncover driving trajectories, posing a significant risk to drivers' long-term privacy. A new trajectory reconstruction algorithm is proposed to transform the CAN messages, specifically vehicle speed and accelerator pedal position, into weighted graphs accommodating various driving statuses. CAN-Trace identifies driving trajectories using graph-matching algorithms applied to the created graphs in comparison to road networks. We also design a new metric to evaluate matched candidates, which allows for potential data gaps and matching inaccuracies. Empirical validation under various real-world conditions, encompassing different vehicles and driving regions, demonstrates the efficacy of CAN-Trace: it achieves an attack success rate of up to 90.59% in the urban region, and 99.41% in the suburban region.

Paperid: 704, https://arxiv.org/pdf/2507.09607.pdf

Abstract:
The existing MPC-based private inference frameworks either rely on impractical real-world assumptions, or adopt the strongest security model (Malicious Security Dishonest Majority, MSDM) and then suffer from severe efficiency limitations. To balance security and efficiency, we propose a novel, three-layer private inference framework based on the Helper-Assisted MSDM (HA-MSDM) model. The first is the primitive layer, where we extend computations from prime fields to rings for efficient fixed-point arithmetic and then better support inference operations. The second is the MPC layer, where we design six fixed-round MPC protocols to reduce latency for core operations like multiplication, polynomial evaluation, and batch check. The third is the inference layer, which can achieve efficient and high-accuracy CNN inference. The efficiency is achieved by applying our designed MPC protocols. The high-accuracy private inference in deep CNNs is achieved by designing a co-optimized strategy, which employs high-precision polynomial approximation for activation functions and uses parameter-adjusted Batch Normalization layers to constrain inputs. Benchmarks on LeNet and AlexNet show our framework achieves up to a 2.4-25.7x speedup in LAN and a 1.3-9.5x acceleration in WAN over the state-of-the-art MSDM frameworks with only 0.04-1.08% relative error.

Paperid: 705, https://arxiv.org/pdf/2507.06926.pdf

Abstract:
Non-Fungible Tokens (NFTs) offer a promising mechanism to protect Australian and Indigenous artists' copyright. They represent and transfer the value of artwork in digital form. Before adopting NFTs to protect Australian artwork, we in this paper investigate them empericially. We focus on examining the details of NFT structure. We start from the underlying structure of NFTs to show how they represent copyright for both artists and production owners, as well as how they aim to safeguard or secure the value of digital artworks. We then involve data collection from various types of sources with different storage methods, including on-chain, centralized, and decentralized systems. Based on both metadata and artwork content, we present our analysis and discussion on the following key issues: copyright, security and artist identification. The final results of the evaluation, unfortnately, show that the NFT is NOT ready to protect Australian and Indigenous artists' copyright.

Paperid: 706, https://arxiv.org/pdf/2512.23132.pdf

Abstract:
Machine learning (ML) underpins foundation models in finance, healthcare, and critical infrastructure, making them targets for data poisoning, model extraction, prompt injection, automated jailbreaking, and preference-guided black-box attacks that exploit model comparisons. Larger models can be more vulnerable to introspection-driven jailbreaks and cross-modal manipulation. Traditional cybersecurity lacks ML-specific threat modeling for foundation, multimodal, and RAG systems. Objective: Characterize ML security risks by identifying dominant TTPs, vulnerabilities, and targeted lifecycle stages. Methods: We extract 93 threats from MITRE ATLAS (26), AI Incident Database (12), and literature (55), and analyze 854 GitHub/Python repositories. A multi-agent RAG system (ChatGPT-4o, temp 0.4) mines 300+ articles to build an ontology-driven threat graph linking TTPs, vulnerabilities, and stages. Results: We identify unreported threats including commercial LLM API model stealing, parameter memorization leakage, and preference-guided text-only jailbreaks. Dominant TTPs include MASTERKEY-style jailbreaking, federated poisoning, diffusion backdoors, and preference optimization leakage, mainly impacting pre-training and inference. Graph analysis reveals dense vulnerability clusters in libraries with poor patch propagation. Conclusion: Adaptive, ML-specific security frameworks, combining dependency hygiene, threat intelligence, and monitoring, are essential to mitigate supply-chain and inference risks across the ML lifecycle.

Paperid: 707, https://arxiv.org/pdf/2512.22233.pdf

Abstract:
Video semantic communication, praised for its transmission efficiency, still faces critical challenges related to privacy leakage. Traditional security techniques like steganography and encryption are challenging to apply since they are not inherently robust against semantic-level transformations and abstractions. Moreover, the temporal continuity of video enables framewise statistical modeling over extended periods, which increases the risk of exposing distributional anomalies and reconstructing hidden content. To address these challenges, we propose SemCovert, a deep semantic-level hiding framework for secure and covert video transmission. SemCovert introduces a pair of co-designed models, namely the semantic hiding model and the secret semantic extractor, which are seamlessly integrated into the semantic communication pipeline. This design enables authorized receivers to reliably recover hidden information, while keeping it imperceptible to regular users. To further improve resistance to analysis, we introduce a randomized semantic hiding strategy, which breaks the determinism of embedding and introduces unpredictable distribution patterns. The experimental results demonstrate that SemCovert effectively mitigates potential eavesdropping and detection risks while reliably concealing secret videos during transmission. Meanwhile, video quality suffers only minor degradation, preserving transmission fidelity. These results confirm SemCovert's effectiveness in enabling secure and covert transmission without compromising semantic communication performance.

Paperid: 708, https://arxiv.org/pdf/2512.12154.pdf

Abstract:
Accurate time-series forecasting is increasingly critical for planning and operations in low-carbon power systems. Emerging time-series large language models (TS-LLMs) now deliver this capability at scale, requiring no task-specific retraining, and are quickly becoming essential components within the Internet-of-Energy (IoE) ecosystem. However, their real-world deployment is complicated by a critical vulnerability: adversarial examples (AEs). Detecting these AEs is challenging because (i) adversarial perturbations are optimized across the entire input sequence and exploit global temporal dependencies, which renders local detection methods ineffective, and (ii) unlike traditional forecasting models with fixed input dimensions, TS-LLMs accept sequences of variable length, increasing variability that complicates detection. To address these challenges, we propose a plug-in detection framework that capitalizes on the TS-LLM's own variable-length input capability. Our method uses sampling-induced divergence as a detection signal. Given an input sequence, we generate multiple shortened variants and detect AEs by measuring the consistency of their forecasts: Benign sequences tend to produce stable predictions under sampling, whereas adversarial sequences show low forecast similarity, because perturbations optimized for a full-length sequence do not transfer reliably to shorter, differently-structured subsamples. We evaluate our approach on three representative TS-LLMs (TimeGPT, TimesFM, and TimeLLM) across three energy datasets: ETTh2 (Electricity Transformer Temperature), NI (Hourly Energy Consumption), and Consumption (Hourly Electricity Consumption and Production). Empirical results confirm strong and robust detection performance across both black-box and white-box attack scenarios, highlighting its practicality as a reliable safeguard for TS-LLM forecasting in real-world energy systems.

Paperid: 709, https://arxiv.org/pdf/2512.00833.pdf

Abstract:
Modern circuits face various threats like reverse engineering, theft of intellectual property (IP), side-channel attacks, etc. Here, we present a novel approach for IP protection based on logic encryption (LE). Unlike established schemes for logic locking, our work obfuscates the circuit's structure and functionality by encoding and encrypting the logic itself. We devise an end-to-end method for practical LE implementation based on standard cryptographic algorithms, key-bit randomization, simple circuit design techniques, and system-level synthesis operations, all in a correct-by-construction manner. Our extensive analysis demonstrates the remarkable efficacy of our scheme, outperforming prior art against a range of oracle-less attacks covering crucial threat vectors, all with lower design overheads. We provide a full open-source release.

Paperid: 710, https://arxiv.org/pdf/2511.18531.pdf

Abstract:
Despite rapid progress in logic locking (LL), reproducibility remains a challenge as codes are rarely made public. We present LockForge, a first-of-its-kind, multi-agent large language model (LLM) framework that turns LL descriptions in papers into executable and tested code. LockForge provides a carefully crafted pipeline realizing forethought, implementation, iterative refinement, and a multi-stage validation, all to systematically bridge the gap between prose and practice for complex LL schemes. For validation, we devise (i) an LLM-as-Judge stage with a scoring system considering behavioral checks, conceptual mechanisms, structural elements, and reproducibility on benchmarks, and (ii) an independent LLM-as-Examiner stage for ground-truth assessment. We apply LockForge to 10 seminal LL schemes, many of which lack reference implementations. Our evaluation on multiple SOTA LLMs, including ablation studies, reveals the significant complexity of the task. We show that an advanced reasoning model and a sophisticated, multi-stage framework like LockForge are required. We release all implementations and benchmarks, providing a reproducible and fair foundation for evaluation of further LL research.

Paperid: 711, https://arxiv.org/pdf/2511.13329.pdf

Abstract:
Embedding-as-a-Service (EaaS) is an effective and convenient deployment solution for addressing various NLP tasks. Nevertheless, recent research has shown that EaaS is vulnerable to model extraction attacks, which could lead to significant economic losses for model providers. For copyright protection, existing methods inject watermark embeddings into text embeddings and use them to detect copyright infringement. However, current watermarking methods often resist only a subset of attacks and fail to provide \textit{comprehensive} protection. To this end, we present the region-triggered semantic watermarking framework called RegionMarker, which defines trigger regions within a low-dimensional space and injects watermarks into text embeddings associated with these regions. By utilizing a secret dimensionality reduction matrix to project onto this subspace and randomly selecting trigger regions, RegionMarker makes it difficult for watermark removal attacks to evade detection. Furthermore, by embedding watermarks across the entire trigger region and using the text embedding as the watermark, RegionMarker is resilient to both paraphrasing and dimension-perturbation attacks. Extensive experiments on various datasets show that RegionMarker is effective in resisting different attack methods, thereby protecting the copyright of EaaS.

Paperid: 712, https://arxiv.org/pdf/2511.12265.pdf

Abstract:
Deep Neural Networks (DNNs) are known to be vulnerable to various adversarial perturbations. To address the safety concerns arising from these vulnerabilities, adversarial training (AT) has emerged as one of the most effective paradigms for enhancing the robustness of DNNs. However, existing AT frameworks primarily focus on a single or a limited set of attack types, leaving DNNs still exposed to attack types that may be encountered in practice but not addressed during training. In this paper, we propose an efficient fine-tuning method called Calibrated Adversarial Sampling (CAS) to address these issues. From the optimization perspective within the multi-armed bandit framework, it dynamically designs rewards and balances exploration and exploitation by considering the dynamic and interdependent characteristics of multiple robustness dimensions. Experiments on benchmark datasets show that CAS achieves superior overall robustness while maintaining high clean accuracy, providing a new paradigm for robust generalization of DNNs.

Paperid: 713, https://arxiv.org/pdf/2511.11018.pdf

Abstract:
Large language models (LLMs) are increasingly tasked with generating structured outputs. While structured generation methods ensure validity, they often lack output diversity, a critical limitation that we confirm in our preliminary study. We propose a novel method to enhance diversity in automaton-based structured generation. Our approach utilizes automata traversal history to steer LLMs towards novel structural patterns. Evaluations show our method significantly improves structural and content diversity while maintaining comparable generation efficiency. Furthermore, we conduct a case study showcasing the effectiveness of our method in generating diverse test cases for testing open-source libraries.

Paperid: 714, https://arxiv.org/pdf/2510.11203.pdf

Abstract:
LLM-based agents have demonstrated promising adaptability in real-world applications. However, these agents remain vulnerable to a wide range of attacks, such as tool poisoning and malicious instructions, that compromise their execution flow and can lead to serious consequences like data breaches and financial loss. Existing studies typically attempt to mitigate such anomalies by predefining specific rules and enforcing them at runtime to enhance safety. Yet, designing comprehensive rules is difficult, requiring extensive manual effort and still leaving gaps that result in false negatives. As agent systems evolve into complex software systems, we take inspiration from software system security and propose TraceAegis, a provenance-based analysis framework that leverages agent execution traces to detect potential anomalies. In particular, TraceAegis constructs a hierarchical structure to abstract stable execution units that characterize normal agent behaviors. These units are then summarized into constrained behavioral rules that specify the conditions necessary to complete a task. By validating execution traces against both hierarchical and behavioral constraints, TraceAegis is able to effectively detect abnormal behaviors. To evaluate the effectiveness of TraceAegis, we introduce TraceAegis-Bench, a dataset covering two representative scenarios: healthcare and corporate procurement. Each scenario includes 1,300 benign behaviors and 300 abnormal behaviors, where the anomalies either violate the agent's execution order or break the semantic consistency of its execution sequence. Experimental results demonstrate that TraceAegis achieves strong performance on TraceAegis-Bench, successfully identifying the majority of abnormal behaviors.

Paperid: 715, https://arxiv.org/pdf/2510.10316.pdf

Abstract:
Since being proposed in 2006, differential privacy has become a standard method for quantifying certain risks in publishing or sharing analyses of sensitive data. At its heart, differential privacy measures risk in terms of the differences between probability distributions, which is a central topic in information theory. A differentially private algorithm is a channel between the underlying data and the output of the analysis. Seen in this way, the guarantees made by differential privacy can be understood in terms of properties of this channel. In this article we examine a few of the key connections between information theory and the formulation/application of differential privacy, giving an ``operational significance'' for relevant information measures.

Paperid: 716, https://arxiv.org/pdf/2510.06530.pdf

Abstract:
The quality and experience of mobile communication have significantly improved with the introduction of 5G, and these improvements are expected to continue beyond the 5G era. However, vulnerabilities in control-plane protocols, such as Radio Resource Control (RRC) and Non-Access Stratum (NAS), pose significant security threats, such as Blind Denial of Service (DoS) attacks. Despite the availability of existing anomaly detection methods that leverage rule-based systems or traditional machine learning methods, these methods have several limitations, including the need for extensive training data, predefined rules, and limited explainability. Addressing these challenges, we propose a novel anomaly detection framework that leverages the capabilities of Large Language Models (LLMs) in zero-shot mode with unordered data and short natural language attack descriptions within the Open Radio Access Network (O-RAN) architecture. We analyse robustness to prompt variation, demonstrate the practicality of automating the attack descriptions and show that detection quality relies on the semantic completeness of the description rather than its phrasing or length. We utilise an RRC/NAS dataset to evaluate the solution and provide an extensive comparison of open-source and proprietary LLM implementations to demonstrate superior performance in attack detection. We further validate the practicality of our framework within O-RAN's real-time constraints, illustrating its potential for detecting other Layer-3 attacks.

Paperid: 717, https://arxiv.org/pdf/2510.04397.pdf

Abstract:
Software vulnerabilities (SVs) pose a critical threat to safety-critical systems, driving the adoption of AI-based approaches such as machine learning and deep learning for software vulnerability detection. Despite promising results, most existing methods are limited to a single programming language. This is problematic given the multilingual nature of modern software, which is often complex and written in multiple languages. Current approaches often face challenges in capturing both shared and language-specific knowledge of source code, which can limit their performance on diverse programming languages and real-world codebases. To address this gap, we propose MULVULN, a novel multilingual vulnerability detection approach that learns from source code across multiple languages. MULVULN captures both the shared knowledge that generalizes across languages and the language-specific knowledge that reflects unique coding conventions. By integrating these aspects, it achieves more robust and effective detection of vulnerabilities in real-world multilingual software systems. The rigorous and extensive experiments on the real-world and diverse REEF dataset, consisting of 4,466 CVEs with 30,987 patches across seven programming languages, demonstrate the superiority of MULVULN over thirteen effective and state-of-the-art baselines. Notably, MULVULN achieves substantially higher F1-score, with improvements ranging from 1.45% to 23.59% compared to the baseline methods.

Paperid: 718, https://arxiv.org/pdf/2509.25072.pdf

Abstract:
Privacy-preserving technologies have introduced a paradigm shift that allows for realizable secure computing in real-world systems. The significant barrier to the practical adoption of these primitives is the computational and communication overhead that is incurred when applied at scale. In this paper, we present an overview of our efforts to bridge the gap between this overhead and practicality for privacy-preserving learning systems using multi-party computation (MPC), zero-knowledge proofs (ZKPs), and fully homomorphic encryption (FHE). Through meticulous hardware/software/algorithm co-design, we show progress towards enabling LLM-scale applications in privacy-preserving settings. We demonstrate the efficacy of our solutions in several contexts, including DNN IP ownership, ethical LLM usage enforcement, and transformer inference.

Paperid: 719, https://arxiv.org/pdf/2509.24444.pdf

Abstract:
The Open Network (TON) blockchain employs an asynchronous execution model that introduces unique security challenges for smart contracts, particularly race conditions arising from unpredictable message processing order. While previous work established vulnerability patterns through static analysis of audit reports, dynamic detection of temporal dependencies through systematic testing remains an open problem. We present BugMagnifier, a transaction simulation framework that systematically reveals vulnerabilities in TON smart contracts through controlled message orchestration. Built atop TON Sandbox and integrated with the TON Virtual Machine (TVM), our tool combines precise message queue manipulation with differential state analysis and probabilistic permutation testing to detect asynchronous execution flaws. Experimental evaluation demonstrates BugMagnifier's effectiveness through extensive parametric studies on purpose-built vulnerable contracts, revealing message ratio-dependent detection complexity that aligns with theoretical predictions. This quantitative model enables predictive vulnerability assessment while shifting discovery from manual expert analysis to automated evidence generation. By providing reproducible test scenarios for temporal vulnerabilities, BugMagnifier addresses a critical gap in the TON security tooling, offering practical support for safer smart contract development in asynchronous blockchain environments.

Paperid: 720, https://arxiv.org/pdf/2509.24272.pdf

Abstract:
Model Context Protocol (MCP) servers enable AI applications to connect to external systems in a plug-and-play manner, but their rapid proliferation also introduces severe security risks. Unlike mature software ecosystems with rigorous vetting, MCP servers still lack standardized review mechanisms, giving adversaries opportunities to distribute malicious implementations. Despite this pressing risk, the security implications of MCP servers remain underexplored. To address this gap, we present the first systematic study that treats MCP servers as active threat actors and decomposes them into core components to examine how adversarial developers can implant malicious intent. Specifically, we investigate three research questions: (i) what types of attacks malicious MCP servers can launch, (ii) how vulnerable MCP hosts and Large Language Models (LLMs) are to these attacks, and (iii) how feasible it is to carry out MCP server attacks in practice. Our study proposes a component-based taxonomy comprising twelve attack categories. For each category, we develop Proof-of-Concept (PoC) servers and demonstrate their effectiveness across diverse real-world host-LLM settings. We further show that attackers can generate large numbers of malicious servers at virtually no cost. We then test state-of-the-art scanners on the generated servers and found that existing detection approaches are insufficient. These findings highlight that malicious MCP servers are easy to implement, difficult to detect with current tools, and capable of causing concrete damage to AI agent systems. Addressing this threat requires coordinated efforts among protocol designers, host developers, LLM providers, and end users to build a more secure and resilient MCP ecosystem.

Paperid: 721, https://arxiv.org/pdf/2509.10823.pdf

Abstract:
The Open Network (TON) is a high-performance blockchain platform designed for scalability and efficiency, leveraging an asynchronous execution model and a multi-layered architecture. While TON's design offers significant advantages, it also introduces unique challenges for smart contract development and security. This paper introduces a comprehensive audit checklist for TON smart contracts, based on an analysis of 34 professional audit reports containing 233 real-world vulnerabilities. The checklist addresses TON-specific challenges, such as asynchronous message handling, and provides actionable insights for developers and auditors. We also present detailed case studies of vulnerabilities in TON smart contracts, highlighting their implications and offering lessons learned. By adopting this checklist, developers and auditors can systematically identify and mitigate vulnerabilities, enhancing the security and reliability of TON-based projects. Our work bridges the gap between Ethereum's mature audit methodologies and the emerging needs of the TON ecosystem, fostering a more secure and robust blockchain environment.

Paperid: 722, https://arxiv.org/pdf/2509.09787.pdf

Abstract:
Split Learning (SL) is a distributed learning approach that enables resource-constrained clients to collaboratively train deep neural networks (DNNs) by offloading most layers to a central server while keeping in- and output layers on the client-side. This setup enables SL to leverage server computation capacities without sharing data, making it highly effective in resource-constrained environments dealing with sensitive data. However, the distributed nature enables malicious clients to manipulate the training process. By sending poisoned intermediate gradients, they can inject backdoors into the shared DNN. Existing defenses are limited by often focusing on server-side protection and introducing additional overhead for the server. A significant challenge for client-side defenses is enforcing malicious clients to correctly execute the defense algorithm. We present ZORRO, a private, verifiable, and robust SL defense scheme. Through our novel design and application of interactive zero-knowledge proofs (ZKPs), clients prove their correct execution of a client-located defense algorithm, resulting in proofs of computational integrity attesting to the benign nature of locally trained DNN portions. Leveraging the frequency representation of model partitions enables ZORRO to conduct an in-depth inspection of the locally trained models in an untrusted environment, ensuring that each client forwards a benign checkpoint to its succeeding client. In our extensive evaluation, covering different model architectures as well as various attack strategies and data scenarios, we show ZORRO's effectiveness, as it reduces the attack success rate to less than 6\% while causing even for models storing \numprint{1000000} parameters on the client-side an overhead of less than 10 seconds.

Paperid: 723, https://arxiv.org/pdf/2509.02048.pdf

Abstract:
Machine learning models require datasets for effective training, but directly sharing raw data poses significant privacy risk such as membership inference attacks (MIA). To mitigate the risk, privacy-preserving techniques such as data perturbation, generalization, and synthetic data generation are commonly utilized. However, these methods often degrade data accuracy, specificity, and diversity, limiting the performance of downstream tasks and thus reducing data utility. Therefore, striking an optimal balance between privacy preservation and data utility remains a critical challenge. To address this issue, we introduce a novel bilevel optimization framework for the publication of private datasets, where the upper-level task focuses on data utility and the lower-level task focuses on data privacy. In the upper-level task, a discriminator guides the generation process to ensure that perturbed latent variables are mapped to high-quality samples, maintaining fidelity for downstream tasks. In the lower-level task, our framework employs local extrinsic curvature on the data manifold as a quantitative measure of individual vulnerability to MIA, providing a geometric foundation for targeted privacy protection. By perturbing samples toward low-curvature regions, our method effectively suppresses distinctive feature combinations that are vulnerable to MIA. Through alternating optimization of both objectives, we achieve a synergistic balance between privacy and utility. Extensive experimental evaluations demonstrate that our method not only enhances resistance to MIA in downstream tasks but also surpasses existing methods in terms of sample quality and diversity.

Paperid: 724, https://arxiv.org/pdf/2509.01791.pdf

Abstract:
Every day, our inboxes are flooded with unsolicited emails, ranging between annoying spam to more subtle phishing scams. Unfortunately, despite abundant prior efforts proposing solutions achieving near-perfect accuracy, the reality is that countering malicious emails still remains an unsolved dilemma. This "open problem" paper carries out a critical assessment of scientific works in the context of phishing email detection. First, we focus on the benchmark datasets that have been used to assess the methods proposed in research. We find that most prior work relied on datasets containing emails that -- we argue -- are not representative of current trends, and mostly encompass the English language. Based on this finding, we then re-implement and re-assess a variety of detection methods reliant on machine learning (ML), including large-language models (LLM), and release all of our codebase -- an (unfortunately) uncommon practice in related research. We show that most such methods achieve near-perfect performance when trained and tested on the same dataset -- a result which intrinsically hinders development (how can future research outperform methods that are already near perfect?). To foster the creation of "more challenging benchmarks" that reflect current phishing trends, we propose E-PhishGEN, an LLM-based (and privacy-savvy) framework to generate novel phishing-email datasets. We use our E-PhishGEN to create E-PhishLLM, a novel phishing-email detection dataset containing 16616 emails in three languages. We use E-PhishLLM to test the detectors we considered, showing a much lower performance than that achieved on existing benchmarks -- indicating a larger room for improvement. We also validate the quality of E-PhishLLM with a user study (n=30). To sum up, we show that phishing email detection is still an open problem -- and provide the means to tackle such a problem by future research.

Paperid: 725, https://arxiv.org/pdf/2509.00882.pdf

Abstract:
Traditional vulnerability detection methods rely heavily on predefined rule matching, which often fails to capture vulnerabilities accurately. With the rise of large language models (LLMs), leveraging their ability to understand code semantics has emerged as a promising direction for achieving more accurate and efficient vulnerability detection. However, current LLM-based approaches face significant challenges: instability in model outputs, limitations in context length, and hallucination. As a result, many existing solutions either use LLMs merely to enrich predefined rule sets, thereby keeping the detection process fundamentally rule-based, or over-rely on them, leading to poor robustness. To address these challenges, we propose a constraint-solving approach powered by LLMs named VULSOLVER. By modeling vulnerability detection as a constraint-solving problem, and by integrating static application security testing (SAST) with the semantic reasoning capabilities of LLMs, our method enables the LLM to act like a professional human security expert. We assess VULSOLVER on the OWASP Benchmark (1,023 labeled samples), achieving 96.29% accuracy, 96.55% F1-score, and 100% recall. Applied to popular GitHub repositories, VULSOLVER also identified 15 previously unknown high-severity vulnerabilities (CVSS 7.5-9.8), demonstrating its effectiveness in real-world security analysis.

Paperid: 726, https://arxiv.org/pdf/2508.16868.pdf

Abstract:
Negative-Bias Temperature Instability is a dominant aging mechanism in nanoscale CMOS circuits such as microprocessors. With this aging mechanism, the rate of device aging is dependent not only on overall operating conditions, such as heat, but also on user controllable inputs to the transistors. This dependence on input implies a possible timing fault-injection attack wherein a targeted path of logic is intentionally degraded through the purposeful, software-driven actions of an attacker, rendering a targeted bit effectively stuck. In this work, we describe such an attack mechanism, which we dub a "$\textbf{Targeted Wearout Attack}$", wherein an attacker with sufficient knowledge of the processor core, executing a carefully crafted software program with only user privilege, is able to degrade a functional unit within the processor with the aim of eliciting a particular desired incorrect calculation in a victim application. Here we give a general methodology for the attack. We then demonstrate a case study where a targeted path within the fused multiply-add pipeline in a RISC-V CPU sees a $>7x$ increase in wear over time than would be experienced under typical workloads. We show that an attacker could leverage such an attack, leading to targeted and silent data corruption in a co-running victim application using the same unit.

Paperid: 727, https://arxiv.org/pdf/2508.16406.pdf

Abstract:
Large Language Models (LLMs) remain vulnerable to jailbreak attacks, which attempt to elicit harmful responses from LLMs. The evolving nature and diversity of these attacks pose many challenges for defense systems, including (1) adaptation to counter emerging attack strategies without costly retraining, and (2) control of the trade-off between safety and utility. To address these challenges, we propose Retrieval-Augmented Defense (RAD), a novel framework for jailbreak detection that incorporates a database of known attack examples into Retrieval-Augmented Generation, which is used to infer the underlying, malicious user query and jailbreak strategy used to attack the system. RAD enables training-free updates for newly discovered jailbreak strategies and provides a mechanism to balance safety and utility. Experiments on StrongREJECT show that RAD substantially reduces the effectiveness of strong jailbreak attacks such as PAP and PAIR while maintaining low rejection rates for benign queries. We propose a novel evaluation scheme and show that RAD achieves a robust safety-utility trade-off across a range of operating points in a controllable manner.

Paperid: 728, https://arxiv.org/pdf/2508.08029.pdf

Abstract:
The introduction of 5G and the Open Radio Access Network (O-RAN) architecture has enabled more flexible and intelligent network deployments. However, the increased complexity and openness of these architectures also introduce novel security challenges, such as data manipulation attacks on the semi-standardised Shared Data Layer (SDL) within the O-RAN platform through malicious xApps. In particular, malicious xApps can exploit this vulnerability by introducing subtle Unicode-wise alterations (hypoglyphs) into the data that are being used by traditional machine learning (ML)-based anomaly detection methods. These Unicode-wise manipulations can potentially bypass detection and cause failures in anomaly detection systems based on traditional ML, such as AutoEncoders, which are unable to process hypoglyphed data without crashing. We investigate the use of Large Language Models (LLMs) for anomaly detection within the O-RAN architecture to address this challenge. We demonstrate that LLM-based xApps maintain robust operational performance and are capable of processing manipulated messages without crashing. While initial detection accuracy requires further improvements, our results highlight the robustness of LLMs to adversarial attacks such as hypoglyphs in input data. There is potential to use their adaptability through prompt engineering to further improve the accuracy, although this requires further research. Additionally, we show that LLMs achieve low detection latency (under 0.07 seconds), making them suitable for Near-Real-Time (Near-RT) RIC deployments.

Paperid: 729, https://arxiv.org/pdf/2508.06087.pdf

Abstract:
The preservation of privacy has emerged as a critical topic in the era of artificial intelligence. However, current work focuses on user-oriented privacy, overlooking severe enterprise data leakage risks exacerbated by the Retrieval-Augmented Generation paradigm. To address this gap, our paper introduces a novel objective: enterprise-oriented privacy concerns. Achieving this objective requires overcoming two fundamental challenges: existing methods such as data sanitization severely degrade model performance, and the field lacks public datasets for evaluation. We address these challenges with several solutions. (1) To prevent performance degradation, we propose ABack, a training-free mechanism that leverages a Hidden State Model to pinpoint the origin of a leakage intention and rewrite the output safely. (2) To solve the lack of datasets, we construct PriGenQA, a new benchmark for enterprise privacy scenarios in healthcare and finance. To ensure a rigorous evaluation, we move beyond simple static attacks by developing a powerful adaptive attacker with Group Relative Policy Optimization. Experiments show that against this superior adversary, ABack improves the overall privacy utility score by up to 15\% over strong baselines, avoiding the performance trade-offs of prior methods.

Paperid: 730, https://arxiv.org/pdf/2507.08158.pdf

Abstract:
Differential Privacy (DP) is a family of definitions that bound the worst-case privacy leakage of a mechanism. One important feature of the worst-case DP guarantee is it naturally implies protections against adversaries with less prior information, more sophisticated attack goals, and complex measures of a successful attack. However, the analytical tradeoffs between the adversarial model and the privacy protections conferred by DP are not well understood thus far. To that end, this work sheds light on what the worst-case guarantee of DP implies about the success of attackers that are more representative of real-world privacy risks. In this paper, we present a single flexible framework that generalizes and extends the patchwork of bounds on DP mechanisms found in prior work. Our framework allows us to compute high-probability guarantees for DP mechanisms on a large family of natural attack settings that previous bounds do not capture. One class of such settings is the approximate reconstruction of multiple individuals' data, such as inferring nearly entire columns of a tabular data set from noisy marginals and extracting sensitive information from DP-trained language models. We conduct two empirical case studies to illustrate the versatility of our bounds and compare them to the success of state-of-the-art attacks. Specifically, we study attacks that extract non-uniform PII from a DP-trained language model, as well as multi-column reconstruction attacks where the adversary has access to some columns in the clear and attempts to reconstruct the remaining columns for each person's record. We find that the absolute privacy risk of attacking non-uniform data is highly dependent on the adversary's prior probability of success. Our high probability bounds give us a nuanced understanding of the privacy leakage of DP mechanisms in a variety of previously understudied attack settings.

Paperid: 731, https://arxiv.org/pdf/2507.06969.pdf

Abstract:
Differentially private (DP) mechanisms are difficult to interpret and calibrate because existing methods for mapping standard privacy parameters to concrete privacy risks -- re-identification, attribute inference, and data reconstruction -- are both overly pessimistic and inconsistent. In this work, we use the hypothesis-testing interpretation of DP ($f$-DP), and determine that bounds on attack success can take the same unified form across re-identification, attribute inference, and data reconstruction risks. Our unified bounds are (1) consistent across a multitude of attack settings, and (2) tunable, enabling practitioners to evaluate risk with respect to arbitrary (including worst-case) levels of baseline risk. Empirically, our results are tighter than prior methods using $\varepsilon$-DP, RÃ©nyi DP, and concentrated DP. As a result, calibrating noise using our bounds can reduce the required noise by 20% at the same risk level, which yields, e.g., more than 15pp accuracy increase in a text classification task. Overall, this unifying perspective provides a principled framework for interpreting and calibrating the degree of protection in DP against specific levels of re-identification, attribute inference, or data reconstruction risk.

Paperid: 732, https://arxiv.org/pdf/2507.05093.pdf

Abstract:
Large Language Models (LLMs) have transformed human-machine interaction since ChatGPT's 2022 debut, with Retrieval-Augmented Generation (RAG) emerging as a key framework that enhances LLM outputs by integrating external knowledge. However, RAG's reliance on ingesting external documents introduces new vulnerabilities. This paper exposes a critical security gap at the data loading stage, where malicious actors can stealthily corrupt RAG pipelines by exploiting document ingestion. We propose a taxonomy of 9 knowledge-based poisoning attacks and introduce two novel threat vectors -- Content Obfuscation and Content Injection -- targeting common formats (DOCX, HTML, PDF). Using an automated toolkit implementing 19 stealthy injection techniques, we test five popular data loaders, finding a 74.4% attack success rate across 357 scenarios. We further validate these threats on six end-to-end RAG systems -- including white-box pipelines and black-box services like NotebookLM and OpenAI Assistants -- demonstrating high success rates and critical vulnerabilities that bypass filters and silently compromise output integrity. Our results emphasize the urgent need to secure the document ingestion process in RAG systems against covert content manipulations.

Paperid: 733, https://arxiv.org/pdf/2512.19016.pdf

Abstract:
Large Language Models (LLMs) are increasingly used in agentic systems, where their interactions with diverse tools and environments create complex, multi-stage safety challenges. However, existing benchmarks mostly rely on static, single-turn assessments that miss vulnerabilities from adaptive, long-chain attacks. To fill this gap, we introduce DREAM, a framework for systematic evaluation of LLM agents against dynamic, multi-stage attacks. At its core, DREAM uses a Cross-Environment Adversarial Knowledge Graph (CE-AKG) to maintain stateful, cross-domain understanding of vulnerabilities. This graph guides a Contextualized Guided Policy Search (C-GPS) algorithm that dynamically constructs attack chains from a knowledge base of 1,986 atomic actions across 349 distinct digital environments. Our evaluation of 12 leading LLM agents reveals a critical vulnerability: these attack chains succeed in over 70% of cases for most models, showing the power of stateful, cross-environment exploits. Through analysis of these failures, we identify two key weaknesses in current agents: contextual fragility, where safety behaviors fail to transfer across environments, and an inability to track long-term malicious intent. Our findings also show that traditional safety measures, such as initial defense prompts, are largely ineffective against attacks that build context over multiple interactions. To advance agent safety research, we release DREAM as a tool for evaluating vulnerabilities and developing more robust defenses.

Paperid: 734, https://arxiv.org/pdf/2512.18529.pdf

Abstract:
Explicit channel state information (CSI) feedback in IEEE~802.11 conveys \emph{transmit beamforming directions} by reporting quantized Givens rotation and phase angles that parametrize the right-singular subspace of the channel matrix. Because these angles encode fine-grained spatial signatures of the propagation environment, recent work have shown that plaintext CSI feedback can inadvertently reveal user activity, identity, and location to passive eavesdroppers. In this work, we introduce a standards-compatible \emph{differentially private (DP) quantization mechanism} that replaces deterministic angular quantization with an $\varepsilon$-DP stochastic quantizer applied directly to the Givens parameters of the transmit beamforming matrix. The mechanism preserves the 802.11 feedback structure, admits closed-form sensitivity bounds for the angular representation, and enables principled privacy calibration. Numerical simulations demonstrate strong privacy guarantees with minimal degradation in beamforming performance.

Paperid: 735, https://arxiv.org/pdf/2511.22147.pdf

Abstract:
As a mainstream technique for 3D reconstruction, 3D Gaussian splatting (3DGS) has been applied in a wide range of applications and services. Recent studies have revealed critical vulnerabilities in this pipeline and introduced computation cost attacks that lead to malicious resource occupancies and even denial-of-service (DoS) conditions, thereby hindering the reliable deployment of 3DGS. In this paper, we propose the first effective and comprehensive black-box defense framework, named RemedyGS, against such computation cost attacks, safeguarding 3DGS reconstruction systems and services. Our pipeline comprises two key components: a detector to identify the attacked input images with poisoned textures and a purifier to recover the benign images from their attacked counterparts, mitigating the adverse effects of these attacks. Moreover, we incorporate adversarial training into the purifier to enforce distributional alignment between the recovered and original natural images, thereby enhancing the defense efficacy. Experimental results demonstrate that our framework effectively defends against white-box, black-box, and adaptive attacks in 3DGS systems, achieving state-of-the-art performance in both safety and utility.

Paperid: 736, https://arxiv.org/pdf/2511.18772.pdf

Abstract:
Deep neural networks (DNNs) have become valuable intellectual property of model owners, due to the substantial resources required for their development. To protect these assets in the deployed environment, recent research has proposed model usage control mechanisms to ensure models cannot be used without proper authorization. These methods typically lock the utility of the model by embedding an access key into its parameters. However, they often assume static deployment, and largely fail to withstand continual post-deployment model updates, such as fine-tuning or task-specific adaptation. In this paper, we propose ADALOC, to endow key-based model usage control with adaptability during model evolution. It strategically selects a subset of weights as an intrinsic access key, which enables all model updates to be confined to this key throughout the evolution lifecycle. ADALOC enables using the access key to restore the keyed model to the latest authorized states without redistributing the entire network (i.e., adaptation), and frees the model owner from full re-keying after each model update (i.e., lock preservation). We establish a formal foundation to underpin ADALOC, providing crucial bounds such as the errors introduced by updates restricted to the access key. Experiments on standard benchmarks, such as CIFAR-100, Caltech-256, and Flowers-102, and modern architectures, including ResNet, DenseNet, and ConvNeXt, demonstrate that ADALOC achieves high accuracy under significant updates while retaining robust protections. Specifically, authorized usages consistently achieve strong task-specific performance, while unauthorized usage accuracy drops to near-random guessing levels (e.g., 1.01% on CIFAR-100), compared to up to 87.01% without ADALOC. This shows that ADALOC can offer a practical solution for adaptive and protected DNN deployment in evolving real-world scenarios.

Paperid: 737, https://arxiv.org/pdf/2511.16316.pdf

Abstract:
Integrated sensing and communication (ISAC) will be central to 6G-enabled transportation, providing both seamless connectivity and high-precision sensing. However, this tight integration exposes attack points not encountered in pure sensing and communication systems. In this article, we identify unique ISAC-induced security challenges and opportunities in three interrelated domains: cyber-physical (where manipulation of sensors and actuators can mislead perception and control), physical-layer (where over-the-air signals are vulnerable to spoofing and jamming) and protocol (where complex cryptographic protocols cannot detect lower-layer attacks). Building on these insights, we put forward a multi-domain security vision for 6G transportation and propose an integrated security framework that unifies protection across domains.

Paperid: 738, https://arxiv.org/pdf/2511.13808.pdf

Abstract:
A classifier using byte n-grams as features is the only approach we have found fast enough to meet requirements in size (sub 2 MB), speed (multiple GB/s), and latency (sub 10 ms) for deployment in numerous malware detection scenarios. However, we've consistently found that 6-8 grams achieve the best accuracy on our production deployments but have been unable to deploy regularly updated models due to the high cost of finding the top-k most frequent n-grams over terabytes of executable programs. Because the Zipfian distribution well models the distribution of n-grams, we exploit its properties to develop a new top-k n-gram extractor that is up to $35\times$ faster than the previous best alternative. Using our new Zipf-Gramming algorithm, we are able to scale up our production training set and obtain up to 30\% improvement in AUC at detecting new malware. We show theoretically and empirically that our approach will select the top-k items with little error and the interplay between theory and engineering required to achieve these results.

Paperid: 739, https://arxiv.org/pdf/2511.07040.pdf

Abstract:
Deep neural networks have recently achieved notable progress in 3D point cloud recognition, yet their vulnerability to adversarial perturbations poses critical security challenges in practical deployments. Conventional defense mechanisms struggle to address the evolving landscape of multifaceted attack patterns. Through systematic analysis of existing defenses, we identify that their unsatisfactory performance primarily originates from an entangled feature space, where adversarial attacks can be performed easily. To this end, we present 3D-ANC, a novel approach that capitalizes on the Neural Collapse (NC) mechanism to orchestrate discriminative feature learning. In particular, NC depicts where last-layer features and classifier weights jointly evolve into a simplex equiangular tight frame (ETF) arrangement, establishing maximally separable class prototypes. However, leveraging this advantage in 3D recognition confronts two substantial challenges: (1) prevalent class imbalance in point cloud datasets, and (2) complex geometric similarities between object categories. To tackle these obstacles, our solution combines an ETF-aligned classification module with an adaptive training framework consisting of representation-balanced learning (RBL) and dynamic feature direction loss (FDL). 3D-ANC seamlessly empowers existing models to develop disentangled feature spaces despite the complexity in 3D data distribution. Comprehensive evaluations state that 3D-ANC significantly improves the robustness of models with various structures on two datasets. For instance, DGCNN's classification accuracy is elevated from 27.2% to 80.9% on ModelNet40 -- a 53.7% absolute gain that surpasses leading baselines by 34.0%.

Paperid: 740, https://arxiv.org/pdf/2510.21004.pdf

Abstract:
The rapid advancement of generative models has enabled the creation of increasingly stealthy synthetic voices, commonly referred to as audio deepfakes. A recent technique, FOICE [USENIX'24], demonstrates a particularly alarming capability: generating a victim's voice from a single facial image, without requiring any voice sample. By exploiting correlations between facial and vocal features, FOICE produces synthetic voices realistic enough to bypass industry-standard authentication systems, including WeChat Voiceprint and Microsoft Azure. This raises serious security concerns, as facial images are far easier for adversaries to obtain than voice samples, dramatically lowering the barrier to large-scale attacks. In this work, we investigate two core research questions: (RQ1) can state-of-the-art audio deepfake detectors reliably detect FOICE-generated speech under clean and noisy conditions, and (RQ2) whether fine-tuning these detectors on FOICE data improves detection without overfitting, thereby preserving robustness to unseen voice generators such as SpeechT5. Our study makes three contributions. First, we present the first systematic evaluation of FOICE detection, showing that leading detectors consistently fail under both standard and noisy conditions. Second, we introduce targeted fine-tuning strategies that capture FOICE-specific artifacts, yielding significant accuracy improvements. Third, we assess generalization after fine-tuning, revealing trade-offs between specialization to FOICE and robustness to unseen synthesis pipelines. These findings expose fundamental weaknesses in today's defenses and motivate new architectures and training protocols for next-generation audio deepfake detection.

Paperid: 741, https://arxiv.org/pdf/2510.20518.pdf

Abstract:
AI-based sensing at wireless edge devices has the potential to significantly enhance Artificial Intelligence (AI) applications, particularly for vision and perception tasks such as in autonomous driving and environmental monitoring. AI systems rely both on efficient model learning and inference. In the inference phase, features extracted from sensing data are utilized for prediction tasks (e.g., classification or regression). In edge networks, sensors and model servers are often not co-located, which requires communication of features. As sensitive personal data can be reconstructed by an adversary, transformation of the features are required to reduce the risk of privacy violations. While differential privacy mechanisms provide a means of protecting finite datasets, protection of individual features has not been addressed. In this paper, we propose a novel framework for privacy-preserving AI-based sensing, where devices apply transformations of extracted features before transmission to a model server.

Paperid: 742, https://arxiv.org/pdf/2510.09985.pdf

Abstract:
Machine learning has become a crucial part of our lives, with applications spanning nearly every aspect of our daily activities. However, using personal information in machine learning applications has sparked significant security and privacy concerns about user data. To address these challenges, different privacy-preserving machine learning (PPML) frameworks have been developed to protect sensitive information in machine learning applications. These frameworks generally attempt to balance design trade-offs such as computational efficiency, communication overhead, security guarantees, and scalability. Despite the advancements, selecting the optimal framework and parameters for specific deployment scenarios remains a complex and critical challenge for privacy and security application developers. We present Prismo, an open-source recommendation system designed to aid in selecting optimal parameters and frameworks for different PPML application scenarios. Prismo enables users to explore a comprehensive space of PPML frameworks through various properties based on user-defined objectives. It supports automated filtering of suitable candidate frameworks by considering parameters such as the number of parties in multi-party computation or federated learning and computation cost constraints in homomorphic encryption. Prismo models every use case into a Linear Integer Programming optimization problem, ensuring tailored solutions are recommended for each scenario. We evaluate Prismo's effectiveness through multiple use cases, demonstrating its ability to deliver best-fit solutions in different deployment scenarios.

Paperid: 743, https://arxiv.org/pdf/2510.07136.pdf

Abstract:
We study the problem of spectral graph clustering under edge differential privacy (DP). Specifically, we develop three mechanisms: (i) graph perturbation via randomized edge flipping combined with adjacency matrix shuffling, which enforces edge privacy while preserving key spectral properties of the graph. Importantly, shuffling considerably amplifies the guarantees: whereas flipping edges with a fixed probability alone provides only a constant epsilon edge DP guarantee as the number of nodes grows, the shuffled mechanism achieves (epsilon, delta) edge DP with parameters that tend to zero as the number of nodes increase; (ii) private graph projection with additive Gaussian noise in a lower-dimensional space to reduce dimensionality and computational complexity; and (iii) a noisy power iteration method that distributes Gaussian noise across iterations to ensure edge DP while maintaining convergence. Our analysis provides rigorous privacy guarantees and a precise characterization of the misclassification error rate. Experiments on synthetic and real-world networks validate our theoretical analysis and illustrate the practical privacy-utility trade-offs.

Paperid: 744, https://arxiv.org/pdf/2510.03996.pdf

Abstract:
The widespread adoption of Machine Learning as a Service raises critical privacy and security concerns, particularly about data confidentiality and trust in both cloud providers and the machine learning models. Homomorphic Encryption (HE) has emerged as a promising solution to this problems, allowing computations on encrypted data without decryption. Despite its potential, existing approaches to integrate HE into neural networks are often limited to specific architectures, leaving a wide gap in providing a framework for easy development of HE-friendly privacy-preserving neural network models similar to what we have in the broader field of machine learning. In this paper, we present FHEON, a configurable framework for developing privacy-preserving convolutional neural network (CNN) models for inference using HE. FHEON introduces optimized and configurable implementations of privacy-preserving CNN layers including convolutional layers, average pooling layers, ReLU activation functions, and fully connected layers. These layers are configured using parameters like input channels, output channels, kernel size, stride, and padding to support arbitrary CNN architectures. We assess the performance of FHEON using several CNN architectures, including LeNet-5, VGG-11, VGG- 16, ResNet-20, and ResNet-34. FHEON maintains encrypted-domain accuracies within +/- 1% of their plaintext counterparts for ResNet-20 and LeNet-5 models. Notably, on a consumer-grade CPU, the models build on FHEON achieved 98.5% accuracy with a latency of 13 seconds on MNIST using LeNet-5, and 92.2% accuracy with a latency of 403 seconds on CIFAR-10 using ResNet-20. Additionally, FHEON operates within a practical memory budget requiring not more than 42.3 GB for VGG-16.

Paperid: 745, https://arxiv.org/pdf/2510.03995.pdf

Abstract:
Deep learning has become a cornerstone of modern machine learning. It relies heavily on vast datasets and significant computational resources for high performance. This data often contains sensitive information, making privacy a major concern in deep learning. Spiking Neural Networks (SNNs) have emerged as an energy-efficient alternative to conventional deep learning approaches. Nevertheless, SNNs still depend on large volumes of data, inheriting all the privacy challenges of deep learning. Homomorphic encryption addresses this challenge by allowing computations to be performed on encrypted data, ensuring data confidentiality throughout the entire processing pipeline. In this paper, we introduce PRIVSPIKE, a privacy-preserving inference framework for SNNs using the CKKS homomorphic encryption scheme. PRIVSPIKE supports arbitrary depth SNNs and introduces two key algorithms for evaluating the Leaky Integrate-and-Fire activation function: (1) a polynomial approximation algorithm designed for high-performance SNN inference, and (2) a novel scheme-switching algorithm that optimizes precision at a higher computational cost. We evaluate PRIVSPIKE on MNIST, CIFAR-10, Neuromorphic MNIST, and CIFAR-10 DVS using models from LeNet-5 and ResNet-19 architectures, achieving encrypted inference accuracies of 98.10%, 79.3%, 98.1%, and 66.0%, respectively. On a consumer-grade CPU, SNN LeNet-5 models achieved inference times of 28 seconds on MNIST and 212 seconds on Neuromorphic MNIST. For SNN ResNet-19 models, inference took 784 seconds on CIFAR-10 and 1846 seconds on CIFAR-10 DVS. These results establish PRIVSPIKE as a viable and efficient solution for secure SNN inference, bridging the gap between energy-efficient deep neural networks and strong cryptographic privacy guarantees while outperforming prior encrypted SNN solutions.

Paperid: 746, https://arxiv.org/pdf/2510.01670.pdf

Abstract:
Computer-Use Agents (CUAs) are an increasingly deployed class of agents that take actions on GUIs to accomplish user goals. In this paper, we show that CUAs consistently exhibit Blind Goal-Directedness (BGD): a bias to pursue goals regardless of feasibility, safety, reliability, or context. We characterize three prevalent patterns of BGD: (i) lack of contextual reasoning, (ii) assumptions and decisions under ambiguity, and (iii) contradictory or infeasible goals. We develop BLIND-ACT, a benchmark of 90 tasks capturing these three patterns. Built on OSWorld, BLIND-ACT provides realistic environments and employs LLM-based judges to evaluate agent behavior, achieving 93.75% agreement with human annotations. We use BLIND-ACT to evaluate nine frontier models, including Claude Sonnet and Opus 4, Computer-Use-Preview, and GPT-5, observing high average BGD rates (80.8%) across them. We show that BGD exposes subtle risks that arise even when inputs are not directly harmful. While prompting-based interventions lower BGD levels, substantial risk persists, highlighting the need for stronger training- or inference-time interventions. Qualitative analysis reveals observed failure modes: execution-first bias (focusing on how to act over whether to act), thought-action disconnect (execution diverging from reasoning), and request-primacy (justifying actions due to user request). Identifying BGD and introducing BLIND-ACT establishes a foundation for future research on studying and mitigating this fundamental risk and ensuring safe CUA deployment.

Paperid: 747, https://arxiv.org/pdf/2509.21475.pdf

Abstract:
Decentralization has a geographic dimension that conventional metrics such as stake distribution overlook. Where validators run affects resilience to regional shocks (outages, disasters, government intervention) and fairness in reward access. Yet in permissionless systems, locations cannot be mandated, but they emerge from incentives. Today, Ethereum's validators cluster along the Atlantic (EU and U.S. East Coast), where latency is structurally favorable. This raises a key question: when some regions already enjoy latency advantages, how does protocol design shape validator incentives and the geography of (de)centralization? We develop a latency-calibrated agent-based model and compare two Ethereum block-building paradigms: a Single-Source Paradigm (SSP), akin to MEV-Boost, where proposers fetch full blocks from a relay that also propagates them; and a Multi-Source Paradigm (MSP), where proposers aggregate value from multiple sources and broadcast the block themselves. Simulations show that SSP concentrates around relay placement but more slowly, since proximity mainly affects propagation, and the marginal value of time is relatively uniform across regions. MSP centralizes faster: aggregating across sources makes marginal value location-dependent, amplifying payoff dispersion and migration toward latency minima. Source placement and consensus settings can dampen or intensify these effects, though once validators are already clustered, the impact of source placement on decentralization is marginal. In most cases, North America consistently emerges as the focal hub. These findings show that protocol design materially shapes validator geography and offer levers for promoting geographical decentralization.

Paperid: 748, https://arxiv.org/pdf/2509.21400.pdf

Abstract:
As the capabilities of Vision Language Models (VLMs) continue to improve, they are increasingly targeted by jailbreak attacks. Existing defense methods face two major limitations: (1) they struggle to ensure safety without compromising the model's utility; and (2) many defense mechanisms significantly reduce the model's inference efficiency. To address these challenges, we propose SafeSteer, a lightweight, inference-time steering framework that effectively defends against diverse jailbreak attacks without modifying model weights. At the core of SafeSteer is the innovative use of Singular Value Decomposition to construct a low-dimensional "safety subspace." By projecting and reconstructing the raw steering vector into this subspace during inference, SafeSteer adaptively removes harmful generation signals while preserving the model's ability to handle benign inputs. The entire process is executed in a single inference pass, introducing negligible overhead. Extensive experiments show that SafeSteer reduces the attack success rate by over 60% and improves accuracy on normal tasks by 1-2%, without introducing significant inference latency. These results demonstrate that robust and practical jailbreak defense can be achieved through simple, efficient inference-time control.

Paperid: 749, https://arxiv.org/pdf/2509.15213.pdf

Abstract:
Extended reality (XR) applications increasingly integrate Large Language Models (LLMs) to enhance user experience, scene understanding, and even generate executable XR content, and are often called "AI glasses". Despite these potential benefits, the integrated XR-LLM pipeline makes XR applications vulnerable to new forms of attacks. In this paper, we analyze LLM-Integated XR systems in the literature and in practice and categorize them along different dimensions from a systems perspective. Building on this categorization, we identify a common threat model and demonstrate a series of proof-of-concept attacks on multiple XR platforms that employ various LLM models (Meta Quest 3, Meta Ray-Ban, Android, and Microsoft HoloLens 2 running Llama and GPT models). Although these platforms each implement LLM integration differently, they share vulnerabilities where an attacker can modify the public context surrounding a legitimate LLM query, resulting in erroneous visual or auditory feedback to users, thus compromising their safety or privacy, sowing confusion, or other harmful effects. To defend against these threats, we discuss mitigation strategies and best practices for developers, including an initial defense prototype, and call on the community to develop new protection mechanisms to mitigate these risks.

Paperid: 750, https://arxiv.org/pdf/2509.13625.pdf

Abstract:
Large language models (LLMs) have significantly transformed natural language understanding and generation, but they raise privacy concerns due to potential exposure of sensitive information. Studies have highlighted the risk of information leakage, where adversaries can extract sensitive information embedded in the prompts. In this work, we introduce a novel private prediction framework for generating high-quality synthetic text with strong privacy guarantees. Our approach leverages the Differential Privacy (DP) framework to ensure worst-case theoretical bounds on information leakage without requiring any fine-tuning of the underlying models. The proposed method performs inference on private records and aggregates the resulting per-token output distributions. This enables the generation of longer and coherent synthetic text while maintaining privacy guarantees. Additionally, we propose a simple blending operation that combines private and public inference to further enhance utility. Empirical evaluations demonstrate that our approach outperforms previous state-of-the-art methods on in-context-learning (ICL) tasks, making it a promising direction for privacy-preserving text generation while maintaining high utility.

Paperid: 751, https://arxiv.org/pdf/2509.09630.pdf

Abstract:
Widespread reuse of open-source code in smart contract development boosts programming efficiency but significantly amplifies bug propagation across contracts, while dedicated methods for detecting similar smart contract functions remain very limited. Conventional abstract-syntax-tree (AST) based methods for smart contract similarity detection face challenges in handling intricate tree structures, which impedes detailed semantic comparison of code. Recent deep-learning based approaches tend to overlook code syntax and detection interpretability, resulting in suboptimal performance. To fill this research gap, we introduce SmartDetector, a novel approach for computing similarity between smart contract functions, explainable at the fine-grained statement level. Technically, SmartDetector decomposes the AST of a smart contract function into a series of smaller statement trees, each reflecting a structural element of the source code. Then, SmartDetector uses a classifier to compute the similarity score of two functions by comparing each pair of their statement trees. To address the infinite hyperparameter space of the classifier, we mathematically derive a cosine-wise diffusion process to efficiently search optimal hyperparameters. Extensive experiments conducted on three large real-world datasets demonstrate that SmartDetector outperforms current state-of-the-art methods by an average improvement of 14.01% in F1-score, achieving an overall average F1-score of 95.88%.

Paperid: 752, https://arxiv.org/pdf/2509.09112.pdf

Abstract:
Large Language Model (LLM) watermarking embeds detectable signals into generated text for copyright protection, misuse prevention, and content detection. While prior studies evaluate robustness using watermark removal attacks, these methods are often suboptimal, creating the misconception that effective removal requires large perturbations or powerful adversaries. To bridge the gap, we first formalize the system model for LLM watermark, and characterize two realistic threat models constrained on limited access to the watermark detector. We then analyze how different types of perturbation vary in their attack range, i.e., the number of tokens they can affect with a single edit. We observe that character-level perturbations (e.g., typos, swaps, deletions, homoglyphs) can influence multiple tokens simultaneously by disrupting the tokenization process. We demonstrate that character-level perturbations are significantly more effective for watermark removal under the most restrictive threat model. We further propose guided removal attacks based on the Genetic Algorithm (GA) that uses a reference detector for optimization. Under a practical threat model with limited black-box queries to the watermark detector, our method demonstrates strong removal performance. Experiments confirm the superiority of character-level perturbations and the effectiveness of the GA in removing watermarks under realistic constraints. Additionally, we argue there is an adversarial dilemma when considering potential defenses: any fixed defense can be bypassed by a suitable perturbation strategy. Motivated by this principle, we propose an adaptive compound character-level attack. Experimental results show that this approach can effectively defeat the defenses. Our findings highlight significant vulnerabilities in existing LLM watermark schemes and underline the urgency for the development of new robust mechanisms.

Paperid: 753, https://arxiv.org/pdf/2509.05376.pdf

Abstract:
Eye-tracking technology can aid in understanding neurodevelopmental disorders and tracing a person's identity. However, this technology poses a significant risk to privacy, as it captures sensitive information about individuals and increases the likelihood that data can be traced back to them. This paper proposes a human-centered framework designed to prevent identity backtracking while preserving the pedagogical benefits of AI-powered eye tracking in interactive learning environments. We explore how real-time data anonymization, ethical design principles, and regulatory compliance (such as GDPR) can be integrated to build trust and transparency. We first demonstrate the potential for backtracking student IDs and diagnoses in various scenarios using serious game-based eye-tracking data. We then provide a two-stage privacy-preserving framework that prevents participants from being tracked while still enabling diagnostic classification. The first phase covers four scenarios: I) Predicting disorder diagnoses based on different game levels. II) Predicting student IDs based on different game levels. III) Predicting student IDs based on randomized data. IV) Utilizing K-Means for out-of-sample data. In the second phase, we present a two-stage framework that preserves privacy. We also employ Federated Learning (FL) across multiple clients, incorporating a secure identity management system with dummy IDs and administrator-only access controls. In the first phase, the proposed framework achieved 99.3% accuracy for scenario 1, 63% accuracy for scenario 2, and 99.7% accuracy for scenario 3, successfully identifying and assigning a new student ID in scenario 4. In phase 2, we effectively prevented backtracking and established a secure identity management system with dummy IDs and administrator-only access controls, achieving an overall accuracy of 99.40%.

Paperid: 754, https://arxiv.org/pdf/2509.05259.pdf

Abstract:
Automatic Generation Control (AGC) is essential for power grid stability but remains vulnerable to stealthy cyberattacks, such as False Data Injection Attacks (FDIAs), which can disturb the system's stability while evading traditional detection methods. Unlike previous works that relied on blackbox approaches, this work proposes Kolmogorov-Arnold Networks (KAN) as an interpretable and accurate method for FDIA detection in AGC systems, considering the system nonlinearities. KAN models include a method for extracting symbolic equations, and are thus able to provide more interpretability than the majority of machine learning models. The proposed KAN is trained offline to learn the complex nonlinear relationships between the AGC measurements under different operating scenarios. After training, symbolic formulas that describe the trained model's behavior can be extracted and leveraged, greatly enhancing interpretability. Our findings confirm that the proposed KAN model achieves FDIA detection rates of up to 95.97% and 95.9% for the initial model and the symbolic formula, respectively, with a low false alarm rate, offering a reliable approach to enhancing AGC cybersecurity.

Paperid: 755, https://arxiv.org/pdf/2509.04260.pdf

Abstract:
In the rapidly evolving software development landscape, Python stands out for its simplicity, versatility, and extensive ecosystem. Python packages, as units of organization, reusability, and distribution, have become a pressing concern, highlighted by the considerable number of vulnerability reports. As a scripting language, Python often cooperates with other languages for performance or interoperability. This adds complexity to the vulnerabilities inherent to Python packages, and the effectiveness of current vulnerability detection tools remains underexplored. This paper addresses these gaps by introducing PyVul, the first comprehensive benchmark suite of Python-package vulnerabilities. PyVul includes 1,157 publicly reported, developer-verified vulnerabilities, each linked to its affected packages. To accommodate diverse detection techniques, it provides annotations at both commit and function levels. An LLM-assisted data cleansing method is incorporated to improve label accuracy, achieving 100% commit-level and 94% function-level accuracy, establishing PyVul as the most precise large-scale Python vulnerability benchmark. We further carry out a distribution analysis of PyVul, which demonstrates that vulnerabilities in Python packages involve multiple programming languages and exhibit a wide variety of types. Moreover, our analysis reveals that multi-lingual Python packages are potentially more susceptible to vulnerabilities. Evaluation of state-of-the-art detectors using this benchmark reveals a significant discrepancy between the capabilities of existing tools and the demands of effectively identifying real-world security issues in Python packages. Additionally, we conduct an empirical review of the top-ranked CWEs observed in Python packages, to diagnose the fine-grained limitations of current detection tools and highlight the necessity for future advancements in the field.

Paperid: 756, https://arxiv.org/pdf/2508.19620.pdf

Abstract:
Extending recommender systems to federated learning (FL) frameworks to protect the privacy of users or platforms while making recommendations has recently gained widespread attention in academia. This is due to the natural coupling of recommender systems and federated learning architectures: the data originates from distributed clients (mostly mobile devices held by users), which are highly related to privacy. In a centralized recommender system (CenRec), the central server collects clients' data, trains the model, and provides the service. Whereas in federated recommender systems (FedRec), the step of data collecting is omitted, and the step of model training is offloaded to each client. The server only aggregates the model and other knowledge, thus avoiding client privacy leakage. Some surveys of federated recommender systems discuss and analyze related work from the perspective of designing FL systems. However, their utility drops by ignoring specific recommendation scenarios' unique characteristics and practical challenges. For example, the statistical heterogeneity issue in cross-domain FedRec originates from the label drift of the data held by different platforms, which is mainly caused by the recommender itself, but not the federated architecture. Therefore, it should focus more on solving specific problems in real-world recommendation scenarios to encourage the deployment FedRec. To this end, this review comprehensively analyzes the coupling of recommender systems and federated learning from the perspective of recommendation researchers and practitioners. We establish a clear link between recommendation scenarios and FL frameworks, systematically analyzing scenario-specific approaches, practical challenges, and potential opportunities. We aim to develop guidance for the real-world deployment of FedRec, bridging the gap between existing research and applications.

Paperid: 757, https://arxiv.org/pdf/2507.22239.pdf

Abstract:
The increasing digitization of smart grids has improved operational efficiency but also introduced new cybersecurity vulnerabilities, such as False Data Injection Attacks (FDIAs) targeting Automatic Generation Control (AGC) systems. While machine learning (ML) and deep learning (DL) models have shown promise in detecting such attacks, their opaque decision-making limits operator trust and real-world applicability. This paper proposes a hybrid framework that integrates lightweight ML-based attack detection with natural language explanations generated by Large Language Models (LLMs). Classifiers such as LightGBM achieve up to 95.13% attack detection accuracy with only 0.004 s inference latency. Upon detecting a cyberattack, the system invokes LLMs, including GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o mini, to generate human-readable explanation of the event. Evaluated on 100 test samples, GPT-4o mini with 20-shot prompting achieved 93% accuracy in identifying the attack target, a mean absolute error of 0.075 pu in estimating attack magnitude, and 2.19 seconds mean absolute error (MAE) in estimating attack onset. These results demonstrate that the proposed framework effectively balances real-time detection with interpretable, high-fidelity explanations, addressing a critical need for actionable AI in smart grid cybersecurity.

Paperid: 758, https://arxiv.org/pdf/2507.22226.pdf

Abstract:
The increasing integration of IoT-connected devices in smart grids has introduced new vulnerabilities at the distribution level. Of particular concern is the potential for cyberattacks that exploit high-wattage IoT devices, such as EV chargers, to manipulate local demand and destabilize the grid. While previous studies have primarily focused on such attacks at the transmission level, this paper investigates their feasibility and impact at the distribution level. We examine how cyberattackers can target voltage-sensitive nodes, especially those exposed by the presence of high-consumption devices, to cause voltage deviation and service disruption. Our analysis demonstrates that conventional grid protections are insufficient against these intelligent, localized attacks. To address this, we propose resilience strategies using distributed generation (DGs), exploring their role in preemptive planning. This research highlights the urgent need for distribution-level cyber resilience planning in smart grids.

Paperid: 759, https://arxiv.org/pdf/2507.16241.pdf

Abstract:
This paper introduces eX-NIDS, a framework designed to enhance interpretability in flow-based Network Intrusion Detection Systems (NIDS) by leveraging Large Language Models (LLMs). In our proposed framework, flows labelled as malicious by NIDS are initially processed through a module called the Prompt Augmenter. This module extracts contextual information and Cyber Threat Intelligence (CTI)-related knowledge from these flows. This enriched, context-specific data is then integrated with an input prompt for an LLM, enabling it to generate detailed explanations and interpretations of why the flow was identified as malicious by NIDS. We compare the generated interpretations against a Basic-Prompt Explainer baseline, which does not incorporate any contextual information into the LLM's input prompt. Our framework is quantitatively evaluated using the Llama 3 and GPT-4 models, employing a novel evaluation method tailored for natural language explanations, focusing on their correctness and consistency. The results demonstrate that augmented LLMs can produce accurate and consistent explanations, serving as valuable complementary tools in NIDS to explain the classification of malicious flows. The use of augmented prompts enhances performance by over 20% compared to the Basic-Prompt Explainer.

Paperid: 760, https://arxiv.org/pdf/2507.09074.pdf

Abstract:
This paper presents a novel method of executable steganography using the alpha transparency layer of ICO image files to embed and deliver self-decompressing JavaScript payloads within web browsers. By targeting the least significant bit (LSB) of non-transparent alpha layer image values, the proposed method successfully conceals compressed JavaScript code inside a favicon image without affecting visual fidelity. Global web traffic loads 294 billion favicons daily and consume 0.9 petabytes of network bandwidth. A proof-of-concept implementation demonstrates that a 64x64 ICO image can embed up to 512 bytes uncompressed, or 0.8 kilobyte when using lightweight two-fold compression. On page load, a browser fetches the favicon as part of standard behavior, allowing an embedded loader script to extract and execute the payload entirely in memory using native JavaScript APIs and canvas pixel access. This creates a two-stage covert channel requiring no additional network or user requests. Testing across multiple browsers in both desktop and mobile environments confirms successful and silent execution of the embedded script. We evaluate the threat model, relate it to polymorphic phishing attacks that evade favicon-based detection, and analyze evasion of content security policies and antivirus scanners. We map nine example MITRE ATT&CK Framework objectives to single line JavaScript to execute arbitrarily in ICO files. Existing steganalysis and sanitization defenses are discussed, highlighting limitations in detecting or neutralizing alpha-channel exploits. The results demonstrate a stealthy and reusable attack surface that blurs traditional boundaries between static images and executable content. Because modern browsers report silent errors when developers specifically fail to load ICO files, this attack surface offers an interesting example of required web behaviors that in turn compromise security.

Paperid: 761, https://arxiv.org/pdf/2507.07735.pdf

Abstract:
Jailbreak attacks reveal critical vulnerabilities in Large Language Models (LLMs) by causing them to generate harmful or unethical content. Evaluating these threats is particularly challenging due to the evolving nature of LLMs and the sophistication required in effectively probing their vulnerabilities. Current benchmarks and evaluation methods struggle to fully address these challenges, leaving gaps in the assessment of LLM vulnerabilities. In this paper, we review existing jailbreak evaluation practices and identify three assumed desiderata for an effective jailbreak evaluation protocol. To address these challenges, we introduce GuardVal, a new evaluation protocol that dynamically generates and refines jailbreak prompts based on the defender LLM's state, providing a more accurate assessment of defender LLMs' capacity to handle safety-critical situations. Moreover, we propose a new optimization method that prevents stagnation during prompt refinement, ensuring the generation of increasingly effective jailbreak prompts that expose deeper weaknesses in the defender LLMs. We apply this protocol to a diverse set of models, from Mistral-7b to GPT-4, across 10 safety domains. Our findings highlight distinct behavioral patterns among the models, offering a comprehensive view of their robustness. Furthermore, our evaluation process deepens the understanding of LLM behavior, leading to insights that can inform future research and drive the development of more secure models.

Paperid: 762, https://arxiv.org/pdf/2512.21367.pdf

Abstract:
The rapid proliferation of satellite constellations, particularly in Low Earth Orbit (LEO), has fundamentally altered the global space infrastructure, shifting the risk landscape from purely kinetic collisions to complex cyber-physical threats. While traditional safety frameworks focus on debris mitigation, ground-based adversaries increasingly exploit radio-frequency links, supply chain vulnerabilities, and software update pathways to degrade space assets. This paper presents a comparative analysis of satellite cybersecurity across LEO, Medium Earth Orbit (MEO), and Geostationary Earth Orbit (GEO) regimes. By synthesizing data from 60 publicly documented security incidents with key vulnerability proxies--including Telemetry, Tracking, and Command (TT&C) anomalies, encryption weaknesses, and environmental stressors--we characterize how orbital altitude dictates attack feasibility and impact. Our evaluation reveals distinct threat profiles: GEO systems are predominantly targeted via high-frequency uplink exposure, whereas LEO constellations face unique risks stemming from limited power budgets, hardware constraints, and susceptibility to thermal and radiation-induced faults. We further bridge the gap between security and sustainability, arguing that unmitigated cyber vulnerabilities accelerate hardware obsolescence and debris accumulation, undermining efforts toward carbon-neutral space operations. The results demonstrate that weak encryption and command path irregularities are the most consistent predictors of adversarial success across all orbits.

Paperid: 763, https://arxiv.org/pdf/2512.15468.pdf

Abstract:
The success of large language models for code relies on vast amounts of code data, including public open-source repositories, such as GitHub, and private, confidential code from companies. This raises concerns about intellectual property compliance and the potential unauthorized use of license-restricted code. While membership inference (MI) techniques have been proposed to detect such unauthorized usage, their effectiveness can be undermined by semantically equivalent code transformation techniques, which modify code syntax while preserving semantic. In this work, we systematically investigate whether semantically equivalent code transformation rules might be leveraged to evade MI detection. The results reveal that model accuracy drops by only 1.5% in the worst case for each rule, demonstrating that transformed datasets can effectively serve as substitutes for fine-tuning. Additionally, we find that one of the rules (RenameVariable) reduces MI success by 10.19%, highlighting its potential to obscure the presence of restricted code. To validate these findings, we conduct a causal analysis confirming that variable renaming has the strongest causal effect in disrupting MI detection. Notably, we find that combining multiple transformations does not further reduce MI effectiveness. Our results expose a critical loophole in license compliance enforcement for training large language models for code, showing that MI detection can be substantially weakened by transformation-based obfuscation techniques.

Paperid: 764, https://arxiv.org/pdf/2512.11431.pdf

Abstract:
The Domain Name System Security Extensions (DNSSEC) are critical for preventing DNS spoofing, yet its specifications contain ambiguities and vulnerabilities that elude traditional "break-and-fix" approaches. A holistic, foundational security analysis of the protocol has thus remained an open problem. This paper introduces DNSSECVerif, the first framework for comprehensive, automated formal security analysis of the DNSSEC protocol suite. Built on the SAPIC+ symbolic verifier, our high-fidelity model captures protocol-level interactions, including cryptographic operations and stateful caching with fine-grained concurrency control. Using DNSSECVerif, we formally prove four of DNSSEC's core security guarantees and uncover critical ambiguities in the standards--notably, the insecure coexistence of NSEC and NSEC3. Our model also automatically rediscovers three classes of known attacks, demonstrating fundamental weaknesses in the protocol design. To bridge the model-to-reality gap, we validate our findings through targeted testing of mainstream DNS software and a large-scale measurement study of over 2.2 million open resolvers, confirming the real-world impact of these flaws. Our work provides crucial, evidence-based recommendations for hardening DNSSEC specifications and implementations.

Paperid: 765, https://arxiv.org/pdf/2512.09882.pdf

Abstract:
We present the first comprehensive evaluation of AI agents against human cybersecurity professionals in a live enterprise environment. We evaluate ten cybersecurity professionals alongside six existing AI agents and ARTEMIS, our new agent scaffold, on a large university network consisting of ~8,000 hosts across 12 subnets. ARTEMIS is a multi-agent framework featuring dynamic prompt generation, arbitrary sub-agents, and automatic vulnerability triaging. In our comparative study, ARTEMIS placed second overall, discovering 9 valid vulnerabilities with an 82% valid submission rate and outperforming 9 of 10 human participants. While existing scaffolds such as Codex and CyAgent underperformed relative to most human participants, ARTEMIS demonstrated technical sophistication and submission quality comparable to the strongest participants. We observe that AI agents offer advantages in systematic enumeration, parallel exploitation, and cost -- certain ARTEMIS variants cost $18/hour versus $60/hour for professional penetration testers. We also identify key capability gaps: AI agents exhibit higher false-positive rates and struggle with GUI-based tasks.

Paperid: 766, https://arxiv.org/pdf/2512.07814.pdf

Abstract:
Large language models for code (LLM4Code) have greatly improved developer productivity but also raise privacy concerns due to their reliance on open-source repositories containing abundant personally identifiable information (PII). Prior work shows that commercial models can reproduce sensitive PII, yet existing studies largely treat PII as a single category and overlook the heterogeneous risks among different types. We investigate whether distinct PII types vary in their likelihood of being learned and leaked by LLM4Code, and whether this relationship is causal. Our methodology includes building a dataset with diverse PII types, fine-tuning representative models of different scales, computing training dynamics on real PII data, and formulating a structural causal model to estimate the causal effect of learnability on leakage. Results show that leakage risks differ substantially across PII types and correlate with their training dynamics: easy-to-learn instances such as IP addresses exhibit higher leakage, while harder types such as keys and passwords leak less frequently. Ambiguous types show mixed behaviors. This work provides the first causal evidence that leakage risks are type-dependent and offers guidance for developing type-aware and learnability-aware defenses for LLM4Code.

Paperid: 767, https://arxiv.org/pdf/2512.02600.pdf

Abstract:
Recent years have shown increased cyber attacks targeting less secure elements in the software supply chain and causing significant damage to businesses and organizations. The US and EU governments and industry are equally interested in enhancing software security, including supply chain and vulnerability response. On June 26, 2025, researchers from the NSF-supported Secure Software Supply Chain Center (S3C2) and the Software Innovation Campus Paderborn (SICP) conducted a Vulnerability Response Summit with a diverse set of 9 practitioners from 9 companies. The goal of the Summit is to enable sharing between industry practitioners having practical experiences and challenges with software supply chain security, including vulnerability response, and helping to form new collaborations. We conducted five panel discussions based on open-ended questions regarding experiences with vulnerability reports, tools used for vulnerability discovery and management, organizational structures to report vulnerability response and management, preparedness and implementations for Cyber Resilience Act1 (CRA) and NIS22, and bug bounties. The open discussions enabled mutual sharing and shed light on common challenges that industry practitioners with practical experience face when securing their software supply chain, including vulnerability response. In this paper, we provide a summary of the Summit. Full panel questions can be found in the appendix.

Paperid: 768, https://arxiv.org/pdf/2511.21020.pdf

Abstract:
Location-Based Services (LBSs) offer significant convenience to mobile users but pose significant privacy risks, as attackers can infer sensitive personal information through spatiotemporal correlations in user trajectories. Since users' sensitivity to location data varies based on factors such as stay duration, access frequency, and semantic sensitivity, implementing personalized privacy protection is imperative. This paper proposes a Personalized Trajectory Privacy Protection Mechanism (PTPPM) to address these challenges. Our approach begins by modeling an attacker's knowledge of a user's trajectory spatiotemporal correlations, which enables the attacker to identify possible location sets and disregard low-probability location sets. To combat this, we integrate geo-indistinguishability with distortion privacy, allowing users to customize their privacy preferences through a configurable privacy budget and expected inference error bound. This approach provides the theoretical framework for constructing a Protection Location Set (PLS) that obscures users' actual locations. Additionally, we introduce a Personalized Privacy Budget Allocation Algorithm (PPBA), which assesses the sensitivity of locations based on trajectory data and allocates privacy budgets accordingly. This algorithm considers factors such as location semantics and road network constraints. Furthermore, we propose a Permute-and-Flip mechanism that generates perturbed locations while minimizing perturbation distance, thus balancing privacy protection and Quality of Service (QoS). Simulation results demonstrate that our mechanism outperforms existing benchmarks, offering superior privacy protection while maintaining user QoS requirements.

Paperid: 769, https://arxiv.org/pdf/2511.18581.pdf

Abstract:
Many recent studies showed that LLMs are vulnerable to jailbreak attacks, where an attacker can perturb the input of an LLM to induce it to generate an output for a harmful question. In general, existing jailbreak techniques either optimize a semantic template intended to induce the LLM to produce harmful outputs or optimize a suffix that leads the LLM to initiate its response with specific tokens (e.g., "Sure"). In this work, we introduce TASO (Template and Suffix Optimization), a novel jailbreak method that optimizes both a template and a suffix in an alternating manner. Our insight is that suffix optimization and template optimization are complementary to each other: suffix optimization can effectively control the first few output tokens but cannot control the overall quality of the output, while template optimization provides guidance for the entire output but cannot effectively control the initial tokens, which significantly impact subsequent responses. Thus, they can be combined to improve the attack's effectiveness. We evaluate the effectiveness of TASO on benchmark datasets (including HarmBench and AdvBench) on 24 leading LLMs (including models from the Llama family, OpenAI, and DeepSeek). The results demonstrate that TASO can effectively jailbreak existing LLMs. We hope our work can inspire future studies in exploring this direction.

Paperid: 770, https://arxiv.org/pdf/2511.18568.pdf

Abstract:
Cellular networks have become foundational to modern communication, supporting a broad range of applications, from civilian use to enterprise systems and military tactical networks. The advent of fifth-generation and beyond cellular networks (B5G) introduces emerging compute capabilities into the Radio Access Network (RAN), transforming it from a traditionally closed, vendor-locked infrastructure into an open and programmable ecosystem. This evolution, exemplified by Open-RAN (O-RAN), enables the deployment of control-plane applications from diverse sources, which can dynamically influence user-plane traffic in response to real-time events. As cellular infrastructures become more disaggregated and software-driven, security becomes an increasingly critical concern. Zero-Trust Architecture (ZTA) has emerged as a promising security paradigm that discards implicit trust assumptions by acknowledging that threats may arise from both external and internal sources. ZTA mandates comprehensive and fine-grained security mechanisms across both control and user planes to contain adversarial movements and enhance breach detection and attack response actions. In this paper, we explore the adoption of ZTA in the context of 5G and beyond, with a particular focus on O-RAN as an architectural enabler. We analyze how ZTA principles align with the architectural and operational characteristics of O-RAN, and identify key challenges and opportunities for embedding zero-trust mechanisms within O-RAN-based cellular networks.

Paperid: 771, https://arxiv.org/pdf/2511.17671.pdf

Abstract:
Language agents are rapidly expanding from single-user assistants to multi-user collaborators in shared workspaces and groups. However, today's language models lack a mechanism for isolating user interactions and concurrent tasks, creating a new attack vector inherent to this new setting: cross-user poisoning (CUP). In a CUP attack, an adversary injects ordinary-looking messages that poison the persistent, shared state, which later triggers the agent to execute unintended, attacker-specified actions on behalf of benign users. We validate CUP on real systems, successfully attacking popular multi-user agents. To study the phenomenon systematically, we present MURMUR, a framework that composes single-user tasks into concurrent, group-based scenarios using an LLM to generate realistic, history-aware user interactions. We observe that CUP attacks succeed at high rates and their effects persist across multiple tasks, thus posing fundamental risks to multi-user LLM deployments. Finally, we introduce a first-step defense with task-based clustering to mitigate this new class of vulnerability

Paperid: 772, https://arxiv.org/pdf/2511.14301.pdf

Abstract:
Transformer models are foundational to natural language processing (NLP) applications, yet remain vulnerable to backdoor attacks introduced through poisoned data, which implant hidden behaviors during training. To strengthen the ability to prevent such compromises, recent research has focused on designing increasingly stealthy attacks to stress-test existing defenses, pairing backdoor behaviors with stylized artifact or token-level perturbation triggers. However, this trend diverts attention from the harder and more realistic case: making the model respond to semantic triggers such as specific names or entities, where a successful backdoor could manipulate outputs tied to real people or events in deployed systems. Motivated by this growing disconnect, we introduce SteganoBackdoor, bringing stealth techniques back into line with practical threat models. Leveraging innocuous properties from natural-language steganography, SteganoBackdoor applies a gradient-guided data optimization process to transform semantic trigger seeds into steganographic carriers that embed a high backdoor payload, remain fluent, and exhibit no representational resemblance to the trigger. Across diverse experimental settings, SteganoBackdoor achieves over 99% attack success at an order-of-magnitude lower data-poisoning rate than prior approaches while maintaining unparalleled evasion against a comprehensive suite of data-level defenses. By revealing this practical and covert attack, SteganoBackdoor highlights an urgent blind spot in current defenses and demands immediate attention to adversarial data defenses and real-world threat modeling.

Paperid: 773, https://arxiv.org/pdf/2511.05919.pdf

Abstract:
LLMs are now an integral part of information retrieval. As such, their role as question answering chatbots raises significant concerns due to their shown vulnerability to adversarial man-in-the-middle (MitM) attacks. Here, we propose the first principled attack evaluation on LLM factual memory under prompt injection via Xmera, our novel, theory-grounded MitM framework. By perturbing the input given to "victim" LLMs in three closed-book and fact-based QA settings, we undermine the correctness of the responses and assess the uncertainty of their generation process. Surprisingly, trivial instruction-based attacks report the highest success rate (up to ~85.3%) while simultaneously having a high uncertainty for incorrectly answered questions. To provide a simple defense mechanism against Xmera, we train Random Forest classifiers on the response uncertainty levels to distinguish between attacked and unattacked queries (average AUC of up to ~96%). We believe that signaling users to be cautious about the answers they receive from black-box and potentially corrupt LLMs is a first checkpoint toward user cyberspace safety.

Paperid: 774, https://arxiv.org/pdf/2511.04215.pdf

Abstract:
Large language models (LLMs) increasingly employ guardrails to enforce ethical, legal, and application-specific constraints on their outputs. While effective at mitigating harmful responses, these guardrails introduce a new class of vulnerabilities by exposing observable decision patterns. In this work, we present the first study of black-box LLM guardrail reverse-engineering attacks. We propose Guardrail Reverse-engineering Attack (GRA), a reinforcement learning-based framework that leverages genetic algorithm-driven data augmentation to approximate the decision-making policy of victim guardrails. By iteratively collecting input-output pairs, prioritizing divergence cases, and applying targeted mutations and crossovers, our method incrementally converges toward a high-fidelity surrogate of the victim guardrail. We evaluate GRA on three widely deployed commercial systems, namely ChatGPT, DeepSeek, and Qwen3, and demonstrate that it achieves an rule matching rate exceeding 0.92 while requiring less than $85 in API costs. These findings underscore the practical feasibility of guardrail extraction and highlight significant security risks for current LLM safety mechanisms. Our findings expose critical vulnerabilities in current guardrail designs and highlight the urgent need for more robust defense mechanisms in LLM deployment.

Paperid: 775, https://arxiv.org/pdf/2510.24920.pdf

Abstract:
Software supply chains, while providing immense economic and software development value, are only as strong as their weakest link. Over the past several years, there has been an exponential increase in cyberattacks specifically targeting vulnerable links in critical software supply chains. These attacks disrupt the day-to-day functioning and threaten the security of nearly everyone on the internet, from billion-dollar companies and government agencies to hobbyist open-source developers. The ever-evolving threat of software supply chain attacks has garnered interest from both the software industry and US government in improving software supply chain security. On Thursday, March 6th, 2025, four researchers from the NSF-backed Secure Software Supply Chain Center (S3C2) conducted a Secure Software Supply Chain Summit with a diverse set of 18 practitioners from 17 organizations. The goals of the Summit were: (1) to enable sharing between participants from different industries regarding practical experiences and challenges with software supply chain security; (2) to help form new collaborations; and (3) to learn about the challenges facing participants to inform our future research directions. The summit consisted of discussions of six topics relevant to the government agencies represented, including software bill of materials (SBOMs); compliance; malicious commits; build infrastructure; culture; and large language models (LLMs) and security. For each topic of discussion, we presented a list of questions to participants to spark conversation. In this report, we provide a summary of the summit. The open questions and challenges that remained after each topic are listed at the end of each topic's section, and the initial discussion questions for each topic are provided in the appendix.

Paperid: 776, https://arxiv.org/pdf/2510.23457.pdf

Abstract:
The 5G protocol lacks a robust base station authentication mechanism during the initial bootstrapping phase, leaving it susceptible to threats such as fake base station attacks. Conventional solutions, including digital signatures based on Public Key Infrastructures (PKIs) and identity-based signatures, are inadequate against quantum-capable adversaries. While integrating NIST's Post-Quantum Cryptography (PQC) standards is a leading approach for quantum resistance, their suitability for 5G base station authentication remains unexplored. Moreover, current solutions are predominantly centralized and lack security features such as distributed authentication. This work presents, to our knowledge, the first comprehensive network-level performance characterization of integrating NIST-PQC standards and conventional digital signatures (including threshold and identity-based schemes) into 5G base station authentication. Our findings reveal significant feasibility concerns, with direct PQC adoption hindered by protocol constraints and large signature sizes. We also highlight the performance limitations of conventional methods due to the overhead of certificate chains. To mitigate these challenges, we propose BORG, a transitional authentication solution based on a Hierarchical Identity-Based Threshold Signature scheme with a Fail-Stop property. BORG offers post-mortem post-quantum forgery detection and distributed trust via threshold and compact signatures, well-suited for 5G's stringent requirements. Our performance analysis underscores an important warning on the infeasibility of direct PQC integration and positions BORG as an effective transitional solution toward future quantum-resilient 5G authentication.

Paperid: 777, https://arxiv.org/pdf/2510.16028.pdf

Abstract:
Neural networks increasingly run on hardware outside the user's control (cloud GPUs, inference marketplaces). Yet ML-as-a-Service reveals little about what actually ran or whether returned outputs faithfully reflect the intended inputs. Users lack recourse against service downgrades (model swaps, quantization, graph rewrites, or discrepancies like altered ad embeddings). Verifying outputs is hard because floating-point(FP) execution on heterogeneous accelerators is inherently nondeterministic. Existing approaches are either impractical for real FP neural networks or reintroduce vendor trust. We present NAO: a Nondeterministic tolerance Aware Optimistic verification protocol that accepts outputs within principled operator-level acceptance regions rather than requiring bitwise equality. NAO combines two error models: (i) sound per-operator IEEE-754 worst-case bounds and (ii) tight empirical percentile profiles calibrated across hardware. Discrepancies trigger a Merkle-anchored, threshold-guided dispute game that recursively partitions the computation graph until one operator remains, where adjudication reduces to a lightweight theoretical-bound check or a small honest-majority vote against empirical thresholds. Unchallenged results finalize after a challenge window, without requiring trusted hardware or deterministic kernels. We implement NAO as a PyTorch-compatible runtime and a contract layer currently deployed on Ethereum Holesky testnet. The runtime instruments graphs, computes per-operator bounds, and runs unmodified vendor kernels in FP32 with negligible overhead (0.3% on Qwen3-8B). Across CNNs, Transformers and diffusion models on A100, H100, RTX6000, RTX4090, empirical thresholds are $10^2-10^3$ times tighter than theoretical bounds, and bound-aware adversarial attacks achieve 0% success. NAO reconciles scalability with verifiability for real-world heterogeneous ML compute.

Paperid: 778, https://arxiv.org/pdf/2510.08609.pdf

Abstract:
Developers consistently use version constraints to specify acceptable versions of the dependencies for their project. \emph{Pinning} dependencies can reduce the likelihood of breaking changes, but comes with a cost of manually managing the replacement of outdated and vulnerable dependencies. On the other hand, \emph{floating} can be used to automatically get bug fixes and security fixes, but comes with the risk of breaking changes. Security practitioners advocate \emph{pinning} dependencies to prevent against software supply chain attacks, e.g., malicious package updates. However, since \emph{pinning} is the tightest version constraint, \emph{pinning} is the most likely to result in outdated dependencies. Nevertheless, how the likelihood of becoming outdated or vulnerable dependencies changes across version constraint types is unknown. The goal of this study is to aid developers in making an informed dependency version constraint choice by empirically evaluating the likelihood of dependencies becoming outdated or vulnerable across version constraint types at scale. In this study, we first identify the trends in dependency version constraint usage and the patterns of version constraint type changes made by developers in the npm, PyPI, and Cargo ecosystems. We then modeled the dependency state transitions using survival analysis and estimated how the likelihood of becoming outdated or vulnerable changes when using \emph{pinning} as opposed to the rest of the version constraint types. We observe that among outdated and vulnerable dependencies, the most commonly used version constraint type is \emph{floating-minor}, with \emph{pinning} being the next most common. We also find that \emph{floating-major} is the least likely to result in outdated and \emph{floating-minor} is the least likely to result in vulnerable dependencies.

Paperid: 779, https://arxiv.org/pdf/2509.25624.pdf

Abstract:
As LLMs advance into autonomous agents with tool-use capabilities, they introduce security challenges that extend beyond traditional content-based LLM safety concerns. This paper introduces Sequential Tool Attack Chaining (STAC), a novel multi-turn attack framework that exploits agent tool use. STAC chains together tool calls that each appear harmless in isolation but, when combined, collectively enable harmful operations that only become apparent at the final execution step. We apply our framework to automatically generate and systematically evaluate 483 STAC cases, featuring 1,352 sets of user-agent-environment interactions and spanning diverse domains, tasks, agent types, and 10 failure modes. Our evaluations show that state-of-the-art LLM agents, including GPT-4.1, are highly vulnerable to STAC, with attack success rates (ASR) exceeding 90% in most cases. The core design of STAC's automated framework is a closed-loop pipeline that synthesizes executable multi-step tool chains, validates them through in-environment execution, and reverse-engineers stealthy multi-turn prompts that reliably induce agents to execute the verified malicious sequence. We further perform defense analysis against STAC and find that existing prompt-based defenses provide limited protection. To address this gap, we propose a new reasoning-driven defense prompt that achieves far stronger protection, cutting ASR by up to 28.8%. These results highlight a crucial gap: defending tool-enabled agents requires reasoning over entire action sequences and their cumulative effects, rather than evaluating isolated prompts or responses.

Paperid: 780, https://arxiv.org/pdf/2509.20411.pdf

Abstract:
Machine learning-based cybersecurity systems are highly vulnerable to adversarial attacks, while Generative Adversarial Networks (GANs) act as both powerful attack enablers and promising defenses. This survey systematically reviews GAN-based adversarial defenses in cybersecurity (2021--August 31, 2025), consolidating recent progress, identifying gaps, and outlining future directions. Using a PRISMA-compliant systematic literature review protocol, we searched five major digital libraries. From 829 initial records, 185 peer-reviewed studies were retained and synthesized through quantitative trend analysis and thematic taxonomy development. We introduce a four-dimensional taxonomy spanning defensive function, GAN architecture, cybersecurity domain, and adversarial threat model. GANs improve detection accuracy, robustness, and data utility across network intrusion detection, malware analysis, and IoT security. Notable advances include WGAN-GP for stable training, CGANs for targeted synthesis, and hybrid GAN models for improved resilience. Yet, persistent challenges remain such as instability in training, lack of standardized benchmarks, high computational cost, and limited explainability. GAN-based defenses demonstrate strong potential but require advances in stable architectures, benchmarking, transparency, and deployment. We propose a roadmap emphasizing hybrid models, unified evaluation, real-world integration, and defenses against emerging threats such as LLM-driven cyberattacks. This survey establishes the foundation for scalable, trustworthy, and adaptive GAN-powered defenses.

Paperid: 781, https://arxiv.org/pdf/2509.11080.pdf

Abstract:
Recommender systems (RecSys) have been widely applied to various applications, including E-commerce, finance, healthcare, social media and have become increasingly influential in shaping user behavior and decision-making, highlighting their growing impact in various domains. However, recent studies have shown that RecSys are vulnerable to membership inference attacks (MIAs), which aim to infer whether user interaction record was used to train a target model or not. MIAs on RecSys models can directly lead to a privacy breach. For example, via identifying the fact that a purchase record that has been used to train a RecSys associated with a specific user, an attacker can infer that user's special quirks. In recent years, MIAs have been shown to be effective on other ML tasks, e.g., classification models and natural language processing. However, traditional MIAs are ill-suited for RecSys due to the unseen posterior probability. Although MIAs on RecSys form a newly emerging and rapidly growing research area, there has been no systematic survey on this topic yet. In this article, we conduct the first comprehensive survey on RecSys MIAs. This survey offers a comprehensive review of the latest advancements in RecSys MIAs, exploring the design principles, challenges, attack and defense associated with this emerging field. We provide a unified taxonomy that categorizes different RecSys MIAs based on their characterizations and discuss their pros and cons. Based on the limitations and gaps identified in this survey, we point out several promising future research directions to inspire the researchers who wish to follow this area. This survey not only serves as a reference for the research community but also provides a clear description for researchers outside this research domain.

Paperid: 782, https://arxiv.org/pdf/2509.05643.pdf

Abstract:
Coverage-guided fuzzing has been widely applied to address zero-day vulnerabilities in general-purpose software and operating systems. This approach relies on instrumenting the target code at compile time. However, applying it to industrial systems remains challenging, due to proprietary and closed-source compiler toolchains and lack of access to source code. FuzzBox addresses these limitations by integrating emulation with fuzzing: it dynamically instruments code during execution in a virtualized environment, for the injection of fuzz inputs, failure detection, and coverage analysis, without requiring source code recompilation and hardware-specific dependencies. We show the effectiveness of FuzzBox through experiments in the context of a proprietary MILS (Multiple Independent Levels of Security) hypervisor for industrial applications. Additionally, we analyze the applicability of FuzzBox across commercial IoT firmware, showcasing its broad portability.

Paperid: 783, https://arxiv.org/pdf/2509.03331.pdf

Abstract:
The adoption of Large Language Models (LLMs) for automated software vulnerability patching has shown promising outcomes on carefully curated evaluation sets. Nevertheless, existing datasets predominantly rely on superficial validation methods rather than exploit-based verification, leading to overestimated performance in security-sensitive applications. This paper introduces VulnRepairEval, an evaluation framework anchored in functional Proof-of-Concept (PoC) exploits. Our framework delivers a comprehensive, containerized evaluation pipeline that enables reproducible differential assessment, where repair success requires the original exploit to fail execution against the modified code. The benchmark construction involved extensive data curation: we processed over 400 CVEs and approximately 2,500 potential sources to extract a collection of authentic vulnerability instances (23 Python CVEs) amenable to automated testing with working PoCs. Through VulnRepairEval, we conduct a comprehensive evaluation of 12 popular LLMs and observe a significant performance deficit: even the top-performing model successfully addresses merely 5/23 instances (about 21.7%), exposing critical weaknesses in security-focused applications. Our failure analysis reveals that most unsuccessful attempts stem from imprecise vulnerability identification and patches containing syntactic or semantic errors. Enhanced prompting strategies and multi-agent approaches yield minimal improvements, with overall effectiveness remaining largely unaffected. This work contributes a stringent, practical evaluation framework for LLM-driven vulnerability remediation and underscores the necessity for assessment protocols that authentically reflect real-world exploitation scenarios.

Paperid: 784, https://arxiv.org/pdf/2508.21265.pdf

Abstract:
This research explores the use of superconductor electronics (SCE) for accelerating fully homomorphic encryption (FHE), focusing on the Number-Theoretic Transform (NTT), a key computational bottleneck in FHE schemes. We present SCE-NTT, a dedicated hardware accelerator based on superconductive single flux quantum (SFQ) logic and memory, targeting high performance and energy efficiency beyond the limits of conventional CMOS. To address SFQ constraints such as limited dense RAM and restricted fanin/fanout, we propose a deeply pipelined NTT-128 architecture using shift register memory (SRM). Designed for N=128 32-bit coefficients, NTT-128 comprises log2(N)=7 processing elements (PEs), each featuring a butterfly unit (BU), dual coefficient memories operating in ping-pong mode via FIFO-based SRM queues, and twiddle factor buffers. The BU integrates a Shoup modular multiplier optimized for a small area, leveraging precomputed twiddle factors. A new RSFQ cell library with over 50 parameterized cells, including compound logic units, was developed for implementation. Functional and timing correctness were validated using JoSIM analog simulations and Verilog models. A multiphase clocking scheme was employed to enhance robustness and reduce path-balancing overhead, improving circuit reliability. Fabricated results show the NTT-128 unit achieves 531 million NTT/sec at 34 GHz, over 100x faster than state-of-the-art CMOS equivalents. We also project that the architecture can scale to larger sizes, such as a 2^14-point NTT in approximately 482 ns. Key-switch throughput is estimated at 1.63 million operations/sec, significantly exceeding existing hardware. These results demonstrate the strong potential of SCE-based accelerators for scalable, energy-efficient secure computation in the post-quantum era, with further gains anticipated through advances in fabrication.

Paperid: 785, https://arxiv.org/pdf/2508.18665.pdf

Abstract:
Large language models (LLMs) based Recommender Systems (RecSys) can flexibly adapt recommendation systems to different domains. It utilizes in-context learning (ICL), i.e., the prompts, to customize the recommendation functions, which include sensitive historical user-specific item interactions, e.g., implicit feedback like clicked items or explicit product reviews. Such private information may be exposed to novel privacy attack. However, no study has been done on this important issue. We design four membership inference attacks (MIAs), aiming to reveal whether victims' historical interactions have been used by system prompts. They are \emph{direct inquiry, hallucination, similarity, and poisoning attacks}, each of which utilizes the unique features of LLMs or RecSys. We have carefully evaluated them on three LLMs that have been used to develop ICL-LLM RecSys and two well-known RecSys benchmark datasets. The results confirm that the MIA threat on LLM RecSys is realistic: direct inquiry and poisoning attacks showing significantly high attack advantages. We have also analyzed the factors affecting these attacks, such as the number of shots in system prompts and the position of the victim in the shots.

Paperid: 786, https://arxiv.org/pdf/2508.18652.pdf

Abstract:
Retrieval-augmented generation (RAG) systems are widely deployed in real-world applications in diverse domains such as finance, healthcare, and cybersecurity. However, many studies showed that they are vulnerable to knowledge corruption attacks, where an attacker can inject adversarial texts into the knowledge database of a RAG system to induce the LLM to generate attacker-desired outputs. Existing studies mainly focus on attacking specific queries or queries with similar topics (or keywords). In this work, we propose UniC-RAG, a universal knowledge corruption attack against RAG systems. Unlike prior work, UniC-RAG jointly optimizes a small number of adversarial texts that can simultaneously attack a large number of user queries with diverse topics and domains, enabling an attacker to achieve various malicious objectives, such as directing users to malicious websites, triggering harmful command execution, or launching denial-of-service attacks. We formulate UniC-RAG as an optimization problem and further design an effective solution to solve it, including a balanced similarity-based clustering method to enhance the attack's effectiveness. Our extensive evaluations demonstrate that UniC-RAG is highly effective and significantly outperforms baselines. For instance, UniC-RAG could achieve over 90% attack success rate by injecting 100 adversarial texts into a knowledge database with millions of texts to simultaneously attack a large set of user queries (e.g., 2,000). Additionally, we evaluate existing defenses and show that they are insufficient to defend against UniC-RAG, highlighting the need for new defense mechanisms in RAG systems.

Paperid: 787, https://arxiv.org/pdf/2508.13246.pdf

Abstract:
In this study, we disclose a worrying new vulnerability in Large Language Models (LLMs), which we term \textbf{involuntary jailbreak}. Unlike existing jailbreak attacks, this weakness is distinct in that it does not involve a specific attack objective, such as generating instructions for \textit{building a bomb}. Prior attack methods predominantly target localized components of the LLM guardrail. In contrast, involuntary jailbreaks may potentially compromise the entire guardrail structure, which our method reveals to be surprisingly fragile. We merely employ a single universal prompt to achieve this goal. In particular, we instruct LLMs to generate several questions that would typically be rejected, along with their corresponding in-depth responses (rather than a refusal). Remarkably, this simple prompt strategy consistently jailbreaks the majority of leading LLMs, including Claude Opus 4.1, Grok 4, Gemini 2.5 Pro, and GPT 4.1. We hope this problem can motivate researchers and practitioners to re-evaluate the robustness of LLM guardrails and contribute to stronger safety alignment in future.

Paperid: 788, https://arxiv.org/pdf/2508.11907.pdf

Abstract:
Federated learning (FL) offers a promising paradigm for collaborative model training while preserving data privacy. However, its susceptibility to gradient inversion attacks poses a significant challenge, necessitating robust privacy protection mechanisms. This paper introduces a novel theoretical framework to decipher the intricate interplay between attack and protection complexities in privacy-preserving FL. We formally define "Attack Complexity" as the minimum computational and data resources an adversary requires to reconstruct private data below a given error threshold, and "Protection Complexity" as the expected distortion introduced by privacy mechanisms. Leveraging Maximum Bayesian Privacy (MBP), we derive tight theoretical bounds for protection complexity, demonstrating its scaling with model dimensionality and privacy budget. Furthermore, we establish comprehensive bounds for attack complexity, revealing its dependence on privacy leakage, gradient distortion, model dimension, and the chosen privacy level. Our findings quantitatively illuminate the fundamental trade-offs between privacy guarantees, system utility, and the effort required for both attacking and defending. This framework provides critical insights for designing more secure and efficient federated learning systems.

Paperid: 789, https://arxiv.org/pdf/2508.02840.pdf

Abstract:
The increasing complexity of software systems has led to a surge in cybersecurity vulnerabilities, necessitating efficient and scalable solutions for vulnerability assessment. However, the deployment of large pre-trained models in real-world scenarios is hindered by their substantial computational and storage demands. To address this challenge, we propose a novel resource-efficient framework that integrates knowledge distillation and particle swarm optimization to enable automated vulnerability assessment. Our framework employs a two-stage approach: First, particle swarm optimization is utilized to optimize the architecture of a compact student model, balancing computational efficiency and model capacity. Second, knowledge distillation is applied to transfer critical vulnerability assessment knowledge from a large teacher model to the optimized student model. This process significantly reduces the model size while maintaining high performance. Experimental results on an enhanced MegaVul dataset, comprising 12,071 CVSS (Common Vulnerability Scoring System) v3 annotated vulnerabilities, demonstrate the effectiveness of our approach. Our approach achieves a 99.4% reduction in model size while retaining 89.3% of the original model's accuracy. Furthermore, it outperforms state-of-the-art baselines by 1.7% in accuracy with 60% fewer parameters. The framework also reduces training time by 72.1% and architecture search time by 34.88% compared to traditional genetic algorithms.

Paperid: 790, https://arxiv.org/pdf/2508.01469.pdf

Abstract:
We present VWAttacker, the first systematic testing framework for analyzing the security of Voice over WiFi (VoWiFi) User Equipment (UE) implementations. VWAttacker includes a complete VoWiFi network testbed that communicates with Commercial-Off-The-Shelf (COTS) UEs based on a simple interface to test the behavior of diverse VoWiFi UE implementations; uses property-guided adversarial testing to uncover security issues in different UEs systematically. To reduce manual effort in extracting and testing properties, we introduce an LLM-based, semi-automatic, and scalable approach for property extraction and testcase (TC) generation. These TCs are systematically mutated by two domain-specific transformations. Furthermore, we introduce two deterministic oracles to detect property violations automatically. Coupled with these techniques, VWAttacker extracts 63 properties from 11 specifications, evaluates 1,116 testcases, and detects 13 issues in 21 UEs. The issues range from enforcing a DH shared secret to 0 to supporting weak algorithms. These issues result in attacks that expose the victim UE's identity or establish weak channels, thus severely hampering the security of cellular networks. We responsibly disclose the findings to all the related vendors. At the time of writing, one of the vulnerabilities has been acknowledged by MediaTek with high severity.

Paperid: 791, https://arxiv.org/pdf/2507.23453.pdf

Abstract:
This paper investigates defenses for LLM-based evaluation systems against prompt injection. We formalize a class of threats called blind attacks, where a candidate answer is crafted independently of the true answer to deceive the evaluator. To counter such attacks, we propose a framework that augments Standard Evaluation (SE) with Counterfactual Evaluation (CFE), which re-evaluates the submission against a deliberately false ground-truth answer. An attack is detected if the system validates an answer under both standard and counterfactual conditions. Experiments show that while standard evaluation is highly vulnerable, our SE+CFE framework significantly improves security by boosting attack detection with minimal performance trade-offs.

Paperid: 792, https://arxiv.org/pdf/2507.18365.pdf

Abstract:
Recommender systems (RecSys) have become an essential component of many web applications. The core of the system is a recommendation model trained on highly sensitive user-item interaction data. While privacy-enhancing techniques are actively studied in the research community, the real-world model development still depends on minimal privacy protection, e.g., via controlled access. Users of such systems should have the right to choose \emph{not} to share highly sensitive interactions. However, there is no method allowing the user to know which interactions are more sensitive than others. Thus, quantifying the privacy risk of RecSys training data is a critical step to enabling privacy-aware RecSys model development and deployment. We propose a membership-inference attack (MIA)- based privacy scoring method, RecPS, to measure privacy risks at both the interaction and user levels. The RecPS interaction-level score definition is motivated and derived from differential privacy, which is then extended to the user-level scoring method. A critical component is the interaction-level MIA method RecLiRA, which gives high-quality membership estimation. We have conducted extensive experiments on well-known benchmark datasets and RecSys models to show the unique features and benefits of RecPS scoring in risk assessment and RecSys model unlearning.

Paperid: 793, https://arxiv.org/pdf/2507.09411.pdf

Abstract:
Large Language Models (LLMs) have transformed software development and automated code generation. Motivated by these advancements, this paper explores the feasibility of LLMs in modifying malware source code to generate variants. We introduce LLMalMorph, a semi-automated framework that leverages semantical and syntactical code comprehension by LLMs to generate new malware variants. LLMalMorph extracts function-level information from the malware source code and employs custom-engineered prompts coupled with strategically defined code transformations to guide the LLM in generating variants without resource-intensive fine-tuning. To evaluate LLMalMorph, we collected 10 diverse Windows malware samples of varying types, complexity and functionality and generated 618 variants. Our thorough experiments demonstrate that it is possible to reduce the detection rates of antivirus engines of these malware variants to some extent while preserving malware functionalities. In addition, despite not optimizing against any Machine Learning (ML)-based malware detectors, several variants also achieved notable attack success rates against an ML-based malware classifier. We also discuss the limitations of current LLM capabilities in generating malware variants from source code and assess where this emerging technology stands in the broader context of malware variant generation.

Paperid: 794, https://arxiv.org/pdf/2507.05132.pdf

Abstract:
The Internet of Medical Things (IoMT) represents a paradigm shift in the healthcare sector, enabling the interconnection of medical devices, sensors, and systems to enhance patient monitoring, diagnosis, and management. The rapid evolution of IoMT presents significant benefits to the healthcare domains. However, there is a rapid increase in distributed denial of service (DDoS) attacks on the IoMT networks due to several vulnerabilities in the IoMT-connected devices, which negatively impact patients' health and can even lead to deaths. Thus, in this paper, we aim to save lives via investigating an extreme learning machine for detecting DDoS attacks on IoMT devices. The proposed approach achieves a high accuracy at a low implementation budget. Thus, it can reduce the implementation cost of the DDoS detection system, making the model capable of executing on the fog level.

Paperid: 795, https://arxiv.org/pdf/2512.23307.pdf

Abstract:
Neural ranking models have achieved remarkable progress and are now widely deployed in real-world applications such as Retrieval-Augmented Generation (RAG). However, like other neural architectures, they remain vulnerable to adversarial manipulations: subtle character-, word-, or phrase-level perturbations can poison retrieval results and artificially promote targeted candidates, undermining the integrity of search engines and downstream systems. Existing defenses either rely on heuristics with poor generalization or on certified methods that assume overly strong adversarial knowledge, limiting their practical use. To address these challenges, we propose RobustMask, a novel defense that combines the context-prediction capability of pretrained language models with a randomized masking-based smoothing mechanism. Our approach strengthens neural ranking models against adversarial perturbations at the character, word, and phrase levels. Leveraging both the pairwise comparison ability of ranking models and probabilistic statistical analysis, we provide a theoretical proof of RobustMask's certified top-K robustness. Extensive experiments further demonstrate that RobustMask successfully certifies over 20% of candidate documents within the top-10 ranking positions against adversarial perturbations affecting up to 30% of their content. These results highlight the effectiveness of RobustMask in enhancing the adversarial robustness of neural ranking models, marking a significant step toward providing stronger security guarantees for real-world retrieval systems.

Paperid: 796, https://arxiv.org/pdf/2512.18264.pdf

Abstract:
As vision-language models (VLMs) become widely adopted, VLM-based attribute inference attacks have emerged as a serious privacy concern, enabling adversaries to infer private attributes from images shared on social media. This escalating threat calls for dedicated protection methods to safeguard user privacy. However, existing methods often degrade the visual quality of images or interfere with vision-based functions on social media, thereby failing to achieve a desirable balance between privacy protection and user experience. To address this challenge, we propose a novel protection method that jointly optimizes privacy suppression and utility preservation under a visual consistency constraint. While our method is conceptually effective, fair comparisons between methods remain challenging due to the lack of publicly available evaluation datasets. To fill this gap, we introduce VPI-COCO, a publicly available benchmark comprising 522 images with hierarchically structured privacy questions and corresponding non-private counterparts, enabling fine-grained and joint evaluation of protection methods in terms of privacy preservation and user experience. Building upon this benchmark, experiments on multiple VLMs demonstrate that our method effectively reduces PAR below 25%, keeps NPAR above 88%, maintains high visual consistency, and generalizes well to unseen and paraphrased privacy questions, demonstrating its strong practical applicability for real-world VLM deployments.

Paperid: 797, https://arxiv.org/pdf/2512.17538.pdf

Abstract:
Autonomous AI agents lack traceable accountability mechanisms, creating a fundamental dilemma where systems must either operate as ``downgraded tools'' or risk real-world abuse. This vulnerability stems from the limitations of traditional key-based authentication, which guarantees neither the operator's physical identity nor the agent's code integrity. To bridge this gap, we propose BAID (Binding Agent ID), a comprehensive identity infrastructure establishing verifiable user-code binding. BAID integrates three orthogonal mechanisms: local binding via biometric authentication, decentralized on-chain identity management, and a novel zkVM-based Code-Level Authentication protocol. By leveraging recursive proofs to treat the program binary as the identity, this protocol provides cryptographic guarantees for operator identity, agent configuration integrity, and complete execution provenance, thereby effectively preventing unauthorized operation and code substitution. We implement and evaluate a complete prototype system, demonstrating the practical feasibility of blockchain-based identity management and zkVM-based authentication protocol.

Paperid: 798, https://arxiv.org/pdf/2512.15892.pdf

Abstract:
Recent advances in large language models (LLMs) have enabled a new generation of autonomous agents that operate over sustained periods and manage sensitive resources on behalf of users. Trusted for their ability to act without direct oversight, such agents are increasingly considered in high-stakes domains including financial management, dispute resolution, and governance. Yet in practice, agents execute on infrastructure controlled by a host, who can tamper with models, inputs, or outputs, undermining any meaningful notion of autonomy. We address this gap by introducing VET (Verifiable Execution Traces), a formal framework that achieves host-independent authentication of agent outputs and takes a step toward host-independent autonomy. Central to VET is the Agent Identity Document (AID), which specifies an agent's configuration together with the proof systems required for verification. VET is compositional: it supports multiple proof mechanisms, including trusted hardware, succinct cryptographic proofs, and notarized TLS transcripts (Web Proofs). We implement VET for an API-based LLM agent and evaluate our instantiation on realistic workloads. We find that for today's black-box, secret-bearing API calls, Web Proofs appear to be the most practical choice, with overhead typically under 3$\times$ compared to direct API calls, while for public API calls, a lower-overhead TEE Proxy is often sufficient. As a case study, we deploy a verifiable trading agent that produces proofs for each decision and composes Web Proofs with a TEE Proxy. Our results demonstrate that practical, host-agnostic authentication is already possible with current technology, laying the foundation for future systems that achieve full host-independent autonomy.

Paperid: 799, https://arxiv.org/pdf/2512.03358.pdf

Abstract:
Quantum Federated Learning (QFL) promises to revolutionize distributed machine learning by combining the computational power of quantum devices with collaborative model training. Yet, privacy of both data and models remains a critical challenge. In this work, we propose a privacy-preserving QFL framework where a network of $n$ quantum devices trains local models and transmits them to a central server under a multi-layered privacy protocol. Our design leverages Singular Value Decomposition (SVD), Quantum Key Distribution (QKD), and Analytic Quantum Gradient Descent (AQGD) to secure data preparation, model sharing, and training stages. Through theoretical analysis and experiments on contemporary quantum platforms and datasets, we demonstrate that the framework robustly safeguards data and model confidentiality while maintaining training efficiency.

Paperid: 800, https://arxiv.org/pdf/2512.02301.pdf

Abstract:
This work presents vQFL (vehicular Quantum Federated Learning), a new framework that leverages quantum machine learning techniques to tackle key privacy and security issues in autonomous vehicular networks. Furthermore, we propose a server-side adapted fine-tuning method, ft-VQFL,to achieve enhanced and more resilient performance. By integrating quantum federated learning with differential privacy and quantum key distribution (QKD), our quantum vanguard approach creates a multi-layered defense against both classical and quantum threats while preserving model utility. Extensive experimentation with industry-standard datasets (KITTI, Waymo, and nuScenes) demonstrates that vQFL maintains accuracy comparable to standard QFL while significantly improving privacy guaranties and communication security. Our implementation using various quantum models (VQC, QCNN, and SamplerQNN) reveals minimal performance overhead despite the added security measures. This work establishes a crucial foundation for quantum-resistant autonomous vehicle systems that can operate securely in the post-quantum era while efficiently processing the massive data volumes (20-40TB/day per vehicle) generated by modern autonomous fleets. The modular design of the framework allows for seamless integration with existing vehicular networks, positioning vQFL as an essential component for future intelligent transportation infrastructure.

Paperid: 801, https://arxiv.org/pdf/2512.00412.pdf

Abstract:
Large Reasoning Models (LRMs) have emerged as a powerful advancement in multi-step reasoning tasks, offering enhanced transparency and logical consistency through explicit chains of thought (CoT). However, these models introduce novel safety and reliability risks, such as CoT-hijacking and prompt-induced inefficiencies, which are not fully captured by existing evaluation methods. To address this gap, we propose RT-LRM, a unified benchmark designed to assess the trustworthiness of LRMs. RT-LRM evaluates three core dimensions: truthfulness, safety and efficiency. Beyond metric-based evaluation, we further introduce the training paradigm as a key analytical perspective to investigate the systematic impact of different training strategies on model trustworthiness. We achieve this by designing a curated suite of 30 reasoning tasks from an observational standpoint. We conduct extensive experiments on 26 models and identify several valuable insights into the trustworthiness of LRMs. For example, LRMs generally face trustworthiness challenges and tend to be more fragile than Large Language Models (LLMs) when encountering reasoning-induced risks. These findings uncover previously underexplored vulnerabilities and highlight the need for more targeted evaluations. In addition, we release a scalable toolbox for standardized trustworthiness research to support future advancements in this important field. Our code and datasets will be open-sourced.

Paperid: 802, https://arxiv.org/pdf/2511.20002.pdf

Abstract:
Conventional adversarial attacks focus on manipulating a single decision of neural networks. However, real-world models often operate in a sequence of decisions, where an isolated mistake can be easily corrected, but cascading errors can lead to severe risks. This paper reveals a novel threat: a single perturbation can hijack the whole decision chain. We demonstrate the feasibility of manipulating a model's outputs toward multiple, predefined outcomes, such as simultaneously misclassifying "non-motorized lane" signs as "motorized lane" and "pedestrian" as "plastic bag". To expose this threat, we introduce Semantic-Aware Universal Perturbations (SAUPs), which induce varied outcomes based on the semantics of the inputs. We overcome optimization challenges by developing an effective algorithm, which searches for perturbations in normalized space with a semantic separation strategy. To evaluate the practical threat of SAUPs, we present RIST, a new real-world image dataset with fine-grained semantic annotations. Extensive experiments on three multimodal large language models demonstrate their vulnerability, achieving a 70% attack success rate when controlling five distinct targets using just an adversarial frame.

Paperid: 803, https://arxiv.org/pdf/2511.11784.pdf

Abstract:
Jailbreak attacks designed to bypass safety mechanisms pose a serious threat by prompting LLMs to generate harmful or inappropriate content, despite alignment with ethical guidelines. Crafting universal filtering rules remains difficult due to their inherent dependence on specific contexts. To address these challenges without relying on threshold calibration or model fine-tuning, this work introduces a semantic consistency analysis between successful and unsuccessful responses, demonstrating that a negation-aware scoring approach captures meaningful patterns. Building on this insight, a novel detection framework called NegBLEURT Forest is proposed to evaluate the degree of alignment between outputs elicited by adversarial prompts and expected safe behaviors. It identifies anomalous responses using the Isolation Forest algorithm, enabling reliable jailbreak detection. Experimental results show that the proposed method consistently achieves top-tier performance, ranking first or second in accuracy across diverse models using the crafted dataset, while competing approaches exhibit notable sensitivity to model and data variations.

Paperid: 804, https://arxiv.org/pdf/2511.11549.pdf

Abstract:
Verifying user attributes to provide fine-grained access control to databases is fundamental to attribute-based authentication. Either a single (central) authority verifies all the attributes, or multiple independent authorities verify the attributes distributedly. In the central setup, the authority verifies all user attributes, and the user downloads only the authorized record. While this is communication efficient, it reveals all user attributes to the authority. A distributed setup prevents this privacy breach by letting each authority verify and learn only one attribute. Motivated by this, Jafarpisheh~et~al. introduced an information-theoretic formulation, called distributed attribute-based private access control (DAPAC). With $N$ non-colluding authorities (servers), $N$ attributes and $K$ possible values for each attribute, the DAPAC system lets each server learn only the single attribute value that it verifies, and is oblivious to the remaining $N-1$. The user retrieves its designated record, without learning anything about the remaining database records. The goal is to maximize the rate, i.e., the ratio of desired message size to total download size. However, not all attributes are sensitive, and DAPAC's privacy constraints can be too restrictive, negatively affecting the rate. To leverage the heterogeneous privacy requirements of user attributes, we propose heterogeneous (Het)DAPAC, a framework which off-loads verification of $N-D$ of the $N$ attributes to a central server, and retains DAPAC's architecture for the $D$ sensitive attributes. We first present a HetDAPAC scheme, which improves the rate from $\frac{1}{2K}$ to $\frac{1}{K+1}$, while sacrificing the privacy of a few non-sensitive attributes. Unlike DAPAC, our scheme entails a download imbalance across servers; we propose a second scheme achieving a balanced per-server download and a rate of $\frac{D+1}{2KD}$.

Paperid: 805, https://arxiv.org/pdf/2511.09880.pdf

Abstract:
Many machine learning models are fine-tuned from large language models (LLMs) to achieve high performance in specialized domains like code generation, biomedical analysis, and mathematical problem solving. However, this fine-tuning process often introduces a critical vulnerability: the systematic degradation of safety alignment, undermining ethical guidelines and increasing the risk of harmful outputs. Addressing this challenge, we introduce EnchTable, a novel framework designed to transfer and maintain safety alignment in downstream LLMs without requiring extensive retraining. EnchTable leverages a Neural Tangent Kernel (NTK)-based safety vector distillation method to decouple safety constraints from task-specific reasoning, ensuring compatibility across diverse model architectures and sizes. Additionally, our interference-aware merging technique effectively balances safety and utility, minimizing performance compromises across various task domains. We implemented a fully functional prototype of EnchTable on three different task domains and three distinct LLM architectures, and evaluated its performance through extensive experiments on eleven diverse datasets, assessing both utility and model safety. Our evaluations include LLMs from different vendors, demonstrating EnchTable's generalization capability. Furthermore, EnchTable exhibits robust resistance to static and dynamic jailbreaking attacks, outperforming vendor-released safety models in mitigating adversarial prompts. Comparative analyses with six parameter modification methods and two inference-time alignment baselines reveal that EnchTable achieves a significantly lower unsafe rate, higher utility score, and universal applicability across different task domains. Additionally, we validate EnchTable can be seamlessly integrated into various deployment pipelines without significant overhead.

Paperid: 806, https://arxiv.org/pdf/2510.25736.pdf

Abstract:
We revisit the problem of symmetric private information retrieval (SPIR) in settings where the database replication is modeled by a simple graph. Here, each vertex corresponds to a server, and a message is replicated on two servers if and only if there is an edge between them. To satisfy the requirement of database privacy, we let all the servers share some common randomness, independent of the messages. We aim to quantify the improvement in SPIR capacity, i.e., the maximum ratio of the number of desired and downloaded symbols, compared to the setting with graph-replicated common randomness. Towards this, we develop an algorithm to convert a class of PIR schemes into the corresponding SPIR schemes, thereby establishing a capacity lower bound on graphs for which such schemes exist. This includes the class of path and cyclic graphs for which we derive capacity upper bounds that are tighter than the trivial bounds given by the respective PIR capacities. For the special case of path graph with three vertices, we identify the SPIR capacity to be $\frac{1}{2}$.

Paperid: 807, https://arxiv.org/pdf/2510.24200.pdf

Abstract:
Federated Learning has seen an increased deployment in real-world scenarios recently, as it enables the distributed training of machine learning models without explicit data sharing between individual clients. Yet, the introduction of the so-called gradient inversion attacks has fundamentally challenged its privacy-preserving properties. Unfortunately, as these attacks mostly rely on direct data optimization without any formal guarantees, the vulnerability of real-world systems remains in dispute and requires tedious testing for each new federated deployment. To overcome these issues, recently the SPEAR attack was introduced, which is based on a theoretical analysis of the gradients of linear layers with ReLU activations. While SPEAR is an important theoretical breakthrough, the attack's practicality was severely limited by its exponential runtime in the batch size b. In this work, we fill this gap by applying State-of-the-Art techniques from Sparsely-Used Dictionary Learning to make the problem of gradient inversion on linear layers with ReLU activations tractable. Our experiments demonstrate that our new attack, SPEAR++, retains all desirable properties of SPEAR, such as robustness to DP noise and FedAvg aggregation, while being applicable to 10x bigger batch sizes.

Paperid: 808, https://arxiv.org/pdf/2510.23443.pdf

Abstract:
The growing intersection of cybersecurity and law creates a complex information space where traditional legal research tools struggle to deal with nuanced connections between cases, statutes, and technical vulnerabilities. This knowledge divide hinders collaboration between legal experts and cybersecurity professionals. To address this important gap, this work provides a first step towards intelligent systems capable of navigating the increasingly intricate cyber-legal domain. We demonstrate promising initial results on multilingual tasks.

Paperid: 809, https://arxiv.org/pdf/2510.22945.pdf

Abstract:
In this paper, we propose a groundbreaking quantum-secure federated learning (QFL) framework designed to safeguard distributed learning systems against the emerging threat of quantum-enabled adversaries. As classical cryptographic methods become increasingly vulnerable to quantum attacks, our framework establishes a resilient security architecture that remains robust even in the presence of quantum-capable attackers. We integrate and rigorously evaluate advanced quantum and post-quantum protocols including Quantum Key Distribution (QKD), Quantum Teleportation, Key Encapsulation Mechanisms (KEM) and Post-Quantum Cryptography (PQC) to fortify the QFL process against both classical and quantum threats. These mechanisms are systematically analyzed and implemented to demonstrate their seamless interoperability within a secure and scalable QFL ecosystem. Through comprehensive theoretical modeling and experimental validation, this work provides a detailed security and performance assessment of the proposed framework. Our findings lay a strong foundation for next-generation federated learning systems that are inherently secure in the quantum era.

Paperid: 810, https://arxiv.org/pdf/2510.14894.pdf

Abstract:
To preserve privacy, multi-party computation (MPC) enables executing Machine Learning (ML) algorithms on secret-shared or encrypted data. However, existing MPC frameworks are not optimized for sparse data. This makes them unsuitable for ML applications involving sparse data, e.g., recommender systems or genomics. Even in plaintext, such applications involve high-dimensional sparse data, that cannot be processed without sparsity-related optimizations due to prohibitively large memory requirements. Since matrix multiplication is central in ML algorithms, we propose MPC algorithms to multiply secret sparse matrices. On the one hand, our algorithms avoid the memory issues of the "dense" data representation of classic secure matrix multiplication algorithms. On the other hand, our algorithms can significantly reduce communication costs (some experiments show a factor 1000) for realistic problem sizes. We validate our algorithms in two ML applications in which existing protocols are impractical. An important question when developing MPC algorithms is what assumptions can be made. In our case, if the number of non-zeros in a row is a sensitive piece of information then a short runtime may reveal that the number of non-zeros is small. Existing approaches make relatively simple assumptions, e.g., that there is a universal upper bound to the number of non-zeros in a row. This often doesn't align with statistical reality, in a lot of sparse datasets the amount of data per instance satisfies a power law. We propose an approach which allows adopting a safe upper bound on the distribution of non-zeros in rows/columns of sparse matrices.

Paperid: 811, https://arxiv.org/pdf/2510.11974.pdf

Abstract:
Cyber threat intelligence (CTI) is central to modern cybersecurity, providing critical insights for detecting and mitigating evolving threats. With the natural language understanding and reasoning capabilities of large language models (LLMs), there is increasing interest in applying them to CTI, which calls for benchmarks that can rigorously evaluate their performance. Several early efforts have studied LLMs on some CTI tasks but remain limited: (i) they adopt only closed-book settings, relying on parametric knowledge without leveraging CTI knowledge bases; (ii) they cover only a narrow set of tasks, lacking a systematic view of the CTI landscape; and (iii) they restrict evaluation to single-source analysis, unlike realistic scenarios that require reasoning across multiple sources. To fill these gaps, we present CTIArena, the first benchmark for evaluating LLM performance on heterogeneous, multi-source CTI under knowledge-augmented settings. CTIArena spans three categories, structured, unstructured, and hybrid, further divided into nine tasks that capture the breadth of CTI analysis in modern security operations. We evaluate ten widely used LLMs and find that most struggle in closed-book setups but show noticeable gains when augmented with security-specific knowledge through our designed retrieval-augmented techniques. These findings highlight the limitations of general-purpose LLMs and the need for domain-tailored techniques to fully unlock their potential for CTI.

Paperid: 812, https://arxiv.org/pdf/2510.11251.pdf

Abstract:
The widespread use of large language models (LLMs) and open-source code has raised ethical and security concerns regarding the distribution and attribution of source code, including unauthorized redistribution, license violations, and misuse of code for malicious purposes. Watermarking has emerged as a promising solution for source attribution, but existing techniques rely heavily on hand-crafted transformation rules, abstract syntax tree (AST) manipulation, or task-specific training, limiting their scalability and generality across languages. Moreover, their robustness against attacks remains limited. To address these limitations, we propose CodeMark-LLM, an LLM-driven watermarking framework that embeds watermark into source code without compromising its semantics or readability. CodeMark-LLM consists of two core components: (i) Semantically Consistent Embedding module that applies functionality-preserving transformations to encode watermark bits, and (ii) Differential Comparison Extraction module that identifies the applied transformations by comparing the original and watermarked code. Leveraging the cross-lingual generalization ability of LLM, CodeMark-LLM avoids language-specific engineering and training pipelines. Extensive experiments across diverse programming languages and attack scenarios demonstrate its robustness, effectiveness, and scalability.

Paperid: 813, https://arxiv.org/pdf/2510.05633.pdf

Abstract:
Over the years, the forensics community has proposed several deep learning-based detectors to mitigate the risks of generative AI. Recently, frequency-domain artifacts (particularly periodic peaks in the magnitude spectrum), have received significant attention, as they have been often considered a strong indicator of synthetic image generation. However, state-of-the-art detectors are typically used as black-boxes, and it still remains unclear whether they truly rely on these peaks. This limits their interpretability and trust. In this work, we conduct a systematic study to address this question. We propose a strategy to remove spectral peaks from images and analyze the impact of this operation on several detectors. In addition, we introduce a simple linear detector that relies exclusively on frequency peaks, providing a fully interpretable baseline free from the confounding influence of deep learning. Our findings reveal that most detectors are not fundamentally dependent on spectral peaks, challenging a widespread assumption in the field and paving the way for more transparent and reliable forensic tools.

Paperid: 814, https://arxiv.org/pdf/2510.05068.pdf

Abstract:
We consider the problem of decentralized constrained optimization with multiple agents $E_1,\ldots,E_N$ who jointly wish to learn the optimal solution set while keeping their feasible sets $\mathcal{P}_1,\ldots,\mathcal{P}_N$ private from each other. We assume that the objective function $f$ is known to all agents and each feasible set is a collection of points from a universal alphabet $\mathcal{P}_{alph}$. A designated agent (leader) starts the communication with the remaining (non-leader) agents, and is the first to retrieve the solution set. The leader searches for the solution by sending queries to and receiving answers from the non-leaders, such that the information on the individual feasible sets revealed to the leader should be no more than nominal, i.e., what is revealed from learning the solution set alone. We develop achievable schemes for obtaining the solution set at nominal information leakage, and characterize their communication costs under two communication setups between agents. In this work, we focus on two kinds of network setups: i) ring, where each agent communicates with two adjacent agents, and ii) star, where only the leader communicates with the remaining agents. We show that, if the leader first learns the joint feasible set through an existing private set intersection (PSI) protocol and then deduces the solution set, the information leaked to the leader is greater than nominal. Moreover, we draw connection of our schemes to threshold PSI (ThPSI), which is a PSI-variant where the intersection is revealed only when its cardinality is larger than a threshold value. Finally, for various realizations of $f$ mapped uniformly at random to a fixed range of values, our schemes are more communication-efficient with a high probability compared to retrieving the entire feasible set through PSI.

Paperid: 815, https://arxiv.org/pdf/2510.04027.pdf

Abstract:
With the increasing need to safeguard data privacy in machine learning models, differential privacy (DP) is one of the major frameworks to build privacy-preserving models. Support Vector Machines (SVMs) are widely used traditional machine learning models due to their robust margin guarantees and strong empirical performance in binary classification. However, applying DP to multi-class SVMs is inadequate, as the standard one-versus-rest (OvR) and one-versus-one (OvO) approaches repeatedly query each data sample when building multiple binary classifiers, thus consuming the privacy budget proportionally to the number of classes. To overcome this limitation, we explore all-in-one SVM approaches for DP, which access each data sample only once to construct multi-class SVM boundaries with margin maximization properties. We propose a novel differentially Private Multi-class SVM (PMSVM) with weight and gradient perturbation methods, providing rigorous sensitivity and convergence analyses to ensure DP in all-in-one SVMs. Empirical results demonstrate that our approach surpasses existing DP-SVM methods in multi-class scenarios.

Paperid: 816, https://arxiv.org/pdf/2510.02386.pdf

Abstract:
Leaderboards for LRMs have turned evaluation into a competition, incentivizing developers to optimize directly on benchmark suites. A shortcut to achieving higher rankings is to incorporate evaluation benchmarks into the training data, thereby yielding inflated performance, known as benchmark contamination. Surprisingly, our studies find that evading contamination detections for LRMs is alarmingly easy. We focus on the two scenarios where contamination may occur in practice: (I) when the base model evolves into LRM via SFT and RL, we find that contamination during SFT can be originally identified by contamination detection methods. Yet, even a brief GRPO training can markedly conceal contamination signals that most detection methods rely on. Further empirical experiments and theoretical analysis indicate that PPO style importance sampling and clipping objectives are the root cause of this detection concealment, indicating that a broad class of RL methods may inherently exhibit similar concealment capability; (II) when SFT contamination with CoT is applied to advanced LRMs as the final stage, most contamination detection methods perform near random guesses. Without exposure to non-members, contaminated LRMs would still have more confidence when responding to those unseen samples that share similar distributions to the training set, and thus, evade existing memorization-based detection methods. Together, our findings reveal the unique vulnerability of LRMs evaluations: Model developers could easily contaminate LRMs to achieve inflated leaderboards performance while leaving minimal traces of contamination, thereby strongly undermining the fairness of evaluation and threatening the integrity of public leaderboards. This underscores the urgent need for advanced contamination detection methods and trustworthy evaluation protocols tailored to LRMs.

Paperid: 817, https://arxiv.org/pdf/2510.01359.pdf

Abstract:
Code-capable large language model (LLM) agents are increasingly embedded into software engineering workflows where they can read, write, and execute code, raising the stakes of safety-bypass ("jailbreak") attacks beyond text-only settings. Prior evaluations emphasize refusal or harmful-text detection, leaving open whether agents actually compile and run malicious programs. We present JAWS-BENCH (Jailbreaks Across WorkSpaces), a benchmark spanning three escalating workspace regimes that mirror attacker capability: empty (JAWS-0), single-file (JAWS-1), and multi-file (JAWS-M). We pair this with a hierarchical, executable-aware Judge Framework that tests (i) compliance, (ii) attack success, (iii) syntactic correctness, and (iv) runtime executability, moving beyond refusal to measure deployable harm. Using seven LLMs from five families as backends, we find that under prompt-only conditions in JAWS-0, code agents accept 61% of attacks on average; 58% are harmful, 52% parse, and 27% run end-to-end. Moving to single-file regime in JAWS-1 drives compliance to ~ 100% for capable models and yields a mean ASR (Attack Success Rate) ~ 71%; the multi-file regime (JAWS-M) raises mean ASR to ~ 75%, with 32% instantly deployable attack code. Across models, wrapping an LLM in an agent substantially increases vulnerability -- ASR raises by 1.6x -- because initial refusals are frequently overturned during later planning/tool-use steps. Category-level analyses identify which attack classes are most vulnerable and most readily deployable, while others exhibit large execution gaps. These findings motivate execution-aware defenses, code-contextual safety filters, and mechanisms that preserve refusal decisions throughout the agent's multi-step reasoning and tool use.

Paperid: 818, https://arxiv.org/pdf/2510.00791.pdf

Abstract:
Quantum key distribution (QKD) enables Alice and Bob to exchange a secret key over a public, untrusted quantum channel. Compared to classical key exchange, QKD achieves everlasting security: after the protocol execution the key is secure against adversaries that can do unbounded computations. On the flip side, while classical key exchange can be achieved non-interactively (with two simultaneous messages between Alice and Bob), no non-interactive protocol is known that provides everlasting security, even using quantum information. In this work, we make progress on this problem. Our main technical contribution is a computational variant of the celebrated monogamy of entanglement game, where the secret is only computationally hidden from the players, rather than information-theoretically. In these settings, we prove a negligible bound on the maximal winning probability over all strategies. As a direct application, we obtain a non-interactive (simultaneous message) QKD protocol from any post-quantum classical non-interactive key exchange, which satisfies everlastingly secure assuming Alice and Bob agree on the same key. The protocol only uses EPR pairs and standard and Hadamard basis measurements, making it suitable for near-term quantum hardware. We also propose how to convert this protocol into a two-round protocol that satisfies the standard notion of everlasting security. Finally, we prove a no-go theorem which establishes that (in contrast to the case of ordinary multi-round QKD) entanglement is necessary for non-interactive QKD, i.e., the messages sent by Alice and Bob cannot both be unentangled with their respective quantum memories if the protocol is to be everlastingly secure.

Paperid: 819, https://arxiv.org/pdf/2509.25448.pdf

Abstract:
Large language models (LLMs) are often modified after release through post-processing such as post-training or quantization, which makes it challenging to determine whether one model is derived from another. Existing provenance detection methods have two main limitations: (1) they embed signals into the base model before release, which is infeasible for already published models, or (2) they compare outputs across models using hand-crafted or random prompts, which are not robust to post-processing. In this work, we propose LLMPrint, a novel detection framework that constructs fingerprints by exploiting LLMs' inherent vulnerability to prompt injection. Our key insight is that by optimizing fingerprint prompts to enforce consistent token preferences, we can obtain fingerprints that are both unique to the base model and robust to post-processing. We further develop a unified verification procedure that applies to both gray-box and black-box settings, with statistical guarantees. We evaluate LLMPrint on five base models and around 700 post-trained or quantized variants. Our results show that LLMPrint achieves high true positive rates while keeping false positive rates near zero.

Paperid: 820, https://arxiv.org/pdf/2509.22022.pdf

Abstract:
Distributed Point Functions (DPFs) enable sharing secret point functions across multiple parties, supporting privacy-preserving technologies such as Private Information Retrieval, and anonymous communications. While 2-party PRG-based schemes with logarithmic key sizes have been known for a decade, extending these solutions to multi-party settings has proven challenging. In particular, PRG-based multi-party DPFs have historically struggled with practicality due to key sizes growing exponentially with the number of parties and the field size. Our work addresses this efficiency bottleneck by optimizing the PRG-based multi-party DPF scheme of Boyle et al. (EUROCRYPT'15). By leveraging the honest-majority assumption, we eliminate the exponential factor present in this scheme. Our construction is the first PRG-based multi-party DPF scheme with practical key sizes, and provides key up to 3x smaller than the best known multi-party DPF. This work demonstrates that with careful optimization, PRG-based multi-party DPFs can achieve practical performances, and even obtain top performances.

Paperid: 821, https://arxiv.org/pdf/2509.20680.pdf

Abstract:
Fine-tuning large language models (LLMs) with local data is a widely adopted approach for organizations seeking to adapt LLMs to their specific domains. Given the shared characteristics in data across different organizations, the idea of collaboratively fine-tuning an LLM using data from multiple sources presents an appealing opportunity. However, organizations are often reluctant to share local data, making centralized fine-tuning impractical. Federated learning (FL), a privacy-preserving framework, enables clients to retain local data while sharing only model parameters for collaborative training, offering a potential solution. While fine-tuning LLMs on centralized datasets risks data leakage through next-token prediction, the iterative aggregation process in FL results in a global model that encapsulates generalized knowledge, which some believe protects client privacy. In this paper, however, we present contradictory findings through extensive experiments. We show that attackers can still extract training data from the global model, even using straightforward generation methods, with leakage increasing as the model size grows. Moreover, we introduce an enhanced attack strategy tailored to FL, which tracks global model updates during training to intensify privacy leakage. To mitigate these risks, we evaluate privacy-preserving techniques in FL, including differential privacy, regularization-constrained updates and adopting LLMs with safety alignment. Our results provide valuable insights and practical guidelines for reducing privacy risks when training LLMs with FL.

Paperid: 822, https://arxiv.org/pdf/2509.19101.pdf

Abstract:
Backdoor attacks pose a significant security threat to natural language processing (NLP) systems, but existing methods lack explainable trigger mechanisms and fail to quantitatively model vulnerability patterns. This work pioneers the quantitative connection between explainable artificial intelligence (XAI) and backdoor attacks, introducing Sensitron, a novel modular framework for crafting stealthy and robust backdoor triggers. Sensitron employs a progressive refinement approach where Dynamic Meta-Sensitivity Analysis (DMSA) first identifies potentially vulnerable input tokens, Hierarchical SHAP Estimation (H-SHAP) then provides explainable attribution to precisely pinpoint the most influential tokens, and finally a Plug-and-Rank mechanism that generates contextually appropriate triggers. We establish the first mathematical correlation (Sensitivity Ranking Correlation, SRC=0.83) between explainability scores and empirical attack success, enabling precise targeting of model vulnerabilities. Sensitron achieves 97.8% Attack Success Rate (ASR) (+5.8% over state-of-the-art (SOTA)) with 85.4% ASR at 0.1% poisoning rate, demonstrating robust resistance against multiple SOTA defenses. This work reveals fundamental NLP vulnerabilities and provides new attack vectors through weaponized explainability.

Paperid: 823, https://arxiv.org/pdf/2509.16870.pdf

Abstract:
Intelligent software systems powered by Large Language Models (LLMs) are increasingly deployed in critical sectors, raising concerns about their safety during runtime. Through an industry-academic collaboration when deploying an LLM-powered virtual customer assistant, a critical software engineering challenge emerged: how to enhance a safer deployment of LLM-powered software systems at runtime? While LlamaGuard, the current state-of-the-art runtime guardrail, offers protection against unsafe inputs, our study reveals a Defense Success Rate (DSR) drop of 24% under obfuscation- and template-based jailbreak attacks. In this paper, we propose DecipherGuard, a novel framework that integrates a deciphering layer to counter obfuscation-based prompts and a low-rank adaptation mechanism to enhance guardrail effectiveness against template-based attacks. Empirical evaluation on over 22,000 prompts demonstrates that DecipherGuard improves DSR by 36% to 65% and Overall Guardrail Performance (OGP) by 20% to 50% compared to LlamaGuard and two other runtime guardrails. These results highlight the effectiveness of DecipherGuard in defending LLM-powered software systems against jailbreak attacks during runtime.

Paperid: 824, https://arxiv.org/pdf/2509.12574.pdf

Abstract:
Existing watermarking methods for large language models (LLMs) mainly embed watermark by adjusting the token sampling prediction or post-processing, lacking intrinsic coupling with LLMs, which may significantly reduce the semantic quality of the generated marked texts. Traditional watermarking methods based on training or fine-tuning may be extendable to LLMs. However, most of them are limited to the white-box scenario, or very time-consuming due to the massive parameters of LLMs. In this paper, we present a new watermarking framework for LLMs, where the watermark is embedded into the LLM by manipulating the internal parameters of the LLM, and can be extracted from the generated text without accessing the LLM. Comparing with related methods, the proposed method entangles the watermark with the intrinsic parameters of the LLM, which better balances the robustness and imperceptibility of the watermark. Moreover, the proposed method enables us to extract the watermark under the black-box scenario, which is computationally efficient for use. Experimental results have also verified the feasibility, superiority and practicality. This work provides a new perspective different from mainstream works, which may shed light on future research.

Paperid: 825, https://arxiv.org/pdf/2509.06504.pdf

Abstract:
With the growing demand for cross-language codebase migration, evaluating LLMs' security implications in translation tasks has become critical. Existing evaluations primarily focus on syntactic or functional correctness at the function level, neglecting the critical dimension of security. To enable security evaluation, we construct STED (Security-centric Translation Evaluation Dataset), the first dataset specifically designed for evaluating the security implications of LLM-based code translation. It comprises 720 security-related code samples across five programming languages and nine high-impact CWE categories, sourced from CVE/NVD and manually verified for translation tasks. Our evaluation framework consists of two independent assessment modules: (1) rigorous evaluation by security researchers, and (2) automated analysis via LLM-as-a-judge. Together they evaluate three critical aspects: functional correctness, vulnerability preservation, and vulnerability introduction rates. Our large-scale evaluation of five state-of-the-art LLMs across 6,000 translation instances reveals significant security degradation, with 28.6-45% of translations introducing new vulnerabilities--particularly for web-related flaws like input validation, where LLMs show consistent weaknesses. Furthermore, we develop a Retrieval-Augmented Generation (RAG)-based mitigation strategy that reduces translation-induced vulnerabilities by 32.8%, showing the potential of knowledge-enhanced prompting.

Paperid: 826, https://arxiv.org/pdf/2509.06326.pdf

Abstract:
As on-device LLMs(e.g., Apple on-device Intelligence) are widely adopted to reduce network dependency, improve privacy, and enhance responsiveness, verifying the legitimacy of models running on local devices becomes critical. Existing attestation techniques are not suitable for billion-parameter Large Language Models (LLMs), struggling to remain both time- and memory-efficient while addressing emerging threats in the LLM era. In this paper, we present AttestLLM, the first-of-its-kind attestation framework to protect the hardware-level intellectual property (IP) of device vendors by ensuring that only authorized LLMs can execute on target platforms. AttestLLM leverages an algorithm/software/hardware co-design approach to embed robust watermarking signatures onto the activation distributions of LLM building blocks. It also optimizes the attestation protocol within the Trusted Execution Environment (TEE), providing efficient verification without compromising inference throughput. Extensive proof-of-concept evaluations on LLMs from Llama, Qwen, and Phi families for on-device use cases demonstrate AttestLLM's attestation reliability, fidelity, and efficiency. Furthermore, AttestLLM enforces model legitimacy and exhibits resilience against model replacement and forgery attacks.

Paperid: 827, https://arxiv.org/pdf/2509.05681.pdf

Abstract:
Decentralized Finance (DeFi) attacks have resulted in significant losses, often orchestrated through Adversarial Exploiter Contracts (AECs) that exploit vulnerabilities in victim smart contracts. To proactively identify such threats, this paper targets the explainable detection of AECs. Existing detection methods struggle to capture semantic dependencies and lack interpretability, limiting their effectiveness and leaving critical knowledge gaps in AEC analysis. To address these challenges, we introduce SEASONED, an effective, self-explanatory, and robust framework for AEC detection. SEASONED extracts semantic information from contract bytecode to construct a semantic relation graph (SRG), and employs a self-counterfactual explainable detector (SCFED) to classify SRGs and generate explanations that highlight the core attack logic. SCFED further enhances robustness, generalizability, and data efficiency by extracting representative information from these explanations. Both theoretical analysis and experimental results demonstrate the effectiveness of SEASONED, which showcases outstanding detection performance, robustness, generalizability, and data efficiency learning ability. To support further research, we also release a new dataset of 359 AECs.

Paperid: 828, https://arxiv.org/pdf/2509.04080.pdf

Abstract:
The security of Industrial Control Systems (ICSs) is critical to ensuring the safety of industrial processes and personnel. The rapid adoption of Industrial Internet of Things (IIoT) technologies has expanded system functionality but also increased the attack surface, exposing ICSs to a growing range of cyber threats. Honeypots provide a means to detect and analyze such threats by emulating target systems and capturing attacker behavior. However, traditional ICS honeypots, often limited to software-based simulations of a single Programmable Logic Controller (PLC), lack the realism required to engage sophisticated adversaries. In this work, we introduce a modular honeynet framework named ICSLure. The framework has been designed to emulate realistic ICS environments. Our approach integrates physical PLCs interacting with live data sources via industrial protocols such as Modbus and Profinet RTU, along with virtualized network components including routers, switches, and Remote Terminal Units (RTUs). The system incorporates comprehensive monitoring capabilities to collect detailed logs of attacker interactions. We demonstrate that our framework enables coherent and high-fidelity emulation of real-world industrial plants. This high-interaction environment significantly enhances the quality of threat data collected and supports advanced analysis of ICS-specific attack strategies, contributing to more effective detection and mitigation techniques.

Paperid: 829, https://arxiv.org/pdf/2508.21457.pdf

Abstract:
Phishing is a pervasive form of social engineering in which attackers impersonate trusted entities to steal information or induce harmful actions. Text-based phishing dominates for its low cost, scalability, and concealability, advantages recently amplified by large language models (LLMs) that enable ``Phishing-as-a-Service'' attacks at scale within minutes. Despite the growing research into LLM-facilitated phishing attacks, consolidated systematic research on the phishing attack life cycle remains scarce. In this work, we present the first systematization of knowledge (SoK) on LLM-generated phishing, offering an end-to-end analysis that spans generation techniques, attack features, and mitigation strategies. We introduce Generation-Characterization-Defense (GenCharDef), which systematizes the ways in which LLM-generated phishing differs from traditional phishing across methodologies, security perspectives, data dependencies, and evaluation practices. This framework highlights unique challenges of LLM-driven phishing, providing a coherent foundation for understanding the evolving threat landscape and guiding the design of more resilient defenses.

Paperid: 830, https://arxiv.org/pdf/2508.21393.pdf

Abstract:
Fine-tuning large language models (LLMs) is crucial for adapting them to specific tasks, yet it remains computationally demanding and raises concerns about correctness and privacy, particularly in untrusted environments. Although parameter-efficient methods like Low-Rank Adaptation (LoRA) significantly reduce resource requirements, ensuring the security and verifiability of fine-tuning under zero-knowledge constraints remains an unresolved challenge. To address this, we introduce zkLoRA, the first framework to integrate LoRA fine-tuning with zero-knowledge proofs (ZKPs), achieving provable security and correctness. zkLoRA employs advanced cryptographic techniques -- such as lookup arguments, sumcheck protocols, and polynomial commitments -- to verify both arithmetic and non-arithmetic operations in Transformer-based architectures. The framework provides end-to-end verifiability for forward propagation, backward propagation, and parameter updates during LoRA fine-tuning, while safeguarding the privacy of model parameters and training data. Leveraging GPU-based implementations, zkLoRA demonstrates practicality and efficiency through experimental validation on open-source LLMs like LLaMA, scaling up to 13 billion parameters. By combining parameter-efficient fine-tuning with ZKPs, zkLoRA bridges a critical gap, enabling secure and trustworthy deployment of LLMs in sensitive or untrusted environments.

Paperid: 831, https://arxiv.org/pdf/2508.20863.pdf

Abstract:
Large Language Models (LLMs) are increasingly being integrated into the scientific peer-review process, raising new questions about their reliability and resilience to manipulation. In this work, we investigate the potential for hidden prompt injection attacks, where authors embed adversarial text within a paper's PDF to influence the LLM-generated review. We begin by formalising three distinct threat models that envision attackers with different motivations -- not all of which implying malicious intent. For each threat model, we design adversarial prompts that remain invisible to human readers yet can steer an LLM's output toward the author's desired outcome. Using a user study with domain scholars, we derive four representative reviewing prompts used to elicit peer reviews from LLMs. We then evaluate the robustness of our adversarial prompts across (i) different reviewing prompts, (ii) different commercial LLM-based systems, and (iii) different peer-reviewed papers. Our results show that adversarial prompts can reliably mislead the LLM, sometimes in ways that adversely affect a "honest-but-lazy" reviewer. Finally, we propose and empirically assess methods to reduce detectability of adversarial prompts under automated content checks.

Paperid: 832, https://arxiv.org/pdf/2508.19774.pdf

Abstract:
Pickle deserialization vulnerabilities have persisted throughout Python's history, remaining widely recognized yet unresolved. Due to its ability to transparently save and restore complex objects into byte streams, many AI/ML frameworks continue to adopt pickle as the model serialization protocol despite its inherent risks. As the open-source model ecosystem grows, model-sharing platforms such as Hugging Face have attracted massive participation, significantly amplifying the real-world risks of pickle exploitation and opening new avenues for model supply chain poisoning. Although several state-of-the-art scanners have been developed to detect poisoned models, their incomplete understanding of the poisoning surface leaves the detection logic fragile and allows attackers to bypass them. In this work, we present the first systematic disclosure of the pickle-based model poisoning surface from both model loading and risky function perspectives. Our research demonstrates how pickle-based model poisoning can remain stealthy and highlights critical gaps in current scanning solutions. On the model loading surface, we identify 22 distinct pickle-based model loading paths across five foundational AI/ML frameworks, 19 of which are entirely missed by existing scanners. We further develop a bypass technique named Exception-Oriented Programming (EOP) and discover 9 EOP instances, 7 of which can bypass all scanners. On the risky function surface, we discover 133 exploitable gadgets, achieving almost a 100% bypass rate. Even against the best-performing scanner, these gadgets maintain an 89% bypass rate. By systematically revealing the pickle-based model poisoning surface, we achieve practical and robust bypasses against real-world scanners. We responsibly disclose our findings to corresponding vendors, receiving acknowledgments and a $6000 bug bounty.

Paperid: 833, https://arxiv.org/pdf/2508.18370.pdf

Abstract:
Large language models (LLMs) have demonstrated exceptional capabilities when trained within executable runtime environments, notably excelling at software engineering tasks through verified feedback loops. Yet, scalable and generalizable execution-grounded environments remain scarce, limiting progress in training more capable ML agents. We introduce CTF-Dojo, the first large-scale executable runtime tailored for training LLMs with verifiable feedback, featuring 658 fully functional Capture-The-Flag (CTF)-style challenges containerized in Docker with guaranteed reproducibility. To enable rapid scaling without manual intervention, we develop CTF-Forge, an automated pipeline that transforms publicly available artifacts into ready-to-use execution environments in minutes, eliminating weeks of expert configuration traditionally required. We trained LLM-based agents on just 486 high-quality, execution-verified trajectories from CTF-Dojo, achieving up to 11.6% absolute gains over strong baselines across three competitive benchmarks: InterCode-CTF, NYU CTF Bench, and Cybench. Our best-performing 32B model reaches 31.9% Pass@1, establishing a new open-weight state-of-the-art that rivals frontier models like DeepSeek-V3-0324 and Gemini-2.5-Flash. By framing CTF-style tasks as a benchmark for executable-agent learning, CTF-Dojo demonstrates that execution-grounded training signals are not only effective but pivotal in advancing high-performance ML agents without dependence on costly proprietary systems.

Paperid: 834, https://arxiv.org/pdf/2508.12897.pdf

Abstract:
Large Reasoning Models (LRMs) have demonstrated impressive performance across various tasks due to their powerful reasoning capabilities. However, their safety performance remains a significant concern. In this paper, we explore the reasons behind the vulnerability of LRMs. Based on this, we propose a novel method to improve the safety of LLMs without sacrificing their reasoning capability. Specifically, we exploit the competition between LRM's reasoning ability and safety ability, and achieve jailbreak by improving LRM's reasoning performance to reduce its safety performance. We then introduce an alignment strategy based on Fuzzification to balance Safety-Reasoning (FuSaR), by detoxifying the harmful reasoning process, where both the dangerous entities and the dangerous procedures in the reasoning steps are hidden. FuSaR successfully mitigates safety risks while preserving core reasoning information. We validate this strategy through alignment experiments on several open-source LRMs using detoxified reasoning data. The results compared with existing baselines conclusively show that FuSaR is an efficient alignment strategy to simultaneously enhance both the reasoning capability and safety of LRMs.

Paperid: 835, https://arxiv.org/pdf/2508.03829.pdf

Abstract:
The growing deployment of Large Language Models (LLMs) in real-world applications has raised concerns about their potential misuse in generating harmful or deceptive content. To address this issue, watermarking techniques have emerged as a promising solution by embedding identifiable binary messages into generated text for origin verification and misuse tracing. While recent efforts have explored multi-bit watermarking schemes capable of embedding rich information such as user identifiers, they typically suffer from the fundamental trade-off between text quality and decoding accuracy: to ensure reliable message decoding, they have to restrict the size of preferred token sets during encoding, yet such restrictions reduce the quality of the generated content. In this work, we propose MajorMark, a novel watermarking method that improves this trade-off through majority bit-aware encoding. MajorMark selects preferred token sets based on the majority bit of the message, enabling a larger and more flexible sampling of tokens. In contrast to prior methods that rely on token frequency analysis for decoding, MajorMark employs a clustering-based decoding strategy, which maintains high decoding accuracy even when the preferred token set is large, thus preserving both content quality and decoding accuracy. We further introduce MajorMark$^+$, which partitions the message into multiple blocks to independently encode and deterministically decode each block, thereby further enhancing the quality of watermarked text and improving decoding accuracy. Extensive experiments on state-of-the-art LLMs demonstrate that our methods significantly enhance both decoding accuracy and text generation quality, outperforming prior multi-bit watermarking baselines.

Paperid: 836, https://arxiv.org/pdf/2508.03681.pdf

Abstract:
Transparency and explainability are two important aspects to be considered when employing black-box machine learning models in high-stake applications. Providing counterfactual explanations is one way of catering this requirement. However, this also poses a threat to the privacy of the institution that is providing the explanation, as well as the user who is requesting it. In this work, we are primarily concerned with the user's privacy who wants to retrieve a counterfactual instance, without revealing their feature vector to the institution. Our framework retrieves the exact nearest neighbor counterfactual explanation from a database of accepted points while achieving perfect, information-theoretic, privacy for the user. First, we introduce the problem of private counterfactual retrieval (PCR) and propose a baseline PCR scheme that keeps the user's feature vector information-theoretically private from the institution. Building on this, we propose two other schemes that reduce the amount of information leaked about the institution database to the user, compared to the baseline scheme. Second, we relax the assumption of mutability of all features, and consider the setting of immutable PCR (I-PCR). Here, the user retrieves the nearest counterfactual without altering a private subset of their features, which constitutes the immutable set, while keeping their feature vector and immutable set private from the institution. For this, we propose two schemes that preserve the user's privacy information-theoretically, but ensure varying degrees of database privacy. Third, we extend our PCR and I-PCR schemes to incorporate user's preference on transforming their attributes, so that a more actionable explanation can be received. Finally, we present numerical results to support our theoretical findings, and compare the database leakage of the proposed schemes.

Paperid: 837, https://arxiv.org/pdf/2508.02116.pdf

Abstract:
As a versatile AI application, voice assistants (VAs) have become increasingly popular, but are vulnerable to security threats. Attackers have proposed various inaudible attacks, but are limited by cost, distance, or LoS. Therefore, we propose \name~Attack, a long-range, cross-barrier, and interference-free inaudible voice attack via solid channels. We begin by thoroughly analyzing the dispersion effect in solid channels, revealing its unique impact on signal propagation. To avoid distortions in voice commands, we design a modular command generation model that parameterizes attack distance, victim audio, and medium dispersion features to adapt to variations in the solid-channel state. Additionally, we propose SUAD Defense, a universal defense that uses ultrasonic perturbation signals to block inaudible voice attacks (IVAs) without impacting normal speech. Since the attack can occur at arbitrary frequencies and times, we propose a training method that randomizes both time and frequency to generate perturbation signals that break ultrasonic commands. Notably, the perturbation signal is modulated to an inaudible frequency without affecting the functionality of voice commands for VAs. Experiments on six smartphones have shown that SUAD Attack achieves activation success rates above 89.8% and SUAD Defense blocks IVAs with success rates exceeding 98%.

Paperid: 838, https://arxiv.org/pdf/2508.01845.pdf

Abstract:
Adversarial attacks against computer vision systems have emerged as a critical research area that challenges the fundamental assumptions about neural network robustness and security. This comprehensive survey examines the evolving landscape of adversarial techniques, revealing their dual nature as both sophisticated security threats and valuable defensive tools. We provide a systematic analysis of adversarial attack methodologies across three primary domains: pixel-space attacks, physically realizable attacks, and latent-space attacks. Our investigation traces the technical evolution from early gradient-based methods such as FGSM and PGD to sophisticated optimization techniques incorporating momentum, adaptive step sizes, and advanced transferability mechanisms. We examine how physically realizable attacks have successfully bridged the gap between digital vulnerabilities and real-world threats through adversarial patches, 3D textures, and dynamic optical perturbations. Additionally, we explore the emergence of latent-space attacks that leverage semantic structure in internal representations to create more transferable and meaningful adversarial examples. Beyond traditional offensive applications, we investigate the constructive use of adversarial techniques for vulnerability assessment in biometric authentication systems and protection against malicious generative models. Our analysis reveals critical research gaps, particularly in neural style transfer protection and computational efficiency requirements. This survey contributes a comprehensive taxonomy, evolution analysis, and identification of future research directions, aiming to advance understanding of adversarial vulnerabilities and inform the development of more robust and trustworthy computer vision systems.

Paperid: 839, https://arxiv.org/pdf/2507.18036.pdf

Abstract:
The intellectual property of deep neural network (DNN) models can be protected with DNN watermarking, which embeds copyright watermarks into model parameters (white-box), model behavior (black-box), or model outputs (box-free), and the watermarks can be subsequently extracted to verify model ownership or detect model theft. Despite recent advances, these existing methods are inherently intrusive, as they either modify the model parameters or alter the structure. This natural intrusiveness raises concerns about watermarking-induced shifts in model behavior and the additional cost of fine-tuning, further exacerbated by the rapidly growing model size. As a result, model owners are often reluctant to adopt DNN watermarking in practice, which limits the development of practical Watermarking as a Service (WaaS) systems. To address this issue, we introduce Nonintrusive Watermarking as a Service (NWaaS), a novel trustless paradigm designed for X-to-Image models, in which we hypothesize that with the model untouched, an owner-defined watermark can still be extracted from model outputs. Building on this concept, we propose ShadowMark, a concrete implementation of NWaaS which addresses critical deployment challenges by establishing a robust and nonintrusive side channel in the protected model's black-box API, leveraging a key encoder and a watermark decoder. It is significantly distinctive from existing solutions by attaining the so-called absolute fidelity and being applicable to different DNN architectures, while being also robust against existing attacks, eliminating the fidelity-robustness trade-off. Extensive experiments on image-to-image, noise-to-image, noise-and-text-to-image, and text-to-image models, demonstrate the efficacy and practicality of ShadowMark for real-world deployment of nonintrusive DNN watermarking.

Paperid: 840, https://arxiv.org/pdf/2507.17736.pdf

Abstract:
We introduce the problem of symmetric private information retrieval (SPIR) on replicated databases modeled by a simple graph. In this model, each vertex corresponds to a server, and a message is replicated on two servers if and only if there is an edge between them. We consider the setting where the server-side common randomness necessary to accomplish SPIR is also replicated at the servers according to the graph, and we call this as message-specific common randomness. In this setting, we establish a lower bound on the SPIR capacity, i.e., the maximum download rate, for general graphs, by proposing an achievable SPIR scheme. Next, we prove that, for any SPIR scheme to be feasible, the minimum size of message-specific randomness should be equal to the size of a message. Finally, by providing matching upper bounds, we derive the exact SPIR capacity for the class of path and regular graphs.

Paperid: 841, https://arxiv.org/pdf/2507.09133.pdf

Abstract:
With the increasing complexity of cyberattacks, the proactive and forward-looking nature of threat intelligence has become more crucial for threat detection and provenance analysis. However, translating high-level attack patterns described in Tactics, Techniques, and Procedures (TTP) intelligence into actionable security policies remains a significant challenge. This challenge arises from the semantic gap between high-level threat intelligence and low-level provenance log. To address this issue, this paper introduces CLIProv, a novel approach for detecting threat behaviors in a host system. CLIProv employs a multimodal framework that leverages contrastive learning to align the semantics of provenance logs with threat intelligence, effectively correlating system intrusion activities with attack patterns. Furthermore, CLIProv formulates threat detection as a semantic search problem, identifying attack behaviors by searching for threat intelligence that is most semantically similar to the log sequence. By leveraging attack pattern information in threat intelligence, CLIProv identifies TTPs and generates complete and concise attack scenarios. Experimental evaluations on standard datasets show that CLIProv effectively identifies attack behaviors in system provenance logs, offering valuable references for potential techniques. Compared to state-of-the-art methods, CLIProv achieves higher precision and significantly improved detection efficiency.

Paperid: 842, https://arxiv.org/pdf/2507.06850.pdf

Abstract:
The rapid adoption of Large Language Model (LLM) agents and multi-agent systems enables remarkable capabilities in natural language processing and generation. However, these systems introduce unprecedented security vulnerabilities that extend beyond traditional content generation attacks to system-level compromise. This paper presents a comprehensive evaluation of the security of LLMs used as reasoning engines within autonomous agents, highlighting how they can be exploited as attack vectors capable of achieving complete computer takeover. We focus on how different attack surfaces and trust boundaries - Direct Prompt Injection, RAG Backdoor, and Inter Agent Trust - can be leveraged to orchestrate such takeovers. We demonstrate that adversaries can effectively coerce popular LLMs (including GPT-4, Claude-4 and Gemini-2.5) into autonomously installing and executing malware on victim machines. Our evaluation of 18 state-of-the-art LLMs reveals an alarming scenario: 94.4% of models succumb to Direct Prompt Injection and 83.3% are vulnerable to the more stealth and evasive RAG Backdoor Attack. Notably, we tested trust boundaries within multi-agent systems, where LLM agents interact and influence each other, and we revealed a critical security flaw: LLMs which successfully resist direct injection or RAG backdoor will execute identical payloads when requested by peer agents. Our findings show that 100.0% of tested LLMs can be compromised through Inter-Agent Trust Exploitation attacks and that every model exhibits context-dependent security behaviors that create exploitable blind spots. Our results also highlight the need to increase awareness and research on the security risks of LLMs, showing a paradigm shift in cybersecurity threats, where AI tools themselves become sophisticated attack vectors.

Paperid: 843, https://arxiv.org/pdf/2507.03064.pdf

Abstract:
Today's Internet of Things (IoT) has evolved from simple sensing and actuation devices to those with embedded processing and intelligent services, enabling rich collaborations between users and their devices. However, enabling such collaboration becomes challenging when transient devices need to interact with host devices in temporarily visited environments. In such cases, fine-grained access control policies are necessary to ensure secure interactions; however, manually implementing them is often impractical for non-expert users. Moreover, at run-time, the system must automatically configure the devices and enforce such fine-grained access control rules. Additionally, the system must address the heterogeneity of devices. In this paper, we present CollabIoT, a system that enables secure and seamless device collaboration in transient IoT environments. CollabIoT employs a Large language Model (LLM)-driven approach to convert users' high-level intents to fine-grained access control policies. To support secure and seamless device collaboration, CollabIoT adopts capability-based access control for authorization and uses lightweight proxies for policy enforcement, providing hardware-independent abstractions. We implement a prototype of CollabIoT's policy generation and auto configuration pipelines and evaluate its efficacy on an IoT testbed and in large-scale emulated environments. We show that our LLM-based policy generation pipeline is able to generate functional and correct policies with 100% accuracy. At runtime, our evaluation shows that our system configures new devices in ~150 ms, and our proxy-based data plane incurs network overheads of up to 2 ms and access control overheads up to 0.3 ms.

Paperid: 844, https://arxiv.org/pdf/2507.02424.pdf

Abstract:
Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94\% accuracy per class and a final classification accuracy of 94.92\% through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.

Paperid: 845, https://arxiv.org/pdf/2507.01487.pdf

Abstract:
Ishai et al. (FOCS'06) introduced secure shuffling as an efficient building block for private data aggregation. Recently, the field of differential privacy has revived interest in secure shufflers by highlighting the privacy amplification they can provide in various computations. Although several works argue for the utility of secure shufflers, they often treat them as black boxes; overlooking the practical vulnerabilities and performance trade-offs of existing implementations. This leaves a central question open: what makes a good secure shuffler? This survey addresses that question by identifying, categorizing, and comparing 26 secure protocols that realize the necessary shuffling functionality. To enable a meaningful comparison, we adapt and unify existing security definitions into a consistent set of properties. We also present an overview of privacy-preserving technologies that rely on secure shufflers, offer practical guidelines for selecting appropriate protocols, and outline promising directions for future work.

Paperid: 846, https://arxiv.org/pdf/2512.22306.pdf

Abstract:
Large Language Models (LLMs) have demonstrated significant potential in automated software security, particularly in vulnerability detection. However, existing benchmarks primarily focus on isolated, single-vulnerability samples or function-level classification, failing to reflect the complexity of real-world software where multiple interacting vulnerabilities often coexist within large files. Recent studies indicate that LLMs suffer from "count bias" and "selection bias" in multi-label tasks, yet this has not been rigorously quantified in the domain of code security. In this work, we introduce a comprehensive benchmark for Multi-Vulnerability Detection across four major languages: C, C++, Python, and JavaScript. We construct a dataset of 40,000 files by systematically injecting controlled counts of vulnerabilities (1, 3, 5, and 9) into long-context code samples (7.5k-10k tokens) sourced from CodeParrot. We evaluate five state-of-the-art LLMs, including GPT-4o-mini, Llama-3.3-70B, and the Qwen-2.5 series. Our results reveal a sharp degradation in performance as vulnerability density increases. While Llama-3.3-70B achieves near-perfect F1 scores (approximately 0.97) on single-vulnerability C tasks, performance drops by up to 40% in high-density settings. Notably, Python and JavaScript show distinct failure modes compared to C/C++, with models exhibiting severe "under-counting" (Recall dropping to less than 0.30) in complex Python files.

Paperid: 847, https://arxiv.org/pdf/2512.15641.pdf

Abstract:
The rapid advancement of deep learning has turned models into highly valuable assets due to their reliance on massive data and costly training processes. However, these models are increasingly vulnerable to leakage and theft, highlighting the critical need for robust intellectual property protection. Model watermarking has emerged as an effective solution, with black-box watermarking gaining significant attention for its practicality and flexibility. Nonetheless, existing black-box methods often fail to better balance covertness (hiding the watermark to prevent detection and forgery) and robustness (ensuring the watermark resists removal)-two essential properties for real-world copyright verification. In this paper, we propose ComMark, a novel black-box model watermarking framework that leverages frequency-domain transformations to generate compressed, covert, and attack-resistant watermark samples by filtering out high-frequency information. To further enhance watermark robustness, our method incorporates simulated attack scenarios and a similarity loss during training. Comprehensive evaluations across diverse datasets and architectures demonstrate that ComMark achieves state-of-the-art performance in both covertness and robustness. Furthermore, we extend its applicability beyond image recognition to tasks including speech recognition, sentiment analysis, image generation, image captioning, and video recognition, underscoring its versatility and broad applicability.

Paperid: 848, https://arxiv.org/pdf/2512.12112.pdf

Abstract:
Industry 5.0's increasing integration of IT and OT systems is transforming industrial operations but also expanding the cyber-physical attack surface. Industrial Control Systems (ICS) face escalating security challenges as traditional siloed defences fail to provide coherent, cross-domain threat insights. We present BRIDG-ICS (BRIDge for Industrial Control Systems), an AI-driven Knowledge Graph (KG) framework for context-aware threat analysis and quantitative assessment of cyber resilience in smart manufacturing environments. BRIDG-ICS fuses heterogeneous industrial and cybersecurity data into an integrated Industrial Security Knowledge Graph linking assets, vulnerabilities, and adversarial behaviours with probabilistic risk metrics (e.g. exploit likelihood, attack cost). This unified graph representation enables multi-stage attack path simulation using graph-analytic techniques. To enrich the graph's semantic depth, the framework leverages Large Language Models (LLMs): domain-specific LLMs extract cybersecurity entities, predict relationships, and translate natural-language threat descriptions into structured graph triples, thereby populating the knowledge graph with missing associations and latent risk indicators. This unified AI-enriched KG supports multi-hop, causality-aware threat reasoning, improving visibility into complex attack chains and guiding data-driven mitigation. In simulated industrial scenarios, BRIDG-ICS scales well, reduces potential attack exposure, and can enhance cyber-physical system resilience in Industry 5.0 settings.

Paperid: 849, https://arxiv.org/pdf/2512.09958.pdf

Abstract:
Quantum federated learning (QFL) is emerging as a key enabler for intelligent, secure, and privacy-preserving model training in next-generation 6G networks. By leveraging the computational advantages of quantum devices, QFL offers significant improvements in learning efficiency and resilience against quantum-era threats. However, future 6G environments are expected to be highly dynamic, decentralized, and data-intensive, which necessitates moving beyond traditional centralized federated learning frameworks. To meet this demand, blockchain technology provides a decentralized, tamper-resistant infrastructure capable of enabling trustless collaboration among distributed quantum edge devices. This paper presents QFLchain, a novel framework that integrates QFL with blockchain to support scalable and secure 6G intelligence. In this work, we investigate four key pillars of \textit{QFLchain} in the 6G context: (i) communication and consensus overhead, (ii) scalability and storage overhead, (iii) energy inefficiency, and (iv) security vulnerability. A case study is also presented, demonstrating potential advantages of QFLchain, based on simulation, over state-of-the-art approaches in terms of training performance.

Paperid: 850, https://arxiv.org/pdf/2512.03207.pdf

Abstract:
The Privacy Sandbox, launched in 2019, is a series of proposals from Google to reduce ``cross-site and cross-app tracking while helping to keep online content and services free for all''. Over the years, Google implemented, experimented, and deprecated some of these APIs into their own products (Chrome, Android, etc.) which raised concerns about the potential of these mechanisms to fundamentally disrupt the advertising, mobile, and web ecosystems. As a result, it is paramount for researchers to understand the consequences that these new technologies, and future ones, will have on billions of users if and when deployed. In this report, we outline our call for privacy, security, usability, and utility evaluations of these APIs, our efforts materialized through the creation and operation of Privacy Sandstorm (https://privacysandstorm.github.io); a research portal to systematically gather resources (overview, analyses, artifacts, etc.) about such proposals. We find that our inventory provides a better visibility and broader perspective on the research findings in that space than what Google lets show through official channels.

Paperid: 851, https://arxiv.org/pdf/2512.01247.pdf

Abstract:
AI character platforms, which allow users to engage in conversations with AI personas, are a rapidly growing application domain. However, their immersive and personalized nature, combined with technical vulnerabilities, raises significant safety concerns. Despite their popularity, a systematic evaluation of their safety has been notably absent. To address this gap, we conduct the first large-scale safety study of AI character platforms, evaluating 16 popular platforms using a benchmark set of 5,000 questions across 16 safety categories. Our findings reveal a critical safety deficit: AI character platforms exhibit an average unsafe response rate of 65.1%, substantially higher than the 17.7% average rate of the baselines. We further discover that safety performance varies significantly across different characters and is strongly correlated with character features such as demographics and personality. Leveraging these insights, we demonstrate that our machine learning model is able identify less safe characters with an F1-score of 0.81. This predictive capability can be beneficial for platforms, enabling improved mechanisms for safer interactions, character search/recommendations, and character creation. Overall, the results and findings offer valuable insights for enhancing platform governance and content moderation for safer AI character platforms.

Paperid: 852, https://arxiv.org/pdf/2511.20801.pdf

Abstract:
Graph Neural Networks (GNNs) have become an effective tool for malware detection by capturing program execution through graph-structured representations. However, important challenges remain regarding scalability, interpretability, and the availability of reliable datasets. This paper brings together six related studies that collectively address these issues. The portfolio begins with a survey of graph-based malware detection and explainability, then advances to new graph reduction methods, integrated reduction-learning approaches, and investigations into the consistency of explanations. It also introduces dual explanation techniques based on subgraph matching and develops ensemble-based models with attention-guided stacked GNNs to improve interpretability. In parallel, curated datasets of control flow graphs are released to support reproducibility and enable future research. Together, these contributions form a coherent line of research that strengthens GNN-based malware detection by enhancing efficiency, increasing transparency, and providing solid experimental foundations.

Paperid: 853, https://arxiv.org/pdf/2511.14005.pdf

Abstract:
Volumetric video has emerged as a key paradigm in eXtended Reality (XR) and immersive multimedia because it enables highly interactive, spatially consistent 3D experiences. However, the transport-layer security for such 3D content remains largely unaddressed. Existing volumetric streaming pipelines inherit uniform encryption schemes from 2D video, overlooking the heterogeneous privacy sensitivity of different geometry and the strict motion-to-photon latency constraints of real-time XR. We take an initial step toward content-aware secure volumetric video delivery by introducing Privis, a saliency-guided transport framework that (i) partitions volumetric assets into independent units, (ii) applies lightweight authenticated encryption with adaptive key rotation, and (iii) employs selective traffic shaping to balance confidentiality and low latency. Privis specifies a generalized transport-layer security architecture for volumetric media, defining core abstractions and adaptive protection mechanisms. We further explore a prototype implementation and present initial latency measurements to illustrate feasibility and design tradeoffs, providing early empirical guidance toward future work on real-time, saliency-conditioned secure delivery.

Paperid: 854, https://arxiv.org/pdf/2511.12448.pdf

Abstract:
We introduce SeedAIchemy, an automated LLM-driven corpus generation tool that makes it easier for developers to implement fuzzing effectively. SeedAIchemy consists of five modules which implement different approaches at collecting publicly available files from the internet. Four of the five modules use large language model (LLM) workflows to construct search terms designed to maximize corpus quality. Corpora generated by SeedAIchemy perform significantly better than a naive corpus and similarly to a manually-curated corpus on a diverse range of target programs and libraries.

Paperid: 855, https://arxiv.org/pdf/2511.11693.pdf

Abstract:
Generative vision-language models like Stable Diffusion demonstrate remarkable capabilities in creative media synthesis, but they also pose substantial risks of producing unsafe, offensive, or culturally inappropriate content when prompted adversarially. Current defenses struggle to align outputs with human values without sacrificing generation quality or incurring high costs. To address these challenges, we introduce VALOR (Value-Aligned LLM-Overseen Rewriter), a modular, zero-shot agentic framework for safer and more helpful text-to-image generation. VALOR integrates layered prompt analysis with human-aligned value reasoning: a multi-level NSFW detector filters lexical and semantic risks; a cultural value alignment module identifies violations of social norms, legality, and representational ethics; and an intention disambiguator detects subtle or indirect unsafe implications. When unsafe content is detected, prompts are selectively rewritten by a large language model under dynamic, role-specific instructions designed to preserve user intent while enforcing alignment. If the generated image still fails a safety check, VALOR optionally performs a stylistic regeneration to steer the output toward a safer visual domain without altering core semantics. Experiments across adversarial, ambiguous, and value-sensitive prompts show that VALOR significantly reduces unsafe outputs by up to 100.00% while preserving prompt usefulness and creativity. These results highlight VALOR as a scalable and effective approach for deploying safe, aligned, and helpful image generation systems in open-world settings.

Paperid: 856, https://arxiv.org/pdf/2511.08985.pdf

Abstract:
Model watermarking techniques can embed watermark information into the protected model for ownership declaration by constructing specific input-output pairs. However, existing watermarks are easily removed when facing model stealing attacks, and make it difficult for model owners to effectively verify the copyright of stolen models. In this paper, we analyze the root cause of the failure of current watermarking methods under model stealing scenarios and then explore potential solutions. Specifically, we introduce a robust watermarking framework, DeepTracer, which leverages a novel watermark samples construction method and a same-class coupling loss constraint. DeepTracer can incur a high-coupling model between watermark task and primary task that makes adversaries inevitably learn the hidden watermark task when stealing the primary task functionality. Furthermore, we propose an effective watermark samples filtering mechanism that elaborately select watermark key samples used in model ownership verification to enhance the reliability of watermarks. Extensive experiments across multiple datasets and models demonstrate that our method surpasses existing approaches in defending against various model stealing attacks, as well as watermark attacks, and achieves new state-of-the-art effectiveness and robustness.

Paperid: 857, https://arxiv.org/pdf/2511.03341.pdf

Abstract:
Barrett's algorithm is one of the most widely used methods for performing modular multiplication, a critical nonlinear operation in modern privacy computing techniques such as homomorphic encryption (HE) and zero-knowledge proofs (ZKP). Since modular multiplication dominates the processing time in these applications, computational complexity and memory limitations significantly impact performance. Computing-in-Memory (CiM) is a promising approach to tackle this problem. However, existing schemes currently suffer from two main problems: 1) Most works focus on low bit-width modular multiplication, which is inadequate for mainstream cryptographic algorithms such as elliptic curve cryptography (ECC) and the RSA algorithm, both of which require high bit-width operations; 2) Recent efforts targeting large number modular multiplication rely on inefficient in-memory logic operations, resulting in high scaling costs for larger bit-widths and increased latency. To address these issues, we propose LaMoS, an efficient SRAM-based CiM design for large-number modular multiplication, offering high scalability and area efficiency. First, we analyze the Barrett's modular multiplication method and map the workload onto SRAM CiM macros for high bit-width cases. Additionally, we develop an efficient CiM architecture and dataflow to optimize large-number modular multiplication. Finally, we refine the mapping scheme for better scalability in high bit-width scenarios using workload grouping. Experimental results show that LaMoS achieves a $7.02\times$ speedup and reduces high bit-width scaling costs compared to existing SRAM-based CiM designs.

Paperid: 858, https://arxiv.org/pdf/2510.23674.pdf

Abstract:
Security in code generation remains a pivotal challenge when applying large language models (LLMs). This paper introduces RefleXGen, an innovative method that significantly enhances code security by integrating Retrieval-Augmented Generation (RAG) techniques with guided self-reflection mechanisms inherent in LLMs. Unlike traditional approaches that rely on fine-tuning LLMs or developing specialized secure code datasets - processes that can be resource-intensive - RefleXGen iteratively optimizes the code generation process through self-assessment and reflection without the need for extensive resources. Within this framework, the model continuously accumulates and refines its knowledge base, thereby progressively improving the security of the generated code. Experimental results demonstrate that RefleXGen substantially enhances code security across multiple models, achieving a 13.6% improvement with GPT-3.5 Turbo, a 6.7% improvement with GPT-4o, a 4.5% improvement with CodeQwen, and a 5.8% improvement with Gemini. Our findings highlight that improving the quality of model self-reflection constitutes an effective and practical strategy for strengthening the security of AI-generated code.

Paperid: 859, https://arxiv.org/pdf/2510.23673.pdf

Abstract:
The Model Context Protocol (MCP) has emerged as a standardized interface enabling seamless integration between Large Language Models (LLMs) and external data sources and tools. While MCP significantly reduces development complexity and enhances agent capabilities, its openness and extensibility introduce critical security vulnerabilities that threaten system trustworthiness and user data protection. This paper systematically analyzes the security landscape of MCP-based systems, identifying three principal threat categories: (1) agent hijacking attacks stemming from protocol design deficiencies; (2) traditional web vulnerabilities in MCP servers; and (3) supply chain security. To address these challenges, we comprehensively survey existing defense strategies, examining both proactive server-side scanning approaches, ranging from layered detection pipelines and agentic auditing frameworks to zero-trust registry systems, and runtime interaction monitoring solutions that provide continuous oversight and policy enforcement. Our analysis reveals that MCP security fundamentally represents a paradigm shift where the attack surface extends from traditional code execution to semantic interpretation of natural language metadata, necessitating novel defense mechanisms tailored to this unique threat model.

Paperid: 860, https://arxiv.org/pdf/2510.22944.pdf

Abstract:
Large language models (LLMs) have become indispensable for automated code generation, yet the quality and security of their outputs remain a critical concern. Existing studies predominantly concentrate on adversarial attacks or inherent flaws within the models. However, a more prevalent yet underexplored issue concerns how the quality of a benign but poorly formulated prompt affects the security of the generated code. To investigate this, we first propose an evaluation framework for prompt quality encompassing three key dimensions: goal clarity, information completeness, and logical consistency. Based on this framework, we construct and publicly release CWE-BENCH-PYTHON, a large-scale benchmark dataset containing tasks with prompts categorized into four distinct levels of normativity (L0-L3). Extensive experiments on multiple state-of-the-art LLMs reveal a clear correlation: as prompt normativity decreases, the likelihood of generating insecure code consistently and markedly increases. Furthermore, we demonstrate that advanced prompting techniques, such as Chain-of-Thought and Self-Correction, effectively mitigate the security risks introduced by low-quality prompts, substantially improving code safety. Our findings highlight that enhancing the quality of user prompts constitutes a critical and effective strategy for strengthening the security of AI-generated code.

Paperid: 861, https://arxiv.org/pdf/2510.22007.pdf

Abstract:
Text watermarking plays a crucial role in ensuring the traceability and accountability of large language model (LLM) outputs and mitigating misuse. While promising, most existing methods assume perfect pseudorandomness. In practice, repetition in generated text induces collisions that create structured dependence, compromising Type I error control and invalidating standard analyses. We introduce a statistical framework that captures this structure through a hierarchical two-layer partition. At its core is the concept of minimal units -- the smallest groups treatable as independent across units while permitting dependence within. Using minimal units, we define a non-asymptotic efficiency measure and cast watermark detection as a minimax hypothesis testing problem. Applied to Gumbel-max and inverse-transform watermarks, our framework produces closed-form optimal rules. It explains why discarding repeated statistics often improves performance and shows that within-unit dependence must be addressed unless degenerate. Both theory and experiments confirm improved detection power with rigorous Type I error control. These results provide the first principled foundation for watermark detection under imperfect pseudorandomness, offering both theoretical insight and practical guidance for reliable tracing of model outputs.

Paperid: 862, https://arxiv.org/pdf/2510.19934.pdf

Abstract:
Differentially private (DP) decentralized Federated Learning (FL) allows local users to collaborate without sharing their data with a central server. However, accurately quantifying the privacy budget of private FL algorithms is challenging due to the co-existence of complex algorithmic components such as decentralized communication and local updates. This paper addresses privacy accounting for two decentralized FL algorithms within the $f$-differential privacy ($f$-DP) framework. We develop two new $f$-DP-based accounting methods tailored to decentralized settings: Pairwise Network $f$-DP (PN-$f$-DP), which quantifies privacy leakage between user pairs under random-walk communication, and Secret-based $f$-Local DP (Sec-$f$-LDP), which supports structured noise injection via shared secrets. By combining tools from $f$-DP theory and Markov chain concentration, our accounting framework captures privacy amplification arising from sparse communication, local iterations, and correlated noise. Experiments on synthetic and real datasets demonstrate that our methods yield consistently tighter $(ε,δ)$ bounds and improved utility compared to Rényi DP-based approaches, illustrating the benefits of $f$-DP in decentralized privacy accounting.

Paperid: 863, https://arxiv.org/pdf/2510.16461.pdf

Abstract:
Software-defined wide area network (SD-WAN) has emerged as a new paradigm for steering a large-scale network flexibly by adopting distributed software-defined network (SDN) controllers. The key to building a logically centralized but physically distributed control-plane is running diverse cluster management protocols to achieve consistency through an exchange of control traffic. Meanwhile, we observe that the control traffic exposes unique time-series patterns and directional relationships due to the operational structure even though the traffic is encrypted, and this pattern can disclose confidential information such as control-plane topology and protocol dependencies, which can be exploited for severe attacks. With this insight, we propose a new SD-WAN fingerprinting system, called Heimdallr. It analyzes periodical and operational patterns of SD-WAN cluster management protocols and the context of flow directions from the collected control traffic utilizing a deep learning-based approach, so that it can classify the cluster management protocols automatically from miscellaneous control traffic datasets. Our evaluation, which is performed in a realistic SD-WAN environment consisting of geographically distant three campus networks and one enterprise network shows that Heimdallr can classify SD-WAN control traffic with $\geq$ 93%, identify individual protocols with $\geq$ 80% macro F-1 scores, and finally can infer control-plane topology with $\geq$ 70% similarity.

Paperid: 864, https://arxiv.org/pdf/2510.16251.pdf

Abstract:
Dynamic program analysis is invaluable for malware detection, debugging, and performance profiling. However, software-based instrumentation incurs high overhead and can be evaded by anti-analysis techniques. In this paper, we propose LibIHT, a hardware-assisted tracing framework that leverages on-CPU branch tracing features (Intel Last Branch Record and Branch Trace Store) to efficiently capture program control-flow with minimal performance impact. Our approach reconstructs control-flow graphs (CFGs) by collecting hardware generated branch execution data in the kernel, preserving program behavior against evasive malware. We implement LibIHT as an OS kernel module and user-space library, and evaluate it on both benign benchmark programs and adversarial anti-instrumentation samples. Our results indicate that LibIHT reduces runtime overhead by over 150x compared to Intel Pin (7x vs 1,053x slowdowns), while achieving high fidelity in CFG reconstruction (capturing over 99% of execution basic blocks and edges). Although this hardware-assisted approach sacrifices the richer semantic detail available from full software instrumentation by capturing only branch addresses, this trade-off is acceptable for many applications where performance and low detectability are paramount. Our findings show that hardware-based tracing captures control flow information significantly faster, reduces detection risk and performs dynamic analysis with minimal interference.

Paperid: 865, https://arxiv.org/pdf/2510.16083.pdf

Abstract:
Credential stuffing attacks have caused significant harm to online users who frequently reuse passwords across multiple websites. While prior research has attempted to detect users with reused passwords or identify malicious login attempts, existing methods often compromise usability by restricting password creation or website access, and their reliance on complex account-sharing mechanisms hinders real-world deployment. To address these limitations, we propose PassREfinder-FL, a novel framework that predicts credential stuffing risks across websites. We introduce the concept of password reuse relations -- defined as the likelihood of users reusing passwords between websites -- and represent them as edges in a website graph. Using graph neural networks (GNNs), we perform a link prediction task to assess credential reuse risk between sites. Our approach scales to a large number of arbitrary websites by incorporating public website information and linking newly observed websites as nodes in the graph. To preserve user privacy, we extend PassREfinder-FL with a federated learning (FL) approach that eliminates the need to share user sensitive information across administrators. Evaluation on a real-world dataset of 360 million breached accounts from 22,378 websites shows that PassREfinder-FL achieves an F1-score of 0.9153 in the FL setting. We further validate that our FL-based GNN achieves a 4-11% performance improvement over other state-of-the-art GNN models through an ablation study. Finally, we demonstrate that the predicted results can be used to quantify password reuse likelihood as actionable risk scores.

Paperid: 866, https://arxiv.org/pdf/2510.15186.pdf

Abstract:
A core challenge for autonomous LLM agents in collaborative settings is balancing robust privacy understanding and preservation alongside task efficacy. Existing privacy benchmarks only focus on simplistic, single-turn interactions where private information can be trivially omitted without affecting task outcomes. In this paper, we introduce MAGPIE (Multi-AGent contextual PrIvacy Evaluation), a novel benchmark of 200 high-stakes tasks designed to evaluate privacy understanding and preservation in multi-agent collaborative, non-adversarial scenarios. MAGPIE integrates private information as essential for task resolution, forcing agents to balance effective collaboration with strategic information control. Our evaluation reveals that state-of-the-art agents, including GPT-5 and Gemini 2.5-Pro, exhibit significant privacy leakage, with Gemini 2.5-Pro leaking up to 50.7% and GPT-5 up to 35.1% of the sensitive information even when explicitly instructed not to. Moreover, these agents struggle to achieve consensus or task completion and often resort to undesirable behaviors such as manipulation and power-seeking (e.g., Gemini 2.5-Pro demonstrating manipulation in 38.2% of the cases). These findings underscore that current LLM agents lack robust privacy understanding and are not yet adequately aligned to simultaneously preserve privacy and maintain effective collaboration in complex environments.

Paperid: 867, https://arxiv.org/pdf/2510.13462.pdf

Abstract:
Large language models (LLMs) with Mixture-of-Experts (MoE) architectures achieve impressive performance and efficiency by dynamically routing inputs to specialized subnetworks, known as experts. However, this sparse routing mechanism inherently exhibits task preferences due to expert specialization, introducing a new and underexplored vulnerability to backdoor attacks. In this work, we investigate the feasibility and effectiveness of injecting backdoors into MoE-based LLMs by exploiting their inherent expert routing preferences. We thus propose BadSwitch, a novel backdoor framework that integrates task-coupled dynamic trigger optimization with a sensitivity-guided Top-S expert tracing mechanism. Our approach jointly optimizes trigger embeddings during pretraining while identifying S most sensitive experts, subsequently constraining the Top-K gating mechanism to these targeted experts. Unlike traditional backdoor attacks that rely on superficial data poisoning or model editing, BadSwitch primarily embeds malicious triggers into expert routing paths with strong task affinity, enabling precise and stealthy model manipulation. Through comprehensive evaluations across three prominent MoE architectures (Switch Transformer, QwenMoE, and DeepSeekMoE), we demonstrate that BadSwitch can efficiently hijack pre-trained models with up to 100% success rate (ASR) while maintaining the highest clean accuracy (ACC) among all baselines. Furthermore, BadSwitch exhibits strong resilience against both text-level and model-level defense mechanisms, achieving 94.07% ASR and 87.18% ACC on the AGNews dataset. Our analysis of expert activation patterns reveals fundamental insights into MoE vulnerabilities. We anticipate this work will expose security risks in MoE systems and contribute to advancing AI safety.

Paperid: 868, https://arxiv.org/pdf/2510.12117.pdf

Abstract:
Chatbot providers (e.g., OpenAI) rely on tiered subscription schemes to generate revenue, offering basic models for free users, and advanced models for paying subscribers. However, a finer-grained pay-to-unlock scheme for premium features (e.g., math, coding) is thought to be more economically viable for the providers. Such a scheme requires a feature-locking technique (FLoTE) which is (i) effective in refusing locked features, (ii) utility-preserving for unlocked features, (iii) robust against evasion or unauthorized credential sharing, and (iv) scalable to multiple features and users. However, existing FLoTEs (e.g., password-locked models) are not robust or scalable. We present Locket, the first robust and scalable FLoTE to enable pay-to-unlock schemes. Locket uses a novel merging approach to attach adapters to an LLM for refusing unauthorized features. Our comprehensive evaluation shows that Locket is effective ($100$% refusal on locked features), utility-preserving ($\leq 7$% utility degradation in unlocked features), robust ($\leq 5$% attack success rate), and scales to multiple features and clients.

Paperid: 869, https://arxiv.org/pdf/2510.07452.pdf

Abstract:
Language models (LMs) may memorize personally identifiable information (PII) from training data, enabling adversaries to extract it during inference. Existing defense mechanisms such as differential privacy (DP) reduce this leakage, but incur large drops in utility. Based on a comprehensive study using circuit discovery to identify the computational circuits responsible PII leakage in LMs, we hypothesize that specific PII leakage circuits in LMs should be responsible for this behavior. Therefore, we propose PATCH (Privacy-Aware Targeted Circuit PatcHing), a novel approach that first identifies and subsequently directly edits PII circuits to reduce leakage. PATCH achieves better privacy-utility trade-off than existing defenses, e.g., reducing recall of PII leakage from LMs by up to 65%. Finally, PATCH can be combined with DP to reduce recall of residual leakage of an LM to as low as 0.01%. Our analysis shows that PII leakage circuits persist even after the application of existing defense mechanisms. In contrast, PATCH can effectively mitigate their impact.

Paperid: 870, https://arxiv.org/pdf/2510.05753.pdf

Abstract:
With the emergence of powerful large-scale foundation models, the training paradigm is increasingly shifting from from-scratch training to transfer learning. This enables high utility training with small, domain-specific datasets typical in sensitive applications. Membership inference attacks (MIAs) provide an empirical estimate of the privacy leakage by machine learning models. Yet, prior assessments of MIAs against models fine-tuned with transfer learning rely on a small subset of possible attacks. We address this by comparing performance of diverse MIAs in transfer learning settings to help practitioners identify the most efficient attacks for privacy risk evaluation. We find that attack efficacy decreases with the increase in training data for score-based MIAs. We find that there is no one MIA which captures all privacy risks in models trained with transfer learning. While the Likelihood Ratio Attack (LiRA) demonstrates superior performance across most experimental scenarios, the Inverse Hessian Attack (IHA) proves to be more effective against models fine-tuned on PatchCamelyon dataset in high data regime.

Paperid: 871, https://arxiv.org/pdf/2510.05244.pdf

Abstract:
AI agents are vulnerable to indirect prompt injection attacks, where malicious instructions embedded in external content or tool outputs cause unintended or harmful behavior. Inspired by the well-established concept of firewalls, we show that a simple, modular and model-agnostic defense operating at the agent--tool interface achieves perfect security (0% or the lowest possible attack success rate) with high utility (task success rate) across four public benchmarks: AgentDojo, Agent Security Bench, InjecAgent and tau-Bench, while achieving a state-of-the-art security-utility tradeoff compared to prior results. Specifically, we employ a defense based on two firewalls: a Tool-Input Firewall (Minimizer) and a Tool-Output Firewall (Sanitizer). Unlike prior complex approaches, this firewall defense makes minimal assumptions on the agent and can be deployed out-of-the-box, while maintaining strong performance without compromising utility. However, our analysis also reveals critical limitations in these existing benchmarks, including flawed success metrics, implementation bugs, and most importantly, weak attacks, hindering significant progress in the field. To foster more meaningful progress, we present targeted fixes to these issues for AgentDojo and Agent Security Bench while proposing best-practices for more robust benchmark design. Further, we demonstrate that although these firewalls push the state-of-the-art on existing benchmarks, it is still possible to bypass them in practice, underscoring the need to incorporate stronger attacks in security benchmarks. Overall, our work shows that existing agentic security benchmarks are easily saturated by a simple approach and highlights the need for stronger agentic security benchmarks with carefully chosen evaluation metrics and strong adaptive attacks.

Paperid: 872, https://arxiv.org/pdf/2510.04085.pdf

Abstract:
Gluing theorem for random unitaries [Schuster, Haferkamp, Huang, QIP 2025] have found numerous applications, including designing low depth random unitaries [Schuster, Haferkamp, Huang, QIP 2025], random unitaries in ${\sf QAC0}$ [Foxman, Parham, Vasconcelos, Yuen'25] and generically shortening the key length of pseudorandom unitaries [Ananth, Bostanci, Gulati, Lin EUROCRYPT'25]. We present an alternate method of combining Haar random unitaries from the gluing lemma from [Schuster, Haferkamp, Huang, QIP 2025] that is secure against adversaries with inverse query access to the joined unitary. As a consequence, we show for the first time that strong pseudorandom unitaries can generically have their length extended, and can be constructed using only $O(n^{1/c})$ bits of randomness, for any constant $c$, if any family of strong pseudorandom unitaries exists.

Paperid: 873, https://arxiv.org/pdf/2510.03612.pdf

Abstract:
Vision-language model (VLM)-based web agents increasingly power high-stakes selection tasks like content recommendation or product ranking by combining multimodal perception with preference reasoning. Recent studies reveal that these agents are vulnerable against attackers who can bias selection outcomes through preference manipulations using adversarial pop-ups, image perturbations, or content tweaks. Existing work, however, either assumes strong white-box access, with limited single-modal perturbations, or uses impractical settings. In this paper, we demonstrate, for the first time, that joint exploitation of visual and textual channels yields significantly more powerful preference manipulations under realistic attacker capabilities. We introduce Cross-Modal Preference Steering (CPS) that jointly optimizes imperceptible modifications to an item's visual and natural language descriptions, exploiting CLIP-transferable image perturbations and RLHF-induced linguistic biases to steer agent decisions. In contrast to prior studies that assume gradient access, or control over webpages, or agent memory, we adopt a realistic black-box threat setup: a non-privileged adversary can edit only their own listing's images and textual metadata, with no insight into the agent's model internals. We evaluate CPS on agents powered by state-of-the-art proprietary and open source VLMs including GPT-4.1, Qwen-2.5VL and Pixtral-Large on both movie selection and e-commerce tasks. Our results show that CPS is significantly more effective than leading baseline methods. For instance, our results show that CPS consistently outperforms baselines across all models while maintaining 70% lower detection rates, demonstrating both effectiveness and stealth. These findings highlight an urgent need for robust defenses as agentic systems play an increasingly consequential role in society.

Paperid: 874, https://arxiv.org/pdf/2510.02194.pdf

Abstract:
Large Language Models (LLMs) have achieved remarkable progress across a wide range of tasks, but remain vulnerable to safety risks such as harmful content generation and jailbreak attacks. Existing safety techniques -- including external guardrails, inference-time guidance, and post-training alignment -- each face limitations in balancing safety, utility, and controllability. In this work, we propose UpSafe$^\circ$C, a unified framework for enhancing LLM safety through safety-aware upcycling. Our approach first identifies safety-critical layers and upcycles them into a sparse Mixture-of-Experts (MoE) structure, where the router acts as a soft guardrail that selectively activates original MLPs and added safety experts. We further introduce a two-stage SFT strategy to strengthen safety discrimination while preserving general capabilities. To enable flexible control at inference time, we introduce a safety temperature mechanism, allowing dynamic adjustment of the trade-off between safety and utility. Experiments across multiple benchmarks, base model, and model scales demonstrate that UpSafe$^\circ$C achieves robust safety improvements against harmful and jailbreak inputs, while maintaining competitive performance on general tasks. Moreover, analysis shows that safety temperature provides fine-grained inference-time control that achieves the Pareto-optimal frontier between utility and safety. Our results highlight a new direction for LLM safety: moving from static alignment toward dynamic, modular, and inference-aware control.

Paperid: 875, https://arxiv.org/pdf/2510.01082.pdf

Abstract:
Pressure sensors are widely integrated into modern Heating, Ventilation and Air Conditioning (HVAC) systems. As they are sensitive to acoustic pressure, they can be a source of eavesdropping. This paper introduces HVAC-EAR, which reconstructs intelligible speech from low-resolution, noisy pressure data with two key contributions: (i) We achieve intelligible reconstruction from as low as 0.5 kHz sampling rate, surpassing prior work limited to hot word detection, by employing a complex-valued conformer with a Complex Unified Attention Block to capture phoneme dependencies; (ii) HVAC-EAR mitigates transient HVAC noise by reconstructing both magnitude and phase of missing frequencies. For the first time, evaluations on real-world HVAC deployments show significant intelligibility, raising novel privacy concerns.

Paperid: 876, https://arxiv.org/pdf/2509.25926.pdf

Abstract:
Large language models (LLMs) have become increasingly popular due to their ability to interact with unstructured content. As such, LLMs are now a key driver behind the automation of language processing systems, such as AI agents. Unfortunately, these advantages have come with a vulnerability to prompt injections, an attack where an adversary subverts the LLM's intended functionality with an injected task. Past approaches have proposed detectors and finetuning to provide robustness, but these techniques are vulnerable to adaptive attacks or cannot be used with state-of-the-art models. To this end we propose type-directed privilege separation for LLMs, a method that systematically prevents prompt injections. We restrict the ability of an LLM to interact with third-party data by converting untrusted content to a curated set of data types; unlike raw strings, each data type is limited in scope and content, eliminating the possibility for prompt injections. We evaluate our method across several case studies and find that designs leveraging our principles can systematically prevent prompt injection attacks while maintaining high utility.

Paperid: 877, https://arxiv.org/pdf/2509.24484.pdf

Abstract:
Pseudorandom unitaries (PRUs), one of the key quantum pseudorandom notions, are efficiently computable unitaries that are computationally indistinguishable from Haar random unitaries. While there is evidence to believe that PRUs are weaker than one-way functions, so far its relationship with other quantum cryptographic primitives (that are plausibly weaker than one-way functions) has not been fully established. In this work, we focus on quantum cryptographic primitives with classical communication, referred to as QCCC primitives. Our main result shows that QCCC bit commitments and QCCC key agreement, cannot be constructed from pseudorandom unitaries in a black-box manner. Our core technical contribution is to show (in a variety of settings) the difficulty of distinguishing identical versus independent Haar unitaries by separable channels. Our result strictly improves upon prior works which studied similar problems in the context of learning theory [Anshu, Landau, Liu, STOC 2022] and cryptography [Ananth, Gulati, Lin, TCC 2024].

Paperid: 878, https://arxiv.org/pdf/2509.24432.pdf

Abstract:
The quantum Haar random oracle model is an idealized model where every party has access to a single Haar random unitary and its inverse. We construct strong pseudorandom unitaries in the quantum Haar random oracle model. This strictly improves upon prior works who either only prove the existence of pseudorandom unitaries in the inverseless quantum Haar random oracle model [Ananth, Bostanci, Gulati, Lin, EUROCRYPT 2025] or prove the existence of a weaker notion (implied by strong pseudorandom unitaries) in the quantum Haar random oracle model [Hhan, Yamada, 2024]. Our results also present a viable approach for building quantum pseudorandomness from random quantum circuits and analyzing pseudorandom objects in nature.

Paperid: 879, https://arxiv.org/pdf/2509.22745.pdf

Abstract:
Recent large language models (LLMs) have increasingly adopted the Mixture-of-Experts (MoE) architecture for efficiency. MoE-based LLMs heavily depend on a superficial safety mechanism in which harmful inputs are routed safety-critical experts. However, our analysis reveals that routing decisions for harmful inputs drift significantly after fine-tuning, exposing a critical vulnerability to harmful fine-tuning (HFT) attacks. Existing defenses, primarily designed for monolithic LLMs, are less effective for MoE LLMs as they fail to prevent drift in harmful input routing. To address this limitation, we propose SafeMoE, a safe fine-tuning method tailored to MoE LLMs. SafeMoE directly mitigates routing drift by penalizing the gap between the routing weights of a fine-tuned model and those of the initial safety-aligned model, thereby preserving the safety-aligned routing of harmful inputs to safety-critical experts. Experiments on open-source MoE LLMs ranging from 7B to 141B parameters demonstrate that SafeMoE effectively mitigates HFT attacks, reducing the harmfulness score of OLMoE from 62.0 to 5.0, for example, while maintaining task utility within 1% degradation and incurring only 2% overhead. It significantly outperforms state-of-the-art defense methods for safeguarding LLM fine-tuning and remains effective in recent large-scale MoE LLMs such as gpt-oss and Llama 4. Our implementation is available at https://anonymous.4open.science/r/SafeMoE.

Paperid: 880, https://arxiv.org/pdf/2509.20384.pdf

Abstract:
Fuzzing is effective for vulnerability discovery but struggles with complex targets such as compilers, interpreters, and database engines, which accept textual input that must satisfy intricate syntactic and semantic constraints. Although language models (LMs) have attracted interest for this task due to their vast latent knowledge and reasoning potential, their practical adoption has been limited. The major challenges stem from insufficient exploration of deep program logic among real-world codebases, and the high cost of leveraging larger models. To overcome these challenges, we propose R1-Fuzz, the first framework that leverages reinforcement learning (RL) to specialize cost-efficient LMs and integrate them for complex textual fuzzing input generation. R1-Fuzz introduces two key designs: coverage-slicing-based question construction and a distance-based reward calculation. Through RL-based post-training of a model with our constructed dataset, R1-Fuzz designs a fuzzing workflow that tightly integrates LMs to reason deep program semantics during fuzzing. Evaluations on diverse real-world targets show that our design enables a small model, named R1-Fuzz-7B, to rival or even outperform much larger models in real-world fuzzing. Notably, R1-Fuzz achieves up to 75\% higher coverage than state-of-the-art fuzzers and discovers 29 previously unknown vulnerabilities, demonstrating its practicality.

Paperid: 881, https://arxiv.org/pdf/2509.20362.pdf

Abstract:
Autonomous Target Tracking (ATT) systems, especially ATT drones, are widely used in applications such as surveillance, border control, and law enforcement, while also being misused in stalking and destructive actions. Thus, the security of ATT is highly critical for real-world applications. Under the scope, we present a new type of attack: distance-pulling attacks (DPA) and a systematic study of it, which exploits vulnerabilities in ATT systems to dangerously reduce tracking distances, leading to drone capturing, increased susceptibility to sensor attacks, or even physical collisions. To achieve these goals, we present FlyTrap, a novel physical-world attack framework that employs an adversarial umbrella as a deployable and domain-specific attack vector. FlyTrap is specifically designed to meet key desired objectives in attacking ATT drones: physical deployability, closed-loop effectiveness, and spatial-temporal consistency. Through novel progressive distance-pulling strategy and controllable spatial-temporal consistency designs, FlyTrap manipulates ATT drones in real-world setups to achieve significant system-level impacts. Our evaluations include new datasets, metrics, and closed-loop experiments on real-world white-box and even commercial ATT drones, including DJI and HoverAir. Results demonstrate FlyTrap's ability to reduce tracking distances within the range to be captured, sensor attacked, or even directly crashed, highlighting urgent security risks and practical implications for the safe deployment of ATT systems.

Paperid: 882, https://arxiv.org/pdf/2509.16390.pdf

Abstract:
Roaming settlement in 5G and beyond networks demands secure, efficient, and trustworthy mechanisms for billing reconciliation between mobile operators. While blockchain promises decentralization and auditability, existing solutions suffer from critical limitations-namely, data privacy risks, assumptions of mutual trust, and scalability bottlenecks. To address these challenges, we present B5GRoam, a novel on-chain and zero-trust framework for secure, privacy-preserving, and scalable roaming settlements. B5GRoam introduces a cryptographically verifiable call detail record (CDR) submission protocol, enabling smart contracts to authenticate usage claims without exposing sensitive data. To preserve privacy, we integrate non-interactive zero-knowledge proofs (zkSNARKs) that allow on-chain verification of roaming activity without revealing user or network details. To meet the high-throughput demands of 5G environments, B5GRoam leverages Layer 2 zk-Rollups, significantly reducing gas costs while maintaining the security guarantees of Layer 1. Experimental results demonstrate a throughput of over 7,200 tx/s with strong privacy and substantial cost savings. By eliminating intermediaries and enhancing verifiability, B5GRoam offers a practical and secure foundation for decentralized roaming in future mobile networks.

Paperid: 883, https://arxiv.org/pdf/2509.14657.pdf

Abstract:
The rapid proliferation of IoT nodes equipped with microphones and capable of performing on-device audio classification exposes highly sensitive data while operating under tight resource constraints. To protect against this, we present a defence-in-depth architecture comprising a security protocol that treats the edge device, cellular network and cloud backend as three separate trust domains, linked by TPM-based remote attestation and mutually authenticated TLS 1.3. A STRIDE-driven threat model and attack-tree analysis guide the design. At startup, each boot stage is measured into TPM PCRs. The node can only decrypt its LUKS-sealed partitions after the cloud has verified a TPM quote and released a one-time unlock key. This ensures that rogue or tampered devices remain inert. Data in transit is protected by TLS 1.3 and hybridised with Kyber and Dilithium to provide post-quantum resilience. Meanwhile, end-to-end encryption and integrity hashes safeguard extracted audio features. Signed, rollback-protected AI models and tamper-responsive sensors harden firmware and hardware. Data at rest follows a 3-2-1 strategy comprising a solid-state drive sealed with LUKS, an offline cold archive encrypted with a hybrid post-quantum cipher and an encrypted cloud replica. Finally, we set out a plan for evaluating the physical and logical security of the proposed protocol.

Paperid: 884, https://arxiv.org/pdf/2509.13581.pdf

Abstract:
Modern optical mouse sensors, with their advanced precision and high responsiveness, possess an often overlooked vulnerability: they can be exploited for side-channel attacks. This paper introduces Mic-E-Mouse, the first-ever side-channel attack that targets high-performance optical mouse sensors to covertly eavesdrop on users. We demonstrate that audio signals can induce subtle surface vibrations detectable by a mouse's optical sensor. Remarkably, user-space software on popular operating systems can collect and broadcast this sensitive side channel, granting attackers access to raw mouse data without requiring direct system-level permissions. Initially, the vibration signals extracted from mouse data are of poor quality due to non-uniform sampling, a non-linear frequency response, and significant quantization. To overcome these limitations, Mic-E-Mouse employs a sophisticated end-to-end data filtering pipeline that combines Wiener filtering, resampling corrections, and an innovative encoder-only spectrogram neural filtering technique. We evaluate the attack's efficacy across diverse conditions, including speaking volume, mouse polling rate and DPI, surface materials, speaker languages, and environmental noise. In controlled environments, Mic-E-Mouse improves the signal-to-noise ratio (SNR) by up to +19 dB for speech reconstruction. Furthermore, our results demonstrate a speech recognition accuracy of roughly 42% to 61% on the AudioMNIST and VCTK datasets. All our code and datasets are publicly accessible on https://sites.google.com/view/mic-e-mouse.

Paperid: 885, https://arxiv.org/pdf/2509.12386.pdf

Abstract:
ML models are susceptible to risks to security, privacy, and fairness. Several defenses are designed to protect against their intended risks, but can inadvertently affect susceptibility to other unrelated risks, known as unintended interactions. Several jurisdictions are preparing ML regulatory frameworks that require ML practitioners to assess the susceptibility of ML models to different risks. A library for valuating unintended interactions that can be used by (a) practitioners to evaluate unintended interactions at scale prior to model deployment and (b) researchers to design defenses which do not suffer from an unintended increase in unrelated risks. Ideally, such a library should be i) comprehensive by including representative attacks, defenses and metrics for different risks, ii) extensible to new modules due to its modular design, iii) consistent with a user-friendly API template for inputs and outputs, iv) applicable to evaluate previously unexplored unintended interactions. We present AMULET, a Python library that covers risks to security, privacy, and fairness, which satisfies all these requirements. AMULET can be used to evaluate unexplored unintended interactions, compare effectiveness between defenses or attacks, and include new attacks and defenses.

Paperid: 886, https://arxiv.org/pdf/2509.12233.pdf

Abstract:
The Internet of Electric Vehicles (IoEV) envisions a tightly coupled ecosystem of electric vehicles (EVs), charging infrastructure, and grid services, yet it remains vulnerable to cyberattacks, unreliable battery-state predictions, and opaque decision processes that erode trust and performance. To address these challenges, we introduce a novel Agentic Artificial Intelligence (AAI) framework tailored for IoEV, where specialized agents collaborate to deliver autonomous threat mitigation, robust analytics, and interpretable decision support. Specifically, we design an AAI architecture comprising dedicated agents for cyber-threat detection and response at charging stations, real-time State of Charge (SoC) estimation, and State of Health (SoH) anomaly detection, all coordinated through a shared, explainable reasoning layer; develop interpretable threat-mitigation mechanisms that proactively identify and neutralize attacks on both physical charging points and learning components; propose resilient SoC and SoH models that leverage continuous and adversarial-aware learning to produce accurate, uncertainty-aware forecasts with human-readable explanations; and implement a three-agent pipeline, where each agent uses LLM-driven reasoning and dynamic tool invocation to interpret intent, contextualize tasks, and execute formal optimizations for user-centric assistance. Finally, we validate our framework through comprehensive experiments across diverse IoEV scenarios, demonstrating significant improvements in security and prediction accuracy. All datasets, models, and code will be released publicly.

Paperid: 887, https://arxiv.org/pdf/2509.10594.pdf

Abstract:
Artificial Intelligence (AI) and Large Language Models (LLMs) are reshaping today's business practices, however, their adoption within small and medium-sized enterprises (SMEs) raises significant technical, ethical and trust issues. This paper proposes a structured, multi-phased framework designed to embed trust and ethical principles throughout the AI lifecycle for their secure and responsible use in SMEs. Structured around four pillars, i.e., Data, Algorithms, Human oversight, and Model Architecture, the framework bridges theoretical ethical principles with operational practice, enhancing AI capabilities in diverse SME applications. Ultimately, this paper offers a structured roadmap for responsible AI adoption, framing trust and ethics as a catalyst for resilience, competitiveness, and sustainable innovation in SMEs.

Paperid: 888, https://arxiv.org/pdf/2509.10313.pdf

Abstract:
End-to-end encryption (E2EE) has emerged as a fundamental element of modern digital communication, protecting data from unauthorized access during transmission. By design, E2EE ensures that only the intended recipient can decrypt the information, making it inaccessible even to service providers. Yet, this powerful safeguard of individual privacy and digital trust also introduces a paradox: it can simultaneously prevent law enforcement efforts by hiding potential malicious activities. This paper examines the dual role of E2EE, its critical importance to privacy, the challenges it

Paperid: 889, https://arxiv.org/pdf/2509.09488.pdf

Abstract:
Diffusion models have significantly advanced text-to-image generation, enabling the creation of highly realistic images conditioned on textual prompts and seeds. Given the considerable intellectual and economic value embedded in such prompts, prompt theft poses a critical security and privacy concern. In this paper, we investigate prompt-stealing attacks targeting diffusion models. We reveal that numerical optimization-based prompt recovery methods are fundamentally limited as they do not account for the initial random noise used during image generation. We identify and exploit a noise-generation vulnerability (CWE-339), prevalent in major image-generation frameworks, originating from PyTorch's restriction of seed values to a range of $2^{32}$ when generating the initial random noise on CPUs. Through a large-scale empirical analysis conducted on images shared via the popular platform CivitAI, we demonstrate that approximately 95% of these images' seed values can be effectively brute-forced in 140 minutes per seed using our seed-recovery tool, SeedSnitch. Leveraging the recovered seed, we propose PromptPirate, a genetic algorithm-based optimization method explicitly designed for prompt stealing. PromptPirate surpasses state-of-the-art methods, i.e., PromptStealer, P2HP, and CLIP-Interrogator, achieving an 8-11% improvement in LPIPS similarity. Furthermore, we introduce straightforward and effective countermeasures that render seed stealing, and thus optimization-based prompt stealing, ineffective. We have disclosed our findings responsibly and initiated coordinated mitigation efforts with the developers to address this critical vulnerability.

Paperid: 890, https://arxiv.org/pdf/2509.09097.pdf

Abstract:
As on-device large language model (LLM) systems become increasingly prevalent, federated fine-tuning enables advanced language understanding and generation directly on edge devices; however, it also involves processing sensitive, user-specific data, raising significant privacy concerns within the federated learning framework. To address these challenges, we propose DP-FedLoRA, a privacy-enhanced federated fine-tuning framework that integrates LoRA-based adaptation with differential privacy in a communication-efficient setting. Each client locally clips and perturbs its LoRA matrices using Gaussian noise to satisfy ($Îµ$, $Î´$)-differential privacy. We further provide a theoretical analysis demonstrating the unbiased nature of the updates and deriving bounds on the variance introduced by noise, offering practical guidance for privacy-budget calibration. Experimental results across mainstream benchmarks show that DP-FedLoRA delivers competitive performance while offering strong privacy guarantees, paving the way for scalable and privacy-preserving LLM deployment in on-device environments.

Paperid: 891, https://arxiv.org/pdf/2509.08524.pdf

Abstract:
Symbolic execution is a powerful technique for software testing, but suffers from limitations when encountering external functions, such as native methods or third-party libraries. Existing solutions often require additional context, expensive SMT solvers, or manual intervention to approximate these functions through symbolic stubs. In this work, we propose a novel approach to automatically generate symbolic stubs for external functions during symbolic execution that leverages Genetic Programming. When the symbolic executor encounters an external function, AutoStub generates training data by executing the function on randomly generated inputs and collecting the outputs. Genetic Programming then derives expressions that approximate the behavior of the function, serving as symbolic stubs. These automatically generated stubs allow the symbolic executor to continue the analysis without manual intervention, enabling the exploration of program paths that were previously intractable. We demonstrate that AutoStub can automatically approximate external functions with over 90% accuracy for 55% of the functions evaluated, and can infer language-specific behaviors that reveal edge cases crucial for software testing.

Paperid: 892, https://arxiv.org/pdf/2509.07941.pdf

Abstract:
Code generation has emerged as a pivotal capability of Large Language Models(LLMs), revolutionizing development efficiency for programmers of all skill levels. However, the complexity of data structures and algorithmic logic often results in functional deficiencies and security vulnerabilities in generated code, reducing it to a prototype requiring extensive manual debugging. While Retrieval-Augmented Generation (RAG) can enhance correctness and security by leveraging external code manuals, it simultaneously introduces new attack surfaces. In this paper, we pioneer the exploration of attack surfaces in Retrieval-Augmented Code Generation (RACG), focusing on malicious dependency hijacking. We demonstrate how poisoned documentation containing hidden malicious dependencies (e.g., matplotlib_safe) can subvert RACG, exploiting dual trust chains: LLM reliance on RAG and developers' blind trust in LLM suggestions. To construct poisoned documents, we propose ImportSnare, a novel attack framework employing two synergistic strategies: 1)Position-aware beam search optimizes hidden ranking sequences to elevate poisoned documents in retrieval results, and 2)Multilingual inductive suggestions generate jailbreaking sequences to manipulate LLMs into recommending malicious dependencies. Through extensive experiments across Python, Rust, and JavaScript, ImportSnare achieves significant attack success rates (over 50% for popular libraries such as matplotlib and seaborn) in general, and is also able to succeed even when the poisoning ratio is as low as 0.01%, targeting both custom and real-world malicious packages. Our findings reveal critical supply chain risks in LLM-powered development, highlighting inadequate security alignment for code generation tasks. To support future research, we will release the multilingual benchmark suite and datasets. The project homepage is https://importsnare.github.io.

Paperid: 893, https://arxiv.org/pdf/2509.02411.pdf

Abstract:
Mobile Large Language Models (LLMs) are revolutionizing diverse fields such as healthcare, finance, and education with their ability to perform advanced natural language processing tasks on-the-go. However, the deployment of these models in mobile and edge environments introduces significant challenges related to privacy and security due to their resource-intensive nature and the sensitivity of the data they process. This survey provides a comprehensive overview of privacy and security issues associated with mobile LLMs, systematically categorizing existing solutions such as differential privacy, federated learning, and prompt encryption. Furthermore, we analyze vulnerabilities unique to mobile LLMs, including adversarial attacks, membership inference, and side-channel attacks, offering an in-depth comparison of their effectiveness and limitations. Despite recent advancements, mobile LLMs face unique hurdles in achieving robust security while maintaining efficiency in resource-constrained environments. To bridge this gap, we propose potential applications, discuss open challenges, and suggest future research directions, paving the way for the development of trustworthy, privacy-compliant, and scalable mobile LLM systems.

Paperid: 894, https://arxiv.org/pdf/2509.02028.pdf

Abstract:
Language-vision understanding has driven the development of advanced perception systems, most notably the emerging paradigm of Referring Multi-Object Tracking (RMOT). By leveraging natural-language queries, RMOT systems can selectively track objects that satisfy a given semantic description, guided through Transformer-based spatial-temporal reasoning modules. End-to-End (E2E) RMOT models further unify feature extraction, temporal memory, and spatial reasoning within a Transformer backbone, enabling long-range spatial-temporal modeling over fused textual-visual representations. Despite these advances, the reliability and robustness of RMOT remain underexplored. In this paper, we examine the security implications of RMOT systems from a design-logic perspective, identifying adversarial vulnerabilities that compromise both the linguistic-visual referring and track-object matching components. Additionally, we uncover a novel vulnerability in advanced RMOT models employing FIFO-based memory, whereby targeted and consistent attacks on their spatial-temporal reasoning introduce errors that persist within the history buffer over multiple subsequent frames. We present VEIL, a novel adversarial framework designed to disrupt the unified referring-matching mechanisms of RMOT models. We show that carefully crafted digital and physical perturbations can corrupt the tracking logic reliability, inducing track ID switches and terminations. We conduct comprehensive evaluations using the Refer-KITTI dataset to validate the effectiveness of VEIL and demonstrate the urgent need for security-aware RMOT designs for critical large-scale applications.

Paperid: 895, https://arxiv.org/pdf/2508.03125.pdf

Abstract:
Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.

Paperid: 896, https://arxiv.org/pdf/2507.15112.pdf

Abstract:
Machine unlearning seeks to remove unwanted information from trained models, initially at the individual-sample level, but increasingly at the level of entire sub-populations. In many deployments, models must delete whole topical domains to satisfy privacy, legal, or quality requirements, e.g., removing several users' posts under GDPR or copyrighted web content. Existing unlearning tools remain largely sample-oriented, and straightforward point deletion often leaves enough residual signal for downstream learners to recover the unwanted domain. We introduce distributional unlearning, a data-centric, model-agnostic framework that asks: Given examples from an unwanted distribution and a retained distribution, what is the smallest set of points whose removal makes the edited dataset far from the unwanted domain yet close to the retained one? Using Kullback-Leibler divergence to quantify removal and preservation, we derive the exact Pareto frontier in the Gaussian case and prove that any model retrained on the edited data incurs log-loss shifts bounded by the divergence thresholds. We propose a simple distance-based selection rule satisfying these constraints with a quadratic reduction in deletion budget compared to random removal. Experiments on synthetic Gaussians, Jigsaw Toxic Comments, SMS spam, and CIFAR-10 show 15-72% fewer deletions than random, with negligible impact on retained performance.

Paperid: 897, https://arxiv.org/pdf/2507.14201.pdf

Abstract:
We present ExCyTIn-Bench, the first benchmark to Evaluate an LLM agent x on the task of Cyber Threat Investigation through security questions derived from investigation graphs. Real-world security analysts must sift through a large number of heterogeneous alert signals and security logs, follow multi-hop chains of evidence, and compile an incident report. With the developments of LLMs, building LLM-based agents for automatic thread investigation is a promising direction. To assist the development and evaluation of LLM agents, we construct a dataset from a controlled Azure tenant that covers 8 simulated real-world multi-step attacks, 57 log tables from Microsoft Sentinel and related services, and 589 automatically generated questions. We leverage security logs extracted with expert-crafted detection logic to build threat investigation graphs, and then generate questions with LLMs using paired nodes on the graph, taking the start node as background context and the end node as answer. Anchoring each question to these explicit nodes and edges not only provides automatic, explainable ground truth answers but also makes the pipeline reusable and readily extensible to new logs. This also enables the automatic generation of procedural tasks with verifiable rewards, which can be naturally extended to training agents via reinforcement learning. Our comprehensive experiments with different models confirm the difficulty of the task: with the base setting, the average reward across all evaluated models is 0.249, and the best achieved is 0.368, leaving substantial headroom for future research. Code and data are coming soon!

Paperid: 898, https://arxiv.org/pdf/2507.06986.pdf

Abstract:
Machine learning models, particularly decision trees (DTs), are widely adopted across various domains due to their interpretability and efficiency. However, as ML models become increasingly integrated into privacy-sensitive applications, concerns about their confidentiality have grown, particularly in light of emerging threats such as model extraction and fault injection attacks. Assessing the vulnerability of DTs under such attacks is therefore important. In this work, we present BarkBeetle, a novel attack that leverages fault injection to extract internal structural information of DT models. BarkBeetle employs a bottom-up recovery strategy that uses targeted fault injection at specific nodes to efficiently infer feature splits and threshold values. Our proof-of-concept implementation demonstrates that BarkBeetle requires significantly fewer queries and recovers more structural information compared to prior approaches, when evaluated on DTs trained with public UCI datasets. To validate its practical feasibility, we implement BarkBeetle on a Raspberry Pi RP2350 board and perform fault injections using the Faultier voltage glitching tool. As BarkBeetle targets general DT models, we also provide an in-depth discussion on its applicability to a broader range of tree-based applications, including data stream classification, DT variants, and cryptography schemes.

Paperid: 899, https://arxiv.org/pdf/2507.04752.pdf

Abstract:
Large Language Models (LLMs) have revolutionized various fields with their exceptional capabilities in understanding, processing, and generating human-like text. This paper investigates the potential of LLMs in advancing Network Intrusion Detection Systems (NIDS), analyzing current challenges, methodologies, and future opportunities. It begins by establishing a foundational understanding of NIDS and LLMs, exploring the enabling technologies that bridge the gap between intelligent and cognitive systems in AI-driven NIDS. While Intelligent NIDS leverage machine learning and deep learning to detect threats based on learned patterns, they often lack contextual awareness and explainability. In contrast, Cognitive NIDS integrate LLMs to process both structured and unstructured security data, enabling deeper contextual reasoning, explainable decision-making, and automated response for intrusion behaviors. Practical implementations are then detailed, highlighting LLMs as processors, detectors, and explainers within a comprehensive AI-driven NIDS pipeline. Furthermore, the concept of an LLM-centered Controller is proposed, emphasizing its potential to coordinate intrusion detection workflows, optimizing tool collaboration and system performance. Finally, this paper identifies critical challenges and opportunities, aiming to foster innovation in developing reliable, adaptive, and explainable NIDS. By presenting the transformative potential of LLMs, this paper seeks to inspire advancement in next-generation network security systems.

Paperid: 900, https://arxiv.org/pdf/2512.23881.pdf

Abstract:
Audio-language models combine audio encoders with large language models to enable multimodal reasoning, but they also introduce new security vulnerabilities. We propose a universal targeted latent space attack, an encoder-level adversarial attack that manipulates audio latent representations to induce attacker-specified outputs in downstream language generation. Unlike prior waveform-level or input-specific attacks, our approach learns a universal perturbation that generalizes across inputs and speakers and does not require access to the language model. Experiments on Qwen2-Audio-7B-Instruct demonstrate consistently high attack success rates with minimal perceptual distortion, revealing a critical and previously underexplored attack surface at the encoder level of multimodal systems.

Paperid: 901, https://arxiv.org/pdf/2512.10426.pdf

Abstract:
Healthcare has become exceptionally sophisticated, as wearables and connected medical devices are revolutionising remote patient monitoring, emergency response, medication management, diagnosis, and predictive and prescriptive analytics. Internet of Things and Cloud computing integrated systems (IoT-Cloud) facilitate sensing, automation, and processing for these healthcare applications. While real-time response is crucial for alleviating patient emergencies, protecting patient privacy is extremely important in data-driven healthcare. In this paper, we propose a multi-layer IoT, Edge and Cloud architecture to enhance the speed of response for emergency healthcare by distributing tasks based on response criticality and permanence of storage. Privacy of patient data is assured by proposing a Differential Privacy framework across several machine learning models such as K-means, Logistic Regression, Random Forest and Naive Bayes. We establish a comprehensive threat model identifying three adversary classes and evaluate Laplace, Gaussian, and hybrid noise mechanisms across varying privacy budgets, with supervised algorithms achieving up to 86% accuracy. The proposed hybrid Laplace-Gaussian noise mechanism with adaptive budget allocation provides a balanced approach, offering moderate tails and better privacy-utility trade-offs for both low and high dimension datasets. At the practical threshold of $\varepsilon = 5.0$, supervised algorithms achieve 82-84% accuracy while reducing attribute inference attacks by up to 18% and data reconstruction correlation by 70%. Blockchain security further ensures trusted communication through time-stamping, traceability, and immutability for analytics applications. Edge computing demonstrates 8$\times$ latency reduction for emergency scenarios, validating the hierarchical architecture for time-critical operations.

Paperid: 902, https://arxiv.org/pdf/2512.07086.pdf

Abstract:
Large Language Models (LLMs) have become foundational components in a wide range of applications, including natural language understanding and generation, embodied intelligence, and scientific discovery. As their computational requirements continue to grow, these models are increasingly deployed as cloud-based services, allowing users to access powerful LLMs via the Internet. However, this deployment model introduces a new class of threat: denial-of-service (DoS) attacks via unbounded reasoning, where adversaries craft specially designed inputs that cause the model to enter excessively long or infinite generation loops. These attacks can exhaust backend compute resources, degrading or denying service to legitimate users. To mitigate such risks, many LLM providers adopt a closed-source, black-box setting to obscure model internals. In this paper, we propose ThinkTrap, a novel input-space optimization framework for DoS attacks against LLM services even in black-box environments. The core idea of ThinkTrap is to first map discrete tokens into a continuous embedding space, then undertake efficient black-box optimization in a low-dimensional subspace exploiting input sparsity. The goal of this optimization is to identify adversarial prompts that induce extended or non-terminating generation across several state-of-the-art LLMs, achieving DoS with minimal token overhead. We evaluate the proposed attack across multiple commercial, closed-source LLM services. Our results demonstrate that, even far under the restrictive request frequency limits commonly enforced by these platforms, typically capped at ten requests per minute (10 RPM), the attack can degrade service throughput to as low as 1% of its original capacity, and in some cases, induce complete service failure.

Paperid: 903, https://arxiv.org/pdf/2512.00414.pdf

Abstract:
This paper introduces BeaCon, a novel tool for the automated generation of adjustable container security policies. Unlike prior approaches, BeaCon leverages dynamic analysis to simulate realistic environments, uncovering container execution paths that may remain hidden during the profiling phase. To address the challenge of exploring vast profiling spaces, we employ efficient heuristics to reveal additional system events with minimal effort. In addition, BeaCon incorporates a security and functionality scoring mechanism to prioritize system calls and capabilities based on their impact on the host OS kernel's security and the functionality of containerized applications. By integrating these scores, BeaCon achieves a customized balance between security and functionality, enabling cloud providers to enforce security measures while maintaining tenant availability. We implemented a prototype of BeaCon using eBPF kernel technology and conducted extensive evaluations. Results from the top 15 containers, which revealed significant improvements, demonstrate that BeaCon identifies an average of 16.5% additional syscalls by applying diverse environments. Furthermore, we evaluated its effectiveness in mitigating risks associated with 45 known vulnerabilities (e.g., CVEs), showcasing its potential to significantly enhance container security. Additionally, we performed proof-of-concept demonstrations for two well-known security vulnerabilities, showing that BeaCon successfully reduces attack surface by blocking these exploits.

Paperid: 904, https://arxiv.org/pdf/2511.09266.pdf

Abstract:
Modern enterprise networks comprise diverse and heterogeneous systems that support a wide range of services, making it challenging for administrators to track and analyze sophisticated attacks such as advanced persistent threats (APTs), which often exploit multiple vectors. To address this challenge, we introduce the concept of network-level security provenance, which enables the systematic establishment of causal relationships across hosts at the network level, facilitating the accurate identification of the root causes of security incidents. Building on this concept, we present SecTracer as a framework for a network-wide provenance analysis. SecTracer offers three main contributions: (i) comprehensive and efficient forensic data collection in enterprise networks via software-defined networking (SDN), (ii) reconstruction of attack histories through provenance graphs to provide a clear and interpretable view of intrusions, and (iii) proactive attack prediction using probabilistic models. We evaluated the effectiveness and efficiency of SecTracer through a real-world APT simulation, demonstrating its capability to enhance threat mitigation while introducing less than 1% network throughput overhead and negligible latency impact.

Paperid: 905, https://arxiv.org/pdf/2511.02083.pdf

Abstract:
Watermarking has emerged as a promising technique to track AI-generated content and differentiate it from authentic human creations. While prior work extensively studies watermarking for autoregressive large language models (LLMs) and image diffusion models, none address discrete diffusion language models, which are becoming popular due to their high inference throughput. In this paper, we introduce the first watermarking method for discrete diffusion models by applying the distribution-preserving Gumbel-max trick at every diffusion step and seeding the randomness with the sequence index to enable reliable detection. We experimentally demonstrate that our scheme is reliably detectable on state-of-the-art diffusion language models and analytically prove that it is distortion-free with an exponentially decaying probability of false detection in the token sequence length.

Paperid: 906, https://arxiv.org/pdf/2510.22555.pdf

Abstract:
Graph Neural Networks(GNNs) are vulnerable to backdoor attacks, where adversaries implant malicious triggers to manipulate model predictions. Existing trigger generators are often simplistic in structure and overly reliant on specific features, confining them to a single graph learning paradigm, such as graph supervised learning, graph contrastive learning, or graph prompt learning. This specialized design, which aligns the trigger with one learning objective, results in poor transferability when applied to other learning paradigms. For instance, triggers generated for the graph supervised learning paradigm perform poorly when tested within graph contrastive learning or graph prompt learning environments. Furthermore, these simple generators often fail to utilize complex structural information or node diversity within the graph data. These constraints limit the attack success rates of such methods in general testing scenarios. Therefore, to address these limitations, we propose Cross-Paradigm Graph Backdoor Attacks with Promptable Subgraph Triggers(CP-GBA), a new transferable graph backdoor attack that employs graph prompt learning(GPL) to train a set of universal subgraph triggers. First, we distill a compact yet expressive trigger set from target graphs, which is structured as a queryable repository, by jointly enforcing class-awareness, feature richness, and structural fidelity. Second, we conduct the first exploration of the theoretical transferability of GPL to train these triggers under prompt-based objectives, enabling effective generalization to diverse and unseen test-time paradigms. Extensive experiments across multiple real-world datasets and defense scenarios show that CP-GBA achieves state-of-the-art attack success rates.

Paperid: 907, https://arxiv.org/pdf/2510.16054.pdf

Abstract:
When users submit queries to Large Language Models (LLMs), their prompts can often contain sensitive data, forcing a difficult choice: Send the query to a powerful proprietary LLM providers to achieving state-of-the-art performance and risk data exposure, or relying on smaller, local models guarantees data privacy but often results in a degradation of task performance. Prior approaches have relied on static pipelines that use LLM rewriting, which shatters linguistic coherence and indiscriminately removes privacy-sensitive information, including task-critical content. We reformulate this challenge (Privacy-Conscious Delegation) as a sequential decision-making problem and introduce a novel reinforcement learning (RL) framework called PrivacyPAD to solve it. Our framework trains an agent to dynamically route text chunks, learning a policy that optimally balances the trade-off between privacy leakage and task performance. It implicitly distinguishes between replaceable Personally Identifiable Information (PII) (which it shields locally) and task-critical PII (which it strategically sends to the remote model for maximal utility). To validate our approach in complex scenarios, we also introduce a new medical dataset with high PII density. Our framework achieves a new state-of-the-art on the privacy-utility frontier, demonstrating the necessity of learned, adaptive policies for deploying LLMs in sensitive environments.

Paperid: 908, https://arxiv.org/pdf/2510.15798.pdf

Abstract:
Distributed SDN (Software-Defined Networking) controllers have rapidly become an integral element of Wide Area Networks (WAN), particularly within SD-WAN, providing scalability and fault-tolerance for expansive network infrastructures. However, the architecture of these controllers introduces new potential attack surfaces that have thus far received inadequate attention. In response to these concerns, we introduce Ambusher, a testing tool designed to discover vulnerabilities within protocols used in distributed SDN controllers. Ambusher achieves this by leveraging protocol state fuzzing, which systematically finds attack scenarios based on an inferred state machine. Since learning states from a cluster is complicated, Ambusher proposes a novel methodology that extracts a single and relatively simple state machine, achieving efficient state-based fuzzing. Our evaluation of Ambusher, conducted on a real SD-WAN deployment spanning two campus networks and one enterprise network, illustrates its ability to uncover 6 potential vulnerabilities in the widely used distributed controller platform.

Paperid: 909, https://arxiv.org/pdf/2510.14086.pdf

Abstract:
The ubiquity of closed-weight language models with public-facing APIs has generated interest in forensic methods, both for extracting hidden model details (e.g., parameters) and for identifying models by their outputs. One successful approach to these goals has been to exploit the geometric constraints imposed by the language model architecture and parameters. In this work, we show that a lesser-known geometric constraint--namely, that language model outputs lie on the surface of a high-dimensional ellipse--functions as a signature for the model and can be used to identify the source model of a given output. This ellipse signature has unique properties that distinguish it from existing model-output association methods like language model fingerprints. In particular, the signature is hard to forge: without direct access to model parameters, it is practically infeasible to produce log-probabilities (logprobs) on the ellipse. Secondly, the signature is naturally occurring, since all language models have these elliptical constraints. Thirdly, the signature is self-contained, in that it is detectable without access to the model inputs or the full weights. Finally, the signature is compact and redundant, as it is independently detectable in each logprob output from the model. We evaluate a novel technique for extracting the ellipse from small models and discuss the practical hurdles that make it infeasible for production-scale models. Finally, we use ellipse signatures to propose a protocol for language model output verification, analogous to cryptographic symmetric-key message authentication systems.

Paperid: 910, https://arxiv.org/pdf/2510.11151.pdf

Abstract:
Large language Models (LLMs) have shown remarkable proficiency in code generation tasks across various programming languages. However, their outputs often contain subtle but critical vulnerabilities, posing significant risks when deployed in security-sensitive or mission-critical systems. This paper introduces TypePilot, an agentic AI framework designed to enhance the security and robustness of LLM-generated code by leveraging strongly typed and verifiable languages, using Scala as a representative example. We evaluate the effectiveness of our approach in two settings: formal verification with the Stainless framework and general-purpose secure code generation. Our experiments with leading open-source LLMs reveal that while direct code generation often fails to enforce safety constraints, just as naive prompting for more secure code, our type-focused agentic pipeline substantially mitigates input validation and injection vulnerabilities. The results demonstrate the potential of structured, type-guided LLM workflows to improve the SotA of the trustworthiness of automated code generation in high-assurance domains.

Paperid: 911, https://arxiv.org/pdf/2510.10990.pdf

Abstract:
Text data has become extremely valuable on large language models (LLMs) and even lead to general artificial intelligence (AGI). A lot of high-quality text in the real world is private and cannot be freely used due to privacy concerns. Therefore, differentially private (DP) synthetic text generation has been proposed, aiming to produce high-utility synthetic data while protecting sensitive information. However, existing DP synthetic text generation imposes uniform guarantees that often overprotect non-sensitive content, resulting in substantial utility loss and computational overhead. Therefore, we propose Secret-Protected Evolution (SecPE), a novel framework that extends private evolution with secret-aware protection. Theoretically, we show that SecPE satisfies $(\mathrm{p}, \mathrm{r})$-secret protection, constituting a relaxation of Gaussian DP that enables tighter utility-privacy trade-offs, while also substantially reducing computational complexity relative to baseline methods. Empirically, across the OpenReview, PubMed, and Yelp benchmarks, SecPE consistently achieves lower Fréchet Inception Distance (FID) and higher downstream task accuracy than GDP-based Aug-PE baselines, while requiring less noise to attain the same level of protection. Our results highlight that secret-aware guarantees can unlock more practical and effective privacy-preserving synthetic text generation.

Paperid: 912, https://arxiv.org/pdf/2510.10321.pdf

Abstract:
Software vulnerabilities remain a persistent risk, yet static and dynamic analyses often overlook structural dependencies that shape insecure behaviors. Viewing programs as heterogeneous graphs, we capture control- and data-flow relations as complex interaction networks. Our hybrid framework combines these graph representations with light-weight (<4B) local LLMs, uniting topological features with semantic reasoning while avoiding the cost and privacy concerns of large cloud models. Evaluated on Java vulnerability detection (binary classification), our method achieves 93.57% accuracy-an 8.36% gain over Graph Attention Network-based embeddings and 17.81% over pretrained LLM baselines such as Qwen2.5 Coder 3B. Beyond accuracy, the approach extracts salient subgraphs and generates natural language explanations, improving interpretability for developers. These results pave the way for scalable, explainable, and locally deployable tools that can shift vulnerability analysis from purely syntactic checks to deeper structural and semantic insights, facilitating broader adoption in real-world secure software development.

Paperid: 913, https://arxiv.org/pdf/2510.07968.pdf

Abstract:
Large Language Models (LLMs) have shown remarkable performance across various applications, but their deployment in sensitive domains raises significant concerns. To mitigate these risks, numerous defense strategies have been proposed. However, most existing studies assess these defenses in isolation, overlooking their broader impacts across other risk dimensions. In this work, we take the first step in investigating unintended interactions caused by defenses in LLMs, focusing on the complex interplay between safety, fairness, and privacy. Specifically, we propose CrossRiskEval, a comprehensive evaluation framework to assess whether deploying a defense targeting one risk inadvertently affects others. Through extensive empirical studies on 14 defense-deployed LLMs, covering 12 distinct defense strategies, we reveal several alarming side effects: 1) safety defenses may suppress direct responses to sensitive queries related to bias or privacy, yet still amplify indirect privacy leakage or biased outputs; 2) fairness defenses increase the risk of misuse and privacy leakage; 3) privacy defenses often impair safety and exacerbate bias. We further conduct a fine-grained neuron-level analysis to uncover the underlying mechanisms of these phenomena. Our analysis reveals the existence of conflict-entangled neurons in LLMs that exhibit opposing sensitivities across multiple risk dimensions. Further trend consistency analysis at both task and neuron levels confirms that these neurons play a key role in mediating the emergence of unintended behaviors following defense deployment. We call for a paradigm shift in LLM risk evaluation, toward holistic, interaction-aware assessment of defense strategies.

Paperid: 914, https://arxiv.org/pdf/2510.07806.pdf

Abstract:
Modern web application recovery presents a critical dilemma. Coarse-grained snapshot rollbacks cause unacceptable data loss for legitimate users. Surgically removing an attack's impact is hindered by a fundamental challenge in high-concurrency environments: it is difficult to attribute resulting file and database modifications to a specific attack-related request. We present ANCORA, a system for precise intrusion recovery in web applications without invasive instrumentation. ANCORA first isolates the full sequence of syscalls triggered by a single malicious request. Based on this sequence, ANCORA addresses file and database modifications separately. To trace file changes, it builds a provenance graph that reveals all modifications, including those by exploit-spawned processes. To attribute database operations, a more difficult challenge due to connection pooling, ANCORA introduces a novel spatiotemporal anchor. This anchor uses the request's network connection tuple and active time window to pinpoint exact database operations. With all malicious file and database operations precisely identified, ANCORA performs a unified rewind and selective replay recovery. It reverts the system to a clean snapshot taken before the attack, then selectively re-applies only legitimate operations to both the file system and database. This completely removes the attack's effects while preserving concurrent legitimate data. We evaluated ANCORA on 10 web applications and 20 CVE-based attack scenarios with concurrency up to 150 connections. Experiments demonstrate ANCORA achieves 99.9% recovery accuracy with manageable overhead: up to 19.8% response latency increase and 17.8% QPS decrease in worst cases, and recovery throughput of 110.7 database operations per second and 27.2 affected files per second, effectively preserving legitimate data.

Paperid: 915, https://arxiv.org/pdf/2510.05181.pdf

Abstract:
Millions of users rely on a market of cloud-based services to obtain access to state-of-the-art large language models. However, it has been very recently shown that the de facto pay-per-token pricing mechanism used by providers creates a financial incentive for them to strategize and misreport the (number of) tokens a model used to generate an output. In this paper, we develop an auditing framework based on martingale theory that enables a trusted third-party auditor who sequentially queries a provider to detect token misreporting. Crucially, we show that our framework is guaranteed to always detect token misreporting, regardless of the provider's (mis-)reporting policy, and not falsely flag a faithful provider as unfaithful with high probability. To validate our auditing framework, we conduct experiments across a wide range of (mis-)reporting policies using several large language models from the $\texttt{Llama}$, $\texttt{Gemma}$ and $\texttt{Ministral}$ families, and input prompts from a popular crowdsourced benchmarking platform. The results show that our framework detects an unfaithful provider after observing fewer than $\sim 70$ reported outputs, while maintaining the probability of falsely flagging a faithful provider below $α= 0.05$.

Paperid: 916, https://arxiv.org/pdf/2510.04257.pdf

Abstract:
Multimodal agents built on large vision-language models (LVLMs) are increasingly deployed in open-world settings but remain highly vulnerable to prompt injection, especially through visual inputs. We introduce AgentTypo, a black-box red-teaming framework that mounts adaptive typographic prompt injection by embedding optimized text into webpage images. Our automatic typographic prompt injection (ATPI) algorithm maximizes prompt reconstruction by substituting captioners while minimizing human detectability via a stealth loss, with a Tree-structured Parzen Estimator guiding black-box optimization over text placement, size, and color. To further enhance attack strength, we develop AgentTypo-pro, a multi-LLM system that iteratively refines injection prompts using evaluation feedback and retrieves successful past examples for continual learning. Effective prompts are abstracted into generalizable strategies and stored in a strategy repository, enabling progressive knowledge accumulation and reuse in future attacks. Experiments on the VWA-Adv benchmark across Classifieds, Shopping, and Reddit scenarios show that AgentTypo significantly outperforms the latest image-based attacks such as AgentAttack. On GPT-4o agents, our image-only attack raises the success rate from 0.23 to 0.45, with consistent results across GPT-4V, GPT-4o-mini, Gemini 1.5 Pro, and Claude 3 Opus. In image+text settings, AgentTypo achieves 0.68 ASR, also outperforming the latest baselines. Our findings reveal that AgentTypo poses a practical and potent threat to multimodal agents and highlight the urgent need for effective defense.

Paperid: 917, https://arxiv.org/pdf/2509.24173.pdf

Abstract:
We study the problem of discrete distribution estimation under utility-optimized local differential privacy (ULDP), which enforces local differential privacy (LDP) on sensitive data while allowing more accurate inference on non-sensitive data. In this setting, we completely characterize the fundamental privacy-utility trade-off. The converse proof builds on several key ideas, including a generalized uniform asymptotic CramÃ©r-Rao lower bound, a reduction showing that it suffices to consider a newly defined class of extremal ULDP mechanisms, and a novel distribution decomposition technique tailored to ULDP constraints. For the achievability, we propose a class of utility-optimized block design (uBD) schemes, obtained as nontrivial modifications of the block design mechanism known to be optimal under standard LDP constraints, while incorporating the distribution decomposition idea used in the converse proof and a score-based linear estimator. These results provide a tight characterization of the estimation accuracy achievable under ULDP and reveal new insights into the structure of optimal mechanisms for privacy-preserving statistical inference.

Paperid: 918, https://arxiv.org/pdf/2509.22796.pdf

Abstract:
Open-source software projects are foundational to modern software ecosystems, with the Linux kernel standing out as a critical exemplar due to its ubiquity and complexity. Although security patches are continuously integrated into the Linux mainline kernel, downstream maintainers often delay their adoption, creating windows of vulnerability. A key reason for this lag is the difficulty in identifying security-critical patches, particularly those addressing exploitable vulnerabilities such as out-of-bounds (OOB) accesses and use-after-free (UAF) bugs. This challenge is exacerbated by intentionally silent bug fixes, incomplete or missing CVE assignments, delays in CVE issuance, and recent changes to the CVE assignment criteria for the Linux kernel. While fine-grained patch classification approaches exist, they exhibit limitations in both coverage and accuracy. In this work, we identify previously unexplored opportunities to significantly improve fine-grained patch classification. Specifically, by leveraging cues from commit titles/messages and diffs alongside appropriate code context, we develop DUALLM, a dual-method pipeline that integrates two approaches based on a Large Language Model (LLM) and a fine-tuned small language model. DUALLM achieves 87.4% accuracy and an F1-score of 0.875, significantly outperforming prior solutions. Notably, DUALLM successfully identified 111 of 5,140 recent Linux kernel patches as addressing OOB or UAF vulnerabilities, with 90 true positives confirmed by manual verification (many do not have clear indications in patch descriptions). Moreover, we constructed proof-of-concepts for two identified bugs (one UAF and one OOB), including one developed to conduct a previously unknown control-flow hijack as further evidence of the correctness of the classification.

Paperid: 919, https://arxiv.org/pdf/2509.16275.pdf

Abstract:
Modern software development pipelines face growing challenges in securing large codebases with extensive dependencies. Static analysis tools like Bandit are effective at vulnerability detection but suffer from high false positives and lack repair capabilities. Large Language Models (LLMs), in contrast, can suggest fixes but often hallucinate changes and lack self-validation. We present SecureFixAgent, a hybrid repair framework integrating Bandit with lightweight local LLMs (<8B parameters) in an iterative detect-repair-validate loop. To improve precision, we apply parameter-efficient LoRA-based fine-tuning on a diverse, curated dataset spanning multiple Python project domains, mitigating dataset bias and reducing unnecessary edits. SecureFixAgent uses Bandit for detection, the LLM for candidate fixes with explanations, and Bandit re-validation for verification, all executed locally to preserve privacy and reduce cloud reliance. Experiments show SecureFixAgent reduces false positives by 10.8% over static analysis, improves fix accuracy by 13.51%, and lowers false positives by 5.46% compared to pre-trained LLMs, typically converging within three iterations. Beyond metrics, developer studies rate explanation quality 4.5/5, highlighting its value for human trust and adoption. By combining verifiable security improvements with transparent rationale in a resource-efficient local framework, SecureFixAgent advances trustworthy, automated vulnerability remediation for modern pipelines.

Paperid: 920, https://arxiv.org/pdf/2509.13072.pdf

Abstract:
In today's evolving threat landscape, ensuring digital sovereignty has become mandatory for military organizations, especially given their increased development and investment in AI-driven cyber security solutions. To this end, a multi-angled framework is proposed in this article in order to define and assess digital sovereign control of data and AI-based models for military cyber security. This framework focuses on aspects such as context, autonomy, stakeholder involvement, and mitigation of risks in this domain. Grounded on the concepts of digital sovereignty and data sovereignty, the framework aims to protect sensitive defence assets against threats such as unauthorized access, ransomware, and supply-chain attacks. This approach reflects the multifaceted nature of digital sovereignty by preserving operational autonomy, assuring security and safety, securing privacy, and fostering ethical compliance of both military systems and decision-makers. At the same time, the framework addresses interoperability challenges among allied forces, strategic and legal considerations, and the integration of emerging technologies by considering a multidisciplinary approach that enhances the resilience and preservation of control over (critical) digital assets. This is done by adopting a design oriented research where systematic literature review is merged with critical thinking and analysis of field incidents in order to assure the effectivity and realism of the framework proposed.

Paperid: 921, https://arxiv.org/pdf/2509.10691.pdf

Abstract:
Decentralized federated learning faces privacy risks because model updates can leak data through inference attacks and membership inference, a concern that grows over many client exchanges. Differential privacy offers principled protection by injecting calibrated noise so confidential information remains secure on resource-limited IoT devices. Yet without transparency, black-box training cannot track noise already injected by previous clients and rounds, which forces worst-case additions and harms accuracy. We propose PrivateDFL, an explainable framework that joins hyperdimensional computing with differential privacy and keeps an auditable account of cumulative noise so each client adds only the difference between the required noise and what has already been accumulated. We evaluate on MNIST, ISOLET, and UCI-HAR to span image, signal, and tabular modalities, and we benchmark against transformer-based and deep learning-based baselines trained centrally with Differentially Private Stochastic Gradient Descent (DP-SGD) and Renyi Differential Privacy (RDP). PrivateDFL delivers higher accuracy, lower latency, and lower energy across IID and non-IID partitions while preserving formal (epsilon, delta) guarantees and operating without a central server. For example, under non-IID partitions, PrivateDFL achieves 24.42% higher accuracy than the Vision Transformer on MNIST while using about 10x less training time, 76x lower inference latency, and 11x less energy, and on ISOLET it exceeds Transformer accuracy by more than 80% with roughly 10x less training time, 40x lower inference latency, and 36x less training energy. Future work will extend the explainable accounting to adversarial clients and adaptive topologies with heterogeneous privacy budgets.

Paperid: 922, https://arxiv.org/pdf/2509.06350.pdf

Abstract:
Jailbreak attacks on Large Language Models (LLMs) have demonstrated various successful methods whereby attackers manipulate models into generating harmful responses that they are designed to avoid. Among these, Greedy Coordinate Gradient (GCG) has emerged as a general and effective approach that optimizes the tokens in a suffix to generate jailbreakable prompts. While several improved variants of GCG have been proposed, they all rely on fixed-length suffixes. However, the potential redundancy within these suffixes remains unexplored. In this work, we propose Mask-GCG, a plug-and-play method that employs learnable token masking to identify impactful tokens within the suffix. Our approach increases the update probability for tokens at high-impact positions while pruning those at low-impact positions. This pruning not only reduces redundancy but also decreases the size of the gradient space, thereby lowering computational overhead and shortening the time required to achieve successful attacks compared to GCG. We evaluate Mask-GCG by applying it to the original GCG and several improved variants. Experimental results show that most tokens in the suffix contribute significantly to attack success, and pruning a minority of low-impact tokens does not affect the loss values or compromise the attack success rate (ASR), thereby revealing token redundancy in LLM prompts. Our findings provide insights for developing efficient and interpretable LLMs from the perspective of jailbreak attacks.

Paperid: 923, https://arxiv.org/pdf/2509.05893.pdf

Abstract:
The Multi-Factor Key Derivation Function (MFKDF) offered a novel solution to the classic problem of usable client-side key management by incorporating multiple popular authentication factors into a key derivation process, but was later shown to be vulnerable to cryptanalysis that degraded its security over multiple invocations. In this paper, we present the Entropy State Transition Modeling Framework (ESTMF), a novel cryptanalytic technique designed to reveal pernicious leaks of entropy across multiple invocations of a cryptographic key derivation or hash function, and show that it can be used to correctly identify each of the known vulnerabilities in the original MFKDF construction. We then use these findings to propose a new construction for ``MFKDF2,'' a next-generation multi-factor key derivation function that can be proven to be end-to-end secure using the ESTMF. Finally, we discuss how MFKDF2 can be extended to support more authentication factors and usability features than the previous MFKDF construction, and derive several generalizable best-practices for the construction of new KDFs in the future.

Paperid: 924, https://arxiv.org/pdf/2509.05367.pdf

Abstract:
Large language models (LLMs) have undergone safety alignment efforts to mitigate harmful outputs. However, as LLMs become more sophisticated in reasoning, their intelligence may introduce new security risks. While traditional jailbreak attacks relied on singlestep attacks, multi-turn jailbreak strategies that adapt dynamically to context remain underexplored. In this work, we introduce TRIAL (Trolley-problem Reasoning for Interactive Attack Logic), a framework that leverages LLMs ethical reasoning to bypass their safeguards. TRIAL embeds adversarial goals within ethical dilemmas modeled on the trolley problem. TRIAL demonstrates high jailbreak success rates towards both open and close-source models. Our findings underscore a fundamental limitation in AI safety: as models gain advanced reasoning abilities, the nature of their alignment may inadvertently allow for more covert security vulnerabilities to be exploited. TRIAL raises an urgent need in reevaluating safety alignment oversight strategies, as current safeguards may prove insufficient against context-aware adversarial attack.

Paperid: 925, https://arxiv.org/pdf/2509.00781.pdf

Abstract:
Despite the ubiquity of modern face retrieval systems, their retrieval stage is often outsourced to third-party entities, posing significant risks to user portrait privacy. Although homomorphic encryption (HE) offers strong security guarantees by enabling arithmetic computations in the cipher space, its high computational inefficiency makes it unsuitable for real-time, real-world applications. To address this issue, we propose Cancelable Product Quantization, a highly efficient framework for secure face representation retrieval. Our hierarchical two-stage framework comprises: (i) a high-throughput cancelable PQ indexing module for fast candidate filtering, and (ii) a fine-grained cipher-space retrieval module for final precise face ranking. A tailored protection mechanism is designed to secure the indexing module for cancelable biometric authentication while ensuring efficiency. Experiments on benchmark datasets demonstrate that our method achieves an decent balance between effectiveness, efficiency and security.

Paperid: 926, https://arxiv.org/pdf/2508.21099.pdf

Abstract:
Despite the advancements in Text-to-Image (T2I) generation models, their potential for misuse or even abuse raises serious safety concerns. Model developers have made tremendous efforts to introduce safety mechanisms that can address these concerns in T2I models. However, the existing safety mechanisms, whether external or internal, either remain susceptible to evasion under distribution shifts or require extensive model-specific adjustments. To address these limitations, we introduce Safe-Control, an innovative plug-and-play safety patch designed to mitigate unsafe content generation in T2I models. Using data-driven strategies and safety-aware conditions, Safe-Control injects safety control signals into the locked T2I model, acting as an update in a patch-like manner. Model developers can also construct various safety patches to meet the evolving safety requirements, which can be flexibly merged into a single, unified patch. Its plug-and-play design further ensures adaptability, making it compatible with other T2I models of similar denoising architecture. We conduct extensive evaluations on six diverse and public T2I models. Empirical results highlight that Safe-Control is effective in reducing unsafe content generation across six diverse T2I models with similar generative architectures, yet it successfully maintains the quality and text alignment of benign images. Compared to seven state-of-the-art safety mechanisms, including both external and internal defenses, Safe-Control significantly outperforms all baselines in reducing unsafe content generation. For example, it reduces the probability of unsafe content generation to 7%, compared to approximately 20% for most baseline methods, under both unsafe prompts and the latest adversarial attacks.

Paperid: 927, https://arxiv.org/pdf/2508.08898.pdf

Abstract:
Blockchains are widely recognized for their immutability, which provides robust guarantees of data integrity and transparency. However, this same feature poses significant challenges in real-world situations that require regulatory compliance, correction of erroneous data, or removal of sensitive information. Redactable blockchains address the limitations of traditional ones by enabling controlled, auditable modifications to blockchain data, primarily through cryptographic mechanisms such as chameleon hash functions and alternative redaction schemes. This report examines the motivations for introducing redactability, surveys the cryptographic primitives that enable secure edits, and analyzes competing approaches and their shortcomings. Special attention is paid to the practical deployment of redactable blockchains in private settings, with discussions of use cases in healthcare, finance, Internet of drones, and federated learning. Finally, the report outlines further challenges, also in connection with reversible computing, and the future potential of redactable blockchains in building law-compliant, trustworthy, and scalable digital infrastructures.

Paperid: 928, https://arxiv.org/pdf/2508.03696.pdf

Abstract:
Text-to-Image (T2I) models have gained widespread adoption across various applications. Despite the success, the potential misuse of T2I models poses significant risks of generating Not-Safe-For-Work (NSFW) content. To investigate the vulnerability of T2I models, this paper delves into adversarial attacks to bypass the safety mechanisms under black-box settings. Most previous methods rely on word substitution to search adversarial prompts. Due to limited search space, this leads to suboptimal performance compared to gradient-based training. However, black-box settings present unique challenges to training gradient-driven attack methods, since there is no access to the internal architecture and parameters of T2I models. To facilitate the learning of adversarial prompts in black-box settings, we propose a novel prompt learning attack framework (PLA), where insightful gradient-based training tailored to black-box T2I models is designed by utilizing multimodal similarities. Experiments show that our new method can effectively attack the safety mechanisms of black-box T2I models including prompt filters and post-hoc safety checkers with a high success rate compared to state-of-the-art methods. Warning: This paper may contain offensive model-generated content.

Paperid: 929, https://arxiv.org/pdf/2508.02188.pdf

Abstract:
The emergence of the Internet of Agents (IoA) introduces critical challenges for communication privacy in sensitive, high-stakes domains. While standard Agent-to-Agent (A2A) protocols secure message content, they are not designed to protect the act of communication itself, leaving agents vulnerable to surveillance and traffic analysis. We find that the rich, event-driven nature of agent dialogues provides a powerful, yet untapped, medium for covert communication. To harness this potential, we introduce and formalize the Covert Event Channel, the first unified model for agent covert communication driven by three interconnected dimensions, which consist of the Storage, Timing,and Behavioral channels. Based on this model, we design and engineer Î CCAP, a novel protocol that operationalizes this event-driven paradigm. Our comprehensive evaluation demonstrates that Î CCAP achieves high capacity and robustness while remaining imperceptible to powerful LLM-based wardens, establishing its practical viability. By systematically engineering this channel, our work provides the foundational understanding essential for developing the next generation of monitoring systems and defensive protocols for a secure and trustworthy IoA.

Paperid: 930, https://arxiv.org/pdf/2507.21540.pdf

Abstract:
The increasing sophistication of large vision-language models (LVLMs) has been accompanied by advances in safety alignment mechanisms designed to prevent harmful content generation. However, these defenses remain vulnerable to sophisticated adversarial attacks. Existing jailbreak methods typically rely on direct and semantically explicit prompts, overlooking subtle vulnerabilities in how LVLMs compose information over multiple reasoning steps. In this paper, we propose a novel and effective jailbreak framework inspired by Return-Oriented Programming (ROP) techniques from software security. Our approach decomposes a harmful instruction into a sequence of individually benign visual gadgets. A carefully engineered textual prompt directs the sequence of inputs, prompting the model to integrate the benign visual gadgets through its reasoning process to produce a coherent and harmful output. This makes the malicious intent emergent and difficult to detect from any single component. We validate our method through extensive experiments on established benchmarks including SafeBench and MM-SafetyBench, targeting popular LVLMs. Results show that our approach consistently and substantially outperforms existing baselines on state-of-the-art models, achieving near-perfect attack success rates (over 0.90 on SafeBench) and improving ASR by up to 0.39. Our findings reveal a critical and underexplored vulnerability that exploits the compositional reasoning abilities of LVLMs, highlighting the urgent need for defenses that secure the entire reasoning process.

Paperid: 931, https://arxiv.org/pdf/2507.19598.pdf

Abstract:
Recent advancements in Large Language Models (LLMs) have significantly enhanced their code generation capabilities. However, their robustness against adversarial misuse, particularly through multi-turn malicious coding prompts, remains underexplored. In this work, we introduce code decomposition attacks, where a malicious coding task is broken down into a series of seemingly benign subtasks across multiple conversational turns to evade safety filters. To facilitate systematic evaluation, we introduce \benchmarkname{}, a large-scale benchmark designed to evaluate the robustness of code LLMs against both single-turn and multi-turn malicious prompts. Empirical results across open- and closed-source models reveal persistent vulnerabilities, especially under multi-turn scenarios. Fine-tuning on MOCHA improves rejection rates while preserving coding ability, and importantly, enhances robustness on external adversarial datasets with up to 32.4% increase in rejection rates without any additional supervision.

Paperid: 932, https://arxiv.org/pdf/2507.11310.pdf

Abstract:
Verifying the credibility of Cyber Threat Intelligence (CTI) is essential for reliable cybersecurity defense. However, traditional approaches typically treat this task as a static classification problem, relying on handcrafted features or isolated deep learning models. These methods often lack the robustness needed to handle incomplete, heterogeneous, or noisy intelligence, and they provide limited transparency in decision-making-factors that reduce their effectiveness in real-world threat environments. To address these limitations, we propose LRCTI, a Large Language Model (LLM)-based framework designed for multi-step CTI credibility verification. The framework first employs a text summarization module to distill complex intelligence reports into concise and actionable threat claims. It then uses an adaptive multi-step evidence retrieval mechanism that iteratively identifies and refines supporting information from a CTI-specific corpus, guided by LLM feedback. Finally, a prompt-based Natural Language Inference (NLI) module is applied to evaluate the credibility of each claim while generating interpretable justifications for the classification outcome. Experiments conducted on two benchmark datasets, CTI-200 and PolitiFact show that LRCTI improves F1-Macro and F1-Micro scores by over 5%, reaching 90.9% and 93.6%, respectively, compared to state-of-the-art baselines. These results demonstrate that LRCTI effectively addresses the core limitations of prior methods, offering a scalable, accurate, and explainable solution for automated CTI credibility verification

Paperid: 933, https://arxiv.org/pdf/2507.10898.pdf

Abstract:
The growing complexity of cyber threats and the limitations of traditional vulnerability detection tools necessitate novel approaches for securing software systems. We introduce MalCodeAI, a language-agnostic, multi-stage AI pipeline for autonomous code security analysis and remediation. MalCodeAI combines code decomposition and semantic reasoning using fine-tuned Qwen2.5-Coder-3B-Instruct models, optimized through Low-Rank Adaptation (LoRA) within the MLX framework, and delivers scalable, accurate results across 14 programming languages. In Phase 1, the model achieved a validation loss as low as 0.397 for functional decomposition and summarization of code segments after 200 iterations, 6 trainable layers, and a learning rate of 2 x 10^(-5). In Phase 2, for vulnerability detection and remediation, it achieved a best validation loss of 0.199 using the same number of iterations and trainable layers but with an increased learning rate of 4 x 10^(-5), effectively identifying security flaws and suggesting actionable fixes. MalCodeAI supports red-hat-style exploit tracing, CVSS-based risk scoring, and zero-shot generalization to detect complex, zero-day vulnerabilities. In a qualitative evaluation involving 15 developers, the system received high scores in usefulness (mean 8.06/10), interpretability (mean 7.40/10), and readability of outputs (mean 7.53/10), confirming its practical value in real-world development workflows. This work marks a significant advancement toward intelligent, explainable, and developer-centric software security solutions.

Paperid: 934, https://arxiv.org/pdf/2507.08862.pdf

Abstract:
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by retrieving external data to mitigate hallucinations and outdated knowledge issues. Benefiting from the strong ability in facilitating diverse data sources and supporting faithful reasoning, knowledge graphs (KGs) have been increasingly adopted in RAG systems, giving rise to KG-based RAG (KG-RAG) methods. Though RAG systems are widely applied in various applications, recent studies have also revealed its vulnerabilities to data poisoning attacks, where malicious information injected into external knowledge sources can mislead the system into producing incorrect or harmful responses. However, these studies focus exclusively on RAG systems using unstructured textual data sources, leaving the security risks of KG-RAG largely unexplored, despite the fact that KGs present unique vulnerabilities due to their structured and editable nature. In this work, we conduct the first systematic investigation of the security issue of KG-RAG methods through data poisoning attacks. To this end, we introduce a practical, stealthy attack setting that aligns with real-world implementation. We propose an attack strategy that first identifies adversarial target answers and then inserts perturbation triples to complete misleading inference chains in the KG, increasing the likelihood that KG-RAG methods retrieve and rely on these perturbations during generation. Through extensive experiments on two benchmarks and four recent KG-RAG methods, our attack strategy demonstrates strong effectiveness in degrading KG-RAG performance, even with minimal KG perturbations. In-depth analyses are also conducted to understand the safety threats within the internal stages of KG-RAG systems and to explore the robustness of LLMs against adversarial knowledge.

Paperid: 935, https://arxiv.org/pdf/2512.24602.pdf

Abstract:
Semantic communication (SemCom) has emerged as a promising paradigm for future wireless networks by prioritizing task-relevant meaning over raw data delivery, thereby reducing communication overhead and improving efficiency. However, shifting from bit-accurate transmission to task-oriented delivery introduces new security and privacy risks. These include semantic leakage, semantic manipulation, knowledge base vulnerabilities, model-related attacks, and threats to authenticity and availability. Most existing secure SemCom studies focus on analog SemCom, where semantic features are mapped to continuous channel inputs. In contrast, digital SemCom transmits semantic information through discrete bits or symbols within practical transceiver pipelines, offering stronger compatibility with realworld systems while exposing a distinct and underexplored attack surface. In particular, digital SemCom typically represents semantic information over a finite alphabet through explicit digital modulation, following two main routes: probabilistic modulation and deterministic modulation. These discrete mechanisms and practical transmission procedures introduce additional vulnerabilities affecting bit- or symbol-level semantic information, the modulation stage, and packet-based delivery and protocol operations. Motivated by these challenges and the lack of a systematic analysis of secure digital SemCom, this paper reviews SemCom fundamentals, clarifies the architectural differences between analog and digital SemCom and their security implications, organizes the threat landscape for digital SemCom, and discusses potential defenses. Finally, we outline open research directions toward secure and deployable digital SemCom systems.

Paperid: 936, https://arxiv.org/pdf/2512.21354.pdf

Abstract:
Contemporary large language model (LLM) agents are remarkably capable, but they still lack reliable safety controls and can produce unconstrained, unpredictable, and even actively harmful outputs. To address this, we introduce Reflection-Driven Control, a standardized and pluggable control module that can be seamlessly integrated into general agent architectures. Reflection-Driven Control elevates "self-reflection" from a post hoc patch into an explicit step in the agent's own reasoning process: during generation, the agent continuously runs an internal reflection loop that monitors and evaluates its own decision path. When potential risks are detected, the system retrieves relevant repair examples and secure coding guidelines from an evolving reflective memory, injecting these evidence-based constraints directly into subsequent reasoning steps. We instantiate Reflection-Driven Control in the setting of secure code generation and systematically evaluate it across eight classes of security-critical programming tasks. Empirical results show that Reflection-Driven Control substantially improves the security and policy compliance of generated code while largely preserving functional correctness, with minimal runtime and token overhead. Taken together, these findings indicate that Reflection-Driven Control is a practical path toward trustworthy AI coding agents: it enables designs that are simultaneously autonomous, safer by construction, and auditable.

Paperid: 937, https://arxiv.org/pdf/2512.17045.pdf

Abstract:
Modern blockchains increasingly adopt multi-proposer (MCP) consensus to remove single-leader bottlenecks and improve censorship resistance. However, MCP alone does not resolve how users should disseminate transactions to proposers. Today, users either naively replicate full transactions to many proposers, sacrificing goodput and exposing payloads to MEV, or target few proposers and accept weak censorship and latency guarantees. This yields a practical trilemma among censorship resistance, low latency, and reasonable cost (in fees or system goodput). We present Sedna, a user-facing protocol that replaces naive transaction replication with verifiable, rateless coding. Users privately deliver addressed symbol bundles to subsets of proposers; execution follows a deterministic order once enough symbols are finalized to decode. We prove Sedna guarantees liveness and \emph{until-decode privacy}, significantly reducing MEV exposure. Analytically, the protocol approaches the information-theoretic lower bound for bandwidth overhead, yielding a 2-3x efficiency improvement over naive replication. Sedna requires no consensus modifications, enabling incremental deployment.

Paperid: 938, https://arxiv.org/pdf/2512.13119.pdf

Abstract:
Most adversarial attacks on point clouds perturb a large number of points, causing widespread geometric changes and limiting applicability in real-world scenarios. While recent works explore sparse attacks by modifying only a few points, such approaches often struggle to maintain effectiveness due to the limited influence of individual perturbations. In this paper, we propose SCP, a sparse and cooperative perturbation framework that selects and leverages a compact subset of points whose joint perturbations produce amplified adversarial effects. Specifically, SCP identifies the subset where the misclassification loss is locally convex with respect to their joint perturbations, determined by checking the positivedefiniteness of the corresponding Hessian block. The selected subset is then optimized to generate high-impact adversarial examples with minimal modifications. Extensive experiments show that SCP achieves 100% attack success rates, surpassing state-of-the-art sparse attacks, and delivers superior imperceptibility to dense attacks with far fewer modifications.

Paperid: 939, https://arxiv.org/pdf/2512.04129.pdf

Abstract:
LLM-based multi-agent systems (MASs) have reshaped the digital landscape with their emergent coordination and problem-solving capabilities. However, current security evaluations of MASs are still confined to limited attack scenarios, leaving their security issues unclear and likely underestimated. To fill this gap, we propose TOMA, a topology-aware multi-hop attack scheme targeting MASs. By optimizing the propagation of contamination within the MAS topology and controlling the multi-hop diffusion of adversarial payloads originating from the environment, TOMA unveils new and effective attack vectors without requiring privileged access or direct agent manipulation. Experiments demonstrate attack success rates ranging from 40% to 78% across three state-of-the-art MAS architectures: \textsc{Magentic-One}, \textsc{LangManus}, and \textsc{OWL}, and five representative topologies, revealing intrinsic MAS vulnerabilities that may be overlooked by existing research. Inspired by these findings, we propose a conceptual defense framework based on topology trust, and prototype experiments show its effectiveness in blocking 94.8% of adaptive and composite attacks.

Paperid: 940, https://arxiv.org/pdf/2512.04008.pdf

Abstract:
Training with differential privacy (DP) provides a guarantee to members in a dataset that they cannot be identified by users of the released model. However, those data providers, and, in general, the public, lack methods to efficiently verify that models trained on their data satisfy DP guarantees. The amount of compute needed to verify DP guarantees for current algorithms scales with the amount of compute required to train the model. In this paper we design the first DP algorithm with near optimal privacy-utility trade-offs but whose DP guarantees can be verified cheaper than training. We focus on DP stochastic convex optimization (DP-SCO), where optimal privacy-utility trade-offs are known. Here we show we can obtain tight privacy-utility trade-offs by privately minimizing a series of regularized objectives and only using the standard DP composition bound. Crucially, this method can be verified with much less compute than training. This leads to the first known DP-SCO algorithm with near optimal privacy-utility whose DP verification scales better than training cost, significantly reducing verification costs on large datasets.

Paperid: 941, https://arxiv.org/pdf/2511.23406.pdf

Abstract:
In this paper, we explore how quantum resources can be used to increase the rate of private distributed matrix multiplication (PDMM). In PDMM, a user who has two high-dimensional matrices, $A$ and $B$, and lacks the computational capabilities to apply matrix multiplication locally, divides the matrices $A$ and $B$ into $K$ and $L$ sub-blocks, respectively. Then, the user sends them to $N$ servers to apply the required multiplication privately from any $T$ servers. The goal is to reduce the number of servers needed to perform the required matrix multiplication. In the quantum setting, we allow the servers to share an entangled state and respond over quantum channels. Upon receiving the qudits, the user applies measurements to obtain the required multiplication. There are two main regimes in the PDMM literature: The high-privacy regime and the low-privacy regime where $T$ is less than $K$ and $L$. First, in the high-privacy regime, the state-of-the-art classical code is called the gap additive secure polynomial (GASP) code. We define a feasibility requirement in the quantum setting for the GASP code such that the highest performance is achieved when it is satisfied. When it is not satisfied, we address two main concerns. The first is to find a relation between the minimum privacy requirement and the dimensions of the two matrices needed for the feasibility condition to be satisfied. Second, we develop a new family of codes that can work in the quantum setting. Second, since GASP does not work efficiently in the low-privacy regimes compared to cyclic-addition degree tables (CAT) and discretely optimized GASP (DOG), we show that the feasibility condition developed for GASP can be adopted for both CAT and DOG codes as well. In addition, we propose another set of codes that can be used in the low privacy regime in the quantum setting when the feasibility requirement is not satisfied.

Paperid: 942, https://arxiv.org/pdf/2511.20500.pdf

Abstract:
Advanced Persistent Threats (APT) pose a major cybersecurity challenge due to their stealth, persistence, and adaptability. Traditional machine learning detectors struggle with class imbalance, high dimensional features, and scarce real world traces. They often lack transferability-performing well in the training domain but degrading in novel attack scenarios. We propose a hybrid transfer framework that integrates Transfer Learning, Explainable AI (XAI), contrastive learning, and Siamese networks to improve cross-domain generalization. An attention-based autoencoder supports knowledge transfer across domains, while Shapley Additive exPlanations (SHAP) select stable, informative features to reduce dimensionality and computational cost. A Siamese encoder trained with a contrastive objective aligns source and target representations, increasing anomaly separability and mitigating feature drift. We evaluate on real-world traces from the DARPA Transparent Computing (TC) program and augment with synthetic attack scenarios to test robustness. Across source to target transfers, the approach delivers improved detection scores with classical and deep baselines, demonstrating a scalable, explainable, and transferable solution for APT detection.

Paperid: 943, https://arxiv.org/pdf/2511.20480.pdf

Abstract:
Advanced Persistent Threats (APTs) pose a significant challenge in cybersecurity due to their stealthy and long-term nature. Modern supervised learning methods require extensive labeled data, which is often scarce in real-world cybersecurity environments. In this paper, we propose an innovative approach that leverages AutoEncoders for unsupervised anomaly detection, augmented by active learning to iteratively improve the detection of APT anomalies. By selectively querying an oracle for labels on uncertain or ambiguous samples, we minimize labeling costs while improving detection rates, enabling the model to improve its detection accuracy with minimal data while reducing the need for extensive manual labeling. We provide a detailed formulation of the proposed Attention Adversarial Dual AutoEncoder-based anomaly detection framework and show how the active learning loop iteratively enhances the model. The framework is evaluated on real-world imbalanced provenance trace databases produced by the DARPA Transparent Computing program, where APT-like attacks constitute as little as 0.004\% of the data. The datasets span multiple operating systems, including Android, Linux, BSD, and Windows, and cover two attack scenarios. The results have shown significant improvements in detection rates during active learning and better performance compared to other existing approaches.

Paperid: 944, https://arxiv.org/pdf/2511.19558.pdf

Abstract:
Text-to-image diffusion models can emit copyrighted, unsafe, or private content. Safety alignment aims to suppress specific concepts, yet evaluations seldom test whether safety persists under benign downstream fine-tuning routinely applied after deployment (e.g., LoRA personalization, style/domain adapters). We study the stability of current safety methods under benign fine-tuning and observe frequent breakdowns. As true safety alignment must withstand even benign post-deployment adaptations, we introduce the SPQR benchmark (Safety-Prompt adherence-Quality-Robustness). SPQR is a single-scored metric that provides a standardized and reproducible framework to evaluate how well safety-aligned diffusion models preserve safety, utility, and robustness under benign fine-tuning, by reporting a single leaderboard score to facilitate comparisons. We conduct multilingual, domain-specific, and out-of-distribution analyses, along with category-wise breakdowns, to identify when safety alignment fails after benign fine-tuning, ultimately showcasing SPQR as a concise yet comprehensive benchmark for T2I safety alignment techniques for T2I models.

Paperid: 945, https://arxiv.org/pdf/2511.15517.pdf

Abstract:
Modern high-throughput BFT consensus protocols use streamlined push-pull mechanisms to disseminate blocks and keep happy-path performance optimal. Yet state-of-the-art designs lack a principled and efficient way to exchange blocks, which leaves them open to targeted attacks and performance collapse under network asynchrony. This work introduces the concept of a block synchronizer, a simple abstraction that drives incremental block retrieval and enforces resource-aware exchange. Its interface and role fit cleanly inside a modern BFT consensus stack. We also uncover a new attack, where an adversary steers honest validators into redundant, uncoordinated pulls that exhaust bandwidth and stall progress. Beluga is a modular and scarcity-aware instantiation of the block synchronizer. It achieves optimal common-case latency while bounding the cost of recovery under faults and adversarial behavior. We integrate Beluga into Mysticeti, the consensus core of the Sui blockchain, and show on a geo-distributed AWS deployment that Beluga sustains optimal performance in the optimistic path and, under attack, delivers up to 3x higher throughput and 25x lower latency than prior designs. The Sui blockchain adopted Beluga in production.

Paperid: 946, https://arxiv.org/pdf/2511.15361.pdf

Abstract:
Blockchain consensus faces a trilemma of security, latency, and decentralization. High-throughput systems often require a reduction in decentralization or robustness against strong adversaries, while highly decentralized and secure systems tend to have lower performance. We present BlueBottle, a two-layer consensus architecture. The core layer, BB-Core, is an n=5f+1 protocol that trades some fault tolerance for a much lower finality latency with a medium-sized core validator set. Our experiments show that BB-Core reduces latency by 20-25% in comparison to Mysticeti. The guard layer, BB-Guard, provides decentralized timestamping, proactive misbehavior detection in BB-Core, and a synchronous recovery path. When it observes equivocations or liveness failures in the core -- while tolerating up to f<3n/5 faulty nodes in the primary layer -- guard validators disseminate evidence, agree on misbehaving parties for exclusion or slashing, and either restart the core protocol (for liveness violations) or select a canonical fork (for safety violations). Together, these layers enable optimistic sub-second finality at high throughput while maintaining strong safety and liveness under a mild synchrony assumption.

Paperid: 947, https://arxiv.org/pdf/2511.11411.pdf

Abstract:
Smart Contract Reusable Components(SCRs) play a vital role in accelerating the development of business-specific contracts by promoting modularity and code reuse. However, the risks associated with SCR usage violations have become a growing concern. One particular type of SCR usage violation, known as a logic-level usage violation, is becoming especially harmful. This violation occurs when the SCR adheres to its specified usage rules but fails to align with the specific business logic of the current context, leading to significant vulnerabilities. Detecting such violations necessitates a deep semantic understanding of the contract's business logic, including the ability to extract implicit usage patterns and analyze fine-grained logical behaviors. To address these challenges, we propose SCRUTINEER, the first automated and practical system for detecting logic-level usage violations of SCRs. First, we design a composite feature extraction approach that produces three complementary feature representations, supporting subsequent analysis. We then introduce a Large Language Model-powered knowledge construction framework, which leverages comprehension-oriented prompts and domain-specific tools to extract logic-level usage and build the SCR knowledge base. Next, we develop a Retrieval-Augmented Generation-driven inspector, which combines a rapid retrieval strategy with both comprehensive and targeted analysis to identify potentially insecure logic-level usages. Finally, we implement a logic-level usage violation analysis engine that integrates a similarity-based checker and a snapshot-based inference conflict checker to enable accurate and robust detection. We evaluate SCRUTINEER from multiple perspectives on 3 ground-truth datasets. The results show that SCRUTINEER achieves a precision of 80.77%, a recall of 82.35%, and an F1-score of 81.55% in detecting logic-level usage violations of SCRs.

Paperid: 948, https://arxiv.org/pdf/2511.10698.pdf

Abstract:
Recent studies have demonstrated that hypergraph neural networks (HGNNs) are susceptible to adversarial attacks. However, existing methods rely on the specific information mechanisms of target HGNNs, overlooking the common vulnerability caused by the significant differences in hyperedge pivotality along aggregation paths in most HGNNs, thereby limiting the transferability and effectiveness of attacks. In this paper, we present a novel framework, i.e., Transferable Hypergraph Attack via Injecting Nodes into Pivotal Hyperedges (TH-Attack), to address these limitations. Specifically, we design a hyperedge recognizer via pivotality assessment to obtain pivotal hyperedges within the aggregation paths of HGNNs. Furthermore, we introduce a feature inverter based on pivotal hyperedges, which generates malicious nodes by maximizing the semantic divergence between the generated features and the pivotal hyperedges features. Lastly, by injecting these malicious nodes into the pivotal hyperedges, TH-Attack improves the transferability and effectiveness of attacks. Extensive experiments are conducted on six authentic datasets to validate the effectiveness of TH-Attack and the corresponding superiority to state-of-the-art methods.

Paperid: 949, https://arxiv.org/pdf/2511.10050.pdf

Abstract:
Traffic sign recognition plays a critical role in ensuring safe and efficient transportation of autonomous vehicles but remain vulnerable to adversarial attacks using stickers or laser projections. While existing attack vectors demonstrate security concerns, they suffer from visual detectability or implementation constraints, suggesting unexplored vulnerability surfaces in TSR systems. We introduce the Adversarial Retroreflective Patch (ARP), a novel attack vector that combines the high deployability of patch attacks with the stealthiness of laser projections by utilizing retroreflective materials activated only under victim headlight illumination. We develop a retroreflection simulation method and employ black-box optimization to maximize attack effectiveness. ARP achieves $\geq$93.4\% success rate in dynamic scenarios at 35 meters and $\geq$60\% success rate against commercial TSR systems in real-world conditions. Our user study demonstrates that ARP attacks maintain near-identical stealthiness to benign signs while achieving $\geq$1.9\% higher stealthiness scores than previous patch attacks. We propose the DPR Shield defense, employing strategically placed polarized filters, which achieves $\geq$75\% defense success rates for stop signs and speed limit signs against micro-prism patches.

Paperid: 950, https://arxiv.org/pdf/2511.04716.pdf

Abstract:
Cognitive diagnosis models (CDMs) are pivotal for creating fine-grained learner profiles in modern intelligent education platforms. However, these models are trained on sensitive student data, raising significant privacy concerns. While membership inference attacks (MIA) have been studied in various domains, their application to CDMs remains a critical research gap, leaving their privacy risks unquantified. This paper is the first to systematically investigate MIA against CDMs. We introduce a novel and realistic grey box threat model that exploits the explainability features of these platforms, where a model's internal knowledge state vectors are exposed to users through visualizations such as radar charts. We demonstrate that these vectors can be accurately reverse-engineered from such visualizations, creating a potent attack surface. Based on this threat model, we propose a profile-based MIA (P-MIA) framework that leverages both the model's final prediction probabilities and the exposed internal knowledge state vectors as features. Extensive experiments on three real-world datasets against mainstream CDMs show that our grey-box attack significantly outperforms standard black-box baselines. Furthermore, we showcase the utility of P-MIA as an auditing tool by successfully evaluating the efficacy of machine unlearning techniques and revealing their limitations.

Paperid: 951, https://arxiv.org/pdf/2510.27080.pdf

Abstract:
Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often limits trust, particularly in decisions that require domain-specific cybersecurity knowledge. Because security threats evolve rapidly, LLMs must not only recall historical incidents but also adapt to emerging vulnerabilities and attack patterns. Retrieval-Augmented Generation (RAG) has demonstrated effectiveness in general LLM applications, but its potential for cybersecurity remains underexplored. In this work, we introduce a RAG-based framework designed to contextualize cybersecurity data and enhance LLM accuracy in knowledge retention and temporal reasoning. Using external datasets and the Llama-3-8B-Instruct model, we evaluate baseline RAG, an optimized hybrid retrieval approach, and conduct a comparative analysis across multiple performance metrics. Our findings highlight the promise of hybrid retrieval in strengthening the adaptability and reliability of LLMs for cybersecurity tasks.

Paperid: 952, https://arxiv.org/pdf/2510.26610.pdf

Abstract:
Semantic communication (SemCom) aims to transmit only task-relevant information, thereby improving communication efficiency but also exposing semantic information to potential eavesdropping. In this paper, we propose a deep reinforcement learning (DRL)-empowered multi-level jamming approach to enhance the security of SemCom systems over MIMO fading wiretap channels. This approach combines semantic layer jamming, achieved by encoding task-irrelevant text, and physical layer jamming, achieved by encoding random Gaussian noise. These two-level jamming signals are superposed with task-relevant semantic information to protect the transmitted semantics from eavesdropping. A deep deterministic policy gradient (DDPG) algorithm is further introduced to dynamically design and optimize the precoding matrices for both taskrelevant semantic information and multi-level jamming signals, aiming to enhance the legitimate user's image reconstruction while degrading the eavesdropper's performance. To jointly train the SemCom model and the DDPG agent, we propose an alternating optimization strategy where the two modules are updated iteratively. Experimental results demonstrate that, compared with both the encryption-based (ESCS) and encoded jammer-based (EJ) benchmarks, our method achieves comparable security while improving the legitimate user's peak signalto-noise ratio (PSNR) by up to approximately 0.6 dB.

Paperid: 953, https://arxiv.org/pdf/2510.23274.pdf

Abstract:
While semantic communication (SemCom) improves transmission efficiency by focusing on task-relevant information, it also raises critical privacy concerns. Many existing secure SemCom approaches rely on restrictive or impractical assumptions, such as favorable channel conditions for the legitimate user or prior knowledge of the eavesdropper's model. To address these limitations, this paper proposes a novel secure SemCom framework for image transmission over wiretap channels, leveraging differential privacy (DP) to provide approximate privacy guarantees. Specifically, our approach first extracts disentangled semantic representations from source images using generative adversarial network (GAN) inversion method, and then selectively perturbs private semantic representations with approximate DP noise. Distinct from conventional DP-based protection methods, we introduce DP noise with learnable pattern, instead of traditional white Gaussian or Laplace noise, achieved through adversarial training of neural networks (NNs). This design mitigates the inherent non-invertibility of DP while effectively protecting private information. Moreover, it enables explicitly controllable security levels by adjusting the privacy budget according to specific security requirements, which is not achieved in most existing secure SemCom approaches. Experimental results demonstrate that, compared with the previous DP-based method and direct transmission, the proposed method significantly degrades the reconstruction quality for the eavesdropper, while introducing only slight degradation in task performance. Under comparable security levels, our approach achieves an LPIPS advantage of 0.06-0.29 and an FPPSR advantage of 0.10-0.86 for the legitimate user compared with the previous DP-based method.

Paperid: 954, https://arxiv.org/pdf/2510.22622.pdf

Abstract:
The misuse of advanced generative AI models has resulted in the widespread proliferation of falsified data, particularly forged human-centric audiovisual content, which poses substantial societal risks (e.g., financial fraud and social instability). In response to this growing threat, several works have preliminarily explored countermeasures. However, the lack of sufficient and diverse training data, along with the absence of a standardized benchmark, hinder deeper exploration. To address this challenge, we first build Mega-MMDF, a large-scale, diverse, and high-quality dataset for multimodal deepfake detection. Specifically, we employ 21 forgery pipelines through the combination of 10 audio forgery methods, 12 visual forgery methods, and 6 audio-driven face reenactment methods. Mega-MMDF currently contains 0.1 million real samples and 1.1 million forged samples, making it one of the largest and most diverse multimodal deepfake datasets, with plans for continuous expansion. Building on it, we present DeepfakeBench-MM, the first unified benchmark for multimodal deepfake detection. It establishes standardized protocols across the entire detection pipeline and serves as a versatile platform for evaluating existing methods as well as exploring novel approaches. DeepfakeBench-MM currently supports 5 datasets and 11 multimodal deepfake detectors. Furthermore, our comprehensive evaluations and in-depth analyses uncover several key findings from multiple perspectives (e.g., augmentation, stacked forgery). We believe that DeepfakeBench-MM, together with our large-scale Mega-MMDF, will serve as foundational infrastructures for advancing multimodal deepfake detection.

Paperid: 955, https://arxiv.org/pdf/2510.21272.pdf

Abstract:
Decentralized Finance (DeFi) smart contracts manage billions of dollars, making them a prime target for exploits. Price manipulation vulnerabilities, often via flash loans, are a devastating class of attacks causing significant financial losses. Existing detection methods are limited. Reactive approaches analyze attacks only after they occur, while proactive static analysis tools rely on rigid, predefined heuristics, limiting adaptability. Both depend on known attack patterns, failing to identify novel variants or comprehend complex economic logic. We propose PMDetector, a hybrid framework combining static analysis with Large Language Model (LLM)-based reasoning to proactively detect price manipulation vulnerabilities. Our approach uses a formal attack model and a three-stage pipeline. First, static taint analysis identifies potentially vulnerable code paths. Second, a two-stage LLM process filters paths by analyzing defenses and then simulates attacks to evaluate exploitability. Finally, a static analysis checker validates LLM results, retaining only high-risk paths and generating comprehensive vulnerability reports. To evaluate its effectiveness, we built a dataset of 73 real-world vulnerable and 288 benign DeFi protocols. Results show PMDetector achieves 88% precision and 90% recall with Gemini 2.5-flash, significantly outperforming state-of-the-art static analysis and LLM-based approaches. Auditing a vulnerability with PMDetector costs just $0.03 and takes 4.0 seconds with GPT-4.1, offering an efficient and cost-effective alternative to manual audits.

Paperid: 956, https://arxiv.org/pdf/2510.20932.pdf

Abstract:
This study investigates the vulnerabilities of autonomous navigation and landing systems in Urban Air Mobility (UAM) vehicles. Specifically, it focuses on Trojan attacks that target deep learning models, such as Convolutional Neural Networks (CNNs). Trojan attacks work by embedding covert triggers within a model's training data. These triggers cause specific failures under certain conditions, while the model continues to perform normally in other situations. We assessed the vulnerability of Urban Autonomous Aerial Vehicles (UAAVs) using the DroNet framework. Our experiments showed a significant drop in accuracy, from 96.4% on clean data to 73.3% on data triggered by Trojan attacks. To conduct this study, we collected a custom dataset and trained models to simulate real-world conditions. We also developed an evaluation framework designed to identify Trojan-infected models. This work demonstrates the potential security risks posed by Trojan attacks and lays the groundwork for future research on enhancing the resilience of UAM systems.

Paperid: 957, https://arxiv.org/pdf/2510.20468.pdf

Abstract:
Recent years have seen a surge in interest in digital content watermarking techniques, driven by the proliferation of generative models and increased legal pressure. With an ever-growing percentage of AI-generated content available online, watermarking plays an increasingly important role in ensuring content authenticity and attribution at scale. There have been many works assessing the robustness of watermarking to removal attacks, yet, watermark forging, the scenario when a watermark is stolen from genuine content and applied to malicious content, remains underexplored. In this work, we investigate watermark forging in the context of widely used post-hoc image watermarking. Our contributions are as follows. First, we introduce a preference model to assess whether an image is watermarked. The model is trained using a ranking loss on purely procedurally generated images without any need for real watermarks. Second, we demonstrate the model's capability to remove and forge watermarks by optimizing the input image through backpropagation. This technique requires only a single watermarked image and works without knowledge of the watermarking model, making our attack much simpler and more practical than attacks introduced in related work. Third, we evaluate our proposed method on a variety of post-hoc image watermarking models, demonstrating that our approach can effectively forge watermarks, questioning the security of current watermarking approaches. Our code and further resources are publicly available.

Paperid: 958, https://arxiv.org/pdf/2510.20340.pdf

Abstract:
Runtime introspection of dependencies, i.e., the ability to observe which dependencies are currently used during program execution, is fundamental for Software Supply Chain security. Yet, Java has no support for it. We solve this problem with Classport, a system that embeds dependency information into Java class files, enabling the retrieval of dependency information at runtime. We evaluate Classport on six real-world projects, demonstrating the feasibility in identifying dependencies at runtime. Runtime dependency introspection with Classport opens important avenues for runtime integrity checking.

Paperid: 959, https://arxiv.org/pdf/2510.12812.pdf

Abstract:
Despite rapid progress in deep learning-based image watermarking, the capacity of current robust methods remains limited to the scale of only a few hundred bits. Such plateauing progress raises the question: How far are we from the fundamental limits of image watermarking? To this end, we present an analysis that establishes upper bounds on the message-carrying capacity of images under PSNR and linear robustness constraints. Our results indicate theoretical capacities are orders of magnitude larger than what current models achieve. Our experiments show this gap between theoretical and empirical performance persists, even in minimal, easily analysable setups. This suggests a fundamental problem. As proof that larger capacities are indeed possible, we train ChunkySeal, a scaled-up version of VideoSeal, which increases capacity 4 times to 1024 bits, all while preserving image quality and robustness. These findings demonstrate modern methods have not yet saturated watermarking capacity, and that significant opportunities for architectural innovation and training strategies remain.

Paperid: 960, https://arxiv.org/pdf/2510.12626.pdf

Abstract:
Unclonable cryptography leverages the quantum no-cloning principle to copy-protect cryptographic functionalities. While most existing works address the basic single-copy security, the stronger notion of multi-copy security remains largely unexplored. We introduce a generic compiler that upgrades collusion-resistant unclonable primitives to achieve multi-copy security, assuming only one-way functions. Using this framework, we obtain the first multi-copy secure constructions of public-key quantum money (termed quantum coins), single-decryptor encryption, unclonable encryption, and more. We also introduce an extended notion of quantum coins, called upgradable quantum coins, which allow weak (almost-public) verification under weaker assumptions and can be upgraded to full public verification under stronger assumptions by the bank simply publishing additional classical information. Along the way, we give a generic compiler that upgrades single-copy secure single-decryptor encryption to a collusion-resistant one, assuming the existence of functional encryption, and construct the first multi-challenge secure unclonable encryption scheme, which we believe are of independent interest.

Paperid: 961, https://arxiv.org/pdf/2510.12462.pdf

Abstract:
Large Language Models (LLMs) are increasingly being used to autonomously evaluate the quality of content in communication systems, e.g., to assess responses in telecom customer support chatbots. However, the impartiality of these AI "judges" is not guaranteed, and any biases in their evaluation criteria could skew outcomes and undermine user trust. In this paper, we systematically investigate judgment biases in two LLM-as-a-judge models (i.e., GPT-Judge and JudgeLM) under the point-wise scoring setting, encompassing 11 types of biases that cover both implicit and explicit forms. We observed that state-of-the-art LLM judges demonstrate robustness to biased inputs, generally assigning them lower scores than the corresponding clean samples. Providing a detailed scoring rubric further enhances this robustness. We further found that fine-tuning an LLM on high-scoring yet biased responses can significantly degrade its performance, highlighting the risk of training on biased data. We also discovered that the judged scores correlate with task difficulty: a challenging dataset like GPQA yields lower average scores, whereas an open-ended reasoning dataset (e.g., JudgeLM-val) sees higher average scores. Finally, we proposed four potential mitigation strategies to ensure fair and reliable AI judging in practical communication scenarios.

Paperid: 962, https://arxiv.org/pdf/2510.10766.pdf

Abstract:
As autonomous vehicles become an essential component of modern transportation, they are increasingly vulnerable to threats such as GPS spoofing attacks. This study presents an adaptive detection approach utilizing a dynamically tuned Density Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, designed to adjust the detection threshold (ε) in real-time. The threshold is updated based on the recursive mean and standard deviation of displacement errors between GPS and in-vehicle sensors data, but only at instances classified as non-anomalous. Furthermore, an initial threshold, determined from 120,000 clean data samples, ensures the capability to identify even subtle and gradual GPS spoofing attempts from the beginning. To assess the performance of the proposed method, five different subsets from the real-world Honda Research Institute Driving Dataset (HDD) are selected to simulate both large and small magnitude GPS spoofing attacks. The modified algorithm effectively identifies turn-by-turn, stop, overshoot, and multiple small biased spoofing attacks, achieving detection accuracies of 98.621%, 99.960.1%, 99.880.1%, and 98.380.1%, respectively. This work provides a substantial advancement in enhancing the security and safety of AVs against GPS spoofing threats.

Paperid: 963, https://arxiv.org/pdf/2510.09624.pdf

Abstract:
With the proliferation of new blockchain-based cryptocurrencies/assets and platforms that make it possible to transact across them, it becomes important to consider not just whether the transfer of coins/assets can be tracked within their respective transaction ledger, but also if they can be tracked as they move across ledgers. This is especially important given that there are documented cases of criminals attempting to use these cross-ledger trades to obscure the flow of their coins/assets. In this paper, we perform a systematic review of the various tracing techniques for blockchain transactions proposed in literature, categorize them using multiple criteria (such as tracing approach and targeted objective) and compare them. Based on the above categorization, we provide insights on the state of blockchain transaction tracing literature and identify the limitations of existing approaches. Finally, we suggest directions for future research in this area based on our analysis.

Paperid: 964, https://arxiv.org/pdf/2510.04754.pdf

Abstract:
Secure key leasing (SKL) enables the holder of a secret key for a cryptographic function to temporarily lease the key using quantum information. Later, the recipient can produce a deletion certificate, which proves that they no longer have access to the secret key. The security guarantee ensures that even a malicious recipient cannot continue to evaluate the function, after producing a valid deletion certificate. Most prior work considers an adversarial recipient that obtains a single leased key, which is insufficient for many applications. In the more realistic collusion-resistant setting, security must hold even when polynomially many keys are leased (and subsequently deleted). However, achieving collusion-resistant SKL from standard assumptions remains poorly understood, especially for functionalities beyond decryption. We improve upon this situation by introducing new pathways for constructing collusion-resistant SKL. Our main contributions are as follows: - A generalization of quantum-secure collusion-resistant traitor tracing called multi-level traitor tracing (MLTT), and a compiler that transforms an MLTT scheme for a primitive X into a collusion-resistant SKL scheme for primitive X. - The first bounded collusion-resistant SKL scheme for PRFs, assuming LWE. - A compiler that upgrades any single-key secure SKL scheme for digital signatures into one with unbounded collusion-resistance, assuming OWFs. - A compiler that upgrades collusion-resistant SKL schemes with classical certificates to ones having verification-query resilience, assuming OWFs.

Paperid: 965, https://arxiv.org/pdf/2510.01552.pdf

Abstract:
Large Language Models (LLMs) are intensively used to assist security analysts in counteracting the rapid exploitation of cyber threats, wherein LLMs offer cyber threat intelligence (CTI) to support vulnerability assessment and incident response. While recent work has shown that LLMs can support a wide range of CTI tasks such as threat analysis, vulnerability detection, and intrusion defense, significant performance gaps persist in practical deployments. In this paper, we investigate the intrinsic vulnerabilities of LLMs in CTI, focusing on challenges that arise from the nature of the threat landscape itself rather than the model architecture. Using large-scale evaluations across multiple CTI benchmarks and real-world threat reports, we introduce a novel categorization methodology that integrates stratification, autoregressive refinement, and human-in-the-loop supervision to reliably analyze failure instances. Through extensive experiments and human inspections, we reveal three fundamental vulnerabilities: spurious correlations, contradictory knowledge, and constrained generalization, that limit LLMs in effectively supporting CTI. Subsequently, we provide actionable insights for designing more robust LLM-powered CTI systems to facilitate future research.

Paperid: 966, https://arxiv.org/pdf/2509.23573.pdf

Paperid: 967, https://arxiv.org/pdf/2509.23571.pdf

Abstract:
As cyber threats continue to grow in scale and sophistication, blue team defenders increasingly require advanced tools to proactively detect and mitigate risks. Large Language Models (LLMs) offer promising capabilities for enhancing threat analysis. However, their effectiveness in real-world blue team threat-hunting scenarios remains insufficiently explored. This paper presents CyberTeam, a benchmark designed to guide LLMs in blue teaming practice. CyberTeam constructs a standardized workflow in two stages. First, it models realistic threat-hunting workflows by capturing the dependencies among analytical tasks from threat attribution to incident response. Next, each task is addressed through a set of operational modules tailored to its specific analytical requirements. This transforms threat hunting into a structured sequence of reasoning steps, with each step grounded in a discrete operation and ordered according to task-specific dependencies. Guided by this framework, LLMs are directed to perform threat-hunting tasks through modularized steps. Overall, CyberTeam integrates 30 tasks and 9 operational modules to guide LLMs through standardized threat analysis. We evaluate both leading LLMs and state-of-the-art cybersecurity agents, comparing CyberTeam against open-ended reasoning strategies. Our results highlight the improvements enabled by standardized design, while also revealing the limitations of open-ended reasoning in real-world threat hunting.

Paperid: 968, https://arxiv.org/pdf/2509.23558.pdf

Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities, yet they also introduce novel security challenges. For instance, prompt jailbreaking attacks involve adversaries crafting sophisticated prompts to elicit responses from LLMs that deviate from human values. To uncover vulnerabilities in LLM alignment methods, we propose the PASS framework (\underline{P}rompt J\underline{a}ilbreaking via \underline{S}emantic and \underline{S}tructural Formalization). Specifically, PASS employs reinforcement learning to transform initial jailbreak prompts into formalized descriptions, which enhances stealthiness and enables bypassing existing alignment defenses. The jailbreak outputs are then structured into a GraphRAG system that, by leveraging extracted relevant terms and formalized symbols as contextual input alongside the original query, strengthens subsequent attacks and facilitates more effective jailbreaks. We conducted extensive experiments on common open-source models, demonstrating the effectiveness of our attack.

Paperid: 969, https://arxiv.org/pdf/2509.23519.pdf

Abstract:
Retrieval-Augmented Generation (RAG) enhances Large Language Models by grounding their outputs in external documents. These systems, however, remain vulnerable to attacks on the retrieval corpus, such as prompt injection. RAG-based search systems (e.g., Google's Search AI Overview) present an interesting setting for studying and protecting against such threats, as defense algorithms can benefit from built-in reliability signals -- like document ranking -- and represent a non-LLM challenge for the adversary due to decades of work to thwart SEO. Motivated by, but not limited to, this scenario, this work introduces ReliabilityRAG, a framework for adversarial robustness that explicitly leverages reliability information of retrieved documents. Our first contribution adopts a graph-theoretic perspective to identify a "consistent majority" among retrieved documents to filter out malicious ones. We introduce a novel algorithm based on finding a Maximum Independent Set (MIS) on a document graph where edges encode contradiction. Our MIS variant explicitly prioritizes higher-reliability documents and provides provable robustness guarantees against bounded adversarial corruption under natural assumptions. Recognizing the computational cost of exact MIS for large retrieval sets, our second contribution is a scalable weighted sample and aggregate framework. It explicitly utilizes reliability information, preserving some robustness guarantees while efficiently handling many documents. We present empirical results showing ReliabilityRAG provides superior robustness against adversarial attacks compared to prior methods, maintains high benign accuracy, and excels in long-form generation tasks where prior robustness-focused methods struggled. Our work is a significant step towards more effective, provably robust defenses against retrieved corpus corruption in RAG.

Paperid: 970, https://arxiv.org/pdf/2509.20767.pdf

Abstract:
Despite all the advantages associated with Network Intrusion Detection Systems (NIDSs) that utilize machine learning (ML) models, there is a significant reluctance among cyber security experts to implement these models in real-world production settings. This is primarily because of their opaque nature, meaning it is unclear how and why the models make their decisions. In this work, we design a deep learning-based NIDS, ExpIDS to have high decision tree explanation fidelity, i.e., the predictions of decision tree explanation corresponding to ExpIDS should be as close to ExpIDS's predictions as possible. ExpIDS can also adapt to changes in network traffic distribution (drift). With the help of extensive experiments, we verify that ExpIDS achieves higher decision tree explanation fidelity and a malicious traffic detection performance comparable to state-of-the-art NIDSs for common attacks with varying levels of real-world drift.

Paperid: 971, https://arxiv.org/pdf/2509.20472.pdf

Abstract:
Our capacity to process information depends on the computational power at our disposal. Information theory captures our ability to distinguish states or communicate messages when it is unconstrained with unrivaled elegance. For computationally bounded observers the situation is quite different. They can, for example, be fooled to believe that distributions are more random than they actually are. In our work, we go beyond the prevailing single-shot approach and take a new direction in computational quantum information theory that captures the essence of complexity-constrained information theory while retaining the look and feel of the unbounded asymptotic theory. As our foundational quantity, we define the computational relative entropy as the optimal error exponent in asymmetric hypothesis testing when restricted to polynomially many copies and quantum gates, defined in a mathematically rigorous way. Building on this foundation, we prove a computational analogue of Stein's lemma, establish computational versions of fundamental inequalities like Pinsker's bound, and demonstrate a computational smoothing property showing that computationally indistinguishable states yield equivalent information measures. We derive a computational entropy that operationally characterizes optimal compression rates for quantum states under computational limitations and show that our quantities apply to computational entanglement theory, proving a computational version of the Rains bound. Our framework reveals striking separations between computational and unbounded information measures, including quantum-classical gaps that arise from cryptographic assumptions, demonstrating that computational constraints fundamentally alter the information-theoretic landscape and open new research directions at the intersection of quantum information, complexity theory, and cryptography.

Paperid: 972, https://arxiv.org/pdf/2509.12899.pdf

Abstract:
Verifiable Secret Sharing (VSS) has been widespread in Distributed Privacy-preserving Machine Learning (DPML), because invalid shares from malicious dealers or participants can be recognized by verifying the commitment of the received shares for honest participants. However, the consistency and the computation and communitation burden of the VSS-based DPML schemes are still two serious challenges. Although Byzantine Fault Tolerance (BFT) system has been brought to guarantee the consistency and improve the efficiency of the existing VSS-based DPML schemes recently, we explore an Adaptive Share Delay Provision (ASDP) strategy, and launch an ASDP-based Customized Model Poisoning Attack (ACuMPA) for certain participants in this paper. We theoretically analyzed why the ASDP strategy and the ACuMPA algorithm works to the existing schemes. Next, we propose an [E]fficient [By]zantine [F]ault [T]olerant-based [Ve]rifiable [S]ecret-sharing (EByFTVeS) scheme. Finally, the validity, liveness, consistency and privacy of the EByFTVeS scheme are theoretically analyzed, while the efficiency of the EByFTVeS scheme outperforms that of the-state-of-art VSS scheme according to comparative experiment results.

Paperid: 973, https://arxiv.org/pdf/2509.10635.pdf

Abstract:
Machine learning has shown promise in facial dysmorphology, where characteristic facial features provide diagnostic clues for rare genetic disorders. GestaltMatcher, a leading framework in this field, has demonstrated clinical utility across multiple studies, but its reliance on centralized datasets limits further development, as patient data are siloed across institutions and subject to strict privacy regulations. We introduce a federated GestaltMatcher service based on a cross-silo horizontal federated learning framework, which allows hospitals to collaboratively train a global ensemble feature extractor without sharing patient images. Patient data are mapped into a shared latent space, and a privacy-preserving kernel matrix computation framework enables syndrome inference and discovery while safeguarding confidentiality. New participants can directly benefit from and contribute to the system by adopting the global feature extractor and kernel configuration from previous training rounds. Experiments show that the federated service retains over 90% of centralized performance and remains robust to both varying silo numbers and heterogeneous data distributions.

Paperid: 974, https://arxiv.org/pdf/2509.09089.pdf

Abstract:
Tag-based sanitizers attach a small "key" to each pointer and a matching "lock" tag to its target memory object, enabling runtime verification of pointer-object consistency and helping developers to detect potential memory violations. However, the limited tag encoding space challenges existing studies in assigning distinct tags to memory objects across temporal and spatial dimensions, leading to potential tag collisions. In this paper, we present ClusterTag, a novel cluster-based memory allocator aimed at simultaneously mitigating tag collisions in both temporal and spatial dimensions. The core design of ClusterTag effectively balances the significant mismatch between tag encoding space and memory objects: it divides memory objects into multiple independent clusters, thereby limiting tag collisions to finite chunks within each cluster. To mitigate tag collisions across clusters, we design a cluster-grained heap randomization scheme. This approach introduces random address intervals between clusters and further breaks the entropy limitation of the tag space. ClusterTag has been implemented as an independent memory allocator that seamlessly integrates with tag-based sanitizers such as HWASan, and maintains comparable performance overhead (within 1%) at various randomization densities. Security evaluations on the Juliet dataset indicate that ClusterTag exhibits deterministic results across 500 repeated tests (5,652 reported and 1,530 missed), while the existing three types of tag assignment strategies all exhibit probabilistic false negatives due to tag collisions. Quantitative analysis across three tag collision distance metrics-minimum, average, and unpredictability-demonstrates that ClusterTag achieves balanced improvements across all three, whereas prior tag assignment schemes (random, staggered, fixed) show significant trade-offs in at least one metric.

Paperid: 975, https://arxiv.org/pdf/2509.06703.pdf

Abstract:
The rise of model-sharing through frameworks and dedicated hubs makes Machine Learning significantly more accessible. Despite their benefits, these tools expose users to underexplored security risks, while security awareness remains limited among both practitioners and developers. To enable a more security-conscious culture in Machine Learning model sharing, in this paper we evaluate the security posture of frameworks and hubs, assess whether security-oriented mechanisms offer real protection, and survey how users perceive the security narratives surrounding model sharing. Our evaluation shows that most frameworks and hubs address security risks partially at best, often by shifting responsibility to the user. More concerningly, our analysis of frameworks advertising security-oriented settings and complete model sharing uncovered six 0-day vulnerabilities enabling arbitrary code execution. Through this analysis, we debunk the misconceptions that the model-sharing problem is largely solved and that its security can be guaranteed by the file format used for sharing. As expected, our survey shows that the surrounding security narrative leads users to consider security-oriented settings as trustworthy, despite the weaknesses shown in this work. From this, we derive takeaways and suggestions to strengthen the security of model-sharing ecosystems.

Paperid: 976, https://arxiv.org/pdf/2509.05377.pdf

Abstract:
Upon integrating Quantum Neural Network (QNN) as the local model, Quantum Federated Learning (QFL) has recently confronted notable challenges. Firstly, exploration is hindered over sharp minima, decreasing learning performance. Secondly, the steady gradient descent results in more stable and predictable model transmissions over wireless channels, making the model more susceptible to attacks from adversarial entities. Additionally, the local QFL model is vulnerable to noise produced by the quantum device's intermediate noise states, since it requires the use of quantum gates and circuits for training. This local noise becomes intertwined with learning parameters during training, impairing model precision and convergence rate. To address these issues, we propose a new QFL technique that incorporates differential privacy and introduces a dedicated noise estimation strategy to quantify and mitigate the impact of intermediate quantum noise. Furthermore, we design an adaptive noise generation scheme to alleviate privacy threats associated with the vanishing gradient variance phenomenon of QNN and enhance robustness against device noise. Experimental results demonstrate that our algorithm effectively balances convergence, reduces communication costs, and mitigates the adverse effects of intermediate quantum noise while maintaining strong privacy protection. Using real-world datasets, we achieved test accuracy of up to 98.47\% for the MNIST dataset and 83.85\% for the CIFAR-10 dataset while maintaining fast execution times.

Paperid: 977, https://arxiv.org/pdf/2509.05318.pdf

Abstract:
The use of unvetted third-party and internet data renders pre-trained models susceptible to backdoor attacks. Detecting backdoor samples is critical to prevent backdoor activation during inference or injection during training. However, existing detection methods often require the defender to have access to the poisoned models, extra clean samples, or significant computational resources to detect backdoor samples, limiting their practicality. To address this limitation, we propose a backdoor sample detection method based on perturbatio\textbf{N} discr\textbf{E}pancy consis\textbf{T}ency \textbf{E}valuation (\NETE). This is a novel detection method that can be used both pre-training and post-training phases. In the detection process, it only requires an off-the-shelf pre-trained model to compute the log probability of samples and an automated function based on a mask-filling strategy to generate perturbations. Our method is based on the interesting phenomenon that the change in perturbation discrepancy for backdoor samples is smaller than that for clean samples. Based on this phenomenon, we use curvature to measure the discrepancy in log probabilities between different perturbed samples and input samples, thereby evaluating the consistency of the perturbation discrepancy to determine whether the input sample is a backdoor sample. Experiments conducted on four typical backdoor attacks and five types of large language model backdoor attacks demonstrate that our detection strategy outperforms existing zero-shot black-box detection methods.

Paperid: 978, https://arxiv.org/pdf/2509.04999.pdf

Abstract:
Advanced Persistent Threats (APTs) present a considerable challenge to cybersecurity due to their stealthy, long-duration nature. Traditional supervised learning methods typically require large amounts of labeled data, which is often scarce in real-world scenarios. This paper introduces a novel approach that combines AutoEncoders for anomaly detection with active learning to iteratively enhance APT detection. By selectively querying an oracle for labels on uncertain or ambiguous samples, our method reduces labeling costs while improving detection accuracy, enabling the model to effectively learn with minimal data and reduce reliance on extensive manual labeling. We present a comprehensive formulation of the Attention Adversarial Dual AutoEncoder-based anomaly detection framework and demonstrate how the active learning loop progressively enhances the model's performance. The framework is evaluated on real-world, imbalanced provenance trace data from the DARPA Transparent Computing program, where APT-like attacks account for just 0.004\% of the data. The datasets, which cover multiple operating systems including Android, Linux, BSD, and Windows, are tested in two attack scenarios. The results show substantial improvements in detection rates during active learning, outperforming existing methods.

Paperid: 979, https://arxiv.org/pdf/2509.01742.pdf

Abstract:
While Trusted Execution Environments provide a strong foundation for secure cloud computing, they remain vulnerable to access pattern leakages. Oblivious Maps (OMAPs) mitigate this by fully hiding access patterns but suffer from high overhead due to randomized remapping and worst-case padding. We argue these costs are not fundamental. Modern accelerators featuring High-Bandwidth Memory (HBM) offer a new opportunity: Vaswani et al. [OSDI'18] point out that eavesdropping on HBM is difficult -- even for physical attackers -- as its memory channels are sealed together with processor cores inside the same physical package. Later, Hunt et al. [NSDI'20] show that, with proper isolation, HBM can be turned into an unobservable region where both data and memory traces are hidden. This motivates a rethink of OMAP design with HBM-backed solutions to finally overcome their traditional performance limits. Building on these insights, we present BOLT, a Bandwidth Optimized, Lightning-fast OMAP accelerator that, for the first time, achieves O(1) + O(log_2(log_2 (N))) bandwidth overhead. BOLT introduces three key innovations: (i) a new OMAP algorithm that leverages isolated HBM as an unobservable cache to accelerate oblivious access to large host memory; (ii) a self-hosted architecture that offloads execution and memory control from the host to mitigate CPU-side leakage; and (iii) tailored algorithm-architecture co-designs that maximize resource efficiency. We implement a prototype BOLT on a Xilinx U55C FPGA. Evaluations show that BOLT achieves up to 279x and 480x speedups in initialization and query time, respectively, over state-of-the-art OMAPs, including an industry implementation from Facebook.

Paperid: 980, https://arxiv.org/pdf/2509.00018.pdf

Abstract:
As a promising physical layer security technique, physical layer key generation (PLKG) enables legitimate users to obtain secret keys from wireless channel without security infrastructures. However, in harsh propagation environments, the channel characteristic becomes unsatisfactory, the key generation rate (KGR) is significantly deteriorated. In this paper, we propose a novel fluid antenna (FA) enabled PLKG system to address this challenge. Specifically, we first derive the closed-form expression of the KGR for FA array, and then jointly optimize the precoding matrix and the antenna positions via a particle swarm optimization (PSO) algorithm. Next, to further reduce the computational complexity of the optimization procedure, we develop an alternating optimization (AO) algorithm, which combines the projected gradient descent (PGD) and the PSO. Simulation results demonstrate that by exploiting the additional spatial degree of freedom (DoF), our FA enabled PLKG system is superior to the benchmarks, such as the conventional fixed-position antenna (FPA) array and the reconfigurable intelligent surface (RIS). It is worth highlighting that compared to the conventional uniform planar antenna (UPA), the FA enabled PLKG achieves a 35.42\% KGR performance improvement under PSO algorithm and a 67.73\% KGR performance improvement under AO algorithm, respectively.

Paperid: 981, https://arxiv.org/pdf/2508.19072.pdf

Abstract:
Advanced Persistent Threats (APTs) represent a growing menace to modern digital infrastructure. Unlike traditional cyberattacks, APTs are stealthy, adaptive, and long-lasting, often bypassing signature-based detection systems. This paper introduces a novel framework for APT detection that unites deep learning, reinforcement learning (RL), and active learning into a cohesive, adaptive defense system. Our system combines auto-encoders for latent behavioral encoding with a multi-agent ensemble of RL-based defenders, each trained to distinguish between benign and malicious process behaviors. We identify a critical challenge in existing detection systems: their static nature and inability to adapt to evolving attack strategies. To this end, our architecture includes multiple RL agents (Q-Learning, PPO, DQN, adversarial defenders), each analyzing latent vectors generated by an auto-encoder. When any agent is uncertain about its decision, the system triggers an active learning loop to simulate expert feedback, thus refining decision boundaries. An ensemble voting mechanism, weighted by each agent's performance, ensures robust final predictions.

Paperid: 982, https://arxiv.org/pdf/2508.12597.pdf

Abstract:
Unmanned aerial vehicle (UAV) individual (ID) identification is a critical security surveillance strategy in low-altitude integrated sensing and communication (ISAC) networks. In this paper, we propose a novel dynamic knowledge distillation (KD)-enabled wireless radio frequency fingerprint large language model (RFF-LLM) framework for UAV ID identification. First, we propose an RFF-LLM framework based on the modified GPT-2 model to improve the identification accuracy in complex outdoor environments. Then, considering the parameter overhead of the RFF-LLM, we design a dynamic KD strategy to compress the model. Specifically, the proximal policy optimization (PPO) algorithm is employed to dynamically adjust the distillation temperature, overcoming the local optimum dilemma inherent in static KD. As a next step, the knowledge of the RFF-LLM is adequately transferred to the lightweight Lite-HRNet model. Finally, our experiments are conducted based on the self-built drone RFF dataset of Release one, namely DRFF-R1, by collecting the I/Q signals of 20 commercial UAVs in channel 149. The experiment results show that the proposed framework achieves 98.38% ID identification accuracy with merely 0.15 million parameters and 2.74 ms response time, which outperforms the benchmarks.

Paperid: 983, https://arxiv.org/pdf/2508.09456.pdf

Abstract:
Vision-language models (VLMs) have shown significant advancements in tasks such as visual grounding, where they localize specific objects in images based on natural language queries and images. However, security issues in visual grounding tasks for VLMs remain underexplored, especially in the context of backdoor attacks. In this paper, we introduce a novel input-aware backdoor attack method, IAG, designed to manipulate the grounding behavior of VLMs. This attack forces the model to ground a specific target object in the input image, regardless of the user's query. We propose an adaptive trigger generator that embeds the semantic information of the attack target's description into the original image using a text-conditional U-Net, thereby overcoming the open-vocabulary attack challenge. To ensure the attack's stealthiness, we utilize a reconstruction loss to minimize visual discrepancies between poisoned and clean images. Additionally, we introduce a unified method for generating attack data. IAG is evaluated theoretically and empirically, demonstrating its feasibility and effectiveness. Notably, our ASR@0.5 on InternVL-2.5-8B reaches over 65\% on various testing sets. IAG also shows promising potential on manipulating Ferret-7B and LlaVA-1.5-7B with very little accuracy decrease on clean samples. Extensive specific experiments, such as ablation study and potential defense, also indicate the robustness and transferability of our attack.

Paperid: 984, https://arxiv.org/pdf/2508.05670.pdf

Abstract:
Game theory has long served as a foundational tool in cybersecurity to test, predict, and design strategic interactions between attackers and defenders. The recent advent of Large Language Models (LLMs) offers new tools and challenges for the security of computer systems; In this work, we investigate whether classical game-theoretic frameworks can effectively capture the behaviours of LLM-driven actors and bots. Using a reproducible framework for game-theoretic LLM agents, we investigate two canonical scenarios -- the one-shot zero-sum game and the dynamic Prisoner's Dilemma -- and we test whether LLMs converge to expected outcomes or exhibit deviations due to embedded biases. Our experiments involve four state-of-the-art LLMs and span five natural languages, English, French, Arabic, Vietnamese, and Mandarin Chinese, to assess linguistic sensitivity. For both games, we observe that the final payoffs are influenced by agents characteristics such as personality traits or knowledge of repeated rounds. Moreover, we uncover an unexpected sensitivity of the final payoffs to the choice of languages, which should warn against indiscriminate application of LLMs in cybersecurity applications and call for in-depth studies, as LLMs may behave differently when deployed in different countries. We also employ quantitative metrics to evaluate the internal consistency and cross-language stability of LLM agents, to help guide the selection of the most stable LLMs and optimising models for secure applications.

Paperid: 985, https://arxiv.org/pdf/2508.04155.pdf

Abstract:
Gradient inversion attacks pose significant privacy threats to distributed training frameworks such as federated learning, enabling malicious parties to reconstruct sensitive local training data from gradient communications between clients and an aggregation server during the aggregation process. While traditional encryption-based defenses, such as homomorphic encryption, offer strong privacy guarantees without compromising model utility, they often incur prohibitive computational overheads. To mitigate this, selective encryption has emerged as a promising approach, encrypting only a subset of gradient data based on the data's significance under a certain metric. However, there have been few systematic studies on how to specify this metric in practice. This paper systematically evaluates selective encryption methods with different significance metrics against state-of-the-art attacks. Our findings demonstrate the feasibility of selective encryption in reducing computational overhead while maintaining resilience against attacks. We propose a distance-based significance analysis framework that provides theoretical foundations for selecting critical gradient elements for encryption. Through extensive experiments on different model architectures (LeNet, CNN, BERT, GPT-2) and attack types, we identify gradient magnitude as a generally effective metric for protection against optimization-based gradient inversions. However, we also observe that no single selective encryption strategy is universally optimal across all attack scenarios, and we provide guidelines for choosing appropriate strategies for different model architectures and privacy requirements.

Paperid: 986, https://arxiv.org/pdf/2508.02805.pdf

Abstract:
Cellular Vehicle-to-Everything (C-V2X) technology enables low-latency, reliable communications essential for safety applications such as a Forward Collision Warning (FCW) system. C-V2X deployments operate under strict protocol compliance with the 3rd Generation Partnership Project (3GPP) and the Society of Automotive Engineers Standard (SAE) J2735 specifications to ensure interoperability. This paper presents a real-world testbed evaluation of protocol-compliant Denial-of-Service (DoS) attacks using User Datagram Protocol (UDP) flooding and oversized Basic Safety Message (BSM) attacks that 7 exploit transport- and application-layer vulnerabilities in C-V2X. The attacks presented in this study transmit valid messages over standard PC5 sidelinks, fully adhering to 3GPP and SAE J2735 specifications, but at abnormally high rates and with oversized payloads that overload the receiver resources without breaching any protocol rules such as IEEE 1609. Using a real-world connected vehicle 11 testbed with commercially available On-Board Units (OBUs), we demonstrate that high-rate UDP flooding and oversized payload of BSM flooding can severely degrade FCW performance. Results show that UDP flooding alone reduces packet delivery ratio by up to 87% and increases latency to over 400ms, while oversized BSM floods overload receiver processing resources, delaying or completely suppressing FCW alerts. When UDP and BSM attacks are executed simultaneously, they cause near-total communication failure, preventing FCW warnings entirely. These findings reveal that protocol-compliant communications do not necessarily guarantee safe or reliable operation of C-V2X-based safety applications.

Paperid: 987, https://arxiv.org/pdf/2508.01605.pdf

Abstract:
Text-to-image diffusion models (T2I DMs) have achieved remarkable success in generating high-quality and diverse images from text prompts, yet recent studies have revealed their vulnerability to backdoor attacks. Existing attack methods suffer from critical limitations: 1) they rely on unnatural adversarial prompts that lack human readability and require massive poisoned data; 2) their effectiveness is typically restricted to specific models, lacking generalizability; and 3) they can be mitigated by recent backdoor defenses. To overcome these challenges, we propose a novel backdoor attack framework that achieves three key properties: 1) \emph{Practicality}: Our attack requires only a few stealthy backdoor samples to generate arbitrary attacker-chosen target images, as well as ensuring high-quality image generation in benign scenarios. 2) \emph{Generalizability:} The attack is applicable across multiple T2I DMs without requiring model-specific redesign. 3) \emph{Robustness:} The attack remains effective against existing backdoor defenses and adaptive defenses. Our extensive experimental results on multiple T2I DMs demonstrate that with only 10 carefully crafted backdoored samples, our attack method achieves $>$90\% attack success rate with negligible degradation in benign image generation quality. We also conduct human evaluation to validate our attack effectiveness. Furthermore, recent backdoor detection and mitigation methods, as well as adaptive defense tailored to our attack are not sufficiently effective, highlighting the pressing need for more robust defense mechanisms against the proposed attack.

Paperid: 988, https://arxiv.org/pdf/2508.01479.pdf

Abstract:
We study the inverse problem of reconstructing high-dimensional trust embeddings from the one-dimensional Siamese trust scores that many distributed-security frameworks expose. Starting from two independent agents that publish time-stamped similarity scores for the same set of devices, we formalise the estimation task, derive an explicit direct-sum estimator that concatenates paired score series with four moment features, and prove that the resulting reconstruction map admits a unique fixed point under a contraction argument rooted in Banach theory. A suite of synthetic benchmarks (20 devices x 10 time steps) confirms that, even in the presence of Gaussian noise, the recovered embeddings preserve inter-device geometry as measured by Euclidean and cosine metrics; we complement these experiments with non-asymptotic error bounds that link reconstruction accuracy to score-sequence length. Beyond methodology, the paper demonstrates a practical privacy risk: publishing granular trust scores can leak latent behavioural information about both devices and evaluation models. We therefore discuss counter-measures -- score quantisation, calibrated noise, obfuscated embedding spaces -- and situate them within wider debates on transparency versus confidentiality in networked AI systems. All datasets, reproduction scripts and extended proofs accompany the submission so that results can be verified without proprietary code.

Paperid: 989, https://arxiv.org/pdf/2507.15818.pdf

Abstract:
We study the problem of semantic private information retrieval (Sem-PIR) with $T$ colluding servers (Sem-TPIR), i.e., servers that collectively share user queries. In Sem-TPIR, the message sizes are different, and message retrieval probabilities by any user are not uniform. This is a generalization of the classical PIR problem where the message sizes are equal and message retrieval probabilities are identical. The earlier work on Sem-PIR considered the case of no collusions, i.e., the collusion parameter of $T=1$. In this paper, we consider the general problem for arbitrary $T < N$. We find an upper bound on the retrieval rate and design a scheme that achieves this rate, i.e., we derive the exact capacity of Sem-TPIR.

Paperid: 990, https://arxiv.org/pdf/2507.14768.pdf

Abstract:
Motivated by federated learning (FL), secure aggregation (SA) aims to securely compute, as efficiently as possible, the sum of a set of inputs distributed across many users. To understand the impact of network topology, hierarchical secure aggregation (HSA) investigated the communication and secret key generation efficiency in a 3-layer relay network, where clusters of users are connected to the aggregation server through an intermediate layer of relays. Due to the pre-aggregation of the messages at the relays, HSA reduces the communication burden on the relay-to-server links and is able to support a large number of users. However, as the number of users increases, a practical challenge arises from heterogeneous security requirements--for example, users in different clusters may require varying levels of input protection. Motivated by this, we study weakly-secure HSA (WS-HSA) with collusion resilience, where instead of protecting all the inputs from any set of colluding users, only the inputs belonging to a predefined collection of user groups (referred to as security input sets) need to be protected against another predefined collection of user groups (referred to as collusion sets). Since the security input sets and collusion sets can be arbitrarily defined, our formulation offers a flexible framework for addressing heterogeneous security requirements in HSA. We characterize the optimal total key rate, i.e., the total number of independent key symbols required to ensure both server and relay security, for a broad range of parameter configurations. For the remaining cases, we establish lower and upper bounds on the optimal key rate, providing constant-factor gap optimality guarantees.

Paperid: 991, https://arxiv.org/pdf/2507.14212.pdf

Abstract:
Goal-oriented Communication (GoC) is a new paradigm that plans data transmission to occur only when it is instrumental for the receiver to achieve a certain goal. This leads to the advantage of reducing the frequency of transmissions significantly while maintaining adherence to the receiver's objectives. However, GoC scheduling also opens a timing-based side channel that an eavesdropper can exploit to obtain information about the state of the system. This type of attack sidesteps even information-theoretic security, as it exploits the timing of updates rather than their content. In this work, we study such an eavesdropping attack against pull-based goal-oriented scheduling for remote monitoring and control of Markov processes. We provide a theoretical framework for defining the effectiveness of the attack and propose possible countermeasures, including two practical heuristics that provide a balance between the performance gains offered by GoC and the amount of leaked information. Our results show that, while a naive goal-oriented scheduler allows the eavesdropper to correctly guess the system state about 60% of the time, our heuristic defenses can halve the leakage with a marginal reduction of the benefits of goal-oriented approaches.

Paperid: 992, https://arxiv.org/pdf/2507.08597.pdf

Abstract:
Machine learning models are commonly used for malware classification; however, they suffer from performance degradation over time due to concept drift. Adapting these models to changing data distributions requires frequent updates, which rely on costly ground truth annotations. While active learning can reduce the annotation burden, leveraging unlabeled data through semi-supervised learning remains a relatively underexplored approach in the context of malware detection. In this research, we introduce \texttt{ADAPT}, a novel pseudo-labeling semi-supervised algorithm for addressing concept drift. Our model-agnostic method can be applied to various machine learning models, including neural networks and tree-based algorithms. We conduct extensive experiments on five diverse malware detection datasets spanning Android, Windows, and PDF domains. The results demonstrate that our method consistently outperforms baseline models and competitive benchmarks. This work paves the way for more effective adaptation of machine learning models to concept drift in malware detection.

Paperid: 993, https://arxiv.org/pdf/2507.07483.pdf

Abstract:
With the rise of social media, vast amounts of user-uploaded videos (e.g., YouTube) are utilized as training data for Visual Object Tracking (VOT). However, the VOT community has largely overlooked video data-privacy issues, as many private videos have been collected and used for training commercial models without authorization. To alleviate these issues, this paper presents the first investigation on preventing personal video data from unauthorized exploitation by deep trackers. Existing methods for preventing unauthorized data use primarily focus on image-based tasks (e.g., image classification), directly applying them to videos reveals several limitations, including inefficiency, limited effectiveness, and poor generalizability. To address these issues, we propose a novel generative framework for generating Temporal Unlearnable Examples (TUEs), and whose efficient computation makes it scalable for usage on large-scale video datasets. The trackers trained w/ TUEs heavily rely on unlearnable noises for temporal matching, ignoring the original data structure and thus ensuring training video data-privacy. To enhance the effectiveness of TUEs, we introduce a temporal contrastive loss, which further corrupts the learning of existing trackers when using our TUEs for training. Extensive experiments demonstrate that our approach achieves state-of-the-art performance in video data-privacy protection, with strong transferability across VOT models, datasets, and temporal matching tasks.

Paperid: 994, https://arxiv.org/pdf/2507.01752.pdf

Abstract:
Gradient-based optimization is the workhorse of deep learning, offering efficient and scalable training via backpropagation. However, its reliance on large volumes of labeled data raises privacy and security concerns such as susceptibility to data poisoning attacks and the risk of overfitting. In contrast, black box optimization methods, which treat the model as an opaque function, relying solely on function evaluations to guide optimization, offer a promising alternative in scenarios where data access is restricted, adversarial risks are high, or overfitting is a concern. However, black box methods also pose significant challenges, including poor scalability to high-dimensional parameter spaces, as prevalent in large language models (LLMs), and high computational costs due to reliance on numerous model evaluations. This paper introduces BBoxER, an evolutionary black-box method for LLM post-training that induces an information bottleneck via implicit compression of the training data. Leveraging the tractability of information flow, we provide strong theoretical bounds on generalization, differential privacy, susceptibility to data poisoning attacks, and robustness to extraction attacks. BBoxER operates on top of pre-trained LLMs, offering a lightweight and modular enhancement suitable for deployment in restricted or privacy-sensitive environments, in addition to non-vacuous generalization guarantees. In experiments with LLMs, we demonstrate empirically that Retrofitting methods are able to learn, showing how a few iterations of BBoxER improve performance and generalize well on a benchmark of reasoning datasets. This positions BBoxER as an attractive add-on on top of gradient-based optimization.

Paperid: 995, https://arxiv.org/pdf/2507.00690.pdf

Abstract:
Adversarial attacks on point clouds often impose strict geometric constraints to preserve plausibility; however, such constraints inherently limit transferability and undefendability. While deformation offers an alternative, existing unstructured approaches may introduce unnatural distortions, making adversarial point clouds conspicuous and undermining their plausibility. In this paper, we propose CageAttack, a cage-based deformation framework that produces natural adversarial point clouds. It first constructs a cage around the target object, providing a structured basis for smooth, natural-looking deformation. Perturbations are then applied to the cage vertices, which seamlessly propagate to the point cloud, ensuring that the resulting deformations remain intrinsic to the object and preserve plausibility. Extensive experiments on seven 3D deep neural network classifiers across three datasets show that CageAttack achieves a superior balance among transferability, undefendability, and plausibility, outperforming state-of-the-art methods. Codes will be made public upon acceptance.

Paperid: 996, https://arxiv.org/pdf/2507.00595.pdf

Abstract:
Existing program verifiers can prove advanced properties about security protocol implementations, but are difficult to scale to large codebases because of the manual effort required. We develop a novel methodology called *Diodon* that addresses this challenge by splitting the codebase into the protocol implementation (the *Core*) and the remainder (the *Application*). This split allows us to apply powerful semi-automated verification techniques to the security-critical Core, while fully-automatic static analyses scale the verification to the entire codebase by ensuring that the Application cannot invalidate the security properties proved for the Core. The static analyses achieve that by proving *I/O independence*, i.e., that the I/O operations within the Application are independent of the Core's security-relevant data (such as keys), and that the Application meets the Core's requirements. We have proved Diodon sound by first showing that we can safely allow the Application to perform I/O independent of the security protocol, and second that manual verification and static analyses soundly compose. We evaluate Diodon on two case studies: an implementation of the signed Diffie-Hellman key exchange and a large (100k+ LoC) production Go codebase implementing a key exchange protocol for which we obtained secrecy and injective agreement guarantees by verifying a Core of about 1% of the code with the auto-active program verifier Gobra in less than three person months.

Paperid: 997, https://arxiv.org/pdf/2512.23438.pdf

Abstract:
Modern CPUs are black boxes, proprietary, and increasingly characterized by sophisticated microarchitectural flaws that evade traditional analysis. While some of these critical vulnerabilities have been uncovered through cumbersome manual effort, building an automated and systematic vulnerability detection framework for real-world post-silicon processors remains a challenge. In this paper, we present Fuzzilicon, the first post-silicon fuzzing framework for real-world x86 CPUs that brings deep introspection into the microcode and microarchitectural layers. Fuzzilicon automates the discovery of vulnerabilities that were previously only detectable through extensive manual reverse engineering, and bridges the visibility gap by introducing microcode-level instrumentation. At the core of Fuzzilicon is a novel technique for extracting feedback directly from the processor's microarchitecture, enabled by reverse-engineering Intel's proprietary microcode update interface. We develop a minimally intrusive instrumentation method and integrate it with a hypervisor-based fuzzing harness to enable precise, feedback-guided input generation, without access to Register Transfer Level (RTL). Applied to Intel's Goldmont microarchitecture, Fuzzilicon introduces 5 significant findings, including two previously unknown microcode-level speculative-execution vulnerabilities. Besides, the Fuzzilicon framework automatically rediscover the $μ$Spectre class of vulnerabilities, which were detected manually in the previous work. Fuzzilicon reduces coverage collection overhead by up to 31$\times$ compared to baseline techniques and achieves 16.27% unique microcode coverage of hookable locations, the first empirical baseline of its kind. As a practical, coverage-guided, and scalable approach to post-silicon fuzzing, Fuzzilicon establishes a new foundation to automate the discovery of complex CPU vulnerabilities.

Paperid: 998, https://arxiv.org/pdf/2512.21524.pdf

Abstract:
Modern hardware systems, driven by demands for high performance and application-specific functionality, have grown increasingly complex, introducing large surfaces for bugs and security-critical vulnerabilities. Fuzzing has emerged as a scalable solution for discovering such flaws. Yet, existing hardware fuzzers suffer from limited semantic awareness, inefficient test refinement, and high computational overhead due to reliance on slow device simulation. In this paper, we present GoldenFuzz, a novel two-stage hardware fuzzing framework that partially decouples test case refinement from coverage and vulnerability exploration. GoldenFuzz leverages a fast, ISA-compliant Golden Reference Model (GRM) as a ``digital twin'' of the Device Under Test (DUT). It fuzzes the GRM first, enabling rapid, low-cost test case refinement, accelerating deep architectural exploration and vulnerability discovery on DUT. During the fuzzing pipeline, GoldenFuzz iteratively constructs test cases by concatenating carefully chosen instruction blocks that balance the subtle inter- and intra-instructions quality. A feedback-driven mechanism leveraging insights from both high- and low-coverage samples further enhances GoldenFuzz's capability in hardware state exploration. Our evaluation of three RISC-V processors, RocketChip, BOOM, and CVA6, demonstrates that GoldenFuzz significantly outperforms existing fuzzers in achieving the highest coverage with minimal test case length and computational overhead. GoldenFuzz uncovers all known vulnerabilities and discovers five new ones, four of which are classified as highly severe with CVSS v3 severity scores exceeding seven out of ten. It also identifies two previously unknown vulnerabilities in the commercial BA51-H core extension.

Paperid: 999, https://arxiv.org/pdf/2512.21008.pdf

Abstract:
Mixture-of-Experts (MoE) architectures have advanced the scaling of Large Language Models (LLMs) by activating only a sparse subset of parameters per input, enabling state-of-the-art performance with reduced computational cost. As these models are increasingly deployed in critical domains, understanding and strengthening their alignment mechanisms is essential to prevent harmful outputs. However, existing LLM safety research has focused almost exclusively on dense architectures, leaving the unique safety properties of MoEs largely unexamined. The modular, sparsely-activated design of MoEs suggests that safety mechanisms may operate differently than in dense models, raising questions about their robustness. In this paper, we present GateBreaker, the first training-free, lightweight, and architecture-agnostic attack framework that compromises the safety alignment of modern MoE LLMs at inference time. GateBreaker operates in three stages: (i) gate-level profiling, which identifies safety experts disproportionately routed on harmful inputs, (ii) expert-level localization, which localizes the safety structure within safety experts, and (iii) targeted safety removal, which disables the identified safety structure to compromise the safety alignment. Our study shows that MoE safety concentrates within a small subset of neurons coordinated by sparse routing. Selective disabling of these neurons, approximately 3% of neurons in the targeted expert layers, significantly increases the averaged attack success rate (ASR) from 7.4% to 64.9% against the eight latest aligned MoE LLMs with limited utility degradation. These safety neurons transfer across models within the same family, raising ASR from 17.9% to 67.7% with one-shot transfer attack. Furthermore, GateBreaker generalizes to five MoE vision language models (VLMs) with 60.9% ASR on unsafe image inputs.

Paperid: 1000, https://arxiv.org/pdf/2512.16658.pdf

Abstract:
The rapid proliferation of deep neural networks (DNNs) across several domains has led to increasing concerns regarding intellectual property (IP) protection and model misuse. Trained DNNs represent valuable assets, often developed through significant investments. However, the ease with which models can be copied, redistributed, or repurposed highlights the urgent need for effective mechanisms to assert and verify model ownership. In this work, we propose an efficient and resilient white-box watermarking framework that embeds ownership information into the internal parameters of a DNN using chaotic sequences. The watermark is generated using a logistic map, a well-known chaotic function, producing a sequence that is sensitive to its initialization parameters. This sequence is injected into the weights of a chosen intermediate layer without requiring structural modifications to the model or degradation in predictive performance. To validate ownership, we introduce a verification process based on a genetic algorithm that recovers the original chaotic parameters by optimizing the similarity between the extracted and regenerated sequences. The effectiveness of the proposed approach is demonstrated through extensive experiments on image classification tasks using MNIST and CIFAR-10 datasets. The results show that the embedded watermark remains detectable after fine-tuning, with negligible loss in model accuracy. In addition to numerical recovery of the watermark, we perform visual analyses using weight density plots and construct activation-based classifiers to distinguish between original, watermarked, and tampered models. Overall, the proposed method offers a flexible and scalable solution for embedding and verifying model ownership in white-box settings well-suited for real-world scenarios where IP protection is critical.

Paperid: 1001, https://arxiv.org/pdf/2512.16538.pdf

Abstract:
As large language models (LLMs) are increasingly adopted for code vulnerability detection, their reliability and robustness across diverse vulnerability types have become a pressing concern. In traditional adversarial settings, code obfuscation has long been used as a general strategy to bypass auditing tools, preserving exploitability without tampering with the tools themselves. Numerous efforts have explored obfuscation methods and tools, yet their capabilities differ in terms of supported techniques, granularity, and programming languages, making it difficult to systematically assess their impact on LLM-based vulnerability detection. To address this gap, we provide a structured systematization of obfuscation techniques and evaluate them under a unified framework. Specifically, we categorize existing obfuscation methods into three major classes (layout, data flow, and control flow) covering 11 subcategories and 19 concrete techniques. We implement these techniques across four programming languages (Solidity, C, C++, and Python) using a consistent LLM-driven approach, and evaluate their effects on 15 LLMs spanning four model families (DeepSeek, OpenAI, Qwen, and LLaMA), as well as on two coding agents (GitHub Copilot and Codex). Our findings reveal both positive and negative impacts of code obfuscation on LLM-based vulnerability detection, highlighting conditions under which obfuscation leads to performance improvements or degradations. We further analyze these outcomes with respect to vulnerability characteristics, code properties, and model attributes. Finally, we outline several open problems and propose future directions to enhance the robustness of LLMs for real-world vulnerability detection.

Paperid: 1002, https://arxiv.org/pdf/2512.06364.pdf

Abstract:
Current mobile health platforms are predominantly individual-centric and lack the necessary primitives for coordinated, auditable, multi-actor workflows. However, in many settings worldwide, health decisions are enacted by multi-actor care networks rather than single users. We present JEEVHITAA, an Android/Flutter system that provides context-sensitive, role-aware sharing and verifiable information flows for care circles. JEEVHITAA ingests platform and device data (via Google Health Connect and BLE connectors), constructs multi-layer user profiles from sensor streams and tiered onboarding, and enforces fine-grained, time-bounded access control across permissioned care graphs. Data are end-to-end encrypted in local stores and during peer sync (Firebase), and provisions are made for document capture by camera or upload as PDF. An integrated retrieval-augmented LLM pipeline (i) produces structured, role-targeted summaries and action plans, (ii) enables users to gather advanced insights on health reports, and (iii) performs evidence-grounded user-relevant verification of arbitrary health content, returning provenance, confidence scores, and source citations. We describe the system architecture, connector abstractions, and security primitives, and evaluate robustness and compatibility using synthetic, ontology-driven simulations and vendor compatibility tests. Finally, we outline plans for longitudinal in-the-wild deployments to measure system performance, the correctness of access control, and the real-world effectiveness of relationship-aware credibility support.

Paperid: 1003, https://arxiv.org/pdf/2512.04436.pdf

Abstract:
Processor designs rely on iterative modifications and reuse well-established designs. However, this reuse of prior designs also leads to similar vulnerabilities across multiple processors. As processors grow increasingly complex with iterative modifications, efficiently detecting vulnerabilities from modern processors is critical. Inspired by software fuzzing, hardware fuzzing has recently demonstrated its effectiveness in detecting processor vulnerabilities. Yet, to our best knowledge, existing processor fuzzers fuzz each design individually, lacking the capability to understand known vulnerabilities in prior processors to fine-tune fuzzing to identify similar or new variants of vulnerabilities. To address this gap, we present ReFuzz, an adaptive fuzzing framework that leverages contextual bandit to reuse highly effective tests from prior processors to fuzz a processor-under-test (PUT) within a given ISA. By intelligently mutating tests that trigger vulnerabilities in prior processors, ReFuzz effectively detects similar and new variants of vulnerabilities in PUTs. ReFuzz uncovered three new security vulnerabilities and two new functional bugs. ReFuzz detected one vulnerability by reusing a test that triggers a known vulnerability in a prior processor. One functional bug exists across three processors that share design modules. The second bug has two variants. Additionally, ReFuzz reuses highly effective tests to enhance efficiency in coverage, achieving an average 511.23x coverage speedup and up to 9.33% more total coverage, compared to existing fuzzers.

Paperid: 1004, https://arxiv.org/pdf/2512.02306.pdf

Abstract:
Omni-modal Large Language Models (OLLMs) that process text, images, videos, and audio introduce new challenges for safety and value guardrails in human-AI interaction. Prior guardrail research largely targets unimodal settings and typically frames safeguarding as binary classification, which limits robustness across diverse modalities and tasks. To address this gap, we propose OmniGuard, the first family of omni-modal guardrails that performs safeguarding across all modalities with deliberate reasoning ability. To support the training of OMNIGUARD, we curate a large, comprehensive omni-modal safety dataset comprising over 210K diverse samples, with inputs that cover all modalities through both unimodal and cross-modal samples. Each sample is annotated with structured safety labels and carefully curated safety critiques from expert models through targeted distillation. Extensive experiments on 15 benchmarks show that OmniGuard achieves strong effectiveness and generalization across a wide range of multimodal safety scenarios. Importantly, OmniGuard provides a unified framework that enforces policies and mitigates risks in omni-modalities, paving the way toward building more robust and capable omnimodal safeguarding systems.

Paperid: 1005, https://arxiv.org/pdf/2512.01185.pdf

Abstract:
Multi-modal large language models (MLLMs), capable of processing text, images, and audio, have been widely adopted in various AI applications. However, recent MLLMs integrating images and text remain highly vulnerable to coordinated jailbreaks. Existing defenses primarily focus on the text, lacking robust multi-modal protection. As a result, studies indicate that MLLMs are more susceptible to malicious or unsafe instructions, unlike their text-only counterparts. In this paper, we proposed DefenSee, a robust and lightweight multi-modal black-box defense technique that leverages image variants transcription and cross-modal consistency checks, mimicking human judgment. Experiments on popular multi-modal jailbreak and benign datasets show that DefenSee consistently enhances MLLM robustness while better preserving performance on benign tasks compared to SOTA defenses. It reduces the ASR of jailbreak attacks to below 1.70% on MiniGPT4 using the MM-SafetyBench benchmark, significantly outperforming prior methods under the same conditions.

Paperid: 1006, https://arxiv.org/pdf/2511.23393.pdf

Abstract:
Federated Learning (FL) enables collaborative, privacy-preserving model training, but supporting the "Right to be Forgotten" is especially challenging because data influences the model through distributed and interleaved client updates. Existing exact unlearning methods typically require frequent retraining from scratch, resulting in high communication cost and long service downtime. To address this, we propose Federated Sequential Group-based Training (FedSGT), an exact unlearning framework for FL. FedSGT partitions the data into uniform groups, and each client may participate in multiple groups. To control communication overhead, each client can limit the number of groups it contributes to. FedSGT then trains multiple sequences of Parameter-Efficient Fine-Tuning (PEFT) modules, each corresponding to a different group permutation. Since the PEFT modules are lightweight and maintained server-side, FedSGT isolates the influence of different data groups into independent modules without incurring significant storage overhead and communication cost. Exact unlearning is thus achieved instantly by deactivating the modules corresponding to the group containing the unlearned data. Furthermore, using multiple training sequences helps maintain high model utility as deletion requests accumulate. We provide a rigorous theoretical analysis of both the deletion rate -- expected number of deletions before retraining is needed -- and the expected model performance. Experiments on various tasks demonstrate that FedSGT achieves a significantly longer service maintenance under multiple unlearning requests while maintaining comparable learning performance and training efficiency to other exact unlearning baselines. Extensive ablation studies validate the robustness of our method across a wide range of parameter settings.

Paperid: 1007, https://arxiv.org/pdf/2511.21180.pdf

Abstract:
Diffusion models exhibit notable fragility when faced with adversarial prompts, and strengthening attack capabilities is crucial for uncovering such vulnerabilities and building more robust generative systems. Existing works often rely on white-box access to model gradients or hand-crafted prompt engineering, which is infeasible in real-world deployments due to restricted access or poor attack effect. In this paper, we propose CAHS-Attack , a CLIP-Aware Heuristic Search attack method. CAHS-Attack integrates Monte Carlo Tree Search (MCTS) to perform fine-grained suffix optimization, leveraging a constrained genetic algorithm to preselect high-potential adversarial prompts as root nodes, and retaining the most semantically disruptive outcome at each simulation rollout for efficient local search. Extensive experiments demonstrate that our method achieves state-of-the-art attack performance across both short and long prompts of varying semantics. Furthermore, we find that the fragility of SD models can be attributed to the inherent vulnerability of their CLIP-based text encoders, suggesting a fundamental security risk in current text-to-image pipelines.

Paperid: 1008, https://arxiv.org/pdf/2511.14963.pdf

Abstract:
Machine learning (ML)-based malware detectors degrade over time as concept drift introduces new and evolving families unseen during training. Retraining is limited by the cost and time of manual labeling or sandbox analysis. Existing approaches mitigate this via drift detection and selective labeling, but fully label-free adaptation remains largely unexplored. Recent self-training methods use a previously trained model to generate pseudo-labels for unlabeled data and then train a new model on these labels. The unlabeled data are used only for inference and do not participate in training the earlier model. We argue that these unlabeled samples still carry valuable information that can be leveraged when incorporated appropriately into training. This paper introduces LFreeDA, an end-to-end framework that adapts malware classifiers to drift without manual labeling or drift detection. LFreeDA first performs unsupervised domain adaptation on malware images, jointly training on labeled and unlabeled samples to infer pseudo-labels and prune noisy ones. It then adapts a classifier on CFG representations using the labeled and selected pseudo-labeled data, leveraging the scalability of images for pseudo-labeling and the richer semantics of CFGs for final adaptation. Evaluations on the real-world MB-24+ dataset show that LFreeDA improves accuracy by up to 12.6% and F1 by 11.1% over no-adaptation lower bounds, and is only 4% and 3.4% below fully supervised upper bounds in accuracy and F1, respectively. It also matches the performance of state-of-the-art methods provided with ground truth labels for 300 target samples. Additional results on two controlled-drift benchmarks further confirm that LFreeDA maintains malware detection performance as malware evolves without human labeling.

Paperid: 1009, https://arxiv.org/pdf/2511.13502.pdf

Abstract:
Large language models (LLMs) perform in-context learning (ICL) by adapting to tasks from prompt demonstrations, which in practice often contain private or proprietary data. Although differential privacy (DP) with private voting is a pragmatic mitigation, DP-ICL implementations are error-prone, and worst-case DP bounds may substantially overestimate actual leakage, calling for practical auditing tools. We present a tight and efficient privacy auditing framework for DP-ICL systems that runs membership inference attacks and translates their success rates into empirical privacy guarantees using Gaussian DP. Our analysis of the private voting mechanism identifies vote configurations that maximize the auditing signal, guiding the design of audit queries that reliably reveal whether a canary demonstration is present in the context. The framework supports both black-box (API-only) and white-box (internal vote) threat models, and unifies auditing for classification and generation by reducing both to a binary decision problem. Experiments on standard text classification and generation benchmarks show that our empirical leakage estimates closely match theoretical DP budgets on classification tasks and are consistently lower on generation tasks due to conservative embedding-sensitivity bounds, making our framework a practical privacy auditor and verifier for real-world DP-ICL deployments.

Paperid: 1010, https://arxiv.org/pdf/2511.11020.pdf

Abstract:
Healthcare AI systems face major vulnerabilities to data poisoning that current defenses and regulations cannot adequately address. We analyzed eight attack scenarios in four categories: architectural attacks on convolutional neural networks, large language models, and reinforcement learning agents; infrastructure attacks exploiting federated learning and medical documentation systems; critical resource allocation attacks affecting organ transplantation and crisis triage; and supply chain attacks targeting commercial foundation models. Our findings indicate that attackers with access to only 100-500 samples can compromise healthcare AI regardless of dataset size, often achieving over 60 percent success, with detection taking an estimated 6 to 12 months or sometimes not occurring at all. The distributed nature of healthcare infrastructure creates many entry points where insiders with routine access can launch attacks with limited technical skill. Privacy laws such as HIPAA and GDPR can unintentionally shield attackers by restricting the analyses needed for detection. Supply chain weaknesses allow a single compromised vendor to poison models across 50 to 200 institutions. The Medical Scribe Sybil scenario shows how coordinated fake patient visits can poison data through legitimate clinical workflows without requiring a system breach. Current regulations lack mandatory adversarial robustness testing, and federated learning can worsen risks by obscuring attribution. We recommend multilayer defenses including required adversarial testing, ensemble-based detection, privacy-preserving security mechanisms, and international coordination on AI security standards. We also question whether opaque black-box models are suitable for high-stakes clinical decisions, suggesting a shift toward interpretable systems with verifiable safety guarantees.

Paperid: 1011, https://arxiv.org/pdf/2511.06573.pdf

Abstract:
Secure data hiding remains a fundamental challenge in digital communication, requiring a careful balance between computational efficiency and perceptual transparency. The balance between security and performance is increasingly fragile with the emergence of generative AI systems capable of autonomously generating and optimising sophisticated cryptanalysis and steganalysis algorithms, thereby accelerating the exposure of vulnerabilities in conventional data-hiding schemes. This work introduces SteganoSNN, a neuromorphic steganographic framework that exploits spiking neural networks (SNNs) to achieve secure, low-power, and high-capacity multimedia data hiding. Digitised audio samples are converted into spike trains using leaky integrate-and-fire (LIF) neurons, encrypted via a modulo-based mapping scheme, and embedded into the least significant bits of RGBA image channels using a dithering mechanism to minimise perceptual distortion. Implemented in Python using NEST and realised on a PYNQ-Z2 FPGA, SteganoSNN attains real-time operation with an embedding capacity of 8 bits per pixel. Experimental evaluations on the DIV2K 2017 dataset demonstrate image fidelity between 40.4 dB and 41.35 dB in PSNR and SSIM values consistently above 0.97, surpassing SteganoGAN in computational efficiency and robustness. SteganoSNN establishes a foundation for neuromorphic steganography, enabling secure, energy-efficient communication for Edge-AI, IoT, and biomedical applications.

Paperid: 1012, https://arxiv.org/pdf/2511.06064.pdf

Abstract:
Omics data is widely employed in medical research to identify disease mechanisms and contains highly sensitive personal information. Federated Learning (FL) with Differential Privacy (DP) can ensure the protection of omics data privacy against malicious user attacks. However, FL with the DP method faces an inherent trade-off: stronger privacy protection degrades predictive accuracy due to injected noise. On the other hand, Homomorphic Encryption (HE) allows computations on encrypted data and enables aggregation of encrypted gradients without DP-induced noise can increase the predictive accuracy. However, it may increase the computation cost. To improve the predictive accuracy while considering the computational ability of heterogeneous clients, we propose a Privacy-Preserving Machine Learning (PPML)-Hybrid method by introducing HE. In the proposed PPML-Hybrid method, clients distributed select either HE or DP based on their computational resources, so that HE clients contribute noise-free updates while DP clients reduce computational overhead. Meanwhile, clients with high computational resources clients can flexibly adopt HE or DP according to their privacy needs. Performance evaluation on omics datasets show that our proposed method achieves comparable predictive accuracy while significantly reducing computation time relative to HE-only. Additionally, it outperforms DP-only methods under equivalent or stricter privacy budgets.

Paperid: 1013, https://arxiv.org/pdf/2511.05797.pdf

Abstract:
Prompt injection attacks pose a critical threat to large language models (LLMs), with prior work focusing on cutting-edge LLM applications like personal copilots. In contrast, simpler LLM applications, such as customer service chatbots, are widespread on the web, yet their security posture and exposure to such attacks remain poorly understood. These applications often rely on third-party chatbot plugins that act as intermediaries to commercial LLM APIs, offering non-expert website builders intuitive ways to customize chatbot behaviors. To bridge this gap, we present the first large-scale study of 17 third-party chatbot plugins used by over 10,000 public websites, uncovering previously unknown prompt injection risks in practice. First, 8 of these plugins (used by 8,000 websites) fail to enforce the integrity of the conversation history transmitted in network requests between the website visitor and the chatbot. This oversight amplifies the impact of direct prompt injection attacks by allowing adversaries to forge conversation histories (including fake system messages), boosting their ability to elicit unintended behavior (e.g., code generation) by 3 to 8x. Second, 15 plugins offer tools, such as web-scraping, to enrich the chatbot's context with website-specific content. However, these tools do not distinguish the website's trusted content (e.g., product descriptions) from untrusted, third-party content (e.g., customer reviews), introducing a risk of indirect prompt injection. Notably, we found that ~13% of e-commerce websites have already exposed their chatbots to third-party content. We systematically evaluate both vulnerabilities through controlled experiments grounded in real-world observations, focusing on factors such as system prompt design and the underlying LLM. Our findings show that many plugins adopt insecure practices that undermine the built-in LLM safeguards.

Paperid: 1014, https://arxiv.org/pdf/2510.21214.pdf

Abstract:
Multimodal large language models (MLLMs) comprise of both visual and textual modalities to process vision language tasks. However, MLLMs are vulnerable to security-related issues, such as jailbreak attacks that alter the model's input to induce unauthorized or harmful responses. The incorporation of the additional visual modality introduces new dimensions to security threats. In this paper, we proposed a black-box jailbreak method via both text and image prompts to evaluate MLLMs. In particular, we designed text prompts with provocative instructions, along with image prompts that introduced mutation and multi-image capabilities. To strengthen the evaluation, we also designed a Re-attack strategy. Empirical results show that our proposed work can improve capabilities to assess the security of both open-source and closed-source MLLMs. With that, we identified gaps in existing defense methods to propose new strategies for both training-time and inference-time defense methods, and evaluated them across the new jailbreak methods. The experiment results showed that the re-designed defense methods improved protections against the jailbreak attacks.

Paperid: 1015, https://arxiv.org/pdf/2510.21124.pdf

Abstract:
Blockchain-based Attribute-Based Access Control (BC-ABAC) offers a decentralized paradigm for secure data governance but faces two inherent challenges: the transparency of blockchain ledgers threatens user privacy by enabling reidentification attacks through attribute analysis, while the computational complexity of policy matching clashes with blockchain's performance constraints. Existing solutions, such as those employing Zero-Knowledge Proofs (ZKPs), often incur high overhead and lack measurable anonymity guarantees, while efficiency optimizations frequently ignore privacy implications. To address these dual challenges, this paper proposes QAEBAC (Quantifiable Anonymity and Efficiency in Blockchain-Based Access Control with Attribute). QAE-BAC introduces a formal (r, t)-anonymity model to dynamically quantify the re-identification risk of users based on their access attributes and history. Furthermore, it features an Entropy-Weighted Path Tree (EWPT) that optimizes policy structure based on realtime anonymity metrics, drastically reducing policy matching complexity. Implemented and evaluated on Hyperledger Fabric, QAE-BAC demonstrates a superior balance between privacy and performance. Experimental results show that it effectively mitigates re-identification risks and outperforms state-of-the-art baselines, achieving up to an 11x improvement in throughput and an 87% reduction in latency, proving its practicality for privacy-sensitive decentralized applications.

Paperid: 1016, https://arxiv.org/pdf/2510.19615.pdf

Abstract:
Decompilation converts machine code into human-readable form, enabling analysis and debugging without source code. However, fidelity issues often degrade the readability and semantic accuracy of decompiled output. Existing methods, such as variable renaming or structural simplification, provide partial improvements but lack robust detection and correction, particularly for complex closed-source binaries. We present FidelityGPT, a framework that enhances decompiled code accuracy and readability by systematically detecting and correcting semantic distortions. FidelityGPT introduces distortion-aware prompt templates tailored to closed-source settings and integrates Retrieval-Augmented Generation (RAG) with a dynamic semantic intensity algorithm to locate distorted lines and retrieve semantically similar code from a database. A variable dependency algorithm further mitigates long-context limitations by analyzing redundant variables and integrating their dependencies into the prompt context. Evaluated on 620 function pairs from a binary similarity benchmark, FidelityGPT achieved an average detection accuracy of 89% and a precision of 83%. Compared to the state-of-the-art DeGPT (Fix Rate 83%, Corrected Fix Rate 37%), FidelityGPT attained 94% FR and 64% CFR, demonstrating significant gains in accuracy and readability. These results highlight its potential to advance LLM-based decompilation and reverse engineering.

Paperid: 1017, https://arxiv.org/pdf/2510.19537.pdf

Abstract:
Deep learning is widely applied to modern problems through neural networks, but the growing computational and energy demands of these models have driven interest in more efficient approaches. Spiking Neural Networks (SNNs), the third generation of neural networks, mimic the brain's event-driven behaviour, offering improved performance and reduced power use. At the same time, concerns about data privacy during cloud-based model execution have led to the adoption of cryptographic methods. This article introduces BioEncryptSNN, a spiking neural network based encryption-decryption framework for secure and noise-resilient data protection. Unlike conventional algorithms, BioEncryptSNN converts ciphertext into spike trains and exploits temporal neural dynamics to model encryption and decryption, optimising parameters such as key length, spike timing, and synaptic connectivity. Benchmarked against AES-128, RSA-2048, and DES, BioEncryptSNN preserved data integrity while achieving up to 4.1x faster encryption and decryption than PyCryptodome's AES implementation. The framework demonstrates scalability and adaptability across symmetric and asymmetric ciphers, positioning SNNs as a promising direction for secure, energy-efficient computing.

Paperid: 1018, https://arxiv.org/pdf/2510.18563.pdf

Abstract:
Multi-agent systems powered by large language models are advancing rapidly, yet the tension between mutual trust and security remains underexplored. We introduce and empirically validate the Trust-Vulnerability Paradox (TVP): increasing inter-agent trust to enhance coordination simultaneously expands risks of over-exposure and over-authorization. To investigate this paradox, we construct a scenario-game dataset spanning 3 macro scenes and 19 sub-scenes, and run extensive closed-loop interactions with trust explicitly parameterized. Using Minimum Necessary Information (MNI) as the safety baseline, we propose two unified metrics: Over-Exposure Rate (OER) to detect boundary violations, and Authorization Drift (AD) to capture sensitivity to trust levels. Results across multiple model backends and orchestration frameworks reveal consistent trends: higher trust improves task success but also heightens exposure risks, with heterogeneous trust-to-risk mappings across systems. We further examine defenses such as Sensitive Information Repartitioning and Guardian-Agent enablement, both of which reduce OER and attenuate AD. Overall, this study formalizes TVP, establishes reproducible baselines with unified metrics, and demonstrates that trust must be modeled and scheduled as a first-class security variable in multi-agent system design.

Paperid: 1019, https://arxiv.org/pdf/2510.16716.pdf

Abstract:
Large Language Models (LLMs) have demonstrated strong performance across diverse tasks, but fine-tuning them typically relies on cloud-based, centralized infrastructures. This requires data owners to upload potentially sensitive data to external servers, raising serious privacy concerns. An alternative approach is to fine-tune LLMs directly on edge devices using local data; however, this introduces a new challenge: the model owner must transfer proprietary models to the edge, which risks intellectual property (IP) leakage. To address this dilemma, we propose DistilLock, a TEE-assisted fine-tuning framework that enables privacy-preserving knowledge distillation on the edge. In DistilLock, a proprietary foundation model is executed within a trusted execution environment (TEE) enclave on the data owner's device, acting as a secure black-box teacher. This setup preserves both data privacy and model IP by preventing direct access to model internals. Furthermore, DistilLock employs a model obfuscation mechanism to offload obfuscated weights to untrusted accelerators for efficient knowledge distillation without compromising security. We demonstrate that DistilLock prevents unauthorized knowledge distillation processes and model-stealing attacks while maintaining high computational efficiency, but offering a secure and practical solution for edge-based LLM personalization.

Paperid: 1020, https://arxiv.org/pdf/2510.15476.pdf

Abstract:
Large Language Models (LLMs) have rapidly become integral to real-world applications, powering services across diverse sectors. However, their widespread deployment has exposed critical security risks, particularly through jailbreak prompts that can bypass model alignment and induce harmful outputs. Despite intense research into both attack and defense techniques, the field remains fragmented: definitions, threat models, and evaluation criteria vary widely, impeding systematic progress and fair comparison. In this Systematization of Knowledge (SoK), we address these challenges by (1) proposing a holistic, multi-level taxonomy that organizes attacks, defenses, and vulnerabilities in LLM prompt security; (2) formalizing threat models and cost assumptions into machine-readable profiles for reproducible evaluation; (3) introducing an open-source evaluation toolkit for standardized, auditable comparison of attacks and defenses; (4) releasing JAILBREAKDB, the largest annotated dataset of jailbreak and benign prompts to date; and (5) presenting a comprehensive evaluation and leaderboard of state-of-the-art methods. Our work unifies fragmented research, provides rigorous foundations for future studies, and supports the development of robust, trustworthy LLMs suitable for high-stakes deployment.

Paperid: 1021, https://arxiv.org/pdf/2510.08079.pdf

Abstract:
Secure key leasing allows a cryptographic key to be leased as a quantum state in such a way that the key can later be revoked in a verifiable manner. In this work, we propose a modular framework for constructing secure key leasing with a classical-lessor, where the lessor is entirely classical and, in particular, the quantum secret key can be both leased and revoked using only classical communication. Based on this framework, we obtain classical-lessor secure key leasing schemes for public-key encryption (PKE), pseudorandom function (PRF), and digital signature. We adopt the strong security notion known as security against verification key revealing attacks (VRA security) proposed by Kitagawa et al. (Eurocrypt 2025) into the classical-lessor setting, and we prove that all three of our schemes satisfy this notion under the learning with errors assumption. Our PKE scheme improves upon the previous construction by Goyal et al. (Eurocrypt 2025), and our PRF and digital signature schemes are respectively the first PRF and digital signature with classical-lessor secure key leasing property.

Paperid: 1022, https://arxiv.org/pdf/2510.07901.pdf

Abstract:
The necessity of blockchain systems to remain decentralised limits current solutions to blockchain governance and dynamic management, forcing a trade-off between control and decentralisation. In light of the above, this work proposes a dynamic and decentralised blockchain management mechanism based on digital twins. To ensure decentralisation, the proposed mechanism utilises multiple digital twins that the system's stakeholders control. To facilitate decentralised decision-making, the twins are organised in a secondary blockchain system that orchestrates agreement on, and propagation of decisions to the managed blockchain. This enables the management of blockchain systems without centralised control. A preliminary evaluation of the performance and impact of the overheads introduced by the proposed mechanism is conducted through simulation. The results demonstrate the proposed mechanism's ability to reach consensus on decisions quickly and reconfigure the primary blockchain with minimal overhead.

Paperid: 1023, https://arxiv.org/pdf/2510.06343.pdf

Abstract:
In safety-critical software systems, cybersecurity activities become essential, with risk assessment being one of the most critical. In many software teams, cybersecurity experts are either entirely absent or represented by only a small number of specialists. As a result, the workload for these experts becomes high, and software engineers would need to conduct cybersecurity activities themselves. This creates a need for a tool to support cybersecurity experts and engineers in evaluating vulnerabilities and threats during the risk assessment process. This paper explores the potential of leveraging locally hosted large language models (LLMs) with retrieval-augmented generation to support cybersecurity risk assessment in the forestry domain while complying with data protection and privacy requirements that limit external data sharing. We performed a design science study involving 12 experts in interviews, interactive sessions, and a survey within a large-scale project. The results demonstrate that LLMs can assist cybersecurity experts by generating initial risk assessments, identifying threats, and providing redundancy checks. The results also highlight the necessity for human oversight to ensure accuracy and compliance. Despite trust concerns, experts were willing to utilize LLMs in specific evaluation and assistance roles, rather than solely relying on their generative capabilities. This study provides insights that encourage the use of LLM-based agents to support the risk assessment process of cyber-physical systems in safety-critical domains.

Paperid: 1024, https://arxiv.org/pdf/2510.03992.pdf

Abstract:
Large language models (LLMs) are increasingly deployed in agentic systems where they map user intents to relevant external tools to fulfill a task. A critical step in this process is tool selection, where a retriever first surfaces candidate tools from a larger pool, after which the LLM selects the most appropriate one. This pipeline presents an underexplored attack surface where errors in selection can lead to severe outcomes like unauthorized data access or denial of service, all without modifying the agent's model or code. While existing evaluations measure task performance in benign settings, they overlook the specific vulnerabilities of the tool selection mechanism under adversarial conditions. To address this gap, we introduce ToolCert, the first statistical framework that formally certifies tool selection robustness. ToolCert models tool selection as a Bernoulli success process and evaluates it against a strong, adaptive attacker who introduces adversarial tools with misleading metadata, and are iteratively refined based on the agent's previous choices. By sampling these adversarial interactions, ToolCert produces a high-confidence lower bound on accuracy, formally quantifying the agent's worst-case performance. Our evaluation with ToolCert uncovers the severe fragility: under attacks injecting deceptive tools or saturating retrieval, the certified accuracy bound drops near zero, an average performance drop of over 60% compared to non-adversarial settings. For attacks targeting the retrieval and selection stages, the certified accuracy bound plummets to less than 20% after just a single round of adversarial adaptation. ToolCert thus reveals previously unexamined security threats inherent to tool selection and provides a principled method to quantify an agent's robustness to such threats, a necessary step for the safe deployment of agentic systems.

Paperid: 1025, https://arxiv.org/pdf/2510.03969.pdf

Abstract:
Large Language Models (LLMs) can produce catastrophic responses in conversational settings that pose serious risks to public safety and security. Existing evaluations often fail to fully reveal these vulnerabilities because they rely on fixed attack prompt sequences, lack statistical guarantees, and do not scale to the vast space of multi-turn conversations. In this work, we propose QRLLM, a novel, principled Certification framework for Catastrophic risks in multi-turn Conversation for LLMs that bounds the probability of an LLM generating catastrophic responses under multi-turn conversation distributions with statistical guarantees. We model multi-turn conversations as probability distributions over query sequences, represented by a Markov process on a query graph whose edges encode semantic similarity to capture realistic conversational flow, and quantify catastrophic risks using confidence intervals. We define several inexpensive and practical distributions: random node, graph path, adaptive with rejection. Our results demonstrate that these distributions can reveal substantial catastrophic risks in frontier models, with certified lower bounds as high as 70\% for the worst model, highlighting the urgent need for improved safety training strategies in frontier LLMs.

Paperid: 1026, https://arxiv.org/pdf/2510.03623.pdf

Abstract:
Explainable Artificial Intelligence (XAI) has aided machine learning (ML) researchers with the power of scrutinizing the decisions of the black-box models. XAI methods enable looking deep inside the models' behavior, eventually generating explanations along with a perceived trust and transparency. However, depending on any specific XAI method, the level of trust can vary. It is evident that XAI methods can themselves be a victim of post-adversarial attacks that manipulate the expected outcome from the explanation module. Among such attack tactics, fairwashing explanation (FE), manipulation explanation (ME), and backdoor-enabled manipulation attacks (BD) are the notable ones. In this paper, we try to understand these adversarial attack techniques, tactics, and procedures (TTPs) on explanation alteration and thus the effect on the model's decisions. We have explored a total of six different individual attack procedures on post-hoc explanation methods such as SHAP (SHapley Additive exPlanations), LIME (Local Interpretable Model-agnostic Explanation), and IG (Integrated Gradients), and investigated those adversarial attacks in cybersecurity applications scenarios such as phishing, malware, intrusion, and fraudulent website detection. Our experimental study reveals the actual effectiveness of these attacks, thus providing an urgency for immediate attention to enhance the resiliency of XAI methods and their applications.

Paperid: 1027, https://arxiv.org/pdf/2510.02717.pdf

Abstract:
The rapid expansion of the Internet of Things (IoT) has revolutionized modern industries by enabling smart automation and real time connectivity. However, this evolution has also introduced complex cybersecurity challenges due to the heterogeneous, resource constrained, and distributed nature of these environments. To address these challenges, this research presents CST AFNet, a novel dual attention based deep learning framework specifically designed for robust intrusion detection in IoT networks. The model integrates multi scale Convolutional Neural Networks (CNNs) for spatial feature extraction, Bidirectional Gated Recurrent Units (BiGRUs) for capturing temporal dependencies, and a dual attention mechanism, channel and temporal attention, to enhance focus on critical patterns in the data. The proposed method was trained and evaluated on the Edge IIoTset dataset, a comprehensive and realistic benchmark containing more than 2.2 million labeled instances spanning 15 attack types and benign traffic, collected from a seven layer industrial testbed. Our proposed model achieves outstanding accuracy for both 15 attack types and benign traffic. CST AFNet achieves 99.97 percent accuracy. Moreover, this model demonstrates exceptional performance with macro averaged precision, recall, and F1 score all above 99.3 percent. Experimental results show that CST AFNet achieves superior detection accuracy, significantly outperforming traditional deep learning models. The findings confirm that CST AFNet is a powerful and scalable solution for real time cyber threat detection in complex IoT and IIoT environments, paving the way for more secure, intelligent, and adaptive cyber physical systems.

Paperid: 1028, https://arxiv.org/pdf/2510.02711.pdf

Abstract:
The growing integration of drones across commercial, industrial, and civilian domains has introduced significant cybersecurity challenges, particularly due to the susceptibility of drone networks to a wide range of cyberattacks. Existing intrusion detection mechanisms often lack the adaptability, efficiency, and generalizability required for the dynamic and resource constrained environments in which drones operate. This paper proposes TSLT-Net, a novel lightweight and unified Temporal Spatial Transformer based intrusion detection system tailored specifically for drone networks. By leveraging self attention mechanisms, TSLT-Net effectively models both temporal patterns and spatial dependencies in network traffic, enabling accurate detection of diverse intrusion types. The framework includes a streamlined preprocessing pipeline and supports both multiclass attack classification and binary anomaly detection within a single architecture. Extensive experiments conducted on the ISOT Drone Anomaly Detection Dataset, consisting of more than 2.3 million labeled records, demonstrate the superior performance of TSLT-Net with 99.99 percent accuracy in multiclass detection and 100 percent in binary anomaly detection, while maintaining a minimal memory footprint of only 0.04 MB and 9722 trainable parameters. These results establish TSLT-Net as an effective and scalable solution for real time drone cybersecurity, particularly suitable for deployment on edge devices in mission critical UAV systems.

Paperid: 1029, https://arxiv.org/pdf/2510.01529.pdf

Abstract:
As large language models (LLMs) advance, ensuring AI safety and alignment is paramount. One popular approach is prompt guards, lightweight mechanisms designed to filter malicious queries while being easy to implement and update. In this work, we introduce a new attack that circumvents such prompt guards, highlighting their limitations. Our method consistently jailbreaks production models while maintaining response quality, even under the highly protected chat interfaces of Google Gemini (2.5 Flash/Pro), DeepSeek Chat (DeepThink), Grok (3), and Mistral Le Chat (Magistral). The attack exploits a resource asymmetry between the prompt guard and the main LLM, encoding a jailbreak prompt that lightweight guards cannot decode but the main model can. This reveals an attack surface inherent to lightweight prompt guards in modern LLM architectures and underscores the need to shift defenses from blocking malicious inputs to preventing malicious outputs. We additionally identify other critical alignment issues, such as copyrighted data extraction, training data extraction, and malicious response leakage during thinking.

Paperid: 1030, https://arxiv.org/pdf/2510.00164.pdf

Abstract:
Rollup protocols have recently received significant attention as a promising class of Layer 2 (L2) scalability solutions. By utilizing the Layer 1 (L1) blockchain solely as a bulletin board for a summary of the executed transactions and state changes, rollups enable secure off-chain execution while avoiding the complexity of other L2 mechanisms. However, to ensure data availability, current rollup protocols require the plaintext of executed transactions to be published on-chain, resulting in inherent privacy limitations. In this paper, we address this problem by introducing Calyx, the first privacy-preserving multi-token optimistic-Rollup protocol. Calyx guarantees full payment privacy for all L2 transactions, revealing no information about the sender, recipient, transferred amount, or token type. The protocol further supports atomic execution of multiple multi-token transactions and introduces a transaction fee scheme to enable broader application scenarios while ensuring the sustainable operation of the protocol. To enforce correctness, Calyx adopts an efficient one-step fraud-proof mechanism. We analyze the security and privacy guarantees of the protocol and provide an implementation and evaluation. Our results show that executing a single transaction costs approximately $0.06 (0.00002 ETH) and incurs only constant-size on-chain cost in asymptotic terms.

Paperid: 1031, https://arxiv.org/pdf/2510.00125.pdf

Abstract:
Machine unlearning is an emerging technique that removes the influence of a subset of training data (forget set) from a model without full retraining, with applications including privacy protection, content moderation, and model correction. The key challenge lies in ensuring that the model completely forgets the knowledge of the forget set without compromising its overall utility. Existing unlearning methods for large language models (LLMs) often utilize auxiliary language models, retain datasets, or even commercial AI services for effective unlearning and maintaining the model utility. However, dependence on these external resources is often impractical and could potentially introduce additional privacy risks. In this work, we propose direct token optimization (DTO), a novel self-contained unlearning approach for LLMs that directly optimizes the token level objectives and eliminates the need for external resources. Given a sequence to unlearn, we identify two categories of tokens: target tokens, which capture critical knowledge for unlearning, and the remaining non-target tokens, which are crucial for maintaining the model utility. The former are used to optimize the unlearning objective, while the latter serve to preserve the model's performance. The experimental results show that the proposed DTO achieves up to 16.8$\times$ improvement in forget quality on several benchmark datasets than the latest baselines while maintaining a comparable level of model utility.

Paperid: 1032, https://arxiv.org/pdf/2509.26509.pdf

Abstract:
The ever-increasing complexity of design specifications for processors and intellectual property (IP) presents a formidable challenge for early bug detection in the modern IC design cycle. The recent advancements in hardware fuzzing have proven effective in detecting bugs in RTL designs of cutting-edge processors. The modern IC design flow involves incremental updates and modifications to the hardware designs necessitating rigorous verification and extending the overall verification period. To accelerate this process, directed fuzzing has emerged focusing on generating targeted stimuli for specific regions of the design, avoiding the need for exhaustive, full-scale verification. However, a significant limitation of these hardware fuzzers lies in their reliance on an equivalent SW model of the hardware which fails to capture intrinsic hardware characteristics. To circumvent the aforementioned challenges, this work introduces TargetFuzz, an innovative and scalable targeted hardware fuzzing mechanism. It leverages SAT-based techniques to focus on specific regions of the hardware design while operating at its native hardware abstraction level, ensuring a more precise and comprehensive verification process. We evaluated this approach across a diverse range of RTL designs for various IP cores. Our experimental results demonstrate its capability to effectively target and fuzz a broad spectrum of sites within these designs, showcasing its extensive coverage and precision in addressing targeted regions. TargetFuzz demonstrates its capability to effectively scale 30x greater in terms of handling target sites, achieving 100% state coverage and 1.5x faster in terms of site coverage, and shows 90x improvement in target state coverage compared to Coverage-Guided Fuzzing, demonstrating its potential to advance the state-of-the-art in directed hardware fuzzing.

Paperid: 1033, https://arxiv.org/pdf/2509.25514.pdf

Abstract:
Function name prediction is crucial for understanding stripped binaries in software reverse engineering, a key step for \textbf{enabling subsequent vulnerability analysis and patching}. However, existing approaches often struggle with architecture-specific limitations, data scarcity, and diverse naming conventions. We present AGNOMIN, a novel architecture-agnostic approach for multi-label function name prediction in stripped binaries. AGNOMIN builds Feature-Enriched Hierarchical Graphs (FEHGs), combining Control Flow Graphs, Function Call Graphs, and dynamically learned \texttt{PCode} features. A hierarchical graph neural network processes this enriched structure to generate consistent function representations across architectures, vital for \textbf{scalable security assessments}. For function name prediction, AGNOMIN employs a Renée-inspired decoder, enhanced with an attention-based head layer and algorithmic improvements. We evaluate AGNOMIN on a comprehensive dataset of 9,000 ELF executable binaries across three architectures, demonstrating its superior performance compared to state-of-the-art approaches, with improvements of up to 27.17\% in precision and 55.86\% in recall across the testing dataset. Moreover, AGNOMIN generalizes well to unseen architectures, achieving 5.89\% higher recall than the closest baseline. AGNOMIN's practical utility has been validated through security hackathons, where it successfully aided reverse engineers in analyzing and patching vulnerable binaries across different architectures.

Paperid: 1034, https://arxiv.org/pdf/2509.25476.pdf

Abstract:
The growing complexity of global supply chains has made hardware Trojans a significant threat in sensor-based power electronics. Traditional Trojan designs depend on digital triggers or fixed threshold conditions that can be detected during standard testing. In contrast, we introduce Environmental Rate Manipulation (ERM), a novel Trojan triggering mechanism that activates by monitoring the rate of change in environmental parameters rather than their absolute values. This approach allows the Trojan to remain inactive under normal conditions and evade redundancy and sensor-fusion defenses. We implement a compact 14~$Î¼$m$^2$ circuit that measures capacitor charging rates in standard sensor front-ends and disrupts inverter pulse-width modulation PWM signals when a rapid change is induced. Experiments on a commercial Texas Instruments solar inverter demonstrate that ERM can trigger catastrophic driver chip failure. Furthermore, ETAP simulations indicate that a single compromised 100~kW inverter may initiate cascading grid instabilities. The attack's significance extends beyond individual sensors to entire classes of environmental sensing systems common in power electronics, demonstrating fundamental challenges for hardware security.

Paperid: 1035, https://arxiv.org/pdf/2509.25410.pdf

Abstract:
Cybercrimes such as online scams and fraud have become prevalent. Cybercriminals often abuse various global or regional events as themes of their fraudulent activities to breach user trust and attain a higher attack success rate. These attacks attempt to manipulate and deceive innocent people into interacting with meticulously crafted websites with malicious payloads, phishing, or fraudulent transactions. To deepen our understanding of the problem, this paper investigates how to characterize event-themed malicious website-based campaigns, with a case study on war-themed websites. We find that attackers tailor their attacks by exploiting the unique aspects of events, as evidenced by activities such as fundraising, providing aid, collecting essential supplies, or seeking updated news. We use explainable unsupervised clustering methods to draw further insights, which could guide the design of effective early defenses against various event-themed malicious web campaigns.

Paperid: 1036, https://arxiv.org/pdf/2509.23459.pdf

Abstract:
Large language models (LLMs) have shown promising performance on tasks that require reasoning, such as text-to-SQL, code generation, and debugging. However, regulatory frameworks with strict privacy requirements constrain their integration into sensitive systems. State-of-the-art LLMs are also proprietary, costly, and resource-intensive, making local deployment impractical. Consequently, utilizing such LLMs often requires sharing data with third-party providers, raising privacy concerns and risking noncompliance with regulations. Although fine-tuned small language models (SLMs) can outperform LLMs on certain tasks and be deployed locally to mitigate privacy concerns, they underperform on more complex tasks such as text-to-SQL translation. In this work, we introduce MaskSQL, a text-to-SQL framework that utilizes abstraction as a privacy protection mechanism to mask sensitive information in LLM prompts. Unlike redaction, which removes content entirely, or generalization, which broadens tokens, abstraction retains essential information while discarding unnecessary details, striking an effective privacy-utility balance for the text-to-SQL task. Moreover, by providing mechanisms to control the privacy-utility tradeoff, MaskSQL facilitates adoption across a broader range of use cases. Our experimental results show that MaskSQL outperforms leading SLM-based text-to-SQL models and achieves performance approaching state-of-the-art LLM-based models, while preserving privacy.

Paperid: 1037, https://arxiv.org/pdf/2509.23019.pdf

Abstract:
Watermarking for large language models (LLMs) embeds a statistical signal during generation to enable detection of model-produced text. While watermarking has proven effective in benign settings, its robustness under adversarial evasion remains contested. To advance a rigorous understanding and evaluation of such vulnerabilities, we propose the \emph{Bias-Inversion Rewriting Attack} (BIRA), which is theoretically motivated and model-agnostic. BIRA weakens the watermark signal by suppressing the logits of likely watermarked tokens during LLM-based rewriting, without any knowledge of the underlying watermarking scheme. Across recent watermarking methods, BIRA achieves over 99\% evasion while preserving the semantic content of the original text. Beyond demonstrating an attack, our results reveal a systematic vulnerability, emphasizing the need for stress testing and robust defenses.

Paperid: 1038, https://arxiv.org/pdf/2509.20808.pdf

Abstract:
Hardware Fuzzing emerged as one of the crucial techniques for finding security flaws in modern hardware designs by testing a wide range of input scenarios. One of the main challenges is creating high-quality input seeds that maximize coverage and speed up verification. Coverage-Guided Fuzzing (CGF) methods help explore designs more effectively, but they struggle to focus on specific parts of the hardware. Existing Directed Gray-box Fuzzing (DGF) techniques like DirectFuzz try to solve this by generating targeted tests, but it has major drawbacks, such as supporting only limited hardware description languages, not scaling well to large circuits, and having issues with abstraction mismatches. To address these problems, we introduce a novel framework, PROFUZZ, that follows the DGF approach and combines fuzzing with Automatic Test Pattern Generation (ATPG) for more efficient fuzzing. By leveraging ATPG's structural analysis capabilities, PROFUZZ can generate precise input seeds that target specific design regions more effectively while maintaining high fuzzing throughput. Our experiments show that PROFUZZ scales 30x better than DirectFuzz when handling multiple target sites, improves coverage by 11.66%, and runs 2.76x faster, highlighting its scalability and effectiveness for directed fuzzing in complex hardware systems.

Paperid: 1039, https://arxiv.org/pdf/2509.20391.pdf

Abstract:
The growing integration of drones into civilian, commercial, and defense sectors introduces significant cybersecurity concerns, particularly with the increased risk of network-based intrusions targeting drone communication protocols. Detecting and classifying these intrusions is inherently challenging due to the dynamic nature of drone traffic and the presence of multiple sophisticated attack vectors such as spoofing, injection, replay, and man-in-the-middle (MITM) attacks. This research aims to develop a robust and interpretable intrusion detection framework tailored for drone networks, with a focus on handling multi-class classification and model explainability. We present a comparative analysis of ensemble-based machine learning models, namely Random Forest, Extra Trees, AdaBoost, CatBoost, and XGBoost, trained on a labeled dataset comprising benign traffic and nine distinct intrusion types. Comprehensive data preprocessing was performed, including missing value imputation, scaling, and categorical encoding, followed by model training and extensive evaluation using metrics such as macro F1-score, ROC AUC, Matthews Correlation Coefficient, and Log Loss. Random Forest achieved the highest performance with a macro F1-score of 0.9998 and ROC AUC of 1.0000. To validate the superiority of the models, statistical tests, including Friedmans test, the Wilcoxon signed-rank test with Holm correction, and bootstrapped confidence intervals, were applied. Furthermore, explainable AI methods, SHAP and LIME, were integrated to interpret both global and local feature importance, enhancing model transparency and decision trustworthiness. The proposed approach not only delivers near-perfect accuracy but also ensures interpretability, making it highly suitable for real-time and safety-critical drone operations.

Paperid: 1040, https://arxiv.org/pdf/2509.18934.pdf

Abstract:
Adversarial smart contracts, mostly on EVM-compatible chains like Ethereum and BSC, are deployed as EVM bytecode to exploit vulnerable smart contracts typically for financial gains. Detecting such malicious contracts at the time of deployment is an important proactive strategy preventing loss from victim contracts. It offers a better cost-benefit than detecting vulnerabilities on diverse potential victims. However, existing works are not generic with limited detection types and effectiveness due to imbalanced samples, while the emerging LLM technologies, which show its potentials in generalization, have two key problems impeding its application in this task: hard digestion of compiled-code inputs, especially those with task-specific logic, and hard assessment of LLMs' certainty in their binary answers, i.e., yes-or-no answers. Therefore, we propose a generic adversarial smart contracts detection framework FinDet, which leverages LLMs with two enhancements addressing above two problems. FinDet takes as input only the EVM-bytecode contracts and identifies adversarial ones among them with high balanced accuracy. The first enhancement extracts concise semantic intentions and high-level behavioral logic from the low-level bytecode inputs, unleashing the LLM reasoning capability restricted by the task input. The second enhancement probes and measures the LLM uncertainty to its multi-round answering to the same query, improving the LLM answering robustness for binary classifications required by the task output. Our comprehensive evaluation shows that FinDet achieves a BAC of 0.9223 and a TPR of 0.8950, significantly outperforming existing baselines. It remains robust under challenging conditions including unseen attack patterns, low-data settings, and feature obfuscation. FinDet detects all 5 public and 20+ unreported adversarial contracts in a 10-day real-world test, confirmed manually.

Paperid: 1041, https://arxiv.org/pdf/2509.17722.pdf

Abstract:
Public key encryption with equality test (PKEET), proposed by Yang et al. (CT-RSA 2010), is a variant of public key encryption that enables an equality test to determine whether two ciphertexts correspond to the same plaintext. This test applies not only for ciphertexts generated under the same encryption key but also for those generated under different encryption keys. To date, several generic constructions of PKEET have been proposed. However, these generic constructions have the drawback of reliance on the random oracle model or a (hierarchical) identity-based encryption scheme. In this paper, we propose a generic construction of a PKEET scheme based on tag-based encryption without the random oracle model. Tag-based encryption is a weaker primitive than identity-based encryption. Our scheme allows to derive new PKEET schemes without the random oracle model. By instantiating our construction with the pairing-free tag-based encryption scheme by Kiltz (TCC 2006), we obtain a pairing-free PKEET scheme without the random oracle model. Moreover, by instantiating our construction with a tag-based encryption scheme based on the learning parity with noise (LPN) assumption, we obtain a PKEET scheme based on the LPN assumption without the random oracle model.

Paperid: 1042, https://arxiv.org/pdf/2509.17709.pdf

Abstract:
An ordered multi-signature scheme allows multiple signers to sign a common message in a sequential manner and allows anyone to verify the signing order of signers with a public-key list. In this work, we propose an ordered multi-signature scheme by modifying the sequential aggregate signature scheme by Chatterjee and Kabaleeshwaran (ACISP 2020). Our scheme offers compact public parameter size and the public-key aggregation property. This property allows us to compress a public-key list into a short aggregated key. We prove the security of our scheme under the symmetric external Diffie-Hellman (SXDH) assumption without the random oracle model.

Paperid: 1043, https://arxiv.org/pdf/2509.16625.pdf

Abstract:
Detecting intrusions in network traffic is a challenging task, particularly under limited supervision and constantly evolving attack patterns. While recent works have leveraged graph neural networks for network intrusion detection, they often decouple representation learning from anomaly detection, limiting the utility of the embeddings for identifying attacks. We propose GraphIDS, a self-supervised intrusion detection model that unifies these two stages by learning local graph representations of normal communication patterns through a masked autoencoder. An inductive graph neural network embeds each flow with its local topological context to capture typical network behavior, while a Transformer-based encoder-decoder reconstructs these embeddings, implicitly learning global co-occurrence patterns via self-attention without requiring explicit positional information. During inference, flows with unusually high reconstruction errors are flagged as potential intrusions. This end-to-end framework ensures that embeddings are directly optimized for the downstream task, facilitating the recognition of malicious traffic. On diverse NetFlow benchmarks, GraphIDS achieves up to 99.98% PR-AUC and 99.61% macro F1-score, outperforming baselines by 5-25 percentage points.

Paperid: 1044, https://arxiv.org/pdf/2509.15572.pdf

Abstract:
Modern AI-powered Integrated Development Environments (AI-IDEs) are increasingly defined by an Agent-centric architecture, where an LLM-powered Agent is deeply integrated to autonomously execute complex tasks. This tight integration, however, also introduces a new and critical attack surface. Attackers can exploit these components by injecting malicious instructions into untrusted external sources, effectively hijacking the Agent to perform harmful operations beyond the user's intention or awareness. This emerging threat has quickly attracted research attention, leading to various proposed attack vectors, such as hijacking Model Context Protocol (MCP) Servers to access private data. However, most existing approaches lack stealth and persistence, limiting their practical impact. We propose the Cuckoo Attack, a novel attack that achieves stealthy and persistent command execution by embedding malicious payloads into configuration files. These files, commonly used in AI-IDEs, execute system commands during routine operations, without displaying execution details to the user. Once configured, such files are rarely revisited unless an obvious runtime error occurs, creating a blind spot for attackers to exploit. We formalize our attack paradigm into two stages, including initial infection and persistence. Based on these stages, we analyze the practicality of the attack execution process and identify the relevant exploitation techniques. Furthermore, we analyze the impact of Cuckoo Attack, which can not only invade the developer's local computer but also achieve supply chain attacks through the spread of configuration files. We contribute seven actionable checkpoints for vendors to evaluate their product security. The critical need for these checks is demonstrated by our end-to-end Proof of Concept, which validated the proposed attack across nine mainstream Agent and AI-IDE pairs.

Paperid: 1045, https://arxiv.org/pdf/2509.13021.pdf

Abstract:
This work introduces xOffense, an AI-driven, multi-agent penetration testing framework that shifts the process from labor-intensive, expert-driven manual efforts to fully automated, machine-executable workflows capable of scaling seamlessly with computational infrastructure. At its core, xOffense leverages a fine-tuned, mid-scale open-source LLM (Qwen3-32B) to drive reasoning and decision-making in penetration testing. The framework assigns specialized agents to reconnaissance, vulnerability scanning, and exploitation, with an orchestration layer ensuring seamless coordination across phases. Fine-tuning on Chain-of-Thought penetration testing data further enables the model to generate precise tool commands and perform consistent multi-step reasoning. We evaluate xOffense on two rigorous benchmarks: AutoPenBench and AI-Pentest-Benchmark. The results demonstrate that xOffense consistently outperforms contemporary methods, achieving a sub-task completion rate of 79.17%, decisively surpassing leading systems such as VulnBot and PentestGPT. These findings highlight the potential of domain-adapted mid-scale LLMs, when embedded within structured multi-agent orchestration, to deliver superior, cost-efficient, and reproducible solutions for autonomous penetration testing.

Paperid: 1046, https://arxiv.org/pdf/2509.11864.pdf

Abstract:
Safety alignment is critical for the ethical deployment of large language models (LLMs), guiding them to avoid generating harmful or unethical content. Current alignment techniques, such as supervised fine-tuning and reinforcement learning from human feedback, remain fragile and can be bypassed by carefully crafted adversarial prompts. Unfortunately, such attacks rely on trial and error, lack generalizability across models, and are constrained by scalability and reliability. This paper presents NeuroStrike, a novel and generalizable attack framework that exploits a fundamental vulnerability introduced by alignment techniques: the reliance on sparse, specialized safety neurons responsible for detecting and suppressing harmful inputs. We apply NeuroStrike to both white-box and black-box settings: In the white-box setting, NeuroStrike identifies safety neurons through feedforward activation analysis and prunes them during inference to disable safety mechanisms. In the black-box setting, we propose the first LLM profiling attack, which leverages safety neuron transferability by training adversarial prompt generators on open-weight surrogate models and then deploying them against black-box and proprietary targets. We evaluate NeuroStrike on over 20 open-weight LLMs from major LLM developers. By removing less than 0.6% of neurons in targeted layers, NeuroStrike achieves an average attack success rate (ASR) of 76.9% using only vanilla malicious prompts. Moreover, Neurostrike generalizes to four multimodal LLMs with 100% ASR on unsafe image inputs. Safety neurons transfer effectively across architectures, raising ASR to 78.5% on 11 fine-tuned models and 77.7% on five distilled models. The black-box LLM profiling attack achieves an average ASR of 63.7% across five black-box models, including the Google Gemini family.

Paperid: 1047, https://arxiv.org/pdf/2509.11187.pdf

Abstract:
In recent years, learning-based Android malware detection has seen significant advancements, with detectors generally falling into three categories: string-based, image-based, and graph-based approaches. While these methods have shown strong detection performance, they often struggle to sustain robustness in real-world settings, particularly when facing code obfuscation and adversarial examples (AEs). Deep multimodal learning has emerged as a promising solution, leveraging the strengths of multiple feature types to enhance robustness and generalization. However, a systematic investigation of multimodal fusion for both accuracy and resilience remains underexplored. In this study, we propose DMLDroid, an Android malware detection based on multimodal fusion that leverages three different representations of malware features, including permissions & intents (tabular-based), DEX file representations (image-based), and API calls (graph-derived sequence-based). We conduct exhaustive experiments independently on each feature, as well as in combination, using different fusion strategies. Experimental results on the CICMalDroid 2020 dataset demonstrate that our multimodal approach with the dynamic weighted fusion mechanism achieves high performance, reaching 97.98% accuracy and 98.67% F1-score on original malware detection. Notably, the proposed method maintains strong robustness, sustaining over 98% accuracy and 98% F1-score under both obfuscation and adversarial attack scenarios. Our findings highlight the benefits of multimodal fusion in improving both detection accuracy and robustness against evolving Android malware threats.

Paperid: 1048, https://arxiv.org/pdf/2509.09942.pdf

Abstract:
Smart contracts automate the management of high-value assets, where vulnerabilities can lead to catastrophic financial losses. This challenge is amplified in Large Language Models (LLMs) by two interconnected failures: they operate as unauditable "black boxes" lacking a transparent reasoning process, and consequently, generate code riddled with critical security vulnerabilities. To address both issues, we propose SmartCoder-R1 (based on Qwen2.5-Coder-7B), a novel framework for secure and explainable smart contract generation. It begins with Continual Pre-training (CPT) to specialize the model. We then apply Long Chain-of-Thought Supervised Fine-Tuning (L-CoT SFT) on 7,998 expert-validated reasoning-and-code samples to train the model to emulate human security analysis. Finally, to directly mitigate vulnerabilities, we employ Security-Aware Group Relative Policy Optimization (S-GRPO), a reinforcement learning phase that refines the generation policy by optimizing a weighted reward signal for compilation success, security compliance, and format correctness. Evaluated against 17 baselines on a benchmark of 756 real-world functions, SmartCoder-R1 establishes a new state of the art, achieving top performance across five key metrics: a ComPass of 87.70%, a VulRate of 8.60%, a SafeAval of 80.16%, a FuncRate of 53.84%, and a FullRate of 50.53%. This FullRate marks a 45.79% relative improvement over the strongest baseline, DeepSeek-R1. Crucially, its generated reasoning also excels in human evaluations, achieving high-quality ratings for Functionality (82.7%), Security (85.3%), and Clarity (90.7%).

Paperid: 1049, https://arxiv.org/pdf/2509.06548.pdf

Abstract:
Malware classification is a contemporary and ongoing challenge in cyber-security: modern obfuscation techniques are able to evade traditional static analysis, while dynamic analysis is too resource intensive to be deployed at a large scale. One prominent line of research addresses these limitations by converting malware binaries into 2D images by heuristically reshaping them into a 2D grid before resizing using Lanczos resampling. These images can then be classified based on their textural information using computer vision approaches. While this approach can detect obfuscated malware more effectively than static analysis, the process of converting files into 2D images results in significant information loss due to both quantisation noise, caused by rounding to integer pixel values, and the introduction of 2D dependencies which do not exist in the original data. This loss of signal limits the classification performance of the downstream model. This work addresses these weaknesses by instead resizing the files into 1D signals which avoids the need for heuristic reshaping, and additionally these signals do not suffer from quantisation noise due to being stored in a floating-point format. It is shown that existing 2D CNN architectures can be readily adapted to classify these 1D signals for improved performance. Furthermore, a bespoke 1D convolutional neural network, based on the ResNet architecture and squeeze-and-excitation layers, was developed to classify these signals and evaluated on the MalNet dataset. It was found to achieve state-of-the-art performance on binary, type, and family level classification with F1 scores of 0.874, 0.503, and 0.507, respectively, paving the way for future models to operate on the proposed signal modality.

Paperid: 1050, https://arxiv.org/pdf/2509.05835.pdf

Abstract:
As generative audio models are rapidly evolving, AI-generated audios increasingly raise concerns about copyright infringement and misinformation spread. Audio watermarking, as a proactive defense, can embed secret messages into audio for copyright protection and source verification. However, current neural audio watermarking methods focus primarily on the imperceptibility and robustness of watermarking, while ignoring its vulnerability to security attacks. In this paper, we develop a simple yet powerful attack: the overwriting attack that overwrites the legitimate audio watermark with a forged one and makes the original legitimate watermark undetectable. Based on the audio watermarking information that the adversary has, we propose three categories of overwriting attacks, i.e., white-box, gray-box, and black-box attacks. We also thoroughly evaluate the proposed attacks on state-of-the-art neural audio watermarking methods. Experimental results demonstrate that the proposed overwriting attacks can effectively compromise existing watermarking schemes across various settings and achieve a nearly 100% attack success rate. The practicality and effectiveness of the proposed overwriting attacks expose security flaws in existing neural audio watermarking systems, underscoring the need to enhance security in future audio watermarking designs.

Paperid: 1051, https://arxiv.org/pdf/2509.05332.pdf

Abstract:
Autonomous vehicles (AVs) rely on complex perception and communication systems, making them vulnerable to adversarial attacks that can compromise safety. While simulation offers a scalable and safe environment for robustness testing, existing frameworks typically lack comprehensive supportfor modeling multi-domain adversarial scenarios. This paper introduces a novel, open-source integrated simulation framework designed to generate adversarial attacks targeting both perception and communication layers of AVs. The framework provides high-fidelity modeling of physical environments, traffic dynamics, and V2X networking, orchestrating these components through a unified core that synchronizes multiple simulators based on a single configuration file. Our implementation supports diverse perception-level attacks on LiDAR sensor data, along with communication-level threats such as V2X message manipulation and GPS spoofing. Furthermore, ROS 2 integration ensures seamless compatibility with third-party AV software stacks. We demonstrate the framework's effectiveness by evaluating the impact of generated adversarial scenarios on a state-of-the-art 3D object detector, revealing significant performance degradation under realistic conditions.

Paperid: 1052, https://arxiv.org/pdf/2509.04925.pdf

Abstract:
In today's fast-paced digital communication, the surge in network traffic data and frequency demands robust and precise network intrusion solutions. Conventional machine learning methods struggle to grapple with complex patterns within the vast network intrusion datasets, which suffer from data scarcity and class imbalance. As a result, we have integrated machine learning and deep learning techniques within the network intrusion detection system to bridge this gap. This study has developed TrailGate, a novel framework that combines machine learning and deep learning techniques. By integrating Transformer and Bidirectional Gated Recurrent Unit (BiGRU) architectures with advanced feature selection strategies and supplemented by data augmentation techniques, TrailGate can identifies common attack types and excels at detecting and mitigating emerging threats. This algorithmic fusion excels at detecting common and well-understood attack types and has the unique ability to swiftly identify and neutralize emerging threats that stem from existing paradigms.

Paperid: 1053, https://arxiv.org/pdf/2509.02289.pdf

Abstract:
Password managers provide significant security benefits to users. However, malicious client-side scripts and browser extensions can steal passwords after the manager has autofilled them into the web page. In this paper, we extend prior work by Stock and Johns, showing how password autofill can be hardened to prevent these local attacks. We implement our design in the Firefox browser and conduct experiments demonstrating that our defense successfully protects passwords from XSS attacks and malicious extensions. We also show that our implementation is compatible with 97% of the Alexa top 1000 websites. Next, we generalize our design, creating a second defense that prevents recently discovered local attacks against the FIDO2 protocols. We implement this second defense into Firefox, demonstrating that it protects the FIDO2 protocol against XSS attacks and malicious extensions. This defense is compatible with all websites, though it does require a small change (2-3 lines) to web servers implementing FIDO2.

Paperid: 1054, https://arxiv.org/pdf/2509.01835.pdf

Abstract:
High-quality datasets of real-world vulnerabilities and their corresponding verifiable exploits are crucial resources in software security research. Yet such resources remain scarce, as their creation demands intensive manual effort and deep security expertise. In this paper, we present CVE-GENIE, an automated, large language model (LLM)-based multi-agent framework designed to reproduce real-world vulnerabilities, provided in Common Vulnerabilities and Exposures (CVE) format, to enable creation of high-quality vulnerability datasets. Given a CVE entry as input, CVE-GENIE gathers the relevant resources of the CVE, automatically reconstructs the vulnerable environment, and (re)produces a verifiable exploit. Our systematic evaluation highlights the efficiency and robustness of CVE-GENIE's design and successfully reproduces approximately 51% (428 of 841) CVEs published in 2024-2025, complete with their verifiable exploits, at an average cost of $2.77 per CVE. Our pipeline offers a robust method to generate reproducible CVE benchmarks, valuable for diverse applications such as fuzzer evaluation, vulnerability patching, and assessing AI's security capabilities.

Paperid: 1055, https://arxiv.org/pdf/2508.19488.pdf

Abstract:
Cyber defense requires automating defensive decision-making under stealthy, deceptive, and continuously evolving adversarial strategies. The FlipIt game provides a foundational framework for modeling interactions between a defender and an advanced adversary that compromises a system without being immediately detected. In FlipIt, the attacker and defender compete to control a shared resource by performing a Flip action and paying a cost. However, the existing FlipIt frameworks rely on a small number of heuristics or specialized learning techniques, which can lead to brittleness and the inability to adapt to new attacks. To address these limitations, we introduce PoolFlip, a multi-agent gym environment that extends the FlipIt game to allow efficient learning for attackers and defenders. Furthermore, we propose Flip-PSRO, a multi-agent reinforcement learning (MARL) approach that leverages population-based training to train defender agents equipped to generalize against a range of unknown, potentially adaptive opponents. Our empirical results suggest that Flip-PSRO defenders are $2\times$ more effective than baselines to generalize to a heuristic attack not exposed in training. In addition, our newly designed ownership-based utility functions ensure that Flip-PSRO defenders maintain a high level of control while optimizing performance.

Paperid: 1056, https://arxiv.org/pdf/2508.19324.pdf

Abstract:
ICAO-compliant facial images, initially designed for secure biometric passports, are increasingly becoming central to identity verification in a wide range of application contexts, including border control, digital travel credentials, and financial services. While their standardization enables global interoperability, it also facilitates practices such as morphing and deepfakes, which can be exploited for harmful purposes like identity theft and illegal sharing of identity documents. Traditional countermeasures like Presentation Attack Detection (PAD) are limited to real-time capture and offer no post-capture protection. This survey paper investigates digital watermarking and steganography as complementary solutions that embed tamper-evident signals directly into the image, enabling persistent verification without compromising ICAO compliance. We provide the first comprehensive analysis of state-of-the-art techniques to evaluate the potential and drawbacks of the underlying approaches concerning the applications involving ICAO-compliant images and their suitability under standard constraints. We highlight key trade-offs, offering guidance for secure deployment in real-world identity systems.

Paperid: 1057, https://arxiv.org/pdf/2508.11824.pdf

Abstract:
The integration of Large Language Models (LLMs) into software engineering has revolutionized code generation, enabling unprecedented productivity through promptware and autonomous AI agents. However, this transformation introduces significant risks, including insecure code generation, hallucinated outputs, irreversible actions, and a lack of transparency and accountability. Incidents like the Replit database deletion underscore the urgent need for robust safety and governance mechanisms. This paper comprehensively analyzes the inherent challenges of LLM-assisted code generation, such as vulnerability inheritance, overtrust, misinterpretation, and the absence of standardized validation and rollback protocols. To address these, we propose the SAFE-AI Framework, a holistic approach emphasizing Safety, Auditability, Feedback, and Explainability. The framework integrates guardrails, sandboxing, runtime verification, risk-aware logging, human-in-the-loop systems, and explainable AI techniques to mitigate risks while fostering trust and compliance. We introduce a novel taxonomy of AI behaviors categorizing suggestive, generative, autonomous, and destructive actions to guide risk assessment and oversight. Additionally, we identify open problems, including the lack of standardized benchmarks for code specific hallucinations and autonomy levels, and propose future research directions for hybrid verification, semantic guardrails, and proactive governance tools. Through detailed comparisons of autonomy control, prompt engineering, explainability, and governance frameworks, this paper provides a roadmap for responsible AI integration in software engineering, aligning with emerging regulations like the EU AI Act and Canada's AIDA to ensure safe, transparent, and accountable AI-driven development.

Paperid: 1058, https://arxiv.org/pdf/2507.22371.pdf

Abstract:
With the increasing security issues in blockchain, smart contract vulnerability detection has become a research focus. Existing vulnerability detection methods have their limitations: 1) Static analysis methods struggle with complex scenarios. 2) Methods based on specialized pre-trained models perform well on specific datasets but have limited generalization capabilities. In contrast, general-purpose Large Language Models (LLMs) demonstrate impressive ability in adapting to new vulnerability patterns. However, they often underperform on specific vulnerability types compared to methods based on specialized pre-trained models. We also observe that explanations generated by general-purpose LLMs can provide fine-grained code understanding information, contributing to improved detection performance. Inspired by these observations, we propose SAEL, an LLM-based framework for smart contract vulnerability detection. We first design targeted prompts to guide LLMs in identifying vulnerabilities and generating explanations, which serve as prediction features. Next, we apply prompt-tuning on CodeT5 and T5 to process contract code and explanations, enhancing task-specific performance. To combine the strengths of each approach, we introduce an Adaptive Mixture-of-Experts architecture. This dynamically adjusts feature weights via a Gating Network, which selects relevant features using TopK filtering and Softmax normalization, and incorporates a Multi-Head Self-Attention mechanism to enhance cross-feature relationships. This design enables effective integration of LLM predictions, explanation features, and code features through gradient optimization. The loss function jointly considers both independent feature performance and overall weighted predictions. Experiments show that SAEL outperforms existing methods across various vulnerabilities.

Paperid: 1059, https://arxiv.org/pdf/2507.20014.pdf

Abstract:
As AI-driven dataspaces become integral to data sharing and collaborative analytics, ensuring privacy, performance, and policy compliance presents significant challenges. This paper provides a comprehensive review of privacy-preserving and policy-aware AI techniques, including Federated Learning, Differential Privacy, Trusted Execution Environments, Homomorphic Encryption, and Secure Multi-Party Computation, alongside strategies for aligning AI with regulatory frameworks such as GDPR and the EU AI Act. We propose a novel taxonomy to classify these techniques based on privacy levels, performance impacts, and compliance complexity, offering a clear framework for practitioners and researchers to navigate trade-offs. Key performance metrics -- latency, throughput, cost overhead, model utility, fairness, and explainability -- are analyzed to highlight the multi-dimensional optimization required in dataspaces. The paper identifies critical research gaps, including the lack of standardized privacy-performance KPIs, challenges in explainable AI for federated ecosystems, and semantic policy enforcement amidst regulatory fragmentation. Future directions are outlined, proposing a conceptual framework for policy-driven alignment, automated compliance validation, standardized benchmarking, and integration with European initiatives like GAIA-X, IDS, and Eclipse EDC. By synthesizing technical, ethical, and regulatory perspectives, this work lays the groundwork for developing trustworthy, efficient, and compliant AI systems in dataspaces, fostering innovation in secure and responsible data-driven ecosystems.

Paperid: 1060, https://arxiv.org/pdf/2507.14853.pdf

Abstract:
Federated Learning (FL) enables collaborative model training without sharing raw data, making it a promising approach for privacy-sensitive domains. Despite its potential, FL faces significant challenges, particularly in terms of communication overhead and data privacy. Privacy-preserving Techniques (PPTs) such as Homomorphic Encryption (HE) have been used to mitigate these concerns. However, these techniques introduce substantial computational and communication costs, limiting their practical deployment. In this work, we explore how Hybrid Homomorphic Encryption (HHE), a cryptographic protocol that combines symmetric encryption with HE, can be effectively integrated with FL to address both communication and privacy challenges, paving the way for scalable and secure decentralized learning system.

Paperid: 1061, https://arxiv.org/pdf/2507.10927.pdf

Abstract:
Cloud storage introduces critical privacy challenges for encrypted data retrieval, where fuzzy multi-keyword search enables approximate matching while preserving data confidentiality. Existing solutions face fundamental trade-offs between security and efficiency: linear-search mechanisms provide adaptive security but incur prohibitive overhead for large-scale data, while tree-based indexes improve performance at the cost of branch leakage vulnerabilities. To address these limitations, we propose DVFS - a dynamic verifiable fuzzy search service with three core innovations: (1) An \textit{adaptive-secure fuzzy search} method integrating locality-sensitive hashing with virtual binary trees, eliminating branch leakage while reducing search complexity from linear to sublinear ($O(\log n)$ time); (2) A \textit{dual-repository version control} mechanism supporting dynamic updates with forward privacy, preventing information leakage during operations; (3) A \textit{blockchain-based verification system} that ensures correctness and completeness via smart contracts, achieving $O(\log n)$ verification complexity. Our solution advances secure encrypted retrieval by simultaneously resolving the security-performance paradox and enabling trustworthy dynamic operations.

Paperid: 1062, https://arxiv.org/pdf/2507.10494.pdf

Abstract:
Split Learning (SL) -- splits a model into two distinct parts to help protect client data while enhancing Machine Learning (ML) processes. Though promising, SL has proven vulnerable to different attacks, thus raising concerns about how effective it may be in terms of data privacy. Recent works have shown promising results for securing SL through the use of a novel paradigm, named Function Secret Sharing (FSS), in which servers obtain shares of a function they compute and operate on a public input hidden with a random mask. However, these works fall short in addressing the rising number of attacks which exist on SL. In SplitHappens, we expand the combination of FSS and SL to U-shaped SL. Similarly to other works, we are able to make use of the benefits of SL by reducing the communication and computational costs of FSS. However, a U-shaped SL provides a higher security guarantee than previous works, allowing a client to keep the labels of the training data secret, without having to share them with the server. Through this, we are able to generalize the security analysis of previous works and expand it to different attack vectors, such as modern model inversion attacks as well as label inference attacks. We tested our approach for two different convolutional neural networks on different datasets. These experiments show the effectiveness of our approach in reducing the training time as well as the communication costs when compared to simply using FSS while matching prior accuracy.

Paperid: 1063, https://arxiv.org/pdf/2507.09580.pdf

Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities across a variety of domains. However, their applications in cryptography, which serves as a foundational pillar of cybersecurity, remain largely unexplored. To address this gap, we propose \textbf{AICrypto}, the first comprehensive benchmark designed to evaluate the cryptographic capabilities of LLMs. The benchmark comprises 135 multiple-choice questions, 150 capture-the-flag (CTF) challenges, and 18 proof problems, covering a broad range of skills from factual memorization to vulnerability exploitation and formal reasoning. All tasks are carefully reviewed or constructed by cryptography experts to ensure correctness and rigor. To support automated evaluation of CTF challenges, we design an agent-based framework. To gain deeper insight into the current state of cryptographic proficiency in LLMs, we introduce human expert performance baselines for comparison across all task types. Our evaluation of 17 leading LLMs reveals that state-of-the-art models match or even surpass human experts in memorizing cryptographic concepts, exploiting common vulnerabilities, and routine proofs. However, they still lack a deep understanding of abstract mathematical concepts and struggle with tasks that require multi-step reasoning and dynamic analysis. We hope this work could provide insights for future research on LLMs in cryptographic applications. Our code and dataset are available at https://aicryptobench.github.io.

Paperid: 1064, https://arxiv.org/pdf/2507.06260.pdf

Abstract:
Nova Premier is Amazon's most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt. We present the first comprehensive evaluation of Nova Premier's critical risk profile under the Frontier Model Safety Framework. Evaluations target three high-risk domains -- Chemical, Biological, Radiological & Nuclear (CBRN), Offensive Cyber Operations, and Automated AI R&D -- and combine automated benchmarks, expert red-teaming, and uplift studies to determine whether the model exceeds release thresholds. We summarize our methodology and report core findings. Based on this evaluation, we find that Nova Premier is safe for public release as per our commitments made at the 2025 Paris AI Safety Summit. We will continue to enhance our safety evaluation and mitigation pipelines as new risks and capabilities associated with frontier models are identified.

Paperid: 1065, https://arxiv.org/pdf/2507.04457.pdf

Abstract:
Differentially private (DP) optimization has been widely adopted as a standard approach to provide rigorous privacy guarantees for training datasets. DP auditing verifies whether a model trained with DP optimization satisfies its claimed privacy level by estimating empirical privacy lower bounds through hypothesis testing. Recent O(1) frameworks improve auditing efficiency by checking the membership status of multiple audit samples in a single run, rather than checking individual samples across multiple runs. However, we reveal that there is no free lunch for this improved efficiency: data dependency and an implicit conflict between auditing and utility impair the tightness of the auditing results. Addressing these challenges, our key insights include reducing data dependency through uncorrelated data and resolving the auditing-utility conflict by decoupling the criteria for effective auditing and separating objectives for utility and auditing. We first propose a unified framework, UniAud, for data-independent auditing that maximizes auditing power through a novel uncorrelated canary construction and a self-comparison framework. We then extend this framework as UniAud++ for data-dependent auditing, optimizing the auditing and utility trade-off through multi-task learning with separate objectives for auditing and training. Experimental results validate that our black-box O(1) framework matches the state-of-the-art auditing results of O(T) auditing with thousands of runs, demonstrating the best efficiency-auditing trade-off across vision and language tasks. Additionally, our framework provides meaningful auditing with only slight utility degradation compared to standard DP training, showing the optimal utility-auditing trade-off and the benefit of requiring no extra training for auditing.

Paperid: 1066, https://arxiv.org/pdf/2507.04214.pdf

Abstract:
Cellular networks serve billions of users globally, yet concerns about reliability and security persist due to weaknesses in 3GPP standards. However, traditional analysis methods, including manual inspection and automated tools, struggle with increasingly expanding cellular network specifications. This paper investigates the feasibility of Large Language Models (LLMs) for automated cellular network specification refinement. To advance it, we leverage 200,000+ approved 3GPP Change Requests (CRs) that document specification revisions, constructing a valuable dataset for domain tasks. We introduce CR-eval, a principled evaluation framework, and benchmark 16 state-of-the-art LLMs, demonstrating that top models can discover security-related weaknesses in over 127 out of 200 test cases within five trials. To bridge potential gaps, we explore LLM specialization techniques, including fine-tuning an 8B model to match or surpass advanced LLMs like GPT-4o and DeepSeek-R1. Evaluations on 30 cellular attacks identify open challenges for achieving full automation. These findings confirm that LLMs can automate the refinement of cellular network specifications and provide valuable insights to guide future research in this direction.

Paperid: 1067, https://arxiv.org/pdf/2507.02735.pdf

Abstract:
Prompt injection attacks pose a significant security threat to LLM-integrated applications. Model-level defenses have shown strong effectiveness, but are currently deployed into commercial-grade models in a closed-source manner. We believe open-source models are needed by the AI security community, where co-development of attacks and defenses through open research drives scientific progress in mitigation against prompt injection attacks. To this end, we develop Meta SecAlign, the first open-source and open-weight LLM with built-in model-level defense that achieves commercial-grade model performance. We provide complete details of our training recipe, which utilizes an improved version of the SOTA SecAlign defense. Evaluations on 9 utility benchmarks and 7 security benchmarks show that Meta SecAlign, despite being trained on a generic instruction-tuning dataset, confers security in unseen downstream tasks, including tool-calling and agentic web navigation, in addition general instruction-following. Our best model -- Meta-SecAlign-70B -- achieves state-of-the-art robustness against prompt injection attacks and comparable utility to closed-source commercial LLM with model-level defense.

Paperid: 1068, https://arxiv.org/pdf/2507.01216.pdf

Abstract:
There is a huge gap between numerous intriguing applications fostered by on-device large language model (LLM) fine-tuning (FT) from fresh mobile data and the limited resources of a mobile device. While existing server-assisted methods (e.g., split learning or side-tuning) may enable LLM FT on the local mobile device, they suffer from heavy communication burdens of activation transmissions, and may disclose data and labels to the server. To address those issues, we develop PAE MobiLLM, a a privacy-aware and efficient LLM FT method which can be deployed on the mobile device via server-assisted additive side-tuning. To further accelerate FT convergence and improve computing efficiency, PAE MobiLLM integrates activation caching on the server side, which allows the server to reuse historical activations and saves the mobile device from repeatedly computing forward passes for the recurring data samples. Besides, to reduce communication cost, PAE MobiLLM develops an activation shortcut that transmits only the token involved in the loss calculation instead of full activation matrices to guide the side network tuning. Last but not least, PAE MobiLLM introduces the additive adapter side-network design which makes the server train the adapter modules based on device-defined prediction differences rather than raw ground-truth labels. In this way, the server can only assist device-defined side-network computing, and learn nothing about data and labels. Extensive experimental results demonstrate PAE MobiLLM's superiority.

Paperid: 1069, https://arxiv.org/pdf/2507.00829.pdf

Abstract:
This paper presents a critical examination of the surprising efficacy of Large Language Models (LLMs) in penetration testing. The paper thoroughly reviews the evolution of LLMs and their rapidly expanding capabilities which render them increasingly suitable for complex penetration testing operations. It systematically details the historical adoption of LLMs in both academic research and industry, showcasing their application across various offensive security tasks and covering broader phases of the cyber kill chain. Crucially, the analysis also extends to the observed adoption of LLMs by malicious actors, underscoring the inherent dual-use challenge of this technology within the security landscape. The unexpected effectiveness of LLMs in this context is elucidated by several key factors: the strong alignment between penetration testing's reliance on pattern-matching and LLMs' core strengths, their inherent capacity to manage uncertainty in dynamic environments, and cost-effective access to competent pre-trained models through LLM providers. The current landscape of LLM-aided penetration testing is categorized into interactive 'vibe-hacking' and the emergence of fully autonomous systems. The paper identifies and discusses significant obstacles impeding wider adoption and safe deployment. These include critical issues concerning model reliability and stability, paramount safety and security concerns, substantial monetary and ecological costs, implications for privacy and digital sovereignty, complex questions of accountability, and profound ethical dilemmas. This comprehensive review and analysis provides a foundation for discussion on future research directions and the development of robust safeguards at the intersection of AI and security.

Paperid: 1070, https://arxiv.org/pdf/2512.24925.pdf

Abstract:
Existing research on generative AI security is primarily driven by mutually reinforcing attack and defense methodologies grounded in empirical experience. This dynamic frequently gives rise to previously unknown attacks that can circumvent current detection and prevention. This necessitates the continual updating of security mechanisms. Constructing generative AI with provable security and theoretically controllable risk is therefore necessary. Consensus Sampling (CS) is a promising algorithm toward provably secure AI. It controls risk by leveraging overlap in model output probabilities. However, we find that CS relies on frequent abstention to avoid unsafe outputs, which reduces utility. Moreover, CS becomes highly vulnerable when unsafe models are maliciously manipulated. To address these issues, we propose a new primitive called Reliable Consensus Sampling (RCS), that traces acceptance probability to tolerate extreme adversarial behaviors, improving robustness. RCS also eliminates the need for abstention entirely. We further develop a feedback algorithm to continuously and dynamically enhance the safety of RCS. We provide theoretical guarantees that RCS maintains a controllable risk threshold. Extensive experiments show that RCS significantly improves robustness and utility while maintaining latency comparable to CS. We hope this work contributes to the development of provably secure generative AI.

Paperid: 1071, https://arxiv.org/pdf/2512.23171.pdf

Abstract:
Federated unlearning has become an attractive approach to address privacy concerns in collaborative machine learning, for situations when sensitive data is remembered by AI models during the machine learning process. It enables the removal of specific data influences from trained models, aligning with the growing emphasis on the "right to be forgotten." While extensively studied in horizontal federated learning, unlearning in vertical federated learning (VFL) remains challenging due to the distributed feature architecture. VFL unlearning includes sample unlearning that removes specific data points' influence and label unlearning that removes entire classes. Since different parties hold complementary features of the same samples, unlearning tasks require cross-party coordination, creating computational overhead and complexities from feature interdependencies. To address such challenges, we propose FedORA (Federated Optimization for data Removal via primal-dual Algorithm), designed for sample and label unlearning in VFL. FedORA formulates the removal of certain samples or labels as a constrained optimization problem solved using a primal-dual framework. Our approach introduces a new unlearning loss function that promotes classification uncertainty rather than misclassification. An adaptive step size enhances stability, while an asymmetric batch design, considering the prior influence of the remaining data on the model, handles unlearning and retained data differently to efficiently reduce computational costs. We provide theoretical analysis proving that the model difference between FedORA and Train-from-scratch is bounded, establishing guarantees for unlearning effectiveness. Experiments on tabular and image datasets demonstrate that FedORA achieves unlearning effectiveness and utility preservation comparable to Train-from-scratch with reduced computation and communication overhead.

Paperid: 1072, https://arxiv.org/pdf/2512.16439.pdf

Abstract:
Benefiting from the superior capabilities of large language models in natural language understanding and generation, Embeddings-as-a-Service (EaaS) has emerged as a successful commercial paradigm on the web platform. However, prior studies have revealed that EaaS is vulnerable to imitation attacks. Existing methods protect the intellectual property of EaaS through watermarking techniques, but they all ignore the most important properties of embedding: semantics, resulting in limited harmlessness and stealthiness. To this end, we propose SemMark, a novel semantic-based watermarking paradigm for EaaS copyright protection. SemMark employs locality-sensitive hashing to partition the semantic space and inject semantic-aware watermarks into specific regions, ensuring that the watermark signals remain imperceptible and diverse. In addition, we introduce the adaptive watermark weight mechanism based on the local outlier factor to preserve the original embedding distribution. Furthermore, we propose Detect-Sampling and Dimensionality-Reduction attacks and construct four scenarios to evaluate the watermarking method. Extensive experiments are conducted on four popular NLP datasets, and SemMark achieves superior verifiability, diversity, stealthiness, and harmlessness.

Paperid: 1073, https://arxiv.org/pdf/2512.16182.pdf

Abstract:
With the rapid development of cloud-based services, large language models (LLMs) have become increasingly accessible through various web platforms. However, this accessibility has also led to growing risks of model abuse. LLM watermarking has emerged as an effective approach to mitigate such misuse and protect intellectual property. Existing watermarking algorithms, however, primarily focus on defending against paraphrase attacks while overlooking piggyback spoofing attacks, which can inject harmful content, compromise watermark reliability, and undermine trust in attribution. To address this limitation, we propose DualGuard, the first watermarking algorithm capable of defending against both paraphrase and spoofing attacks. DualGuard employs the adaptive dual-stream watermarking mechanism, in which two complementary watermark signals are dynamically injected based on the semantic content. This design enables DualGuard not only to detect but also to trace spoofing attacks, thereby ensuring reliable and trustworthy watermark detection. Extensive experiments conducted across multiple datasets and language models demonstrate that DualGuard achieves excellent detectability, robustness, traceability, and text quality, effectively advancing the state of LLM watermarking for real-world applications.

Paperid: 1074, https://arxiv.org/pdf/2512.11272.pdf

Abstract:
Blockchain technology has experienced rapid growth and has been widely adopted across various sectors, including healthcare, finance, and energy. However, blockchain platforms remain vulnerable to a broad range of cyberattacks, particularly those aimed at exploiting transactions and smart contracts (SCs) to steal digital assets or compromise system integrity. To address this issue, we propose a novel and effective framework for detecting cyberattacks within blockchain systems. Our framework begins with a preprocessing tool that uses Natural Language Processing (NLP) techniques to transform key features of blockchain transactions into image representations. These images are then analyzed through vision-based analysis using Vision Transformers (ViT), a recent advancement in computer vision known for its superior ability to capture complex patterns and semantic relationships. By integrating NLP-based preprocessing with vision-based learning, our framework can detect a wide variety of attack types. Experimental evaluations on benchmark datasets demonstrate that our approach significantly outperforms existing state-of-the-art methods in terms of both accuracy (achieving 99.5%) and robustness in cyberattack detection for blockchain transactions and SCs.

Paperid: 1075, https://arxiv.org/pdf/2512.04260.pdf

Abstract:
Hypervisors are under threat by critical memory safety vulnerabilities, with pointer corruption being one of the most prevalent and severe forms. Existing exploitation frameworks depend on identifying highly-constrained structures in the host machine and accurately determining their runtime addresses, which is ineffective in hypervisor environments where such structures are rare and further obfuscated by Address Space Layout Randomization (ASLR). We instead observe that modern virtualization environments exhibit weak memory isolation -- guest memory is fully attacker-controlled yet accessible from the host, providing a reliable primitive for exploitation. Based on this observation, we present the first systematic characterization and taxonomy of Cross-Domain Attacks (CDA), a class of exploitation techniques that enable capability escalation through guest memory reuse. To automate this process, we develop a system that identifies cross-domain gadgets, matches them with corrupted pointers, synthesizes triggering inputs, and assembles complete exploit chains. Our evaluation on 15 real-world vulnerabilities across QEMU and VirtualBox shows that CDA is widely applicable and effective.

Paperid: 1076, https://arxiv.org/pdf/2511.21600.pdf

Abstract:
The rise of generative AI has enabled the production of high-fidelity synthetic tabular data across fields such as healthcare, finance, and public policy, raising growing concerns about data provenance and misuse. Watermarking offers a promising solution to address these concerns by ensuring the traceability of synthetic data, but existing methods face many limitations: they are computationally expensive due to reliance on large diffusion models, struggle with mixed discrete-continuous data, or lack robustness to post-modifications. To address them, we propose TAB-DRW, an efficient and robust post-editing watermarking scheme for generative tabular data. TAB-DRW embeds watermark signals in the frequency domain: it normalizes heterogeneous features via the Yeo-Johnson transformation and standardization, applies the discrete Fourier transform (DFT), and adjusts the imaginary parts of adaptively selected entries according to precomputed pseudorandom bits. To further enhance robustness and efficiency, we introduce a novel rank-based pseudorandom bit generation method that enables row-wise retrieval without incurring storage overhead. Experiments on five benchmark tabular datasets show that TAB-DRW achieves strong detectability and robustness against common post-processing attacks, while preserving high data fidelity and fully supporting mixed-type features.

Paperid: 1077, https://arxiv.org/pdf/2511.20602.pdf

Abstract:
Quantum Key Distribution (QKD) is revolutionizing cryptography by promising information-theoretic security through the immutable laws of quantum mechanics. Yet, the challenge of transforming these idealized security models into practical, resilient systems remains a pressing issue, especially as quantum computing evolves. In this review, we critically dissect and synthesize the latest advancements in QKD protocols and their security vulnerabilities, with a strong emphasis on rigorous security proofs. We actively categorize contemporary QKD schemes into three key classes: uncertainty principle-based protocols (e.g., BB84), hybrid architectures that enable secure direct communication (eg, three-stage protocol), and continuous-variable frameworks. We further include two modern classes of QKD protocols, namely Twin-field QKD and Device-Independent QKD, both of which were developed to have practical implementations over the last decade. Moreover, we highlight important experimental breakthroughs and innovative mitigation strategies, including the deployment of advanced Quantum Error Correction Codes (QECCs), that significantly enhance channel fidelity and system robustness. By mapping the current landscape, from sophisticated quantum attacks to state-of-the-art error correction methods, this review fills an important gap in the literature. To bring everything together, the relevance of this review concerning quantum augmented networks (QuANets) is also presented. This allows the readers to gain a comprehensive understanding of the security promises of quantum key distribution from theoretical proofs to experimental validations.

Paperid: 1078, https://arxiv.org/pdf/2511.14937.pdf

Abstract:
Large Language Models (LLMs) increasingly use persistent memory from past interactions to enhance personalization and task performance. However, this memory introduces critical risks when sensitive information is revealed in inappropriate contexts. We present CIMemories, a benchmark for evaluating whether LLMs appropriately control information flow from memory based on task context. CIMemories uses synthetic user profiles with over 100 attributes per user, paired with diverse task contexts in which each attribute may be essential for some tasks but inappropriate for others. Our evaluation reveals that frontier models exhibit up to 69% attribute-level violations (leaking information inappropriately), with lower violation rates often coming at the cost of task utility. Violations accumulate across both tasks and runs: as usage increases from 1 to 40 tasks, GPT-5's violations rise from 0.1% to 9.6%, reaching 25.1% when the same prompt is executed 5 times, revealing arbitrary and unstable behavior in which models leak different attributes for identical prompts. Privacy-conscious prompting does not solve this - models overgeneralize, sharing everything or nothing rather than making nuanced, context-dependent decisions. These findings reveal fundamental limitations that require contextually aware reasoning capabilities, not just better prompting or scaling.

Paperid: 1079, https://arxiv.org/pdf/2511.12313.pdf

Abstract:
The scalability of current quantum networks is limited due to noisy quantum components and high implementation costs, thereby limiting the security advantages that quantum networks provide over their classical counterparts. Quantum Augmented Networks (QuANets) address this by integrating quantum components in classical network infrastructure to improve robustness and end-to-end security. To enable such integration, Quantum Anonymous Notification (QAN) is a method to anonymously inform a receiver of an incoming quantum communication. Therefore, several quantum primitives will serve as core tools, namely, quantum voting, quantum anonymous protocols, quantum secret sharing, etc. However, all current quantum protocols can be compromised in the presence of several common channel noises. In this work, we propose an improved quantum anonymous notification (QAN) protocol that utilizes rotation operations on shared GHZ states to produce an anonymous notification in an n-user quantum-augmented network. We study the behavior of this modified QAN protocol under the dephasing noise model and observe stronger resilience to false notifications than earlier QAN approaches. The QAN framework is also proposed to be integrated with a machine-learning classifier, enhanced quantum-augmented network. Finally, we discuss how this notification layer integrates with QuANets so that receivers can allow switch-bypass handling of quantum payloads, reducing header-based information leakage and vulnerability to targeted interference at compromised switches.

Paperid: 1080, https://arxiv.org/pdf/2511.11896.pdf

Abstract:
The widespread reliance on open-source software dramatically increases the risk of vulnerability exploitation, underscoring the need for effective and scalable vulnerability detection (VD). Existing VD techniques, whether traditional machine learning-based or LLM-based approaches like prompt engineering, supervised fine-tuning, or off-policy preference optimization, remain fundamentally limited in their ability to perform context-aware analysis: They depend on fixed inputs or static preference datasets, cannot adaptively explore repository-level dependencies, and are constrained by function-level benchmarks that overlook critical vulnerability context. This paper introduces Vulnerability-Adaptive Policy Optimization (VULPO), an on-policy LLM reinforcement learning framework for context-aware VD. To support training and evaluation, we first construct ContextVul, a new dataset that augments high-quality function-level samples with lightweight method to extract repository-level context information. We then design multi-dimensional reward structuring that jointly captures prediction correctness, vulnerability localization accuracy, and the semantic relevance of vulnerability analysis, thereby guiding the model toward comprehensive contextual reasoning. To address the asymmetric difficulty of different vulnerability cases and mitigate reward hacking, VULPO incorporates label-level and sample-level difficulty-adaptive reward scaling, encouraging the model to explore challenging cases while maintaining balanced reward distribution. Extensive experiments demonstrate the superiority of our VULPO framework in context-aware VD: Our VULPO-4B substantially outperforms existing VD baselines based on prompt engineering and off-policy optimization, improving F1 by 85% over Qwen3-4B and achieving performance comparable to a 150x larger-scale model, DeepSeek-R1-0528.

Paperid: 1081, https://arxiv.org/pdf/2511.10714.pdf

Abstract:
Recent advances in Chain-of-Thought (CoT) prompting have substantially improved the reasoning capabilities of large language models (LLMs), but have also introduced their computational efficiency as a new attack surface. In this paper, we propose BadThink, the first backdoor attack designed to deliberately induce "overthinking" behavior in CoT-enabled LLMs while ensuring stealth. When activated by carefully crafted trigger prompts, BadThink manipulates the model to generate inflated reasoning traces - producing unnecessarily redundant thought processes while preserving the consistency of final outputs. This subtle attack vector creates a covert form of performance degradation that significantly increases computational costs and inference time while remaining difficult to detect through conventional output evaluation methods. We implement this attack through a sophisticated poisoning-based fine-tuning strategy, employing a novel LLM-based iterative optimization process to embed the behavior by generating highly naturalistic poisoned data. Our experiments on multiple state-of-the-art models and reasoning tasks show that BadThink consistently increases reasoning trace lengths - achieving an over 17x increase on the MATH-500 dataset - while remaining stealthy and robust. This work reveals a critical, previously unexplored vulnerability where reasoning efficiency can be covertly manipulated, demonstrating a new class of sophisticated attacks against CoT-enabled systems.

Paperid: 1082, https://arxiv.org/pdf/2511.08905.pdf

Abstract:
Given the high cost of large language model (LLM) training from scratch, safeguarding LLM intellectual property (IP) has become increasingly crucial. As the standard paradigm for IP ownership verification, LLM fingerprinting thus plays a vital role in addressing this challenge. Existing LLM fingerprinting methods verify ownership by extracting or injecting model-specific features. However, they overlook potential attacks during the verification process, leaving them ineffective when the model thief fully controls the LLM's inference process. In such settings, attackers may share prompt-response pairs to enable fingerprint unlearning or manipulate outputs to evade exact-match verification. We propose iSeal, the first fingerprinting method designed for reliable verification when the model thief controls the suspected LLM in an end-to-end manner. It injects unique features into both the model and an external module, reinforced by an error-correction mechanism and a similarity-based verification strategy. These components are resistant to verification-time attacks, including collusion-based fingerprint unlearning and response manipulation, backed by both theoretical analysis and empirical results. iSeal achieves 100 percent Fingerprint Success Rate (FSR) on 12 LLMs against more than 10 attacks, while baselines fail under unlearning and response manipulations.

Paperid: 1083, https://arxiv.org/pdf/2511.01393.pdf

Abstract:
As the Web3 ecosystem evolves toward a multi-chain architecture, cross-chain bridges have become critical infrastructure for enabling interoperability between diverse blockchain networks. However, while connecting isolated blockchains, the lack of cross-chain transaction pairing records introduces significant challenges for security analysis like cross-chain fund tracing, advanced vulnerability detection, and transaction graph-based analysis. To address this gap, we introduce ConneX, an automated and general-purpose system designed to accurately identify corresponding transaction pairs across both ends of cross-chain bridges. Our system leverages Large Language Models (LLMs) to efficiently prune the semantic search space by identifying semantically plausible key information candidates within complex transaction records. Further, it deploys a novel examiner module that refines these candidates by validating them against transaction values, effectively addressing semantic ambiguities and identifying the correct semantics. Extensive evaluations on a dataset of about 500,000 transactions from five major bridge platforms demonstrate that ConneX achieves an average F1 score of 0.9746, surpassing baselines by at least 20.05\%, with good efficiency that reduces the semantic search space by several orders of magnitude (1e10 to less than 100). Moreover, its successful application in tracing illicit funds (including a cross-chain transfer worth $1 million) in real-world hacking incidents underscores its practical utility for enhancing cross-chain security and transparency.

Paperid: 1084, https://arxiv.org/pdf/2510.23463.pdf

Abstract:
Federated Learning (FL) is a distributed learning paradigm that preserves privacy by eliminating the need to exchange raw data during training. In its prototypical edge instantiation with underlying wireless transmissions enabled by analog over-the-air computing (AirComp), referred to as \emph{over-the-air FL (AirFL)}, the inherent channel noise plays a unique role of \emph{frenemy} in the sense that it degrades training due to noisy global aggregation while providing a natural source of randomness for privacy-preserving mechanisms, formally quantified by \emph{differential privacy (DP)}. It remains, nevertheless, challenging to effectively harness such channel impairments, as prior arts, under assumptions of either simple channel models or restricted types of loss functions, mostly considering (local) DP enhancement with a single-round or non-convergent bound on privacy loss. In this paper, we study AirFL over multiple-access fading channels with a multi-antenna base station (BS) subject to user-level DP requirements. Despite a recent study, which claimed in similar settings that artificial noise (AN) must be injected to ensure DP in general, we demonstrate, on the contrary, that DP can be gained as a \emph{perk} even \emph{without} employing any AN. Specifically, we derive a novel bound on DP that converges under general bounded-domain assumptions on model parameters, along with a convergence bound with general smooth and non-convex loss functions. Next, we optimize over receive beamforming and power allocations to characterize the optimal convergence-privacy trade-offs, which also reveal explicit conditions in which DP is achievable without compromising training. Finally, our theoretical findings are validated by extensive numerical results.

Paperid: 1085, https://arxiv.org/pdf/2510.21133.pdf

Abstract:
Frontier Large Language Models (LLMs) pose unprecedented dual-use risks through the potential proliferation of chemical, biological, radiological, and nuclear (CBRN) weapons knowledge. We present the first comprehensive evaluation of 10 leading commercial LLMs against both a novel 200-prompt CBRN dataset and a 180-prompt subset of the FORTRESS benchmark, using a rigorous three-tier attack methodology. Our findings expose critical safety vulnerabilities: Deep Inception attacks achieve 86.0\% success versus 33.8\% for direct requests, demonstrating superficial filtering mechanisms; Model safety performance varies dramatically from 2\% (claude-opus-4) to 96\% (mistral-small-latest) attack success rates; and eight models exceed 70\% vulnerability when asked to enhance dangerous material properties. We identify fundamental brittleness in current safety alignment, where simple prompt engineering techniques bypass safeguards for dangerous CBRN information. These results challenge industry safety claims and highlight urgent needs for standardized evaluation frameworks, transparent safety metrics, and more robust alignment techniques to mitigate catastrophic misuse risks while preserving beneficial capabilities.

Paperid: 1086, https://arxiv.org/pdf/2510.20956.pdf

Abstract:
We discover a novel and surprising phenomenon of unintentional misalignment in reasoning language models (RLMs), which we call self-jailbreaking. Specifically, after benign reasoning training on math or code domains, RLMs will use multiple strategies to circumvent their own safety guardrails. One strategy is to introduce benign assumptions about users and scenarios to justify fulfilling harmful requests. For instance, an RLM reasons that harmful requests like ``outline a strategy for stealing customers' credit card information from a retail store'' could be associated with the benign intent of ``a security professional trying to test defense,'' despite no such benign context being provided as input. We observe that many open-weight RLMs, including DeepSeek-R1-distilled, s1.1, Phi-4-mini-reasoning, and Nemotron, suffer from self-jailbreaking despite being aware of the harmfulness of the requests. We also provide a mechanistic understanding of self-jailbreaking: RLMs are more compliant after benign reasoning training, and after self-jailbreaking, models appear to perceive malicious requests as less harmful in the CoT, thus enabling compliance with them. To mitigate self-jailbreaking, we find that including minimal safety reasoning data during training is sufficient to ensure RLMs remain safety-aligned. Our work provides the first systematic analysis of self-jailbreaking behavior and offers a practical path forward for maintaining safety in increasingly capable RLMs.

Paperid: 1087, https://arxiv.org/pdf/2510.20223.pdf

Abstract:
Multimodal large language models (MLLMs) have achieved remarkable progress, yet remain critically vulnerable to adversarial attacks that exploit weaknesses in cross-modal processing. We present a systematic study of multimodal jailbreaks targeting both vision-language and audio-language models, showing that even simple perceptual transformations can reliably bypass state-of-the-art safety filters. Our evaluation spans 1,900 adversarial prompts across three high-risk safety categories harmful content, CBRN (Chemical, Biological, Radiological, Nuclear), and CSEM (Child Sexual Exploitation Material) tested against seven frontier models. We explore the effectiveness of attack techniques on MLLMs, including FigStep-Pro (visual keyword decomposition), Intelligent Masking (semantic obfuscation), and audio perturbations (Wave-Echo, Wave-Pitch, Wave-Speed). The results reveal severe vulnerabilities: models with almost perfect text-only safety (0\% ASR) suffer >75\% attack success under perceptually modified inputs, with FigStep-Pro achieving up to 89\% ASR in Llama-4 variants. Audio-based attacks further uncover provider-specific weaknesses, with even basic modality transfer yielding 25\% ASR for technical queries. These findings expose a critical gap between text-centric alignment and multimodal threats, demonstrating that current safeguards fail to generalize across cross-modal attacks. The accessibility of these attacks, which require minimal technical expertise, suggests that robust multimodal AI safety will require a paradigm shift toward broader semantic-level reasoning to mitigate possible risks.

Paperid: 1088, https://arxiv.org/pdf/2510.18324.pdf

Abstract:
Host-based cryptomining malware, commonly known as cryptojackers, have gained notoriety for their stealth and the significant financial losses they cause in Linux-based cloud environments. Existing solutions often struggle with scalability due to high monitoring overhead, low detection accuracy against obfuscated behavior, and lack of integrated remediation. We present CryptoGuard, a lightweight hybrid solution that combines detection and remediation strategies to counter cryptojackers. To ensure scalability, CryptoGuard uses sketch- and sliding window-based syscall monitoring to collect behavior patterns with minimal overhead. It decomposes the classification task into a two-phase process, leveraging deep learning models to identify suspicious activity with high precision. To counter evasion techniques such as entry point poisoning and PID manipulation, CryptoGuard integrates targeted remediation mechanisms based on eBPF, a modern Linux kernel feature deployable on any compatible host. Evaluated on 123 real-world cryptojacker samples, it achieves average F1-scores of 96.12% and 92.26% across the two phases, and outperforms state-of-the-art baselines in terms of true and false positive rates, while incurring only 0.06% CPU overhead per host.

Paperid: 1089, https://arxiv.org/pdf/2510.17552.pdf

Abstract:
Control-flow hijacking attacks manipulate orchestration mechanisms in multi-agent systems into performing unsafe actions that compromise the system and exfiltrate sensitive information. Recently proposed defenses, such as LlamaFirewall, rely on alignment checks of inter-agent communications to ensure that all agent invocations are "related to" and "likely to further" the original objective. We start by demonstrating control-flow hijacking attacks that evade these defenses even if alignment checks are performed by advanced LLMs. We argue that the safety and functionality objectives of multi-agent systems fundamentally conflict with each other. This conflict is exacerbated by the brittle definitions of "alignment" and the checkers' incomplete visibility into the execution context. We then propose, implement, and evaluate ControlValve, a new defense inspired by the principles of control-flow integrity and least privilege. ControlValve (1) generates permitted control-flow graphs for multi-agent systems, and (2) enforces that all executions comply with these graphs, along with contextual rules (generated in a zero-shot manner) for each agent invocation.

Paperid: 1091, https://arxiv.org/pdf/2510.16923.pdf

Abstract:
Deep learning models deployed in safety critical applications like autonomous driving use simulations to test their robustness against adversarial attacks in realistic conditions. However, these simulations are non-differentiable, forcing researchers to create attacks that do not integrate simulation environmental factors, reducing attack success. To address this limitation, we introduce UNDREAM, the first software framework that bridges the gap between photorealistic simulators and differentiable renderers to enable end-to-end optimization of adversarial perturbations on any 3D objects. UNDREAM enables manipulation of the environment by offering complete control over weather, lighting, backgrounds, camera angles, trajectories, and realistic human and object movements, thereby allowing the creation of diverse scenes. We showcase a wide array of distinct physically plausible adversarial objects that UNDREAM enables researchers to swiftly explore in different configurable environments. This combination of photorealistic simulation and differentiable optimization opens new avenues for advancing research of physical adversarial attacks.

Paperid: 1092, https://arxiv.org/pdf/2510.13111.pdf

Abstract:
The run-time electromagnetic (EM) emanation of microprocessors presents a side-channel that leaks the confidentiality of the applications running on them. Many recent works have demonstrated successful attacks leveraging such side-channels to extract the confidentiality of diverse applications, such as the key of cryptographic algorithms and the hyperparameter of neural network models. This paper proposes ShuffleV, a microarchitecture defense strategy against EM Side-Channel Attacks (SCAs). ShuffleV adopts the moving target defense (MTD) philosophy, by integrating hardware units to randomly shuffle the execution order of program instructions and optionally insert dummy instructions, to nullify the statistical observation by attackers across repetitive runs. We build ShuffleV on the open-source RISC-V core and provide six design options, to suit different application scenarios. To enable rapid evaluation, we develop a ShuffleV simulator that can help users to (1) simulate the performance overhead for each design option and (2) generate an execution trace to validate the randomness of execution on their workload. We implement ShuffleV on a Xilinx PYNQ-Z2 FPGA and validate its performance with two representative victim applications against EM SCAs, AES encryption, and neural network inference. The experimental results demonstrate that ShuffleV can provide automatic protection for these applications, without any user intervention or software modification.

Paperid: 1093, https://arxiv.org/pdf/2510.12908.pdf

Abstract:
Federated learning (FL) enables organizations to collaboratively train models without sharing their datasets. Despite this advantage, recent studies show that both client updates and the global model can leak private information, limiting adoption in sensitive domains such as healthcare. Local differential privacy (LDP) offers strong protection by letting each participant privatize updates before transmission. However, existing LDP methods were designed for centralized training and introduce challenges in FL, including high resource demands that can cause client dropouts and the lack of reliable privacy guarantees under asynchronous participation. These issues undermine model generalizability, fairness, and compliance with regulations such as HIPAA and GDPR. To address them, we propose L-RDP, a DP method designed for LDP that ensures constant, lower memory usage to reduce dropouts and provides rigorous per-client privacy guarantees by accounting for intermittent participation.

Paperid: 1094, https://arxiv.org/pdf/2510.11414.pdf

Abstract:
The proliferation of autonomous AI agents within enterprise environments introduces a critical security challenge: managing access control for emergent, novel tasks for which no predefined policies exist. This paper introduces an advanced security framework that extends the Task-Based Access Control (TBAC) model by using a Large Language Model (LLM) as an autonomous, risk-aware judge. This model makes access control decisions not only based on an agent's intent but also by explicitly considering the inherent \textbf{risk associated with target resources} and the LLM's own \textbf{model uncertainty} in its decision-making process. When an agent proposes a novel task, the LLM judge synthesizes a just-in-time policy while also computing a composite risk score for the task and an uncertainty estimate for its own reasoning. High-risk or high-uncertainty requests trigger more stringent controls, such as requiring human approval. This dual consideration of external risk and internal confidence allows the model to enforce a more robust and adaptive version of the principle of least privilege, paving the way for safer and more trustworthy autonomous systems.

Paperid: 1095, https://arxiv.org/pdf/2510.10625.pdf

Abstract:
Determining which data samples were used to train a model-known as Membership Inference Attack (MIA)-is a well-studied and important problem with implications for data privacy. Black-box methods presume access only to the model's outputs and often rely on training auxiliary reference models. While they have shown strong empirical performance, they rely on assumptions that rarely hold in real-world settings: (i) the attacker knows the training hyperparameters; (ii) all available non-training samples come from the same distribution as the training data; and (iii) the fraction of training data in the evaluation set is known. In this paper, we demonstrate that removing these assumptions leads to a significant drop in the performance of black-box attacks. We introduce ImpMIA, a Membership Inference Attack that exploits the Implicit Bias of neural networks, hence removes the need to rely on any reference models and their assumptions. ImpMIA is a white-box attack -- a setting which assumes access to model weights and is becoming increasingly realistic given that many models are publicly available (e.g., via Hugging Face). Building on maximum-margin implicit bias theory, ImpMIA uses the Karush-Kuhn-Tucker (KKT) optimality conditions to identify training samples. This is done by finding the samples whose gradients most strongly reconstruct the trained model's parameters. As a result, ImpMIA achieves state-of-the-art performance compared to both black and white box attacks in realistic settings where only the model weights and a superset of the training data are available.

Paperid: 1096, https://arxiv.org/pdf/2510.10246.pdf

Abstract:
System passwords serve as critical credentials for user authentication and access control when logging into operating systems or applications. Upon entering a valid password, users pass verification to access system resources and execute corresponding operations. In recent years, frequent password cracking attacks targeting system passwords have posed a severe threat to information system security. To address this challenge, in-depth research into password cracking attack methods and defensive technologies holds significant importance. This paper conducts systematic research on system password security, focusing on analyzing typical password cracking methods such as brute force attacks, dictionary attacks, and rainbow table attacks, while evaluating the effectiveness of existing defensive measures. The experimental section utilizes common cryptanalysis tools, such as John the Ripper and Hashcat, to simulate brute force and dictionary attacks. Five test datasets, each generated using Message Digest Algorithm 5 (MD5), Secure Hash Algorithm 256-bit (SHA 256), and bcrypt hash functions, are analyzed. By comparing the overall performance of different hash algorithms and password complexity strategies against these attacks, the effectiveness of defensive measures such as salting and slow hashing algorithms is validated. Building upon this foundation, this paper further evaluates widely adopted defense mechanisms, including account lockout policies, multi-factor authentication, and risk adaptive authentication. By integrating experimental data with recent research findings, it analyzes the strengths and limitations of each approach while proposing feasible improvement recommendations and optimization strategies.

Paperid: 1097, https://arxiv.org/pdf/2510.09443.pdf

Abstract:
This paper investigates the impact of sanctions on Tornado Cash, a smart contract protocol designed to enhance transaction privacy. Following the U.S. Department of the Treasury's sanctions against Tornado Cash in August 2022, platform activity declined sharply. We document a significant and sustained reduction in transaction volume, user diversity, and overall protocol utilization after the sanctions were imposed. Our analysis draws on transaction data from three major blockchains: Ethereum, BNB Smart Chain, and Polygon. We further examine developments following the partial lifting and eventual removal of sanctions by the U.S. Office of Foreign Assets Control (OFAC) in March 2025. Although activity partially recovered, the rebound remained limited. The Tornado Cash case illustrates how regulatory interventions can affect decentralized protocols, while also highlighting the challenges of fully enforcing such measures in decentralized environments.

Paperid: 1098, https://arxiv.org/pdf/2510.09433.pdf

Abstract:
Tornado Cash is a decentralised mixer that uses cryptographic techniques to sever the on-chain trail between depositors and withdrawers. In practice, however, its anonymity can be undermined by user behaviour and operational quirks. We conduct the first cross-chain empirical study of Tornado Cash activity on Ethereum, BNB Smart Chain, and Polygon, introducing three clustering heuristics-(i) address-reuse, (ii) transactional-linkage, and (iii) a novel first-in-first-out (FIFO) temporal-matching rule. Together, these heuristics reconnect deposits to withdrawals and deanonymise a substantial share of recipients. Our analysis shows that 5.1 - 12.6% of withdrawals can already be traced to their originating deposits through address reuse and transactional linkage heuristics. Adding our novel First-In-First-Out (FIFO) temporal-matching heuristic lifts the linkage rate by a further 15 - 22 percentage points. Statistical tests confirm that these FIFO matches are highly unlikely to occur by chance. Comparable leakage across Ethereum, BNB Smart Chain, and Polygon indicates chain-agnostic user misbehaviour, rather than chain-specific protocol flaws. These results expose how quickly cryptographic guarantees can unravel in everyday use, underscoring the need for both disciplined user behaviour and privacy-aware protocol design. In total, our heuristics link over $2.3 billion in Tornado Cash withdrawals to identifiable deposits, exposing significant cracks in practical anonymity.

Paperid: 1099, https://arxiv.org/pdf/2510.06719.pdf

Abstract:
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by grounding them in external knowledge. However, its application in sensitive domains is limited by privacy risks. Existing private RAG methods typically rely on query-time differential privacy (DP), which requires repeated noise injection and leads to accumulated privacy loss. To address this issue, we propose DP-SynRAG, a framework that uses LLMs to generate differentially private synthetic RAG databases. Unlike prior methods, the synthetic text can be reused once created, thereby avoiding repeated noise injection and additional privacy costs. To preserve essential information for downstream RAG tasks, DP-SynRAG extends private prediction, which instructs LLMs to generate text that mimics subsampled database records in a DP manner. Experiments show that DP-SynRAG achieves superior performanec to the state-of-the-art private RAG systems while maintaining a fixed privacy budget, offering a scalable solution for privacy-preserving RAG.

Paperid: 1100, https://arxiv.org/pdf/2510.06544.pdf

Abstract:
As advances in synthetic voice generation accelerate, an increasing variety of fake voice generators have emerged, producing audio that is often indistinguishable from real human speech. This evolution poses new and serious threats across sectors where audio recordings serve as critical evidence. Although fake voice detectors are also advancing, the arms race between fake voice generation and detection has become more intense and complex. In this work, we present the first large-scale, cross-domain evaluation of fake voice detectors, benchmarking 8 state-of-the-art models against datasets synthesized by 20 different fake voice generation systems. To the best of our knowledge, this is the most comprehensive cross-domain assessment conducted to date. Our study reveals substantial security vulnerabilities in current fake voice detection systems, underscoring critical gaps in their real-world robustness. To advance the field, we propose a unified and effective metric that consolidates the diverse and often inconsistent evaluation criteria previously used across different studies. This metric enables standardized, straightforward comparisons of the robustness of fake voice detectors. We conclude by offering actionable recommendations for building more resilient fake voice detection technologies, with the broader goal of reinforcing the foundations of AI security and trustworthiness.

Paperid: 1101, https://arxiv.org/pdf/2510.05946.pdf

Abstract:
A sum-product network (SPN) is a graphical model that allows several types of probabilistic inference to be performed efficiently. In this paper, we propose a privacy-preserving protocol which tackles structure generation and parameter learning of SPNs. Additionally, we provide a protocol for private inference on SPNs, subsequent to training. To preserve the privacy of the participants, we derive our protocol based on secret sharing, which guarantees privacy in the honest-but-curious setting even when at most half of the parties cooperate to disclose the data. The protocol makes use of a forest of randomly generated SPNs, which is trained and weighted privately and can then be used for private inference on data points. Our experiments indicate that preserving the privacy of all participants does not decrease log-likelihood performance on both homogeneously and heterogeneously partitioned data. We furthermore show that our protocol's performance is comparable to current state-of-the-art SPN learners in homogeneously partitioned data settings. In terms of runtime and memory usage, we demonstrate that our implementation scales well when increasing the number of parties, comparing favorably to protocols for neural networks, when they are trained to reproduce the input-output behavior of SPNs.

Paperid: 1102, https://arxiv.org/pdf/2510.04987.pdf

Abstract:
Graph-based models learn rich code graph structural information and present superior performance on various code analysis tasks. However, the robustness of these models against adversarial example attacks in the context of vulnerability detection remains an open question. This paper proposes NatGVD, a novel attack methodology that generates natural adversarial vulnerable code to circumvent GNN-based and graph-aware transformer-based vulnerability detectors. NatGVD employs a set of code transformations that modify graph structure while preserving code semantics. Instead of injecting dead or unrelated code like previous works, NatGVD considers naturalness requirements: generated examples should not be easily recognized by humans or program analysis tools. With extensive evaluation of NatGVD on state-of-the-art vulnerability detection systems, the results reveal up to 53.04% evasion rate across GNN-based detectors and graph-aware transformer-based detectors. We also explore potential defense strategies to enhance the robustness of these systems against NatGVD.

Paperid: 1103, https://arxiv.org/pdf/2510.04261.pdf

Abstract:
Large language models (LLMs) have been widely deployed in Conversational AIs (CAIs), while exposing privacy and security threats. Recent research shows that LLM-based CAIs can be manipulated to extract private information from human users, posing serious security threats. However, the methods proposed in that study rely on a white-box setting that adversaries can directly modify the system prompt. This condition is unlikely to hold in real-world deployments. The limitation raises a critical question: can unprivileged attackers still induce such privacy risks in practical LLM-integrated applications? To address this question, we propose \textsc{VortexPIA}, a novel indirect prompt injection attack that induces privacy extraction in LLM-integrated applications under black-box settings. By injecting token-efficient data containing false memories, \textsc{VortexPIA} misleads LLMs to actively request private information in batches. Unlike prior methods, \textsc{VortexPIA} allows attackers to flexibly define multiple categories of sensitive data. We evaluate \textsc{VortexPIA} on six LLMs, covering both traditional and reasoning LLMs, across four benchmark datasets. The results show that \textsc{VortexPIA} significantly outperforms baselines and achieves state-of-the-art (SOTA) performance. It also demonstrates efficient privacy requests, reduced token consumption, and enhanced robustness against defense mechanisms. We further validate \textsc{VortexPIA} on multiple realistic open-source LLM-integrated applications, demonstrating its practical effectiveness.

Paperid: 1104, https://arxiv.org/pdf/2510.03559.pdf

Abstract:
UX professionals routinely conduct design reviews, yet privacy concerns are often overlooked -- not only due to limited tools, but more critically because of low intrinsic motivation. Limited privacy knowledge, weak empathy for unexpectedly affected users, and low confidence in identifying harms make it difficult to address risks. We present PrivacyMotiv, an LLM-powered system that supports privacy-oriented design diagnosis by generating speculative personas with UX user journeys centered on individuals vulnerable to privacy risks. Drawing on narrative strategies, the system constructs relatable and attention-drawing scenarios that show how ordinary design choices may cause unintended harms, expanding the scope of privacy reflection in UX. In a within-subjects study with professional UX practitioners (N=16), we compared participants' self-proposed methods with PrivacyMotiv across two privacy review tasks. Results show significant improvements in empathy, intrinsic motivation, and perceived usefulness. This work contributes a promising privacy review approach which addresses the motivational barriers in privacy-aware UX.

Paperid: 1105, https://arxiv.org/pdf/2510.00799.pdf

Abstract:
Most image watermarking systems focus on robustness, capacity, and imperceptibility while treating the embedded payload as meaningless bits. This bit-centric view imposes a hard ceiling on capacity and prevents watermarks from carrying useful information. We propose LatentSeal, which reframes watermarking as semantic communication: a lightweight text autoencoder maps full-sentence messages into a compact 256-dimensional unit-norm latent vector, which is robustly embedded by a finetuned watermark model and secured through a secret, invertible rotation. The resulting system hides full-sentence messages, decodes in real time, and survives valuemetric and geometric attacks. It surpasses prior state of the art in BLEU-4 and Exact Match on several benchmarks, while breaking through the long-standing 256-bit payload ceiling. It also introduces a statistically calibrated score that yields a ROC AUC score of 0.97-0.99, and practical operating points for deployment. By shifting from bit payloads to semantic latent vectors, LatentSeal enables watermarking that is not only robust and high-capacity, but also secure and interpretable, providing a concrete path toward provenance, tamper explanation, and trustworthy AI governance. Models, training and inference code, and data splits will be available upon publication.

Paperid: 1106, https://arxiv.org/pdf/2510.00572.pdf

Abstract:
Intrusion Detection Systems (IDS) face persistent challenges due to evolving cyberattacks, high-dimensional traffic data, and severe class imbalance in benchmark datasets such as NSL-KDD. To address these issues, we propose IntrusionX, a hybrid deep learning framework that integrates Convolutional Neural Networks (CNNs) for local feature extraction and Long Short-Term Memory (LSTM) networks for temporal modeling. The architecture is further optimized using the Squirrel Search Algorithm (SSA), enabling effective hyperparameter tuning while maintaining computational efficiency. Our pipeline incorporates rigorous preprocessing, stratified data splitting, and dynamic class weighting to enhance the detection of rare classes. Experimental evaluation on NSL-KDD demonstrates that IntrusionX achieves 98% accuracy in binary classification and 87% in 5-class classification, with significant improvements in minority class recall (U2R: 71%, R2L: 93%). The novelty of IntrusionX lies in its reproducible, imbalance-aware design with metaheuristic optimization.

Paperid: 1107, https://arxiv.org/pdf/2510.00317.pdf

Abstract:
The widespread adoption of open-source software (OSS) necessitates the mitigation of vulnerability risks. Most vulnerability detection (VD) methods are limited by inadequate contextual understanding, restrictive single-round interactions, and coarse-grained evaluations, resulting in undesired model performance and biased evaluation results. To address these challenges, we propose MAVUL, a novel multi-agent VD system that integrates contextual reasoning and interactive refinement. Specifically, a vulnerability analyst agent is designed to flexibly leverage tool-using capabilities and contextual reasoning to achieve cross-procedural code understanding and effectively mine vulnerability patterns. Through iterative feedback and refined decision-making within cross-role agent interactions, the system achieves reliable reasoning and vulnerability prediction. Furthermore, MAVUL introduces multi-dimensional ground truth information for fine-grained evaluation, thereby enhancing evaluation accuracy and reliability. Extensive experiments conducted on a pairwise vulnerability dataset demonstrate MAVUL's superior performance. Our findings indicate that MAVUL significantly outperforms existing multi-agent systems with over 62% higher pairwise accuracy and single-agent systems with over 600% higher average performance. The system's effectiveness is markedly improved with increased communication rounds between the vulnerability analyst agent and the security architect agent, underscoring the importance of contextual reasoning in tracing vulnerability flows and the crucial feedback role. Additionally, the integrated evaluation agent serves as a critical, unbiased judge, ensuring a more accurate and reliable estimation of the system's real-world applicability by preventing misleading binary comparisons.

Paperid: 1108, https://arxiv.org/pdf/2509.25525.pdf

Abstract:
Multimodal large language models (MLLMs) have demonstrated remarkable capabilities in processing and reasoning over diverse modalities, but their advanced abilities also raise significant privacy concerns, particularly regarding Personally Identifiable Information (PII) leakage. While relevant research has been conducted on single-modal language models to some extent, the vulnerabilities in the multimodal setting have yet to be fully investigated. In this work, we investigate these emerging risks with a focus on vision language models (VLMs), a representative subclass of MLLMs that covers the two modalities most relevant for PII leakage, vision and text. We introduce a concept-guided mitigation approach that identifies and modifies the model's internal states associated with PII-related content. Our method guides VLMs to refuse PII-sensitive tasks effectively and efficiently, without requiring re-training or fine-tuning. We also address the current lack of multimodal PII datasets by constructing various ones that simulate real-world scenarios. Experimental results demonstrate that the method can achieve an average refusal rate of 93.3% for various PII-related tasks with minimal impact on unrelated model performances. We further examine the mitigation's performance under various conditions to show the adaptability of our proposed method.

Paperid: 1109, https://arxiv.org/pdf/2509.24440.pdf

Abstract:
We evaluate the performance of two architectures for network-wide quantum key distribution (QKD): Relayed QKD, which relays keys over multi-link QKD paths for non-adjacent nodes, and Switched QKD, which uses optical switches to dynamically connect arbitrary QKD modules to form direct QKD links between them. An advantage of Switched QKD is that it distributes quantum keys end-to-end, whereas Relayed relies on trusted nodes. However, Switched depends on arbitrary matching of QKD modules. We first experimentally evaluate the performance of commercial DV-QKD modules; for each of three vendors we benchmark the performance in standard/matched module pairs and in unmatched pairs to emulate configurations in the Switched QKD network architecture. The analysis reveals that in some cases a notable variation in the generated secret key rate (SKR) between the matched and unmatched pairs is observed. Driven by these experimental findings, we conduct a comprehensive theoretical analysis that evaluates the network-wide performance of the two architectures. Our analysis is based on uniform ring networks, where we derive optimal key management configurations and analytical formulas for the achievable consumed SKR. We compare network performance under varying ring sizes, QKD link losses, QKD receivers' sensitivity and performance penalties of unmatched modules. Our findings indicate that Switched QKD performs better in dense rings (short distances, large node counts), while Relayed QKD is more effective in longer distances and large node counts. Moreover, we confirm that unmatched QKD modules penalties significantly impact the efficiency of Switched QKD architecture.

Paperid: 1110, https://arxiv.org/pdf/2509.24032.pdf

Abstract:
Rust is a modern systems programming language that ensures memory safety by enforcing ownership and borrowing rules at compile time. While the unsafe keyword allows programmers to bypass these restrictions, it introduces significant risks. Various approaches for isolating unsafe code to protect safe Rust from vulnerabilities have been proposed, yet these methods provide only fixed isolation boundaries and do not accommodate expressive policies that require sandboxing both safe and unsafe code. This paper presents SandCell for flexible and lightweight isolation in Rust by leveraging existing syntactic boundaries. SandCell allows programmers to specify which components to sandbox with minimal annotation effort, enabling fine-grained control over isolation. The system also introduces novel techniques to minimize overhead when transferring data between sandboxes. Our evaluation demonstrates SandCell's effectiveness in preventing vulnerabilities across various Rust applications while maintaining reasonable performance overheads.

Paperid: 1111, https://arxiv.org/pdf/2509.22684.pdf

Abstract:
Zero-Knowledge Proofs (ZKP) are protocols which construct cryptographic proofs to demonstrate knowledge of a secret input in a computation without revealing any information about the secret. ZKPs enable novel applications in private and verifiable computing such as anonymized cryptocurrencies and blockchain scaling and have seen adoption in several real-world systems. Prior work has accelerated ZKPs on GPUs by leveraging the inherent parallelism in core computation kernels like Multi-Scalar Multiplication (MSM). However, we find that a systematic characterization of execution bottlenecks in ZKPs, as well as their scalability on modern GPU architectures, is missing in the literature. This paper presents ZKProphet, a comprehensive performance study of Zero-Knowledge Proofs on GPUs. Following massive speedups of MSM, we find that ZKPs are bottlenecked by kernels like Number-Theoretic Transform (NTT), as they account for up to 90% of the proof generation latency on GPUs when paired with optimized MSM implementations. Available NTT implementations under-utilize GPU compute resources and often do not employ architectural features like asynchronous compute and memory operations. We observe that the arithmetic operations underlying ZKPs execute exclusively on the GPU's 32-bit integer pipeline and exhibit limited instruction-level parallelism due to data dependencies. Their performance is thus limited by the available integer compute units. While one way to scale the performance of ZKPs is adding more compute units, we discuss how runtime parameter tuning for optimizations like precomputed inputs and alternative data representations can extract additional speedup. With this work, we provide the ZKP community a roadmap to scale performance on GPUs and construct definitive GPU-accelerated ZKPs for their application requirements and available hardware resources.

Paperid: 1112, https://arxiv.org/pdf/2509.22256.pdf

Abstract:
Large language model (LLM)-based computer-use agents represent a convergence of AI and OS capabilities, enabling natural language to control system- and application-level functions. However, due to LLMs' inherent uncertainty issues, granting agents control over computers poses significant security risks. When agent actions deviate from user intentions, they can cause irreversible consequences. Existing mitigation approaches, such as user confirmation and LLM-based dynamic action validation, still suffer from limitations in usability, security, and performance. To address these challenges, we propose CSAgent, a system-level, static policy-based access control framework for computer-use agents. To bridge the gap between static policy and dynamic context and user intent, CSAgent introduces intent- and context-aware policies, and provides an automated toolchain to assist developers in constructing and refining them. CSAgent enforces these policies through an optimized OS service, ensuring that agent actions can only be executed under specific user intents and contexts. CSAgent supports protecting agents that control computers through diverse interfaces, including API, CLI, and GUI. We implement and evaluate CSAgent, which successfully defends against more than 99.36% of attacks while introducing only 6.83% performance overhead.

Paperid: 1113, https://arxiv.org/pdf/2509.21772.pdf

Abstract:
Phishing attacks are a significant societal threat, disproportionately harming vulnerable populations and eroding trust in essential digital services. Current defenses are often reactive, failing against modern evasive tactics like cloaking that conceal malicious content. To address this, we introduce PhishLumos, an adaptive multi-agent system that proactively mitigates entire attack campaigns. It confronts a core cybersecurity imbalance: attackers can easily scale operations, while defense remains an intensive expert task. Instead of being blocked by evasion, PhishLumos treats it as a critical signal to investigate the underlying infrastructure. Its Large Language Model (LLM)-powered agents uncover shared hosting, certificates, and domain registration patterns. On real-world data, our system identified 100% of campaigns in the median case, over a week before their confirmation by cybersecurity experts. PhishLumos demonstrates a practical shift from reactive URL blocking to proactive campaign mitigation, protecting users before they are harmed and making the digital world safer for all.

Paperid: 1114, https://arxiv.org/pdf/2509.20324.pdf

Abstract:
Retrieval-Augmented Generation (RAG) is an emerging approach in natural language processing that combines large language models (LLMs) with external document retrieval to produce more accurate and grounded responses. While RAG has shown strong potential in reducing hallucinations and improving factual consistency, it also introduces new privacy and security challenges that differ from those faced by traditional LLMs. Existing research has demonstrated that LLMs can leak sensitive information through training data memorization or adversarial prompts, and RAG systems inherit many of these vulnerabilities. At the same time, reliance of RAG on an external knowledge base opens new attack surfaces, including the potential for leaking information about the presence or content of retrieved documents, or for injecting malicious content to manipulate model behavior. Despite these risks, there is currently no formal framework that defines the threat landscape for RAG systems. In this paper, we address a critical gap in the literature by proposing, to the best of our knowledge, the first formal threat model for retrieval-RAG systems. We introduce a structured taxonomy of adversary types based on their access to model components and data, and we formally define key threat vectors such as document-level membership inference and data poisoning, which pose serious privacy and integrity risks in real-world deployments. By establishing formal definitions and attack models, our work lays the foundation for a more rigorous and principled understanding of privacy and security in RAG systems.

Paperid: 1115, https://arxiv.org/pdf/2509.18800.pdf

Abstract:
Android's open-source nature facilitates widespread smartphone accessibility, particularly in price-sensitive markets. System and vendor applications that come pre-installed on budget Android devices frequently operate with elevated privileges, yet they receive limited independent examination. To address this gap, we developed a framework that extracts APKs from physical devices and applies static analysis to identify privacy and security issues in embedded software. Our study examined 1,544 APKs collected from seven African smartphones. The analysis revealed that 145 applications (9%) disclose sensitive data, 249 (16%) expose critical components without sufficient safeguards, and many present additional risks: 226 execute privileged or dangerous commands, 79 interact with SMS messages (read, send, or delete), and 33 perform silent installation operations. We also uncovered a vendor-supplied package that appears to transmit device identifiers and location details to an external third party. These results demonstrate that pre-installed applications on widely distributed low-cost devices represent a significant and underexplored threat to user security and privacy.

Paperid: 1116, https://arxiv.org/pdf/2509.14706.pdf

Abstract:
With the rapid development of Internet technologies, web systems have become essential infrastructures for modern information exchange and business operations. However, alongside their expansion, numerous security vulnerabilities have emerged, making web security a critical research focus within the broader field of cybersecurity. These issues are closely related to data protection, privacy preservation, and business continuity, and systematic research on web security is crucial for mitigating malicious attacks and enhancing the reliability and robustness of network systems. This paper first reviews the OWASP Top 10, summarizing the types, causes, and impacts of common web vulnerabilities, and illustrates their exploitation mechanisms through representative cases. Building upon this, the Gruyere platform is adopted as an experimental subject for analyzing known vulnerabilities. The study presents detailed reproduction steps for specific vulnerabilities, proposes comprehensive remediation strategies, and further compares Gruyere's vulnerabilities with contemporary real-world cases. The findings suggest that, although Gruyere's vulnerabilities are relatively outdated, their underlying principles remain highly relevant for explaining a wide range of modern security flaws. Overall, this research demonstrates that web system security analysis based on Gruyere not only deepens the understanding of vulnerability mechanisms but also provides practical support for technological innovation and security defense.

Paperid: 1117, https://arxiv.org/pdf/2509.13634.pdf

Abstract:
Federated learning (FL) has gained popularity as a privacy-preserving method of training machine learning models on decentralized networks. However to ensure reliable operation of UAV-assisted FL systems, issues like as excessive energy consumption, communication inefficiencies, and security vulnerabilities must be solved. This paper proposes an innovative framework that integrates Digital Twin (DT) technology and Zero-Knowledge Federated Learning (zkFed) to tackle these challenges. UAVs act as mobile base stations, allowing scattered devices to train FL models locally and upload model updates for aggregation. By incorporating DT technology, our approach enables real-time system monitoring and predictive maintenance, improving UAV network efficiency. Additionally, Zero-Knowledge Proofs (ZKPs) strengthen security by allowing model verification without exposing sensitive data. To optimize energy efficiency and resource management, we introduce a dynamic allocation strategy that adjusts UAV flight paths, transmission power, and processing rates based on network conditions. Using block coordinate descent and convex optimization techniques, our method significantly reduces system energy consumption by up to 29.6% compared to conventional FL approaches. Simulation results demonstrate improved learning performance, security, and scalability, positioning this framework as a promising solution for next-generation UAV-based intelligent networks.

Paperid: 1118, https://arxiv.org/pdf/2509.12291.pdf

Abstract:
Distributed Denial of Service (DDoS) attacks pose a persistent threat to network security, requiring timely and scalable mitigation strategies. In this paper, we propose a novel collaborative architecture that integrates a P4-programmable data plane with an SDN control plane to enable real-time DDoS detection and response. At the core of our approach is a split early-exit neural network that performs partial inference in the data plane using a quantized Convolutional Neural Network (CNN), while deferring uncertain cases to a Gated Recurrent Unit (GRU) module in the control plane. This design enables high-speed classification at line rate with the ability to escalate more complex flows for deeper analysis. Experimental evaluation using real-world DDoS datasets demonstrates that our approach achieves high detection accuracy with significantly reduced inference latency and control plane overhead. These results highlight the potential of tightly coupled ML-P4-SDN systems for efficient, adaptive, and low-latency DDoS defense.

Paperid: 1119, https://arxiv.org/pdf/2509.12181.pdf

Abstract:
Online e-commerce scams, ranging from shopping scams to pet scams, globally cause millions of dollars in financial damage every year. In response, the security community has developed highly accurate detection systems able to determine if a website is fraudulent. However, finding candidate scam websites that can be passed as input to these downstream detection systems is challenging: relying on user reports is inherently reactive and slow, and proactive systems issuing search engine queries to return candidate websites suffer from low coverage and do not generalize to new scam types. In this paper, we present LOKI, a system designed to identify search engine queries likely to return a high fraction of fraudulent websites. LOKI implements a keyword scoring model grounded in Learning Under Privileged Information (LUPI) and feature distillation from Search Engine Result Pages (SERPs). We rigorously validate LOKI across 10 major scam categories and demonstrate a 20.58 times improvement in discovery over both heuristic and data-driven baselines across all categories. Leveraging a small seed set of only 1,663 known scam sites, we use the keywords identified by our method to discover 52,493 previously unreported scams in the wild. Finally, we show that LOKI generalizes to previously-unseen scam categories, highlighting its utility in surfacing emerging threats.

Paperid: 1120, https://arxiv.org/pdf/2509.11971.pdf

Abstract:
Simulating hostile attacks of physical autonomous systems can be a useful tool to examine their robustness to attack and inform vulnerability-aware design. In this work, we examine this through the lens of multi-robot patrol, by presenting a machine learning-based adversary model that observes robot patrol behavior in order to attempt to gain undetected access to a secure environment within a limited time duration. Such a model allows for evaluation of a patrol system against a realistic potential adversary, offering insight into future patrol strategy design. We show that our new model outperforms existing baselines, thus providing a more stringent test, and examine its performance against multiple leading decentralized multi-robot patrol strategies.

Paperid: 1121, https://arxiv.org/pdf/2509.11745.pdf

Abstract:
Digital watermarks can be embedded into AI-generated content (AIGC) by initializing the generation process with starting points sampled from a secret distribution. When combined with pseudorandom error-correcting codes, such watermarked outputs can remain indistinguishable from unwatermarked objects, while maintaining robustness under whitenoise. In this paper, we go beyond indistinguishability and investigate security under removal attacks. We demonstrate that indistinguishability alone does not necessarily guarantee resistance to adversarial removal. Specifically, we propose a novel attack that exploits boundary information leaked by the locations of watermarked objects. This attack significantly reduces the distortion required to remove watermarks -- by up to a factor of $15 \times$ compared to a baseline whitenoise attack under certain settings. To mitigate such attacks, we introduce a defense mechanism that applies a secret transformation to hide the boundary, and prove that the secret transformation effectively rendering any attacker's perturbations equivalent to those of a naive whitenoise adversary. Our empirical evaluations, conducted on multiple versions of Stable Diffusion, validate the effectiveness of both the attack and the proposed defense, highlighting the importance of addressing boundary leakage in latent-based watermarking schemes.

Paperid: 1122, https://arxiv.org/pdf/2509.11440.pdf

Abstract:
In recent years, Rowhammer has attracted significant attention from academia and industry alike. This technique, first published in 2014, flips bits in memory by repeatedly accessing neighbouring memory locations. Since its discovery, researchers have developed a substantial body of work exploiting Rowhammer and proposing countermeasures. These works demonstrate that Rowhammer can be mounted not only through native code, but also via remote code execution, such as JavaScript in browsers, and over networks. In this work, we uncover a previously unexplored Rowhammer vector. We present Thunderhammer, an attack that induces DRAM bitflips from malicious peripherals connected via PCIe or Thunderbolt (which tunnels PCIe). On modern DDR4 systems, we observe that triggering bitflips through PCIe requests requires precisely timed access patterns tailored to the target system. We design a custom device to reverse engineer critical architectural parameters that shape PCIe request scheduling, and to execute effective hammering access patterns. Leveraging this knowledge, we successfully demonstrate Rowhammer-induced bitflips in DDR4 memory modules via both PCIe slot connections and Thunderbolt ports tunnelling PCIe.

Paperid: 1123, https://arxiv.org/pdf/2509.11120.pdf

Abstract:
Autonomous vehicles, including self-driving cars, robotic ground vehicles, and drones, rely on complex sensor pipelines to ensure safe and reliable operation. However, these safety-critical systems remain vulnerable to adversarial sensor attacks that can compromise their performance and mission success. While extensive research has demonstrated various sensor attack techniques, critical gaps remain in understanding their feasibility in real-world, end-to-end systems. This gap largely stems from the lack of a systematic perspective on how sensor errors propagate through interconnected modules in autonomous systems when autonomous vehicles interact with the physical world. To bridge this gap, we present a comprehensive survey of autonomous vehicle sensor attacks across platforms, sensor modalities, and attack methods. Central to our analysis is the System Error Propagation Graph (SEPG), a structured demonstration tool that illustrates how sensor attacks propagate through system pipelines, exposing the conditions and dependencies that determine attack feasibility. With the aid of SEPG, our study distills seven key findings that highlight the feasibility challenges of sensor attacks and uncovers eleven previously overlooked attack vectors exploiting inter-module interactions, several of which we validate through proof-of-concept experiments. Additionally, we demonstrate how large language models (LLMs) can automate aspects of SEPG construction and cross-validate expert analysis, showcasing the promise of AI-assisted security evaluation.

Paperid: 1124, https://arxiv.org/pdf/2509.11035.pdf

Abstract:
Multi-agent debate (MAD) is an emerging approach to improving the reasoning capabilities of large language models (LLMs). Existing MAD methods rely on multiple rounds of interaction among agents to reach consensus, and the final output is selected by majority voting in the last round. However, this consensus-based design faces several limitations. First, multiple rounds of communication increases token overhead and limits scalability. Second, due to the inherent conformity of LLMs, agents that initially produce correct responses may be influenced by incorrect ones during the debate process, causing error propagation. Third, majority voting introduces randomness and unfairness in the decision-making phase, and can degrade the reasoning performance. To address these issues, we propose \textsc{Free-MAD}, a novel MAD framework that eliminates the need for consensus among agents. \textsc{Free-MAD} introduces a novel score-based decision mechanism that evaluates the entire debate trajectory rather than relying on the last round only. This mechanism tracks how each agent's reasoning evolves, enabling more accurate and fair outcomes. In addition, \textsc{Free-MAD} reconstructs the debate phase by introducing anti-conformity, a mechanism that enables agents to mitigate excessive influence from the majority. Experiments on eight benchmark datasets demonstrate that \textsc{Free-MAD} significantly improves reasoning performance while requiring only a single-round debate and thus reducing token costs. We also show that compared to existing MAD approaches, \textsc{Free-MAD} exhibits improved robustness in real-world attack scenarios.

Paperid: 1125, https://arxiv.org/pdf/2509.09103.pdf

Abstract:
Crop diseases pose significant threats to global food security, agricultural productivity, and sustainable farming practices, directly affecting farmers' livelihoods and economic stability. To address the growing need for effective crop disease management, AI-based disease alerting systems have emerged as promising tools by providing early detection and actionable insights for timely intervention. However, existing systems often overlook critical aspects such as data privacy, market pricing power, and farmer-friendly usability, leaving farmers vulnerable to privacy breaches and economic exploitation. To bridge these gaps, we propose AgriSentinel, the first Privacy-Enhanced Embedded-LLM Crop Disease Alerting System. AgriSentinel incorporates a differential privacy mechanism to protect sensitive crop image data while maintaining classification accuracy. Its lightweight deep learning-based crop disease classification model is optimized for mobile devices, ensuring accessibility and usability for farmers. Additionally, the system includes a fine-tuned, on-device large language model (LLM) that leverages a curated knowledge pool to provide farmers with specific, actionable suggestions for managing crop diseases, going beyond simple alerting. Comprehensive experiments validate the effectiveness of AgriSentinel, demonstrating its ability to safeguard data privacy, maintain high classification performance, and deliver practical, actionable disease management strategies. AgriSentinel offers a robust, farmer-friendly solution for automating crop disease alerting and management, ultimately contributing to improved agricultural decision-making and enhanced crop productivity.

Paperid: 1126, https://arxiv.org/pdf/2509.03306.pdf

Abstract:
Quantum computing is a disruptive technology that is expected to offer significant advantages in many critical fields (e.g. drug discovery and cryptography). The security of information processed by such machines is therefore paramount. Currently, modest Noisy Intermediate-Scale Quantum (NISQ) devices are available. The goal of this work is to identify a practical, heuristic methodology to evaluate security properties, such as secrecy and integrity, while using quantum processors owned by potentially untrustworthy providers.

Paperid: 1127, https://arxiv.org/pdf/2509.02856.pdf

Abstract:
Previous works in the differential privacy literature that allow users to choose their privacy levels typically operate under the heterogeneous differential privacy (HDP) framework with the simplifying assumption that user data and privacy levels are not correlated. Firstly, we demonstrate that the standard HDP framework falls short when user data and privacy demands are allowed to be correlated. Secondly, to address this shortcoming, we propose an alternate framework, Add-remove Heterogeneous Differential Privacy (AHDP), that jointly accounts for user data and privacy preference. We show that AHDP is robust to possible correlations between data and privacy. Thirdly, we formalize the guarantees of the proposed AHDP framework through an operational hypothesis testing perspective. The hypothesis testing setup may be of independent interest in analyzing other privacy frameworks as well. Fourthly, we show that there exists non-trivial AHDP mechanisms that notably do not require prior knowledge of the data-privacy correlations. We propose some such mechanisms and apply them to core statistical tasks such as mean estimation, frequency estimation, and linear regression. The proposed mechanisms are simple to implement with minimal assumptions and modeling requirements, making them attractive for real-world use. Finally, we empirically evaluate proposed AHDP mechanisms, highlighting their trade-offs using LLM-generated synthetic datasets, which we release for future research.

Paperid: 1128, https://arxiv.org/pdf/2509.00391.pdf

Abstract:
Gradient-based adversarial prompting, such as the Greedy Coordinate Gradient (GCG) algorithm, has emerged as a powerful method for jailbreaking large language models (LLMs). In this paper, we present a systematic appraisal of GCG and its annealing-augmented variant, T-GCG, across open-source LLMs of varying scales. Using Qwen2.5-0.5B, LLaMA-3.2-1B, and GPT-OSS-20B, we evaluate attack effectiveness on both safety-oriented prompts (AdvBench) and reasoning-intensive coding prompts. Our study reveals three key findings: (1) attack success rates (ASR) decrease with model size, reflecting the increasing complexity and non-convexity of larger models' loss landscapes; (2) prefix-based heuristics substantially overestimate attack effectiveness compared to GPT-4o semantic judgments, which provide a stricter and more realistic evaluation; and (3) coding-related prompts are significantly more vulnerable than adversarial safety prompts, suggesting that reasoning itself can be exploited as an attack vector. In addition, preliminary results with T-GCG show that simulated annealing can diversify adversarial search and achieve competitive ASR under prefix evaluation, though its benefits under semantic judgment remain limited. Together, these findings highlight the scalability limits of GCG, expose overlooked vulnerabilities in reasoning tasks, and motivate further development of annealing-inspired strategies for more robust adversarial evaluation.

Paperid: 1129, https://arxiv.org/pdf/2509.00300.pdf

Abstract:
As modern systems increasingly rely on GPUs for computationally intensive tasks such as machine learning acceleration, ensuring the integrity of GPU computation has become critically important. Recent studies have shown that GPU kernels are vulnerable to both traditional memory safety issues (e.g., buffer overflow attacks) and emerging microarchitectural threats (e.g., Rowhammer attacks), many of which manifest as anomalous execution behaviors observable through side-channel signals. However, existing golden model based validation approaches that rely on such signals are fragile, highly sensitive to interference, and do not scale well across GPU workloads with diverse scheduling behaviors. To address these challenges, we propose ShadowScope, a monitoring and validation framework that leverages a composable golden model. Instead of building a single monolithic reference, ShadowScope decomposes trusted kernel execution into modular, repeatable functions that encode key behavioral features. This composable design captures execution patterns at finer granularity, enabling robust validation that is resilient to noise, workload variation, and interference across GPU workloads. To further reduce reliance on noisy software-only monitoring, we introduce ShadowScope+, a hardware-assisted validation mechanism that integrates lightweight on-chip checks into the GPU pipeline. ShadowScope+ achieves high validation accuracy with an average runtime overhead of just 4.6%, while incurring minimal hardware and design complexity. Together, these contributions demonstrate that side-channel observability can be systematically repurposed into a practical defense for GPU kernel integrity.

Paperid: 1130, https://arxiv.org/pdf/2508.21219.pdf

Abstract:
Browser fingerprinting defenses have historically focused on detecting JavaScript(JS)-based tracking techniques. However, the widespread adoption of WebAssembly (WASM) introduces a potential blind spot, as adversaries can convert JS to WASM's low-level binary format to obfuscate malicious logic. This paper presents the first systematic evaluation of how such WASM-based obfuscation impacts the robustness of modern fingerprinting defenses. We develop an automated pipeline that translates real-world JS fingerprinting scripts into functional WASM-obfuscated variants and test them against two classes of defenses: state-of-the-art detectors in research literature and commercial, in-browser tools. Our findings reveal a notable divergence: detectors proposed in the research literature that rely on feature-based analysis of source code show moderate vulnerability, stemming from outdated datasets or a lack of WASM compatibility. In contrast, defenses such as browser extensions and native browser features remained completely effective, as their API-level interception is agnostic to the script's underlying implementation. These results highlight a gap between academic and practical defense strategies and offer insights into strengthening detection approaches against WASM-based obfuscation, while also revealing opportunities for more evasive techniques in future attacks.

Paperid: 1131, https://arxiv.org/pdf/2508.18832.pdf

Abstract:
We analyze the privacy guarantees of the Laplace mechanism releasing the histogram of a dataset through the lens of pointwise maximal leakage (PML). While differential privacy is commonly used to quantify the privacy loss, it is a context-free definition that does not depend on the data distribution. In contrast, PML enables a more refined analysis by incorporating assumptions about the data distribution. We show that when the probability of each histogram bin is bounded away from zero, stronger privacy protection can be achieved for a fixed level of noise. Our results demonstrate the advantage of context-aware privacy measures and show that incorporating assumptions about the data can improve privacy-utility tradeoffs.

Paperid: 1132, https://arxiv.org/pdf/2508.18318.pdf

Abstract:
Wind power data often suffers from missing values due to sensor faults and unstable transmission at edge sites. While federated learning enables privacy-preserving collaboration without sharing raw data, it remains vulnerable to anomalous updates and privacy leakage during parameter exchange. These challenges are amplified in open industrial environments, necessitating zero-trust mechanisms where no participant is inherently trusted. To address these challenges, this work proposes ZTFed-MAS2S, a zero-trust federated learning framework that integrates a multi-head attention-based sequence-to-sequence imputation model. ZTFed integrates verifiable differential privacy with non-interactive zero-knowledge proofs and a confidentiality and integrity verification mechanism to ensure verifiable privacy preservation and secure model parameters transmission. A dynamic trust-aware aggregation mechanism is employed, where trust is propagated over similarity graphs to enhance robustness, and communication overhead is reduced via sparsity- and quantization-based compression. MAS2S captures long-term dependencies in wind power data for accurate imputation. Extensive experiments on real-world wind farm datasets validate the superiority of ZTFed-MAS2S in both federated learning performance and missing data imputation, demonstrating its effectiveness as a secure and efficient solution for practical applications in the energy sector.

Paperid: 1133, https://arxiv.org/pdf/2508.17964.pdf

Abstract:
As blockchain technology continues to evolve, the security of smart contracts has increasingly drawn attention from both academia and industry. The Move language, with its unique resource model and linear type system, provides a solid foundation for the security of digital assets. However, smart contracts still face new security challenges due to developer programming errors and the potential risks associated with cross-module interactions. This paper systematically analyzes the limitations of existing security tools within the Move ecosystem and reveals their unique vulnerability patterns. To address these issues, it introduces MoveScanner, a static analysis tool based on a control flow graph and data flow analysis architecture. By incorporating cross-module call graph tracking, MoveScanner can effectively identify five key types of security vulnerabilities, including resource leaks, weak permission management, and arithmetic overflows. In terms of design, MoveScanner adheres to a modular principle, supports bytecode-level analysis and multi-chain adaptation, and introduces innovative resource trajectory tracking algorithms and capability matrix analysis methods, thereby significantly reducing the false positive rate. Empirical results show that MoveScanner achieved 88.2% detection accuracy in benchmark testing, filling the gap in security tools in the Move ecosystem. Furthermore, this paper identifies twelve new types of security risks based on the resource-oriented programming paradigm and provides a theoretical foundation and practical experience for the development of smart contract security mechanisms. Future work will focus on combining formal verification and dynamic analysis techniques to build a security protection framework covering the entire contract lifecycle

Paperid: 1134, https://arxiv.org/pdf/2508.16133.pdf

Abstract:
Sensors are fundamental to cyber-physical systems (CPS), enabling perception and control by transducing physical stimuli into digital measurements. However, despite growing research on physical attacks on sensors, our understanding of sensor hardware vulnerabilities remains fragmented due to the ad-hoc nature of this field. Moreover, the infinite attack signal space further complicates threat abstraction and defense. To address this gap, we propose a systematization framework, termed sensor out-of-band (OOB) vulnerabilities, that for the first time provides a comprehensive abstraction for sensor attack surfaces based on underlying physical principles. We adopt a bottom-up systematization methodology that analyzes OOB vulnerabilities across three levels. At the component level, we identify the physical principles and limitations that contribute to OOB vulnerabilities. At the sensor level, we categorize known attacks and evaluate their practicality. At the system level, we analyze how CPS features such as sensor fusion, closed-loop control, and intelligent perception impact the exposure and mitigation of OOB threats. Our findings offer a foundational understanding of sensor hardware security and provide guidance and future directions for sensor designers, security researchers, and system developers aiming to build more secure sensors and CPS.

Paperid: 1135, https://arxiv.org/pdf/2508.15778.pdf

Abstract:
Deep learning-based lane detection (LD) plays a critical role in autonomous driving and advanced driver assistance systems. However, its vulnerability to backdoor attacks presents a significant security concern. Existing backdoor attack methods on LD often exhibit limited practical utility due to the artificial and conspicuous nature of their triggers. To address this limitation and investigate the impact of more ecologically valid backdoor attacks on LD models, we examine the common data poisoning attack and introduce DBALD, a novel diffusion-based data poisoning framework for generating naturalistic backdoor triggers. DBALD comprises two key components: optimal trigger position finding and stealthy trigger generation. Given the insight that attack performance varies depending on the trigger position, we propose a heatmap-based method to identify the optimal trigger location, with gradient analysis to generate attack-specific heatmaps. A region-based editing diffusion process is then applied to synthesize visually plausible triggers within the most susceptible regions identified previously. Furthermore, to ensure scene integrity and stealthy attacks, we introduce two loss strategies: one for preserving lane structure and another for maintaining the consistency of the driving scene. Consequently, compared to existing attack methods, DBALD achieves both a high attack success rate and superior stealthiness. Extensive experiments on 4 mainstream LD models show that DBALD exceeds state-of-the-art methods, with an average success rate improvement of +10.87% and significantly enhanced stealthiness. The experimental results highlight significant practical challenges in ensuring model robustness against real-world backdoor threats in LD.

Paperid: 1136, https://arxiv.org/pdf/2508.11575.pdf

Abstract:
The growing adoption of machine learning in sensitive areas such as healthcare and defense introduces significant privacy and security challenges. These domains demand robust data protection, as models depend on large volumes of sensitive information for both training and inference. Fully Homomorphic Encryption (FHE) presents a compelling solution by enabling computations directly on encrypted data, maintaining confidentiality across the entire machine learning workflow. However, FHE inherently supports only linear operations, making it difficult to implement non-linear activation functions, essential components of modern neural networks. This work focuses on designing, implementing, and evaluating activation functions tailored for FHE-based machine learning. We investigate two commonly used functions: the Square function and Rectified Linear Unit (ReLU), using LeNet-5 and ResNet-20 architectures with the CKKS scheme from the OpenFHE library. For ReLU, we assess two methods: a conventional low-degree polynomial approximation and a novel scheme-switching technique that securely evaluates ReLU under FHE constraints. Our findings show that the Square function performs well in shallow networks like LeNet-5, achieving 99.4% accuracy with 128 seconds per image. In contrast, deeper models like ResNet-20 benefit more from ReLU. The polynomial approximation yields 83.8% accuracy with 1,145 seconds per image, while our scheme-switching method improves accuracy to 89.8%, albeit with a longer inference time of 1,697 seconds. These results underscore a critical trade-off in FHE-based ML: faster activation functions often reduce accuracy, whereas those preserving accuracy demand greater computational resources.

Paperid: 1137, https://arxiv.org/pdf/2508.09426.pdf

Abstract:
As artificial intelligence technology continues to advance, chatbots are becoming increasingly powerful. Among them, ChatGPT, launched by OpenAI, has garnered widespread attention globally due to its powerful natural language processing capabilities based on the GPT model, which enables it to engage in natural conversations with users, understand various forms of linguistic expressions, and generate useful information and suggestions. However, as its application scope expands, user demand grows, and malicious attacks related to it become increasingly frequent, the security threats and privacy risks faced by ChatGPT are gradually coming to the forefront. In this paper, the security of ChatGPT is mainly studied from two aspects, security threats and privacy risks. The article systematically analyzes various types of vulnerabilities involved in the above two types of problems and their causes. Briefly, we discuss the controversies that ChatGPT may cause at the ethical and moral levels. In addition, this paper reproduces several network attack and defense test scenarios by simulating the attacker's perspective and methodology. Simultaneously, it explores the feasibility of using ChatGPT for security vulnerability detection and security tool generation from the defender's perspective.

Paperid: 1138, https://arxiv.org/pdf/2508.08043.pdf

Abstract:
Virtual Reality (VR) techniques, serving as the bridge between the real and virtual worlds, have boomed and are widely used in manufacturing, remote healthcare, gaming, etc. Specifically, VR systems offer users immersive experiences that include both perceptions and actions. Various studies have demonstrated that attackers can manipulate VR software to influence users' interactions, including perception and actions. However, such attacks typically require strong access and specialized expertise. In this paper, we are the first to present a systematic analysis of physical attacks against VR systems and introduce False Reality, a new attack threat to VR devices without requiring access to or modification of their software. False Reality disturbs VR system services by tampering with sensor measurements, and further spoofing users' perception even inducing harmful actions, e.g., inducing dizziness or causing users to crash into obstacles, by exploiting perceptual and psychological effects. We formalize these threats through an attack pathway framework and validate three representative pathways via physical experiments and user studies on five commercial VR devices. Finally, we further propose a defense prototype to mitigate such threats. Our findings shall provide valuable insights for enhancing the security and resilience of future VR systems.

Paperid: 1139, https://arxiv.org/pdf/2508.05334.pdf

Abstract:
Academic credential fraud threatens educational integrity, especially in developing countries like Bangladesh, where verification methods are primarily manual and inefficient. To address this challenge, we present ShikkhaChain, a blockchain-powered certificate management platform designed to securely issue, verify, and revoke academic credentials in a decentralized and tamper-proof manner. Built on Ethereum smart contracts and utilizing IPFS for off-chain storage, the platform offers a transparent, scalable solution accessible through a React-based DApp with MetaMask integration. ShikkhaChain enables role-based access for governments, regulators, institutions, and public verifiers, allowing QR-based validation and on-chain revocation tracking. Our prototype demonstrates enhanced trust, reduced verification time, and improved international credibility for Bangladeshi degrees, promoting a more reliable academic and employment ecosystem.

Paperid: 1140, https://arxiv.org/pdf/2508.02035.pdf

Abstract:
Phishing attacks continue to evolve, with cloaking techniques posing a significant challenge to detection efforts. Cloaking allows attackers to display phishing sites only to specific users while presenting legitimate pages to security crawlers, rendering traditional detection systems ineffective. This research proposes PhishParrot, a novel crawling environment optimization system designed to counter cloaking techniques. PhishParrot leverages the contextual analysis capabilities of Large Language Models (LLMs) to identify potential patterns in crawling information, enabling the construction of optimal user profiles capable of bypassing cloaking mechanisms. The system accumulates information on phishing sites collected from diverse environments. It then adapts browser settings and network configurations to match the attacker's target user conditions based on information extracted from similar cases. A 21-day evaluation showed that PhishParrot improved detection accuracy by up to 33.8% over standard analysis systems, yielding 91 distinct crawling environments for diverse conditions targeted by attackers. The findings confirm that the combination of similar-case extraction and LLM-based context analysis is an effective approach for detecting cloaked phishing attacks.

Paperid: 1141, https://arxiv.org/pdf/2507.18302.pdf

Abstract:
Language Models (LMs) typically adhere to a "pre-training and fine-tuning" paradigm, where a universal pre-trained model can be fine-tuned to cater to various specialized domains. Low-Rank Adaptation (LoRA) has gained the most widespread use in LM fine-tuning due to its lightweight computational cost and remarkable performance. Because the proportion of parameters tuned by LoRA is relatively small, there might be a misleading impression that the LoRA fine-tuning data is invulnerable to Membership Inference Attacks (MIAs). However, we identify that utilizing the pre-trained model can induce more information leakage, which is neglected by existing MIAs. Therefore, we introduce LoRA-Leak, a holistic evaluation framework for MIAs against the fine-tuning datasets of LMs. LoRA-Leak incorporates fifteen membership inference attacks, including ten existing MIAs, and five improved MIAs that leverage the pre-trained model as a reference. In experiments, we apply LoRA-Leak to three advanced LMs across three popular natural language processing tasks, demonstrating that LoRA-based fine-tuned LMs are still vulnerable to MIAs (e.g., 0.775 AUC under conservative fine-tuning settings). We also applied LoRA-Leak to different fine-tuning settings to understand the resulting privacy risks. We further explore four defenses and find that only dropout and excluding specific LM layers during fine-tuning effectively mitigate MIA risks while maintaining utility. We highlight that under the "pre-training and fine-tuning" paradigm, the existence of the pre-trained model makes MIA a more severe risk for LoRA-based LMs. We hope that our findings can provide guidance on data privacy protection for specialized LM providers.

Paperid: 1142, https://arxiv.org/pdf/2507.18215.pdf

Abstract:
Information security is facing increasingly severe challenges, and traditional protection means are difficult to cope with complex and changing threats. In recent years, as an emerging intelligent technology, large language models (LLMs) have shown a broad application prospect in the field of information security. In this paper, we focus on the key role of LLM in information security, systematically review its application progress in malicious behavior prediction, network threat analysis, system vulnerability detection, malicious code identification, and cryptographic algorithm optimization, and explore its potential in enhancing security protection performance. Based on neural networks and Transformer architecture, this paper analyzes the technical basis of large language models and their advantages in natural language processing tasks. It is shown that the introduction of large language modeling helps to improve the detection accuracy and reduce the false alarm rate of security systems. Finally, this paper summarizes the current application results and points out that it still faces challenges in model transparency, interpretability, and scene adaptability, among other issues. It is necessary to explore further the optimization of the model structure and the improvement of the generalization ability to realize a more intelligent and accurate information security protection system.

Paperid: 1143, https://arxiv.org/pdf/2507.17324.pdf

Abstract:
Virtual Reality (VR) has emerged as a transformative technology across industries, yet its security weaknesses, including vulnerabilities, are underinvestigated. This study investigates 334 VR projects hosted on GitHub, examining 1,681 software security weaknesses to understand: what types of weaknesses are prevalent in VR software; when and how weaknesses are introduced; how long they have survived; and how they have been removed. Due to the limited availability of VR software security weaknesses in public databases (e.g., the National Vulnerability Database or NVD), we prepare the first systematic dataset of VR software security weaknesses by introducing a novel framework to collect such weaknesses from GitHub commit data. Our empirical study on the dataset leads to useful insights, including: (i) VR weaknesses are heavily skewed toward user interface weaknesses, followed by resource-related weaknesses; (ii) VR development tools pose higher security risks than VR applications; (iii) VR security weaknesses are often introduced at the VR software birth time.

Paperid: 1144, https://arxiv.org/pdf/2507.16887.pdf

Abstract:
The rapid advancement of pre-trained language models (PLMs) has demonstrated promising results for various code-related tasks. However, their effectiveness in detecting real-world vulnerabilities remains a critical challenge. % for the security community. While existing empirical studies evaluate PLMs for vulnerability detection (VD), their inadequate consideration in data preparation, evaluation setups, and experimental settings undermines the accuracy and comprehensiveness of evaluations. This paper introduces RevisitVD, an extensive evaluation of 17 PLMs spanning smaller code-specific PLMs and large-scale PLMs using newly constructed datasets. Specifically, we compare the performance of PLMs under both fine-tuning and prompt engineering, assess their effectiveness and generalizability across various training and testing settings, and analyze their robustness against code normalization, abstraction, and semantic-preserving transformations. Our findings reveal that, for VD tasks, PLMs incorporating pre-training tasks designed to capture the syntactic and semantic patterns of code outperform both general-purpose PLMs and those solely pre-trained or fine-tuned on large code corpora. However, these models face notable challenges in real-world scenarios, such as difficulties in detecting vulnerabilities with complex dependencies, handling perturbations introduced by code normalization and abstraction, and identifying semantic-preserving vulnerable code transformations. Also, the truncation caused by the limited context windows of PLMs can lead to a non-negligible amount of labeling errors. This study underscores the importance of thorough evaluations of model performance in practical scenarios and outlines future directions to help enhance the effectiveness of PLMs for realistic VD applications.

Paperid: 1145, https://arxiv.org/pdf/2507.15836.pdf

Abstract:
In this work we study black-box privacy auditing, where the goal is to lower bound the privacy parameter of a differentially private learning algorithm using only the algorithm's outputs (i.e., final trained model). For DP-SGD (the most successful method for training differentially private deep learning models), the canonical approach auditing uses membership inference-an auditor comes with a small set of special "canary" examples, inserts a random subset of them into the training set, and then tries to discern which of their canaries were included in the training set (typically via a membership inference attack). The auditor's success rate then provides a lower bound on the privacy parameters of the learning algorithm. Our main contribution is a method for optimizing the auditor's canary set to improve privacy auditing, leveraging recent work on metagradient optimization. Our empirical evaluation demonstrates that by using such optimized canaries, we can improve empirical lower bounds for differentially private image classification models by over 2x in certain instances. Furthermore, we demonstrate that our method is transferable and efficient: canaries optimized for non-private SGD with a small model architecture remain effective when auditing larger models trained with DP-SGD.

Paperid: 1146, https://arxiv.org/pdf/2507.14263.pdf

Abstract:
The Internet is poised to host billions to trillions of autonomous AI agents that negotiate, delegate, and migrate in milliseconds and workloads that will strain DNS-centred identity and discovery. In this paper, we describe the NANDA index architecture, which we envision as a means for discoverability, identifiability and authentication in the internet of AI agents. We present an architecture where a minimal lean index resolves to dynamic, cryptographically verifiable AgentFacts that supports multi-endpoint routing, load balancing, privacy-preserving access, and credentialed capability assertions. Our architecture design delivers five concrete guarantees: (1) A quilt-like index proposal that supports both NANDA-native agents as well as third party agents being discoverable via the index, (2) rapid global resolution for newly spawned AI agents, (3) sub-second revocation and key rotation, (4) schema-validated capability assertions, and (5) privacy-preserving discovery across organisational boundaries via verifiable, least-disclosure queries. We formalize the AgentFacts schema, specify a CRDT-based update protocol, and prototype adaptive resolvers. The result is a lightweight, horizontally scalable foundation that unlocks secure, trust-aware collaboration for the next generation of the Internet of AI agents, without abandoning existing web infrastructure.

Paperid: 1147, https://arxiv.org/pdf/2507.05558.pdf

Abstract:
Smart contract vulnerabilities have led to billions in losses, yet finding actionable exploits remains challenging. Traditional fuzzers rely on rigid heuristics and struggle with complex attacks, while human auditors are thorough but slow and don't scale. Large Language Models offer a promising middle ground, combining human-like reasoning with machine speed. However, early studies show that simply prompting LLMs generates unverified vulnerability speculations with high false positive rates. To address this, we present A1, an agentic system that transforms any LLM into an end-to-end exploit generator. A1 provides agents with six domain-specific tools for autonomous vulnerability discovery, from understanding contract behavior to testing strategies on real blockchain states. All outputs are concretely validated through execution, ensuring only profitable proof-of-concept exploits are reported. We evaluate A1 across 36 real-world vulnerable contracts on Ethereum and Binance Smart Chain. A1 achieves a 63% success rate on the VERITE benchmark. Across all successful cases, A1 extracts up to \$8.59 million per exploit and \$9.33 million total. Through 432 experiments across six LLMs, we show that most exploits emerge within five iterations, with costs ranging \$0.01-\$3.59 per attempt. Using Monte Carlo analysis of historical attacks, we demonstrate that immediate vulnerability detection yields 86-89% success probability, dropping to 6-21% with week-long delays. Our economic analysis reveals a troubling asymmetry: attackers achieve profitability at \$6,000 exploit values while defenders require \$60,000 -- raising fundamental questions about whether AI agents inevitably favor exploitation over defense.

Paperid: 1148, https://arxiv.org/pdf/2507.02699.pdf

Abstract:
The increasing capabilities of LLMs have led to the rapid proliferation of LLM agent apps, where developers enhance LLMs with access to external resources to support complex task execution. Among these, LLM email agent apps represent one of the widely used categories, as email remains a critical communication medium for users. LLM email agents are capable of managing and responding to email using LLM-driven reasoning and autonomously executing user instructions via external email APIs (e.g., send email). However, despite their growing deployment and utility, the security mechanism of LLM email agent apps remains underexplored. Currently, there is no comprehensive study into the potential security risk within these agent apps and their broader implications. In this paper, we conduct the first in-depth and systematic security study of LLM email agents. We propose the Email Agent Hijacking (EAH) attack, which overrides the original prompts of the email agent via external email resources, allowing attackers to gain control of the email agent remotely and further perform specific attack scenarios without user awareness. To facilitate the large-scale evaluation, we propose EAHawk, a pipeline to evaluate the EAH attack of LLM email agent apps. By EAHawk, we performed an empirical study spanning 14 representative LLM agent frameworks, 63 agent apps, 12 LLMs, and 20 email services, which led to the generation of 1,404 real-world email agent instances for evaluation. Experimental results indicate that all 1,404 instances were successfully hijacked; on average, only 2.03 attack attempts are required to control an email agent instance. Even worse, for some LLMs, the average number of attempts needed to achieve full agent control drops to as few as 1.23.

Paperid: 1149, https://arxiv.org/pdf/2507.01129.pdf

Abstract:
The spherical noise added to gradients in differentially private (DP) training undermines the performance of adaptive optimizers like AdaGrad and Adam, and hence many recent works have proposed algorithms to address this challenge. However, the empirical results in these works focus on simple tasks and models and the conclusions may not generalize to model training in practice. In this paper we survey several of these variants, and develop better theoretical intuition for them as well as perform empirical studies comparing them. We find that a common intuition of aiming for unbiased estimates of second moments of gradients in adaptive optimizers is misguided, and instead that a simple technique called scale-then-privatize (which does not achieve unbiased second moments) has more desirable theoretical behaviors and outperforms all other variants we study on a small-scale language model training task. We additionally argue that scale-then-privatize causes the noise addition to better match the application of correlated noise mechanisms which are more desirable to use in practice.

Paperid: 1150, https://arxiv.org/pdf/2507.00096.pdf

Abstract:
Alternative Assets tokenization is transforming non-traditional financial instruments are represented and traded on the web. However, ensuring trustworthiness in web-based tokenized ecosystems poses significant challenges, from verifying off-chain asset data to enforcing regulatory compliance. This paper proposes an AI-governed agent architecture that integrates intelligent agents with blockchain to achieve web-trustworthy tokenization of alternative assets. In the proposed architecture, autonomous agents orchestrate the tokenization process (asset verification, valuation, compliance checking, and lifecycle management), while an AI-driven governance layer monitors agent behavior and enforces trust through adaptive policies and cryptoeconomic incentives. We demonstrate that this approach enhances transparency, security, and compliance in asset tokenization, addressing key concerns around data authenticity and fraud. A case study on tokenizing real estate assets illustrates how the architecture mitigates risks (e.g., fraudulent listings and money laundering) through real-time AI anomaly detection and on-chain enforcement. Our evaluation and analysis suggest that combining AI governance with multi-agent systems and blockchain can significantly bolster trust in tokenized asset ecosystems. This work offers a novel framework for trustworthy asset tokenization on the web and provides insights for practitioners aiming to deploy secure, compliant tokenization platforms.

Paperid: 1151, https://arxiv.org/pdf/2512.24238.pdf

Abstract:
Many location-based services rely on a point-in-polygon test (PiP), checking whether a point or a trajectory lies inside a geographic zone. Since geometric operations are expensive in zero-knowledge proofs, privately performing the PiP test is challenging. In this paper, we answer the research questions of how different ways of encoding zones affect accuracy and proof cost by exploiting gridbased lookup tables under a fixed STARK execution model. Beyond a Boolean grid-based baseline that marks cells as in- or outside, we explore a distance-aware encoding approach that stores how far each cell is from a zone boundary and uses interpolation to reason within a cell. Our experiments on real-world data demonstrate that the proposed distance-aware approach achieves higher accuracy on coarse grids (max. 60%p accuracy gain) with only a moderate verification overhead (approximately 1.4x), making zone encoding the key lever for efficient zero-knowledge spatial checks.

Paperid: 1152, https://arxiv.org/pdf/2512.22789.pdf

Abstract:
Recent privacy regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) have established legal requirements for obtaining user consent regarding the collection, use, and sharing of personal data. These regulations emphasize that consent must be informed, freely given, specific, and unambiguous. However, there are still many violations, which highlight a gap between legal expectations and actual implementation. Consent mechanisms embedded in functional web forms across websites play a critical role in ensuring compliance with data protection regulations such as the GDPR and CCPA, as well as in upholding user autonomy and trust. However, current research has primarily focused on cookie banners and mobile app dialogs. These forms are diverse in structure, vary in legal basis, and are often difficult to locate or evaluate, creating a significant challenge for automated consent compliance auditing. In this work, we present Cosmic, a novel automated framework for detecting consent-related privacy violations in web forms. We evaluate our developed tool for auditing consent compliance in web forms, across 5,823 websites and 3,598 forms. Cosmic detects 3,384 violations on 94.1% of consent forms, covering key GDPR principles such as freely given consent, purpose disclosure, and withdrawal options. It achieves 98.6% and 99.1% TPR for consent and violation detection, respectively, demonstrating high accuracy and real-world applicability.

Paperid: 1153, https://arxiv.org/pdf/2512.22526.pdf

Abstract:
Modern cloud-based AI training relies on extensive telemetry and logs to ensure accountability. While these audit trails enable retrospective inspection, they struggle to address the inherent non-determinism of deep learning. Stochastic operations, such as dropout, create an ambiguity surface where attackers can mask malicious manipulations as natural random variance, granting them plausible deniability. Consequently, existing logging mechanisms cannot verify whether stochastic values were generated and applied honestly without exposing sensitive training data. To close this integrity gap, we introduce Verifiable Dropout, a privacy-preserving mechanism based on zero-knowledge proofs. We treat stochasticity not as an excuse but as a verifiable claim. Our approach binds dropout masks to a deterministic, cryptographically verifiable seed and proves the correct execution of the dropout operation. This design enables users to audit the integrity of stochastic training steps post-hoc, ensuring that randomness was neither biased nor cherry-picked, while strictly preserving the confidentiality of the model and data.

Paperid: 1154, https://arxiv.org/pdf/2512.21241.pdf

Abstract:
In hard-label black-box adversarial attacks, where only the top-1 predicted label is accessible, the prohibitive query complexity poses a major obstacle to practical deployment. In this paper, we focus on optimizing a representative class of attacks that search for the optimal ray direction yielding the minimum $\ell_2$-norm perturbation required to move a benign image into the adversarial region. Inspired by Nesterov's Accelerated Gradient (NAG), we propose a momentum-based algorithm, ARS-OPT, which proactively estimates the gradient with respect to a future ray direction inferred from accumulated momentum. We provide a theoretical analysis of its convergence behavior, showing that ARS-OPT enables more accurate directional updates and achieves faster, more stable optimization. To further accelerate convergence, we incorporate surrogate-model priors into ARS-OPT's gradient estimation, resulting in PARS-OPT with enhanced performance. The superiority of our approach is supported by theoretical guarantees under standard assumptions. Extensive experiments on ImageNet and CIFAR-10 demonstrate that our method surpasses 13 state-of-the-art approaches in query efficiency.

Paperid: 1155, https://arxiv.org/pdf/2512.18303.pdf

Abstract:
Current cybersecurity research increasingly acknowledges the human factor, yet remains fragmented, often treating user vulnerabilities as isolated and static traits. This paper introduces MORPHEUS, a holistic framework that operationalizes human-centric security as a dynamic and interconnected system. Grounded in the Cognition-Affect-Behavior (CAB) model and Attribution Theory, MORPHEUS consolidates 50 human factors influencing susceptibility to major cyberthreats, including phishing, malware, password management, and misconfigurations. Beyond factor identification, the framework systematically maps 295 documented interactions, revealing how cognitive, emotional, behavioral, and socio-organizational processes jointly shape security outcomes, and distills them into twelve recurring interaction mechanisms. MORPHEUS further links theory to practice through an inventory of 99 validated psychometric instruments, enabling empirical assessment and targeted intervention. We illustrate the framework's applicability through concrete operational scenarios, spanning risk diagnosis, training, and interface design. Overall, MORPHEUS provides a rigorous yet actionable foundation for advancing human-centered cybersecurity research and practice.

Paperid: 1156, https://arxiv.org/pdf/2512.15379.pdf

Abstract:
The success of machine learning for real-world robotic systems has created a new form of intellectual property: the trained policy. This raises a critical need for novel methods that verify ownership and detect unauthorized, possibly unsafe misuse. While watermarking is established in other domains, physical policies present a unique challenge: remote detection. Existing methods assume access to the robot's internal state, but auditors are often limited to external observations (e.g., video footage). This ``Physical Observation Gap'' means the watermark must be detected from signals that are noisy, asynchronous, and filtered by unknown system dynamics. We formalize this challenge using the concept of a \textit{glimpse sequence}, and introduce Colored Noise Coherency (CoNoCo), the first watermarking strategy designed for remote detection. CoNoCo embeds a spectral signal into the robot's motions by leveraging the policy's inherent stochasticity. To show it does not degrade performance, we prove CoNoCo preserves the marginal action distribution. Our experiments demonstrate strong, robust detection across various remote modalities, including motion capture and side-way/top-down video footage, in both simulated and real-world robot experiments. This work provides a necessary step toward protecting intellectual property in robotics, offering the first method for validating the provenance of physical policies non-invasively, using purely remote observations.

Paperid: 1157, https://arxiv.org/pdf/2512.14453.pdf

Abstract:
Companies adopt agile methodologies and DevOps to facilitate efficient development and deployment of software-intensive products. This, in turn, introduces challenges in relation to security standard compliance traditionally following a more linear workflow. This is especially a challenge for the engineering of products and services associated with critical infrastructures. To support companies in their transition towards DevOps, this paper presents an adaptation of DevOps according to security regulations and standards. We report on our longitudinal study at Siemens AG, consisting of several individual sub-studies in the inception, validation, and initial adoption of our framework based on RefA as well as the implications for practice. RefA is a prescriptive model of a security compliant DevOps lifecycle based on the IEC 62443-4-1 standard. The overall framework is aimed at professionals, not only security experts, being able to use it on implementing DevOps processes while remaining compliant with security norms. We demonstrate how RefA facilitates the transfer of security compliance knowledge to product development teams. This knowledge transfer supports the agility aim of ensuring that cross-functional teams have all the skills needed to deliver the compliant products.

Paperid: 1158, https://arxiv.org/pdf/2512.11143.pdf

Abstract:
While penetration testing plays a vital role in cybersecurity, achieving fully automated, hands-off-the-keyboard execution remains a significant research challenge. In this paper, we introduce the "Planner-Executor-Perceptor (PEP)" design paradigm and use it to systematically review existing work and identify the key challenges in this area. We also evaluate existing penetration testing systems, with a particular focus on the use of Large Language Model (LLM) agents for this task. The results show that the out-of-the-box Claude Code and Sonnet 4.5 exhibit superior penetration capabilities observed to date, substantially outperforming all prior systems. However, a detailed analysis of their testing processes reveals specific strengths and limitations; notably, LLM agents struggle with maintaining coherent long-horizon plans, performing complex reasoning, and effectively utilizing specialized tools. These limitations significantly constrain its overall capability, efficiency, and stability. To address these limitations, we propose CHECKMATE, a framework that integrates enhanced classical planning with LLM agents, providing an external, structured "brain" that mitigates the inherent weaknesses of LLM agents. Our evaluation shows that CHECKMATE outperforms the state-of-the-art system (Claude Code) in penetration capability, improving benchmark success rates by over 20%. In addition, it delivers substantially greater stability, cutting both time and monetary costs by more than 50%.

Paperid: 1159, https://arxiv.org/pdf/2512.10449.pdf

Abstract:
The landscape of scientific peer review is rapidly evolving with the integration of Large Language Models (LLMs). This shift is driven by two parallel trends: the widespread individual adoption of LLMs by reviewers to manage workload (the "Lazy Reviewer" hypothesis) and the formal institutional deployment of AI-powered assessment systems by conferences like AAAI and Stanford's Agents4Science. This study investigates the robustness of these "LLM-as-a-Judge" systems (both illicit and sanctioned) to adversarial PDF manipulation. Unlike general jailbreaks, we focus on a distinct incentive: flipping "Reject" decisions to "Accept," for which we develop a novel evaluation metric which we term as WAVS (Weighted Adversarial Vulnerability Score). We curated a dataset of 200 scientific papers and adapted 15 domain-specific attack strategies to this task, evaluating them across 13 Language Models, including GPT-5, Claude Haiku, and DeepSeek. Our results demonstrate that obfuscation strategies like "Maximum Mark Magyk" successfully manipulate scores, achieving alarming decision flip rates even in large-scale models. We will release our complete dataset and injection framework to facilitate more research on this topic.

Paperid: 1160, https://arxiv.org/pdf/2512.08204.pdf

Abstract:
Connected vehicles represent a key enabler of intelligent transportation systems, where vehicles are equipped with advanced communication, sensing, and computing technologies to interact not only with one another but also with surrounding infrastructures and the environment. Through continuous data exchange, such vehicles are capable of enhancing road safety, improving traffic efficiency, and ensuring more reliable mobility services. Further, when these capabilities are integrated with advanced automation technologies, the concept essentially evolves into connected and autonomous vehicles (CAVs). While connected vehicles primarily focus on seamless information sharing, autonomous vehicles are mainly dependent on advanced perception, decision-making, and control mechanisms to operate with minimal or without human intervention. However, as a result of connectivity, an adversary with malicious intentions might be able to compromise successfully by breaching the system components of CAVs. In this paper, we present an attack-tree based methodology for evaluating cyber security vulnerabilities in CAVs. In particular, we utilize the attack-defense tree formulation to systematically assess attack-leaf vulnerabilities, and before analyzing the vulnerability indices, we also define a measure of vulnerabilities, which is based on existing cyber security threats and corresponding defensive countermeasures.

Paperid: 1161, https://arxiv.org/pdf/2512.07725.pdf

Abstract:
This paper presents a systematic evaluation of the privacy behaviors and attributes of eight recent, popular browser agents. Browser agents are software that automate Web browsing using large language models and ancillary tooling. However, the automated capabilities that make browser agents powerful also make them high-risk points of failure. Both the kinds of tasks browser agents are designed to execute, along with the kinds of information browser agents are entrusted with to fulfill those tasks, mean that vulnerabilities in these tools can result in enormous privacy harm. This work presents a framework of five broad factors (totaling 15 distinct measurements) to measure the privacy risks in browser agents. Our framework assesses i. vulnerabilities in the browser agent's components, ii. how the browser agent protects against website behaviors, iii. whether the browser agent prevents cross-site tracking, iv. how the agent responds to privacy-affecting prompts, and v. whether the tool leaks personal information to sites. We apply our framework to eight browser agents and identify 30 vulnerabilities, ranging from disabled browser privacy features to "autocompleting" sensitive personal information in form fields. We have responsibly disclosed our findings, and plan to release our dataset and other artifacts.

Paperid: 1162, https://arxiv.org/pdf/2512.04785.pdf

Abstract:
AI agent-based systems are becoming increasingly integral to modern software architectures, enabling autonomous decision-making, dynamic task execution, and multimodal interactions through large language models (LLMs). However, these systems introduce novel and evolving security challenges, including prompt injection attacks, context poisoning, model manipulation, and opaque agent-to-agent communication, that are not effectively captured by traditional threat modeling frameworks. In this paper, we introduce ASTRIDE, an automated threat modeling platform purpose-built for AI agent-based systems. ASTRIDE extends the classical STRIDE framework by introducing a new threat category, A for AI Agent-Specific Attacks, which encompasses emerging vulnerabilities such as prompt injection, unsafe tool invocation, and reasoning subversion, unique to agent-based applications. To automate threat modeling, ASTRIDE combines a consortium of fine-tuned vision-language models (VLMs) with the OpenAI-gpt-oss reasoning LLM to perform end-to-end analysis directly from visual agent architecture diagrams, such as data flow diagrams(DFDs). LLM agents orchestrate the end-to-end threat modeling automation process by coordinating interactions between the VLM consortium and the reasoning LLM. Our evaluations demonstrate that ASTRIDE provides accurate, scalable, and explainable threat modeling for next-generation intelligent systems. To the best of our knowledge, ASTRIDE is the first framework to both extend STRIDE with AI-specific threats and integrate fine-tuned VLMs with a reasoning LLM to fully automate diagram-driven threat modeling in AI agent-based applications.

Paperid: 1163, https://arxiv.org/pdf/2512.04429.pdf

Abstract:
Recent advances in quantum-secure communication have highlighted the value of hybrid schemes that combine Quantum Key Distribution (QKD) with Post-Quantum Cryptography (PQC). Yet most existing hybrid designs omit realistic finite-key effects on QKD key rates and do not specify how to maintain security when both QKD and PQC primitives leak information through side-channels. These gaps limit the applicability of hybrid systems in practical, deployed networks. In this work, we advance a recently proposed hybrid QKD-PQC system by integrating tight finite-key security to the QKD primitive and improving the design for better scalability. This hybrid system employs an information-theoretically secure instruction sequence that determines the configurations of different primitives and thus ensures message confidentiality even when both the QKD and the PQC primitives are compromised. The novelty in our work lies in the implementation of the tightest finite-key security to date for the BBM92 protocol and the design improvements in the primitives of the hybrid system that ensure the processing time scales linearly with the size of secret instructions.

Paperid: 1164, https://arxiv.org/pdf/2512.03356.pdf

Abstract:
Large language models (LLMs) have become foundational in AI systems, yet they remain vulnerable to adversarial jailbreak attacks. These attacks involve carefully crafted prompts that bypass safety guardrails and induce models to produce harmful content. Detecting such malicious input queries is therefore critical for maintaining LLM safety. Existing methods for jailbreak detection typically involve fine-tuning LLMs as static safety LLMs using fixed training datasets. However, these methods incur substantial computational costs when updating model parameters to improve robustness, especially in the face of novel jailbreak attacks. Inspired by immunological memory mechanisms, we propose the Multi-Agent Adaptive Guard (MAAG) framework for jailbreak detection. The core idea is to equip guard with memory capabilities: upon encountering novel jailbreak attacks, the system memorizes attack patterns, enabling it to rapidly and accurately identify similar threats in future encounters. Specifically, MAAG first extracts activation values from input prompts and compares them to historical activations stored in a memory bank for quick preliminary detection. A defense agent then simulates responses based on these detection results, and an auxiliary agent supervises the simulation process to provide secondary filtering of the detection outcomes. Extensive experiments across five open-source models demonstrate that MAAG significantly outperforms state-of-the-art (SOTA) methods, achieving 98% detection accuracy and a 96% F1-score across a diverse range of attack scenarios.

Paperid: 1165, https://arxiv.org/pdf/2512.02318.pdf

Abstract:
This paper studies how multimodal large language models (MLLMs) undermine the security guarantees of visual CAPTCHA. We identify the attack surface where an adversary can cheaply automate CAPTCHA solving using off-the-shelf models. We evaluate 7 leading commercial and open-source MLLMs across 18 real-world CAPTCHA task types, measuring single-shot accuracy, success under limited retries, end-to-end latency, and per-solve cost. We further analyze the impact of task-specific prompt engineering and few-shot demonstrations on solver effectiveness. We reveal that MLLMs can reliably solve recognition-oriented and low-interaction CAPTCHA tasks at human-like cost and latency, whereas tasks requiring fine-grained localization, multi-step spatial reasoning, or cross-frame consistency remain significantly harder for current models. By examining the reasoning traces of such MLLMs, we investigate the underlying mechanisms of why models succeed/fail on specific CAPTCHA puzzles and use these insights to derive defense-oriented guidelines for selecting and strengthening CAPTCHA tasks. We conclude by discussing implications for platform operators deploying CAPTCHA as part of their abuse-mitigation pipeline.Code Availability (https://anonymous.4open.science/r/Captcha-465E/).

Paperid: 1166, https://arxiv.org/pdf/2512.01893.pdf

Abstract:
Phishing remains a persistent cybersecurity threat; however, developing scalable and effective user training is labor-intensive and challenging to maintain. Generative Artificial Intelligence offers an interesting opportunity, but empirical evidence on its instructional efficacy remains scarce. This paper provides an experimental validation of Large Language Models (LLMs) as autonomous engines for generating phishing resilience training. Across two controlled studies (N=480), we demonstrate that AI-generated content yields significant pre-post learning gains regardless of the specific prompting strategy employed. Study 1 (N=80) compares four prompting techniques, finding that even a straightforward "direct-profile" strategy--simply embedding user traits into the prompt--produces effective training material. Study 2 (N=400) investigates the scalability of this approach by testing personalization and training duration. Results show that complex psychometric personalization offers no measurable advantage over well-designed generic content, while longer training duration provides a modest boost in accuracy. These findings suggest that organizations can leverage LLMs to generate high-quality, effective training at scale without the need for complex user profiling, relying instead on the inherent capabilities of the model.

Paperid: 1167, https://arxiv.org/pdf/2512.00251.pdf

Abstract:
The increasing complexity of IoT edge networks presents significant challenges for anomaly detection, particularly in identifying sophisticated Denial-of-Service (DoS) attacks and zero-day exploits under highly dynamic and imbalanced traffic conditions. This paper proposes SD-CGAN, a Conditional Generative Adversarial Network framework enhanced with Sinkhorn Divergence, tailored for robust anomaly detection in IoT edge environments. The framework incorporates CTGAN-based synthetic data augmentation to address class imbalance and leverages Sinkhorn Divergence as a geometry-aware loss function to improve training stability and reduce mode collapse. The model is evaluated on exploitative attack subsets from the CICDDoS2019 dataset and compared against baseline deep learning and GAN-based approaches. Results show that SD-CGAN achieves superior detection accuracy, precision, recall, and F1-score while maintaining computational efficiency suitable for deployment in edge-enabled IoT environments.

Paperid: 1168, https://arxiv.org/pdf/2511.17799.pdf

Abstract:
Security bugs in the Linux kernel emerge endlessly and have attracted much attention. However, fixing security bugs in the Linux kernel could be incomplete due to human mistakes. Specifically, an incomplete fix fails to repair all the original security defects in the software, fails to properly repair the original security defects, or introduces new ones. In this paper, we study the fixes of incomplete security bugs in the Linux kernel for the first time, and reveal their characteristics, root causes, as well as detection. We first construct a dataset of incomplete security bug fixes in the Linux kernel and answer the following three questions. What are the characteristics of incomplete security bug fixes in the Linux kernel? What are the root causes behind them? How should they be detected to reduce security risks? We then have the three main insights in the following. (*Due to the notification of arXiv "The Abstract field cannot be longer than 1,920 characters", the appeared Abstract is shortened. For the full Abstract, please download the Article.)

Paperid: 1169, https://arxiv.org/pdf/2511.13246.pdf

Abstract:
This study proposes a novel approach to ensure the security of textual data transmission in a semantic communication system. In the proposed system, a sender transmits textual information to a receiver, while a potential eavesdropper attempts to intercept the information. At the sender side, the text is initially preprocessed, where each sentence is annotated with its corresponding topic, and subsequently extracted into a knowledge graph. To achieve the secure transmission of the knowledge graph, we propose a channel encryption scheme that integrates constellation diagonal transformation with multi-parameter weighted fractional Fourier transform (MP-WFRFT). At the receiver side, the textual data is first decrypted, and then recovered via a transformer model. Experimental results demonstrate that the proposed method reduces the probability of information compromise. The legitimate receiver achieves a Bilingual Evaluation Understudy (BLEU) score of 0.9, whereas the BLEU score of the eavesdropper remains below 0.3. Compared to the baselines, the proposed method can improve the security by up to 20%.

Paperid: 1170, https://arxiv.org/pdf/2511.12971.pdf

Abstract:
Decentralized finance (DeFi) is experiencing rapid expansion. However, prevalent code reuse and limited open-source contributions have introduced significant challenges to the blockchain ecosystem, including plagiarism and the propagation of vulnerable code. Consequently, an effective and accurate similarity detection method for EVM bytecode is urgently needed to identify similar contracts. Traditional binary similarity detection methods are typically based on instruction stream or control flow graph (CFG), which have limitations on EVM bytecode due to specific features like low-level EVM bytecode and heavily-reused basic blocks. Moreover, the highly-diverse Solidity Compiler (Solc) versions further complicate accurate similarity detection. Motivated by these challenges, we propose a novel EVM bytecode representation called Stable-Semantic Graph (SSG), which captures relationships between 'stable instructions' (special instructions identified by our study). Moreover, we implement a prototype, Esim, which embeds SSG into matrices for similarity detection using a heterogeneous graph neural network. Esim demonstrates high accuracy in SSG construction, achieving F1-scores of 100% for control flow and 95.16% for data flow, and its similarity detection performance reaches 96.3% AUC, surpassing traditional approaches. Our large-scale study, analyzing 2,675,573 smart contracts on six EVM-compatible chains over a one-year period, also demonstrates that Esim outperforms the SOTA tool Etherscan in vulnerability search.

Paperid: 1171, https://arxiv.org/pdf/2511.10847.pdf

Abstract:
High-precision clock synchronization is essential for a wide range of network-distributed applications. In the quantum space, these applications include communication, sensing, and positioning. However, current synchronization techniques are vulnerable to attacks, such as intercept-resend attacks, spoofing, and delay attacks. Here, we propose and experimentally demonstrate a new quantum secure time transfer (QSTT) system, subsequently used for clock synchronization, that largely negates such attacks. Novel to our system is the optimal use of self-generated quantum keys within the QSTT to information-theoretically secure the maximum amount of timing data; as well as the introduction, within a hybrid quantum/post-quantum architecture, of an information-theoretic secure obfuscated encryption sequence of the remaining timing data. With these enhancements, we argue that our new system represents the most robust implementation of QSTT to date.

Paperid: 1172, https://arxiv.org/pdf/2511.10554.pdf

Abstract:
Provenance-based intrusion detection is an increasingly popular application of graphical machine learning in cybersecurity, where system activities are modeled as provenance graphs to capture causality and correlations among potentially malicious actions. Graph Neural Networks (GNNs) have demonstrated strong performance in this setting. However, traditional statically-provisioned GNN inference architectures fall short in meeting two crucial demands of intrusion detection: (1) maintaining consistently low detection latency, and (2) handling highly irregular and bursty workloads. To holistically address these challenges, we present GraphFaaS, a serverless architecture tailored for GNN-based intrusion detection. GraphFaaS leverages the elasticity and agility of serverless computing to dynamically scale the GNN inference pipeline. We parallelize and adapt GNN workflows to a serverless environment, ensuring that the system can respond in real time to fluctuating workloads. By decoupling compute resources from static provisioning, GraphFaaS delivers stable inference latency, which is critical for dependable intrusion detection and timely incident response in cybersecurity operations. Preliminary evaluation shows GraphFaaS reduces average detection latency by 85% and coefficient of variation (CV) by 64% compared to the baseline.

Paperid: 1173, https://arxiv.org/pdf/2511.08282.pdf

Abstract:
Software services are crucial for reliable communication and networking; therefore, Site Reliability Engineering (SRE) is important to ensure these systems stay reliable and perform well in cloud-native environments. SRE leverages tools like Prometheus and Grafana to monitor system metrics, defining critical Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for maintaining high service standards. However, a significant challenge arises as many developers often lack in-depth understanding of these tools and the intricacies involved in defining appropriate SLIs and SLOs. To bridge this gap, we propose a novel SRE platform, called SRE-Llama, enhanced by Generative-AI, Federated Learning, Blockchain, and Non-Fungible Tokens (NFTs). This platform aims to automate and simplify the process of monitoring, SLI/SLO generation, and alert management, offering ease in accessibility and efficy for developers. The system operates by capturing metrics from cloud-native services and storing them in a time-series database, like Prometheus and Mimir. Utilizing this stored data, our platform employs Federated Learning models to identify the most relevant and impactful SLI metrics for different services and SLOs, addressing concerns around data privacy. Subsequently, fine-tuned Meta's Llama-3 LLM is adopted to intelligently generate SLIs, SLOs, error budgets, and associated alerting mechanisms based on these identified SLI metrics. A unique aspect of our platform is the encoding of generated SLIs and SLOs as NFT objects, which are then stored on a Blockchain. This feature provides immutable record-keeping and facilitates easy verification and auditing of the SRE metrics and objectives. The automation of the proposed platform is governed by the blockchain smart contracts. The proposed SRE-Llama platform prototype has been implemented with a use case featuring a customized Open5GS 5G Core.

Paperid: 1174, https://arxiv.org/pdf/2511.07441.pdf

Abstract:
AI agents can autonomously perform tasks and, often without explicit user consent, collect or disclose users' sensitive local data, which raises serious privacy concerns. Although AI agents' privacy policies may describe their intended data practices, there remains limited transparency and accountability about whether runtime behavior matches those policies. To close this gap, we introduce AudAgent, a visual framework that continuously monitors AI agents' data practices in real time and guards compliance with stated privacy policies. AudAgent consists of four components for automated privacy auditing of AI agents. (i) Policy parsing: an ensemble of LLMs translates natural-language privacy policies into a structured privacy-policy model, where cross-LLM voting guarantees confidence of the parsing results. (ii) Runtime annotation: a lightweight Presidio-based analyzer detects sensitive data and annotates how the data is used based on the context of the AI agent's operations and the privacy-policy model. (iii) Compliance auditing: ontology alignment and automata-based evaluation connect the policy model with runtime annotations, enabling on-the-fly compliance checks between the natural-language policy and observed unordered data practices of AI agents. (iv) User interface: a platform-independent implementation visualizes the real-time execution trace of AI agents along with potential privacy risks detected during auditing, providing user-friendly transparency and accountability. In addition to common formatted privacy policies, AudAgent also supports user-defined policies for fine-grained control and customization. We evaluate AudAgent on AI agents built upon mainstream programming frameworks such as AutoGen, experiments show that AudAgent effectively identifies potential privacy policy violations in real time.

Paperid: 1175, https://arxiv.org/pdf/2511.07049.pdf

Abstract:
Large-scale Video Foundation Models (VFMs) has significantly advanced various video-related tasks, either through task-specific models or Multi-modal Large Language Models (MLLMs). However, the open accessibility of VFMs also introduces critical security risks, as adversaries can exploit full knowledge of the VFMs to launch potent attacks. This paper investigates a novel and practical adversarial threat scenario: attacking downstream models or MLLMs fine-tuned from open-source VFMs, without requiring access to the victim task, training data, model query, and architecture. In contrast to conventional transfer-based attacks that rely on task-aligned surrogate models, we demonstrate that adversarial vulnerabilities can be exploited directly from the VFMs. To this end, we propose the Transferable Video Attack (TVA), a temporal-aware adversarial attack method that leverages the temporal representation dynamics of VFMs to craft effective perturbations. TVA integrates a bidirectional contrastive learning mechanism to maximize the discrepancy between the clean and adversarial features, and introduces a temporal consistency loss that exploits motion cues to enhance the sequential impact of perturbations. TVA avoids the need to train expensive surrogate models or access to domain-specific data, thereby offering a more practical and efficient attack strategy. Extensive experiments across 24 video-related tasks demonstrate the efficacy of TVA against downstream models and MLLMs, revealing a previously underexplored security vulnerability in the deployment of video models.

Paperid: 1176, https://arxiv.org/pdf/2511.06862.pdf

Abstract:
Ensuring compliance with Information Flow Security (IFS) is known to be challenging, especially for concurrent systems with large codebases such as multicore operating system (OS) kernels. Refinement, which verifies that an implementation preserves certain properties of a more abstract specification, is promising for tackling such challenges. However, in terms of refinement-based verification of security properties, existing techniques are still restricted to sequential systems or lack the expressiveness needed to capture complex security policies for concurrent systems. In this work, we present a generalized security-preserving refinement technique, particularly for verifying the IFS of concurrent systems governed by potentially complex security policies. We formalize the IFS properties for concurrent systems and present a refinement-based compositional approach to prove that the generalized security properties (e.g., intransitive noninterference) are preserved between implementation and abstraction. The key intuition enabling such reasoning, compared to previous refinement work, is to establish a step-mapping relation between the implementation and the abstraction, which is sufficient to ensure that every paired step (in the abstraction and the implementation, respectively) is either permitted or prohibited by the security policy. We apply our approach to verify two non-trivial case studies against a collection of security policies. Our proofs are fully mechanized in Isabelle/HOL, during which we identified that two covert channels previously reported in the ARINC 653 single-core standard also exist in the ARINC 653 multicore standard. We subsequently proved the correctness of the revised mechanism, showcasing the effectiveness of our approach.

Paperid: 1177, https://arxiv.org/pdf/2511.06212.pdf

Abstract:
The rapid expansion of the Internet of Things (IoT) is reshaping communication and operational practices across industries, but it also broadens the attack surface and increases susceptibility to security breaches. Artificial Intelligence has become a valuable solution in securing IoT networks, with Large Language Models (LLMs) enabling automated attack behavior analysis and mitigation suggestion in Network Intrusion Detection Systems (NIDS). Despite advancements, the use of LLMs in such systems further expands the attack surface, putting entire networks at risk by introducing vulnerabilities such as prompt injection and data poisoning. In this work, we attack an LLM-based IoT attack analysis and mitigation framework to test its adversarial robustness. We construct an attack description dataset and use it in a targeted data poisoning attack that applies word-level, meaning-preserving perturbations to corrupt the Retrieval-Augmented Generation (RAG) knowledge base of the framework. We then compare pre-attack and post-attack mitigation responses from the target model, ChatGPT-5 Thinking, to measure the impact of the attack on model performance, using an established evaluation rubric designed for human experts and judge LLMs. Our results show that small perturbations degrade LLM performance by weakening the linkage between observed network traffic features and attack behavior, and by reducing the specificity and practicality of recommended mitigations for resource-constrained devices.

Paperid: 1178, https://arxiv.org/pdf/2511.06056.pdf

Abstract:
Remote identity verification is essential for modern digital security; however, it remains highly vulnerable to sophisticated Presentation Attacks (PAs) that utilise forged or manipulated identity documents. Although Deep Learning (DL) has driven advances in Presentation Attack Detection (PAD), the field is fundamentally limited by a lack of data and the poor generalisation of models across various document types and new attack methods. This article presents a systematic literature review (SLR) conducted in accordance with the PRISMA methodology, aiming to analyse and synthesise the current state of AI-based PAD for identity documents from 2020 to 2025 comprehensively. Our analysis reveals a significant methodological evolution: a transition from standard Convolutional Neural Networks (CNNs) to specialised forensic micro-artefact analysis, and more recently, the adoption of large-scale Foundation Models (FMs), marking a substantial shift in the field. We identify a central paradox that hinders progress: a critical "Reality Gap" exists between models validated on extensive, private datasets and those assessed using limited public datasets, which typically consist of mock-ups or synthetic data. This gap limits the reproducibility of research results. Additionally, we highlight a "Synthetic Utility Gap," where synthetic data generation the primary academic response to data scarcity often fails to predict forensic utility. This can lead to model overfitting to generation artefacts instead of the actual attack. This review consolidates our findings, identifies critical research gaps, and provides a definitive reference framework that outlines a prescriptive roadmap for future research aimed at developing secure, robust, and globally generalizable PAD systems.

Paperid: 1179, https://arxiv.org/pdf/2511.05133.pdf

Abstract:
While Intent-Based Networking (IBN) promises operational efficiency through autonomous and abstraction-driven network management, a critical unaddressed issue lies in IBN's implicit trust in the integrity of intent ingested by the network. This inherent assumption of data reliability creates a blind spot exploitable by Man-in-the-Middle (MitM) attacks, where an adversary intercepts and alters intent before it is enacted, compelling the network to orchestrate malicious configurations. This study proposes a secured IBN (sIBN) system with data driven intrusion detection method designed to secure legitimate user intent from adversarial tampering. The proposed intent intrusion detection system uses a ML model applied for network behavioral anomaly detection to reveal temporal patterns of intent tampering. This is achieved by leveraging a set of original behavioral metrics and newly engineered time-aware features, with the model's hyperparameters fine-tuned through the randomized search cross-validation (RSCV) technique. Numerical results based on real-world data sets, show the effectiveness of sIBN, achieving the best performance across standard evaluation metrics, in both binary and multi classification tasks, while maintaining low error rates.

Paperid: 1180, https://arxiv.org/pdf/2511.04707.pdf

Abstract:
Recent advances in long-context language models (LMs) have enabled million-token inputs, expanding their capabilities across complex tasks like computer-use agents. Yet, the safety implications of these extended contexts remain unclear. To bridge this gap, we introduce NINJA (short for Needle-in-haystack jailbreak attack), a method that jailbreaks aligned LMs by appending benign, model-generated content to harmful user goals. Critical to our method is the observation that the position of harmful goals play an important role in safety. Experiments on standard safety benchmark, HarmBench, show that NINJA significantly increases attack success rates across state-of-the-art open and proprietary models, including LLaMA, Qwen, Mistral, and Gemini. Unlike prior jailbreaking methods, our approach is low-resource, transferable, and less detectable. Moreover, we show that NINJA is compute-optimal -- under a fixed compute budget, increasing context length can outperform increasing the number of trials in best-of-N jailbreak. These findings reveal that even benign long contexts -- when crafted with careful goal positioning -- introduce fundamental vulnerabilities in modern LMs.

Paperid: 1181, https://arxiv.org/pdf/2511.02780.pdf

Abstract:
Smart contracts operate in a highly adversarial environment, where vulnerabilities can lead to substantial financial losses. Thus, smart contracts are subject to security audits. In auditing, proof-of-concept (PoC) exploits play a critical role by demonstrating to the stakeholders that the reported vulnerabilities are genuine, reproducible, and actionable. However, manually creating PoCs is time-consuming, error-prone, and often constrained by tight audit schedules. We introduce POCO, an agentic framework that automatically generates executable PoC exploits from natural-language vulnerability descriptions written by auditors. POCO autonomously generates PoC exploits in an agentic manner by interacting with a set of code-execution tools in a Reason-Act-Observe loop. It produces fully executable exploits compatible with the Foundry testing framework, ready for integration into audit reports and other security tools. We evaluate POCO on a dataset of 23 real-world vulnerability reports. POCO consistently outperforms the prompting and workflow baselines, generating well-formed and logically correct PoCs. Our results demonstrate that agentic frameworks can significantly reduce the effort required for high-quality PoCs in smart contract audits. Our contribution provides readily actionable knowledge for the smart contract security community.

Paperid: 1182, https://arxiv.org/pdf/2510.25819.pdf

Abstract:
The rapid rise of AI agents presents urgent challenges in authentication, authorization, and identity management. Current agent-centric protocols (like MCP) highlight the demand for clarified best practices in authentication and authorization. Looking ahead, ambitions for highly autonomous agents raise complex long-term questions regarding scalable access control, agent-centric identities, AI workload differentiation, and delegated authority. This OpenID Foundation whitepaper is for stakeholders at the intersection of AI agents and access management. It outlines the resources already available for securing today's agents and presents a strategic agenda to address the foundational authentication, authorization, and identity problems pivotal for tomorrow's widespread autonomous systems.

Paperid: 1183, https://arxiv.org/pdf/2510.24408.pdf

Abstract:
As the core of the Internet infrastructure, the TCP/IP protocol stack undertakes the task of network data transmission. However, due to the complexity of the protocol and the uncertainty of cross-layer interaction, there are often inconsistencies between the implementation of the protocol stack code and the RFC standard. This inconsistency may not only lead to differences in protocol functions but also cause serious security vulnerabilities. At present, with the continuous expansion of protocol stack functions and the rapid iteration of RFC documents, it is increasingly important to detect and fix these inconsistencies. With the rise of large language models, researchers have begun to explore how to extract protocol specifications from RFC documents through these models, including protocol stack modeling, state machine extraction, text ambiguity analysis, and other related content. However, existing methods rely on predefined patterns or rule-based approaches that fail to generalize across different protocol specifications. Automated and scalable detection of these inconsistencies remains a significant challenge. In this study, we propose an automated analysis framework based on LLM and differential models. By modeling the iterative relationship of the protocol and based on the iterative update relationship of the RFC standard, we perform incremental code function analysis on different versions of kernel code implementations to automatically perform code detection and vulnerability analysis. We conduct extensive evaluations to validate the effectiveness of our framework, demonstrating its effectiveness in identifying potential vulnerabilities caused by RFC code inconsistencies.

Paperid: 1184, https://arxiv.org/pdf/2510.23389.pdf

Abstract:
The behaviour of neural network components must be proven correct before deployment in safety-critical systems. Unfortunately, existing neural network verification techniques cannot certify the absence of faults at the software level. In this paper, we show how to specify and verify that neural networks are safe, by explicitly reasoning about their floating-point implementation. In doing so, we construct NeuroCodeBench 2.0, a benchmark comprising 912 neural network verification examples that cover activation functions, common layers, and full neural networks of up to 170K parameters. Our verification suite is written in plain C and is compatible with the format of the International Competition on Software Verification (SV-COMP). Thanks to it, we can conduct the first rigorous evaluation of eight state-of-the-art software verifiers on neural network code. The results show that existing automated verification tools can correctly solve an average of 11% of our benchmark, while producing around 3% incorrect verdicts. At the same time, a historical analysis reveals that the release of our benchmark has already had a significantly positive impact on the latter.

Paperid: 1185, https://arxiv.org/pdf/2510.22963.pdf

Abstract:
LLM-powered agents often use prompt compression to reduce inference costs, but this introduces a new security risk. Compression modules, which are optimized for efficiency rather than safety, can be manipulated by adversarial inputs, causing semantic drift and altering LLM behavior. This work identifies prompt compression as a novel attack surface and presents CompressionAttack, the first framework to exploit it. CompressionAttack includes two strategies: HardCom, which uses discrete adversarial edits for hard compression, and SoftCom, which performs latent-space perturbations for soft compression. Experiments on multiple LLMs show up to 80% attack success and 98% preference flips, while remaining highly stealthy and transferable. Case studies in VSCode Cline and Ollama confirm real-world impact, and current defenses prove ineffective, highlighting the need for stronger protections.

Paperid: 1186, https://arxiv.org/pdf/2510.20056.pdf

Abstract:
The rapid growth of electric vehicles (EVs) has driven the development of roadway wireless charging technology, effectively extending EV driving range. However, wireless charging introduces significant cybersecurity challenges. Any receiver within the magnetic field can potentially extract energy, and previous research demonstrated that a hacker could detect the operating frequency and steal substantial power. However, our approach required time to track new frequencies or precise adjustments of inductance and capacitance, which would be less effective against potential rapid transmitter frequency changes or capacitance drift. As a solution, we enhanced the interceptor and enabled it to intrude as well as steal energy within just three cycles of the high-frequency signal. Moreover, it can work without any circuit parameters or look-up tables. The key innovation is synchronizing the receiver current with the phase of the magnetic sensor voltage. Through MATLAB / Simulink simulations, finite-element analysis, and experimental validation, we demonstrated that our improved method can steal over 76% of the power received by a fully resonant receiver under identical conditions. This attack demonstrates that simple frequency-changing power encryption offers limited protection against such threats.

Paperid: 1187, https://arxiv.org/pdf/2510.19856.pdf

Abstract:
In recent years, blockchain has experienced widespread adoption across various industries, becoming integral to numerous enterprise applications. Concurrently, the rise of generative AI and LLMs has transformed human-computer interactions, offering advanced capabilities in understanding and generating human-like text. The introduction of the MCP has further enhanced AI integration by standardizing communication between AI systems and external data sources. Despite these advancements, there is still no standardized method for seamlessly integrating LLM applications and blockchain. To address this concern, we propose "MCC: Model Context Contracts" a novel framework that enables LLMs to interact directly with blockchain smart contracts through MCP-like protocol. This integration allows AI agents to invoke blockchain smart contracts, facilitating more dynamic and context-aware interactions between users and blockchain networks. Essentially, it empowers users to interact with blockchain systems and perform transactions using queries in natural language. Within this proposed architecture, blockchain smart contracts can function as intelligent agents capable of recognizing user input in natural language and executing the corresponding transactions. To ensure that the LLM accurately interprets natural language inputs and maps them to the appropriate MCP functions, the LLM was fine-tuned using a custom dataset comprising user inputs paired with their corresponding MCP server functions. This fine-tuning process significantly improved the platform's performance and accuracy. To validate the effectiveness of MCC, we have developed an end-to-end prototype implemented on the Rahasak blockchain with the fine-tuned Llama-4 LLM. To the best of our knowledge, this research represents the first approach to using the concept of Model Context Protocol to integrate LLMs with blockchain.

Paperid: 1188, https://arxiv.org/pdf/2510.18448.pdf

Abstract:
The Proof-of-Concept (PoC) for a vulnerability is crucial in validating its existence, mitigating false positives, and illustrating the severity of the security threat it poses. However, research on PoCs significantly lags behind studies focusing on vulnerability data. This discrepancy can be directly attributed to several challenges, including the dispersion of real-world PoCs across multiple platforms, the diversity in writing styles, and the difficulty associated with PoC reproduction. To fill this gap, we conduct the first large-scale study on PoCs in the wild, assessing their report availability, completeness, reproducibility. Specifically, 1) to investigate PoC reports availability for CVE vulnerability, we collected an extensive dataset of 470,921 PoCs and their reports from 13 platforms, representing the broadest collection of publicly available PoCs to date. 2) To assess the completeness of PoC report at a fine-grained level, we proposed a component extraction method, which combines pattern-matching techniques with a fine-tuned BERT-NER model to extract 9 key components from PoC reports. 3) To evaluate the effectiveness of PoCs, we recruited 8 participants to manually reproduce 150 sampled vulnerabilities with 32 vulnerability types based on PoC reports, enabling an in-depth analysis of PoC reproducibility and the factors influencing it. Our findings reveal that 78.9% of CVE vulnerabilities lack available PoCs, and existing PoC reports typically miss about 30% of the essential components required for effective vulnerability understanding and reproduction, with various reasons identified for the failure to reproduce vulnerabilities using available PoC reports. Finally, we proposed actionable strategies for stakeholders to enhance the overall usability of vulnerability PoCs in strengthening software security.

Paperid: 1189, https://arxiv.org/pdf/2510.15083.pdf

Abstract:
The Synthetic Minority Over-sampling Technique (SMOTE) is one of the most widely used methods for addressing class imbalance and generating synthetic data. Despite its popularity, little attention has been paid to its privacy implications; yet, it is used in the wild in many privacy-sensitive applications. In this work, we conduct the first systematic study of privacy leakage in SMOTE: We begin by showing that prevailing evaluation practices, i.e., naive distinguishing and distance-to-closest-record metrics, completely fail to detect any leakage and that membership inference attacks (MIAs) can be instantiated with high accuracy. Then, by exploiting SMOTE's geometric properties, we build two novel attacks with very limited assumptions: DistinSMOTE, which perfectly distinguishes real from synthetic records in augmented datasets, and ReconSMOTE, which reconstructs real minority records from synthetic datasets with perfect precision and recall approaching one under realistic imbalance ratios. We also provide theoretical guarantees for both attacks. Experiments on eight standard imbalanced datasets confirm the practicality and effectiveness of these attacks. Overall, our work reveals that SMOTE is inherently non-private and disproportionately exposes minority records, highlighting the need to reconsider its use in privacy-sensitive applications.

Paperid: 1190, https://arxiv.org/pdf/2510.15068.pdf

Abstract:
Multimodal large language models (MLLMs) exhibit remarkable capabilities but remain susceptible to jailbreak attacks exploiting cross-modal vulnerabilities. In this work, we introduce a novel method that leverages sequential comic-style visual narratives to circumvent safety alignments in state-of-the-art MLLMs. Our method decomposes malicious queries into visually innocuous storytelling elements using an auxiliary LLM, generates corresponding image sequences through diffusion models, and exploits the models' reliance on narrative coherence to elicit harmful outputs. Extensive experiments on harmful textual queries from established safety benchmarks show that our approach achieves an average attack success rate of 83.5\%, surpassing prior state-of-the-art by 46\%. Compared with existing visual jailbreak methods, our sequential narrative strategy demonstrates superior effectiveness across diverse categories of harmful content. We further analyze attack patterns, uncover key vulnerability factors in multimodal safety mechanisms, and evaluate the limitations of current defense strategies against narrative-driven attacks, revealing significant gaps in existing protections.

Paperid: 1191, https://arxiv.org/pdf/2510.13361.pdf

Abstract:
Despite the rapid progress of neural networks, they remain highly vulnerable to adversarial examples, for which adversarial training (AT) is currently the most effective defense. While AT has been extensively studied, its practical applications expose two major limitations: natural accuracy tends to degrade significantly compared with standard training, and robustness does not transfer well across attacks crafted under different norm constraints. Unlike prior works that attempt to address only one issue within a single network, we propose to partition the overall generalization goal into multiple sub-tasks, each assigned to a dedicated base learner. By specializing in its designated objective, each base learner quickly becomes an expert in its field. In the later stages of training, we interpolate their parameters to form a knowledgeable global learner, while periodically redistributing the global parameters back to the base learners to prevent their optimization trajectories from drifting too far from the shared target. We term this framework Generalist and introduce three variants tailored to different application scenarios. Both theoretical analysis and extensive experiments demonstrate that Generalist achieves lower generalization error and significantly alleviates the trade-off problems compared with baseline methods. Our results suggest that Generalist provides a promising step toward developing fully robust classifiers in the future.

Paperid: 1192, https://arxiv.org/pdf/2510.10761.pdf

Abstract:
Decentralized communication is becoming an important use case within Web3. On Ethereum, users can repurpose the transaction input data field to embed natural-language messages, commonly known as Input Data Messages (IDMs). However, as IDMs gain wider adoption, there has been a growing volume of toxic content on-chain. This trend is concerning, as Ethereum provides no protocol-level support for content moderation. We propose two moderation frameworks for Ethereum IDMs: (i) BUILDERMOD, where builders perform semantic checks during block construction; and (ii) USERMOD, where users proactively obtain moderation proofs from external classifiers and embed them in transactions. Our evaluation reveals that BUILDERMOD incurs high block-time overhead, which limits its practicality. In contrast, USERMOD enables lower-latency validation and scales more effectively, making it a more practical approach in moderation-aware Ethereum environments. Our study lays the groundwork for protocol-level content governance in decentralized systems, and we hope it contributes to the development of a decentralized communication environment that is safe, trustworthy, and socially responsible.

Paperid: 1193, https://arxiv.org/pdf/2510.09462.pdf

Abstract:
AI control protocols serve as a defense mechanism to stop untrusted LLM agents from causing harm in autonomous settings. Prior work treats this as a security problem, stress testing with exploits that use the deployment context to subtly complete harmful side tasks, such as backdoor insertion. In practice, most AI control protocols are fundamentally based on LLM monitors, which can become a central point of failure. We study adaptive attacks by an untrusted model that knows the protocol and the monitor model, which is plausible if the untrusted model was trained with a later knowledge cutoff or can search for this information autonomously. We instantiate a simple adaptive attack vector by which the attacker embeds publicly known or zero-shot prompt injections in the model outputs. Using this tactic, frontier models consistently evade diverse monitors and complete malicious tasks on two main AI control benchmarks. The attack works universally against current protocols that rely on a monitor. Furthermore, the recent Defer-to-Resample protocol even backfires, as its resampling amplifies the prompt injection and effectively reframes it as a best-of-$n$ attack. In general, adaptive attacks on monitor models represent a major blind spot in current control protocols and should become a standard component of evaluations for future AI control mechanisms.

Paperid: 1194, https://arxiv.org/pdf/2510.09210.pdf

Abstract:
In recent years, data poisoning attacks have been increasingly designed to appear harmless and even beneficial, often with the intention of verifying dataset ownership or safeguarding private data from unauthorized use. However, these developments have the potential to cause misunderstandings and conflicts, as data poisoning has traditionally been regarded as a security threat to machine learning systems. To address this issue, it is imperative for harmless poisoning generators to claim ownership of their generated datasets, enabling users to identify potential poisoning to prevent misuse. In this paper, we propose the deployment of watermarking schemes as a solution to this challenge. We introduce two provable and practical watermarking approaches for data poisoning: {\em post-poisoning watermarking} and {\em poisoning-concurrent watermarking}. Our analyses demonstrate that when the watermarking length is $Θ(\sqrt{d}/ε_w)$ for post-poisoning watermarking, and falls within the range of $Θ(1/ε_w^2)$ to $O(\sqrt{d}/ε_p)$ for poisoning-concurrent watermarking, the watermarked poisoning dataset provably ensures both watermarking detectability and poisoning utility, certifying the practicality of watermarking under data poisoning attacks. We validate our theoretical findings through experiments on several attacks, models, and datasets.

Paperid: 1195, https://arxiv.org/pdf/2510.08918.pdf

Abstract:
Fuzzing has become a cornerstone technique for uncovering vulnerabilities and enhancing the security of OS kernels. However, state-of-the-art kernel fuzzers, including the de facto standard Syzkaller, struggle to generate valid syscall sequences that respect implicit Syscall Dependency Relations (SDRs). Consequently, many generated seeds either fail kernel validation or cannot penetrate deep execution paths, resulting in significant inefficiency. We hypothesize that SDRs can be effectively learned from both historic and present kernel execution data, and that incorporating these learned relations into fuzzing can substantially improve seed validity and diversity. To validate this, we propose an approach that utilizes the N-gram model to mine SDRs from the Dongting dataset-one of the largest Linux kernel execution datasets available-as well as from execution traces collected on the fly during fuzzing. The resulting model is used to continuously augment the Choice Table of Syzkaller to improve its seed generation and demonstrably increases the Shannon Entropy of the Choice Table throughout fuzzing, reflecting more empirically-grounded choices in expanding syscall sequences into valid and diverse seeds. In addition, we introduce a Random Walk strategy that instructs Syzkaller to construct seeds in a bidirectional manner to further diversify the generated seeds. We implement our approach in a prototype, Psyzkaller, built on top of Syzkaller. Experiments on three representative Linux kernel versions show that Psyzkaller improves Syzkaller's code coverage by 4.6%-7.0% in 48-hour fuzzing, while triggering 110.4%-187.2% more crashes. Moreover, our investigation shows that Psyzkaller discovered eight previously unknown kernel vulnerabilities, compared to only one found by Syzkaller.

Paperid: 1196, https://arxiv.org/pdf/2510.08797.pdf

Abstract:
AI-powered attacks on Learning with Errors (LWE), an important hard math problem in post-quantum cryptography, rival or outperform "classical" attacks on LWE under certain parameter settings. Despite the promise of this approach, a dearth of accessible data limits AI practitioners' ability to study and improve these attacks. Creating LWE data for AI model training is time- and compute-intensive and requires significant domain expertise. To fill this gap and accelerate AI research on LWE attacks, we propose the TAPAS datasets, a Toolkit for Analysis of Post-quantum cryptography using AI Systems. These datasets cover several LWE settings and can be used off-the-shelf by AI practitioners to prototype new approaches to cracking LWE. This work documents TAPAS dataset creation, establishes attack performance baselines, and lays out directions for future work.

Paperid: 1197, https://arxiv.org/pdf/2510.07533.pdf

Abstract:
Palm recognition has emerged as a dominant biometric authentication technology in critical infrastructure. These systems operate in either single-modal form, using palmprint or palmvein individually, or dual-modal form, fusing the two modalities. Despite this diversity, they share similar hardware architectures that inadvertently emit electromagnetic (EM) signals during operation. Our research reveals that these EM emissions leak palm biometric information, motivating us to develop EMPalm--an attack framework that covertly recovers both palmprint and palmvein images from eavesdropped EM signals. Specifically, we first separate the interleaved transmissions of the two modalities, identify and combine their informative frequency bands, and reconstruct the images. To further enhance fidelity, we employ a diffusion model to restore fine-grained biometric features unique to each domain. Evaluations on seven prototype and two commercial palm acquisition devices show that EMPalm can recover palm biometric information with high visual fidelity, achieving SSIM scores up to 0.79, PSNR up to 29.88 dB, and FID scores as low as 6.82 across all tested devices, metrics that collectively demonstrate strong structural similarity, high signal quality, and low perceptual discrepancy. To assess the practical implications of the attack, we further evaluate it against four state-of-the-art palm recognition models, achieving a model-wise average spoofing success rate of 65.30% over 6,000 samples from 100 distinct users.

Paperid: 1198, https://arxiv.org/pdf/2510.05052.pdf

Abstract:
The proliferation of powerful large language models (LLMs) has necessitated robust safety alignment, yet these models remain vulnerable to evolving adversarial attacks, including multi-turn jailbreaks that iteratively search for successful queries. Current defenses, primarily reactive and static, often fail to counter these search-based attacks. In this paper, we introduce ProAct, a novel proactive defense framework designed to disrupt and mislead autonomous jailbreaking processes. Our core idea is to intentionally provide adversaries with "spurious responses" that appear to be results of successful jailbreak attacks but contain no actual harmful content. These misleading responses provide false signals to the attacker's internal optimization loop, causing the adversarial search to terminate prematurely and effectively jailbreaking the jailbreak. By conducting extensive experiments across state-of-the-art LLMs, jailbreaking frameworks, and safety benchmarks, our method consistently and significantly reduces attack success rates by up to 92\%. When combined with other defense frameworks, it further reduces the success rate of the latest attack strategies to 0\%. ProAct represents an orthogonal defense strategy that can serve as an additional guardrail to enhance LLM safety against the most effective jailbreaking attacks.

Paperid: 1199, https://arxiv.org/pdf/2510.01173.pdf

Abstract:
In this work, we formulate and study the problem of image-editing detection and attribution: given a base image and a suspicious image, detection seeks to determine whether the suspicious image was derived from the base image using an AI editing model, while attribution further identifies the specific editing model responsible. Existing methods for detecting and attributing AI-generated images are insufficient for this problem, as they focus on determining whether an image was AI-generated/edited rather than whether it was edited from a particular base image. To bridge this gap, we propose EditTrack, the first framework for this image-editing detection and attribution problem. Building on four key observations about the editing process, EditTrack introduces a novel re-editing strategy and leverages carefully designed similarity metrics to determine whether a suspicious image originates from a base image and, if so, by which model. We evaluate EditTrack on five state-of-the-art editing models across six datasets, demonstrating that it consistently achieves accurate detection and attribution, significantly outperforming five baselines.

Paperid: 1200, https://arxiv.org/pdf/2509.25394.pdf

Abstract:
With the popularity of wireless charging, energy access protection and cybersecurity are gaining importance, especially in public places. Currently, the most common energy encryption method uses frequency and associated impedance variation. However, we have proven that this method is not reliable, since a hacker can detect the changing frequency and adjust the compensation. However, the previously presented system needed time to follow the updated frequency, while encryption systems may vary the frequency faster to avoid energy theft. Furthermore, the previous system required an additional sensor coil. To solve these problems, we optimized the attack and the associated system, which can intrude and steal energy within 0.2 ms. The key is the elimination of the time-consuming maximum receiver current regulation. Also, we use the main receiving coil rather than any additional sensor antenna to detect the magnetic field. Thus, the new hardware is even simpler. A simulation model and experimental results demonstrate the fast response speed of the attack on encrypted wireless power and steal 65% of the power. Overall, the applicability of the attack is highly improved and leaves less room for hardening the encryption. The results demonstrate that energy access protection needs to be given great attention.

Paperid: 1201, https://arxiv.org/pdf/2509.21786.pdf

Abstract:
With the development of Internet, privacy has become a close concern of users. Anonymous authentication plays an important role in privacy-preserving systems. $k$-times anonymous authentication ($k$-TAA) scheme allows members of a group to be authenticated anonymously by application providers up to $k$ times. Considering quantum computing attacks, lattice-based $k$-TAA was introduced. However, existing schemes do not support dynamically granting and revoking users. In this paper, we construct the first lattice-based dynamic $k$-TAA, which offers limited times anonymous authentication, dynamic member management, and post-quantum security. We present a concrete construction, and reduce its security to standard complexity assumptions. Notably, compared with existing lattice-based $k$-TAA, our scheme is efficient in terms of communication cost.

Paperid: 1202, https://arxiv.org/pdf/2509.20460.pdf

Abstract:
This paper addresses the problem of protecting network information from privacy system identification (SI) attacks when sharing cyber-physical system simulations. We model analyst observations of networked states as time-series outputs of a graph filter driven by differentially private (DP) nodal excitations, with the analyst aiming to infer the underlying graph shift operator (GSO). Unlike traditional SI, which estimates system parameters, we study the inverse problem: what assumptions prevent adversaries from identifying the GSO while preserving utility for legitimate analysis. We show that applying DP mechanisms to inputs provides formal privacy guarantees for the GSO, linking the $(ε,δ)$-DP bound to the spectral properties of the graph filter and noise covariance. More precisely, for DP Gaussian signals, the spectral characteristics of both the filter and noise covariance determine the privacy bound, with smooth filters and low-condition-number covariance yielding greater privacy.

Paperid: 1203, https://arxiv.org/pdf/2509.13563.pdf

Abstract:
Progressive Web Applications (PWAs) blend the advantages of web and native apps, offering features like offline access, push notifications, and installability. Beyond these, modern PWAs are increasingly granted system-level capabilities such as auto-start on login and shared context with native applications. However, their permission management remains poorly defined and inconsistently implemented across platforms and browsers. To investigate these gaps, we developed Permissioner, a cross-platform analysis tool, and conducted a systematic study of PWA permissions. Our analysis uncovered critical issues of inconsistency, incompleteness, and unclear boundaries in permission enforcement, leading to various attacks including permission leakage, device identification, and Permission API abuse. We further examined why some browsers resist adopting more granular permission controls, identifying trade-offs involving usability, compatibility, and platform limitations. Through collaboration with browser vendors, several issues reported in our findings were acknowledged and resolved, notably by Firefox and Chrome. Our work highlights the urgent need for a unified, robust permission model for PWAs and provides actionable guidance toward achieving this goal.

Paperid: 1204, https://arxiv.org/pdf/2509.13561.pdf

Abstract:
Progressive Web App (PWA) installation is critical for integrating web and mobile app functionalities, offering a seamless user experience. However, ensuring the security of the PWA installation lifecycle is essential for maintaining user trust and privacy. This paper introduces the GUARDIANPWA framework, a comprehensive approach to analyzing the PWA installation mechanism based on the CIA security principles (Confidentiality, Integrity, and Availability) and identifying areas where browser vendors fail to comply with these principles. Our study revealed 203 instances of non-compliance with security principles, highlighting how these irregularities in the PWA installation lifecycle can lead to potential violations of user privacy. For instance, in Firefox, PWAs installed in private mode incorrectly appear in normal mode, risking user confidentiality. Additionally, 29,465 PWAs are at risk because Samsung Internet does not display origins when PWAs navigate to third-party websites, undermining integrity. These findings were reported to browser vendors, leading to Firefox acknowledging four issues, resolving one, and planning to resolve two others. GUARDIANPWA supports developers by analyzing PWA manifest files for syntactic and semantic correctness, offering actionable recommendations, and helping to create PWAs that align with security best practices. By using GUARDIANPWA, developers and users can address critical security gaps and enhance compliance with CIA principles throughout the PWA installation lifecycle.

Paperid: 1205, https://arxiv.org/pdf/2509.10814.pdf

Abstract:
The prevalence of cryptographic API misuse (CAM) is compromising the effectiveness of cryptography and in turn the security of modern systems and applications. Despite extensive efforts to develop CAM detection tools, these tools typically rely on a limited set of predefined rules from human-curated knowledge. This rigid, rule-based approach hinders adaptation to evolving CAM patterns in real practices. We propose leveraging large language models (LLMs), trained on publicly available cryptography-related data, to automatically detect and classify CAMs in real-world code to address this limitation. Our method enables the development and continuous expansion of a CAM taxonomy, supporting developers and detection tools in tracking and understanding emerging CAM patterns. Specifically, we develop an LLM-agnostic prompt engineering method to guide LLMs in detecting CAM instances from C/C++, Java, Python, and Go code, and then classifying them into a hierarchical taxonomy. Using a data set of 3,492 real-world software programs, we demonstrate the effectiveness of our approach with mainstream LLMs, including GPT, Llama, Gemini, and Claude. It also allows us to quantitatively measure and compare the performance of these LLMs in analyzing CAM in realistic code. Our evaluation produced a taxonomy with 279 base CAM categories, 36 of which are not addressed by existing taxonomies. To validate its practical value, we encode 11 newly identified CAM types into detection rules and integrate them into existing tools. Experiments show that such integration expands the tools' detection capabilities.

Paperid: 1206, https://arxiv.org/pdf/2509.10655.pdf

Abstract:
While the widespread deployment of Large Language Models (LLMs) holds great potential for society, their vulnerabilities to adversarial manipulation and exploitation can pose serious safety, security, and ethical risks. As new threats continue to emerge, it becomes critically necessary to assess the landscape of LLMs' safety and security against evolving adversarial prompt techniques. To understand the behavior of LLMs, this research provides an empirical analysis and risk profile of nine prominent LLMs, Claude Opus 4, DeepSeek V3 (both open-source and online), Gemini 2.5 Flash, GPT-4o, Grok 3, Llama 4 Scout, Mistral 7B, and Qwen 3 1.7B, against 24 different security and safety categories. These LLMs are evaluated on their ability to produce harmful responses for adversarially crafted prompts (dataset has been made public) for a broad range of safety and security topics, such as promotion of violent criminal behavior, promotion of non-violent criminal activity, societal harms related to safety, illegal sexual content, dangerous code generation, and cybersecurity threats beyond code. Our study introduces the Risk Severity Index (RSI), an agile and scalable evaluation score, to quantify and compare the security posture and creating a risk profile of LLMs. As the LLM development landscape progresses, the RSI is intended to be a valuable metric for comparing the risks of LLMs across evolving threats. This research finds widespread vulnerabilities in the safety filters of the LLMs tested and highlights the urgent need for stronger alignment, responsible deployment practices, and model governance, particularly for open-access and rapidly iterated models.

Paperid: 1207, https://arxiv.org/pdf/2509.09989.pdf

Abstract:
The classification of network traffic using machine learning (ML) models is one of the primary mechanisms to address the security issues in IoT networks and/or IoT devices. However, the ML models often act as black-boxes that create a roadblock to take critical decision based on the model output. To address this problem, we design and develop a system, called rCamInspector, that employs Explainable AI (XAI) to provide reliable and trustworthy explanations to model output. rCamInspector adopts two classifiers, Flow Classifier - categorizes a flow into one of four classes, IoTCam, Conf, Share and Others, and SmartCam Classifier - classifies an IoTCam flow into one of six classes, Netatmo, Spy Clock, Canary, D3D, Ezviz, V380 Spy Bulb; both are IP address and transport port agnostic. rCamInspector is evaluated using 38GB of network traffic and our results show that XGB achieves the highest accuracy of 92% and 99% in the Flow and SmartCam classifiers respectively among eight supervised ML models. We analytically show that the traditional mutual information (MI) based feature importance cannot provide enough reliability on the model output of XGB in either classifiers. Using SHAP and LIME, we show that a separate set of features can be picked up to explain a correct prediction of XGB. For example, the feature Init Bwd Win Byts turns out to have the highest SHAP values to support the correct prediction of both IoTCam in Flow Classifier and Netatmo class in SmartCam Classifier. To evaluate the faithfulness of the explainers on our dataset, we show that both SHAP and LIME have a consistency of more than 0.7 and a sufficiency of 1.0. Comparing with existing works, we show that rCamInspector achieves a better accuracy (99%), precision (99%), and false negative rate (0.7%).

Paperid: 1208, https://arxiv.org/pdf/2509.09912.pdf

Abstract:
Peer review is the cornerstone of academic publishing, yet the process is increasingly strained by rising submission volumes, reviewer overload, and expertise mismatches. Large language models (LLMs) are now being used as "reviewer aids," raising concerns about their fairness, consistency, and robustness against indirect prompt injection attacks. This paper presents a systematic evaluation of LLMs as academic reviewers. Using a curated dataset of 1,441 papers from ICLR 2023 and NeurIPS 2022, we evaluate GPT-5-mini against human reviewers across ratings, strengths, and weaknesses. The evaluation employs structured prompting with reference paper calibration, topic modeling, and similarity analysis to compare review content. We further embed covert instructions into PDF submissions to assess LLMs' susceptibility to prompt injection. Our findings show that LLMs consistently inflate ratings for weaker papers while aligning more closely with human judgments on stronger contributions. Moreover, while overarching malicious prompts induce only minor shifts in topical focus, explicitly field-specific instructions successfully manipulate specific aspects of LLM-generated reviews. This study underscores both the promises and perils of integrating LLMs into peer review and points to the importance of designing safeguards that ensure integrity and trust in future review processes.

Paperid: 1209, https://arxiv.org/pdf/2509.09158.pdf

Abstract:
Protocol fuzzing is a scalable and cost-effective technique for identifying security vulnerabilities in deployed Internet of Things devices. During their operational phase, IoT devices often run lightweight servers to handle user interactions, such as video streaming or image capture in smart cameras. Implementation flaws in transport or application-layer security mechanisms can expose IoT devices to a range of threats, including unauthorized access and data leakage. This paper addresses the challenge of uncovering such vulnerabilities by leveraging protocol fuzzing techniques that inject crafted transport and application-layer packets into IoT communications. We present a mutation-based fuzzing tool, named IoTFuzzSentry, to identify specific non-trivial vulnerabilities in commercial IoT devices. We further demonstrate how these vulnerabilities can be exploited in real-world scenarios. We integrated our fuzzing tool into a well-known testing tool Cotopaxi and evaluated it with commercial-off-the-shelf IoT devices such as IP cameras and Smart Plug. Our evaluation revealed vulnerabilities categorized into 4 types (IoT Access Credential Leakage, Sneak IoT Live Video Stream, Creep IoT Live Image, IoT Command Injection) and we show their exploits using three IoT devices. We have responsibly disclosed all these vulnerabilities to the respective vendors. So far, we have published two CVEs, CVE-2024-41623 and CVE-2024-42531, and one is awaiting. To extend the applicability, we have investigated the traffic of six additional IoT devices and our analysis shows that these devices can have similar vulnerabilities, due to the presence of a similar set of application protocols. We believe that IoTFuzzSentry has the potential to discover unconventional security threats and allow IoT vendors to strengthen the security of their commercialized IoT devices automatically with negligible overhead.

Paperid: 1210, https://arxiv.org/pdf/2509.08485.pdf

Abstract:
The majority of consumer IoT devices lack mechanisms for administrators to monitor and control them, hindering tailored security policies. A key challenge is identifying whether a new device, especially a streaming IoT camera, has joined the network. We present zCamInspector, a system for identifying known IoT cameras with supervised classifiers (zCamClassifier) and detecting zero-day cameras with one-class classifiers (zCamDetector). We analyzed ~40GB of traffic across three datasets: Set I (six commercial IoT cameras), Set II (five open-source IoT cameras, ~1.5GB), and Set III (four conferencing and two video-sharing applications as non-IoT traffic). From each, 62 flow-based features were extracted using CICFlowmeter. zCamInspector employs seven supervised models (ET, DT, RF, KNN, XGB, LKSVM, GNB) and four one-class models (OCSVM, SGDOCSVM, IF, DeepSVDD). Results show that XGB identifies IoT cameras with >99% accuracy and false negatives as low as 0.3%, outperforming state-of-the-art methods. For zero-day detection, accuracies reached 93.20% (OCSVM), 96.55% (SGDOCSVM), 78.65% (IF), and 92.16% (DeepSVDD). When all devices were treated as zero-day, DeepSVDD performed best with mean training/testing accuracies of 96.03%/74.51%. zCamInspector also achieved >95% accuracy for specific devices, such as Spy Clock cameras, demonstrating its robustness for identifying and detecting zero-day IoT cameras in diverse network environments.

Paperid: 1211, https://arxiv.org/pdf/2509.07804.pdf

Abstract:
E-health record (EHR) contains a vast amount of continuously growing medical data and enables medical institutions to access patient health data conveniently.This provides opportunities for medical data mining which has important applications in identifying high-risk patients and improving disease diagnosis, etc.Since EHR contains sensitive patient information, how to protect patient privacy and enable mining on EHR data is important and challenging.Traditional public key encryption (PKE) can protect patient privacy, but cannot support flexible selective computation on encrypted EHR data.Functional encryption (FE) allows authorised users to compute function values of encrypted data without releasing other information, hence supporting selective computation on encrypted data. Nevertheless, existing FE schemes do not support fine-grained revocation and update, so they are unsuitable for EHR system. In this paper,we first propose an inner-product functional encryption with fine-grained revocation (IPFE-FR) scheme, and then apply it to a flexible EHR sharing system. Our scheme possesses the following features:(1) a group manager can revoke a specific function computation of medical institutions on encrypted EHR data,instead of all function computation rights. (2) a revoked medical institution is not allowed to compute the function value of encrypted EHR data not only generated after the revocation, but also generated before the revocation. (3) secret keys issued to the same medical institution are bound together to prevent collusion attacks. The formal definition and security model of the IPFE-FR scheme are proposed.Furthermore, we present a concrete construction and reduce its security to the Learning with Errors (LWE) assumption which is quantum-resistant. Finally, the theoretical analysis and experimental implementation of our scheme are conducted to show its efficiency.

Paperid: 1212, https://arxiv.org/pdf/2509.07131.pdf

Abstract:
Blockchain and smart contracts have garnered significant interest in recent years as the foundation of a decentralized, trustless digital ecosystem, thereby eliminating the need for traditional centralized authorities. Despite their central role in powering Web3, their complexity still presents significant barriers for non-expert users. To bridge this gap, Artificial Intelligence (AI)-based agents have emerged as valuable tools for interacting with blockchain environments, supporting a range of tasks, from analyzing on-chain data and optimizing transaction strategies to detecting vulnerabilities within smart contracts. While interest in applying AI to blockchain is growing, the literature still lacks a comprehensive survey that focuses specifically on the intersection with AI agents. Most of the related work only provides general considerations, without focusing on any specific domain. This paper addresses this gap by presenting the first Systematization of Knowledge dedicated to AI-driven systems for blockchain, with a special focus on their security and privacy dimensions, shedding light on their applications, limitations, and future research directions.

Paperid: 1213, https://arxiv.org/pdf/2509.06368.pdf

Abstract:
The immersive nature of XR introduces a fundamentally different set of security and privacy (S&P) challenges due to the unprecedented user interactions and data collection that traditional paradigms struggle to mitigate. As the primary architects of XR applications, developers play a critical role in addressing novel threats. However, to effectively support developers, we must first understand how they perceive and respond to different threats. Despite the growing importance of this issue, there is a lack of in-depth, threat-aware studies that examine XR S&P from the developers' perspective. To fill this gap, we interviewed 23 professional XR developers with a focus on emerging threats in XR. Our study addresses two research questions aiming to uncover existing problems in XR development and identify actionable paths forward. By examining developers' perceptions of S&P threats, we found that: (1) XR development decisions (e.g., rich sensor data collection, user-generated content interfaces) are closely tied to and can amplify S&P threats, yet developers are often unaware of these risks, resulting in cognitive biases in threat perception; and (2) limitations in existing mitigation methods, combined with insufficient strategic, technical, and communication support, undermine developers' motivation, awareness, and ability to effectively address these threats. Based on these findings, we propose actionable and stakeholder-aware recommendations to improve XR S&P throughout the XR development process. This work represents the first effort to undertake a threat-aware, developer-centered study in the XR domain -- an area where the immersive, data-rich nature of the XR technology introduces distinctive challenges.

Paperid: 1214, https://arxiv.org/pdf/2509.03806.pdf

Abstract:
Dynamic Searchable Symmetric Encryption (DSSE) allows secure searches over a dynamic encrypted database but suffers from inherent information leakage. Existing passive attacks against DSSE rely on persistent leakage monitoring to infer leakage patterns, whereas this work targets intermittent observation - a more practical threat model. We propose Peekaboo - a new universal attack framework - and the core design relies on inferring the search pattern and further combining it with auxiliary knowledge and other leakage. We instantiate Peekaboo over the SOTA attacks, Sap (USENIX' 21) and Jigsaw (USENIX' 24), to derive their "+" variants (Sap+ and Jigsaw+). Extensive experiments demonstrate that our design achieves >0.9 adjusted rand index for search pattern recovery and 90% query accuracy vs. FMA's 30% (CCS' 23). Peekaboo's accuracy scales with observation rounds and the number of observed queries but also it resists SOTA countermeasures, with >40% accuracy against file size padding and >80% against obfuscation.

Paperid: 1215, https://arxiv.org/pdf/2509.02413.pdf

Abstract:
Enabling automated decision-making processes by leveraging data-driven analysis is a core goal of Decision Support Systems (DSSs). In multi-party scenarios where decisions rely on distributed and sensitive data, though, ensuring confidentiality, verifiability, transparency, integrity, and consistency at once remains an open challenge for DSSs. To tackle this multi-faceted problem, we propose the Secure Platform for Automated decision Rules via Trusted Applications (SPARTA) approach. By leveraging Trusted Execution Environments (TEEs) at its core, SPARTA ensures that the decision logic and the data remain protected. To guarantee transparency and consistency of the decision process, SPARTA encodes decision rules into verifiable software objects deployed within TEEs. To maintain the confidentiality of the outcomes while keeping the information integrity, SPARTA employs cryptography techniques on notarized data based on user-definable access policies. Based on experiments conducted on public benchmarks and synthetic data, we find our approach to be practically applicable and scalable.

Paperid: 1216, https://arxiv.org/pdf/2509.02372.pdf

Abstract:
Large Language Models (LLMs) have become critical to modern software development, but their reliance on uncurated web-scale datasets for training introduces a significant security risk: the absorption and reproduction of malicious content. To systematically evaluate this risk, we introduce Scam2Prompt, a scalable automated auditing framework that identifies the underlying intent of a scam site and then synthesizes innocuous, developer-style prompts that mirror this intent, allowing us to test whether an LLM will generate malicious code in response to these innocuous prompts. In a large-scale study of four production LLMs (GPT-4o, GPT-4o-mini, Llama-4-Scout, and DeepSeek-V3), we found that Scam2Prompt's innocuous prompts triggered malicious URL generation in 4.24% of cases. To test the persistence of this security risk, we constructed Innoc2Scam-bench, a benchmark of 1,559 innocuous prompts that consistently elicited malicious code from all four initial LLMs. When applied to seven additional production LLMs released in 2025, we found the vulnerability is not only present but severe, with malicious code generation rates ranging from 12.7% to 43.8%. Furthermore, existing safety measures like state-of-the-art guardrails proved insufficient to prevent this behavior, with an overall detection rate of less than 0.3%.

Paperid: 1217, https://arxiv.org/pdf/2508.20890.pdf

Abstract:
Large Language Models (LLMs) are increasingly integrated into real-world applications, from virtual assistants to autonomous agents. However, their flexibility also introduces new attack vectors-particularly Prompt Injection (PI), where adversaries manipulate model behavior through crafted inputs. As attackers continuously evolve with paraphrased, obfuscated, and even multi-task injection strategies, existing benchmarks are no longer sufficient to capture the full spectrum of emerging threats. To address this gap, we construct a new benchmark that systematically extends prior efforts. Our benchmark subsumes the two widely-used existing ones while introducing new manipulation techniques and multi-task scenarios, thereby providing a more comprehensive evaluation setting. We find that existing defenses, though effective on their original benchmarks, show clear weaknesses under our benchmark, underscoring the need for more robust solutions. Our key insight is that while attack forms may vary, the adversary's intent-injecting an unauthorized task-remains invariant. Building on this observation, we propose PromptSleuth, a semantic-oriented defense framework that detects prompt injection by reasoning over task-level intent rather than surface features. Evaluated across state-of-the-art benchmarks, PromptSleuth consistently outperforms existing defense while maintaining comparable runtime and cost efficiency. These results demonstrate that intent-based semantic reasoning offers a robust, efficient, and generalizable strategy for defending LLMs against evolving prompt injection threats.

Paperid: 1218, https://arxiv.org/pdf/2508.18947.pdf

Abstract:
The integration of Large Language Models (LLMs) into Security Operations Centres (SOCs) presents a transformative, yet still evolving, opportunity to reduce analyst workload through human-AI collaboration. However, their real-world application in SOCs remains underexplored. To address this gap, we present a longitudinal study of 3,090 analyst queries from 45 SOC analysts over 10 months. Our analysis reveals that analysts use LLMs as on-demand aids for sensemaking and context-building, rather than for making high-stakes determinations, preserving analyst decision authority. The majority of queries are related to interpreting low-level telemetry (e.g., commands) and refining technical communication through short (1-3 turn) interactions. Notably, 93% of queries align with established cybersecurity competencies (NICE Framework), underscoring the relevance of LLM use for SOC-related tasks. Despite variations in tasks and engagement, usage trends indicate a shift from occasional exploration to routine integration, with growing adoption and sustained use among a subset of analysts. We find that LLMs function as flexible, on-demand cognitive aids that augment, rather than replace, SOC expertise. Our study provides actionable guidance for designing context-aware, human-centred AI assistance in security operations, highlighting the need for further in-the-wild research on real-world analyst-LLM collaboration, challenges, and impacts.

Paperid: 1219, https://arxiv.org/pdf/2508.18148.pdf

Abstract:
Intrusion Detection Systems (IDS) play a crucial role in network security defense. However, a significant challenge for IDS in training detection models is the shortage of adequately labeled malicious samples. To address these issues, this paper introduces a novel semi-supervised framework \textbf{GANGRL-LLM}, which integrates Generative Adversarial Networks (GANs) with Large Language Models (LLMs) to enhance malicious code generation and SQL Injection (SQLi) detection capabilities in few-sample learning scenarios. Specifically, our framework adopts a collaborative training paradigm where: (1) the GAN-based discriminator improves malicious pattern recognition through adversarial learning with generated samples and limited real samples; and (2) the LLM-based generator refines the quality of malicious code synthesis using reward signals from the discriminator. The experimental results demonstrate that even with a limited number of labeled samples, our training framework is highly effective in enhancing both malicious code generation and detection capabilities. This dual enhancement capability offers a promising solution for developing adaptive defense systems capable of countering evolving cyber threats.

Paperid: 1220, https://arxiv.org/pdf/2508.17660.pdf

Abstract:
Voice deepfake attacks, which artificially impersonate human speech for malicious purposes, have emerged as a severe threat. Existing defenses typically inject noise into human speech to compromise voice encoders in speech synthesis models. However, these methods degrade audio quality and require prior knowledge of the attack approaches, limiting their effectiveness in diverse scenarios. Moreover, real-time audios, such as speech in virtual meetings and voice messages, are still exposed to voice deepfake threats. To overcome these limitations, we propose ClearMask, a noise-free defense mechanism against voice deepfake attacks. Unlike traditional approaches, ClearMask modifies the audio mel-spectrogram by selectively filtering certain frequencies, inducing a transferable voice feature loss without injecting noise. We then apply audio style transfer to further deceive voice decoders while preserving perceived sound quality. Finally, optimized reverberation is introduced to disrupt the output of voice generation models without affecting the naturalness of the speech. Additionally, we develop LiveMask to protect streaming speech in real-time through a universal frequency filter and reverberation generator. Our experimental results show that ClearMask and LiveMask effectively prevent voice deepfake attacks from deceiving speaker verification models and human listeners, even for unseen voice synthesis models and black-box API services. Furthermore, ClearMask demonstrates resilience against adaptive attackers who attempt to recover the original audio signal from the protected speech samples.

Paperid: 1221, https://arxiv.org/pdf/2508.15848.pdf

Abstract:
AI-generated text (AIGT) detection evasion aims to reduce the detection probability of AIGT, helping to identify weaknesses in detectors and enhance their effectiveness and reliability in practical applications. Although existing evasion methods perform well, they suffer from high computational costs and text quality degradation. To address these challenges, we propose Self-Disguise Attack (SDA), a novel approach that enables Large Language Models (LLM) to actively disguise its output, reducing the likelihood of detection by classifiers. The SDA comprises two main components: the adversarial feature extractor and the retrieval-based context examples optimizer. The former generates disguise features that enable LLMs to understand how to produce more human-like text. The latter retrieves the most relevant examples from an external knowledge base as in-context examples, further enhancing the self-disguise ability of LLMs and mitigating the impact of the disguise process on the diversity of the generated text. The SDA directly employs prompts containing disguise features and optimized context examples to guide the LLM in generating detection-resistant text, thereby reducing resource consumption. Experimental results demonstrate that the SDA effectively reduces the average detection accuracy of various AIGT detectors across texts generated by three different LLMs, while maintaining the quality of AIGT.

Paperid: 1222, https://arxiv.org/pdf/2508.15036.pdf

Abstract:
The transformer architecture has become a cornerstone of modern AI, fueling remarkable progress across applications in natural language processing, computer vision, and multimodal learning. As these models continue to scale explosively for performance, implementation efficiency remains a critical challenge. Mixture of Experts (MoE) architectures, selectively activating specialized subnetworks (experts), offer a unique balance between model accuracy and computational cost. However, the adaptive routing in MoE architectures, where input tokens are dynamically directed to specialized experts based on their semantic meaning inadvertently opens up a new attack surface for privacy breaches. These input-dependent activation patterns leave distinctive temporal and spatial traces in hardware execution, which adversaries could exploit to deduce sensitive user data. In this work, we propose MoEcho, discovering a side channel analysis based attack surface that compromises user privacy on MoE based systems. Specifically, in MoEcho, we introduce four novel architectural side channels on different computing platforms, including Cache Occupancy Channels and Pageout+Reload on CPUs, and Performance Counter and TLB Evict+Reload on GPUs, respectively. Exploiting these vulnerabilities, we propose four attacks that effectively breach user privacy in large language models (LLMs) and vision language models (VLMs) based on MoE architectures: Prompt Inference Attack, Response Reconstruction Attack, Visual Inference Attack, and Visual Reconstruction Attack. MoEcho is the first runtime architecture level security analysis of the popular MoE structure common in modern transformers, highlighting a serious security and privacy threat and calling for effective and timely safeguards when harnessing MoE based models for developing efficient large scale AI services.

Paperid: 1223, https://arxiv.org/pdf/2508.14699.pdf

Abstract:
Credit card fraud detection (CCFD) is a critical application of Machine Learning (ML) in the financial sector, where accurately identifying fraudulent transactions is essential for mitigating financial losses. ML models have demonstrated their effectiveness in fraud detection task, in particular with the tabular dataset. While adversarial attacks have been extensively studied in computer vision and deep learning, their impacts on the ML models, particularly those trained on CCFD tabular datasets, remains largely unexplored. These latent vulnerabilities pose significant threats to the security and stability of the financial industry, especially in high-value transactions where losses could be substantial. To address this gap, in this paper, we present a holistic framework that investigate the robustness of CCFD ML model against adversarial perturbations under different circumstances. Specifically, the gradient-based attack methods are incorporated into the tabular credit card transaction data in both black- and white-box adversarial attacks settings. Our findings confirm that tabular data is also susceptible to subtle perturbations, highlighting the need for heightened awareness among financial technology practitioners regarding ML model security and trustworthiness. Furthermore, the experiments by transferring adversarial samples from gradient-based attack method to non-gradient-based models also verify our findings. Our results demonstrate that such attacks remain effective, emphasizing the necessity of developing robust defenses for CCFD algorithms.

Paperid: 1224, https://arxiv.org/pdf/2508.10677.pdf

Abstract:
Effective incident response (IR) is critical for mitigating cyber threats, yet security teams are overwhelmed by alert fatigue, high false-positive rates, and the vast volume of unstructured Cyber Threat Intelligence (CTI) documents. While CTI holds immense potential for enriching security operations, its extensive and fragmented nature makes manual analysis time-consuming and resource-intensive. To bridge this gap, we introduce a novel Retrieval-Augmented Generation (RAG)-based framework that leverages Large Language Models (LLMs) to automate and enhance IR by integrating dynamically retrieved CTI. Our approach introduces a hybrid retrieval mechanism that combines NLP-based similarity searches within a CTI vector database with standardized queries to external CTI platforms, facilitating context-aware enrichment of security alerts. The augmented intelligence is then leveraged by an LLM-powered response generation module, which formulates precise, actionable, and contextually relevant incident mitigation strategies. We propose a dual evaluation paradigm, wherein automated assessment using an auxiliary LLM is systematically cross-validated by cybersecurity experts. Empirical validation on real-world and simulated alerts demonstrates that our approach enhances the accuracy, contextualization, and efficiency of IR, alleviating analyst workload and reducing response latency. This work underscores the potential of LLM-driven CTI fusion in advancing autonomous security operations and establishing a foundation for intelligent, adaptive cybersecurity frameworks.

Paperid: 1225, https://arxiv.org/pdf/2508.08836.pdf

Abstract:
Training large language models (LLMs) is resource-intensive and expensive, making protecting intellectual property (IP) for LLMs crucial. Recently, embedding fingerprints into LLMs has emerged as a prevalent method for establishing model ownership. However, existing back-door-based methods suffer from limited stealth and efficiency. To simultaneously address these issues, we propose EditMF, a training-free fingerprinting paradigm that achieves highly imperceptible fingerprint embedding with minimal computational overhead. Ownership bits are mapped to compact, semantically coherent triples drawn from an encrypted artificial knowledge base (e.g., virtual author-novel-protagonist facts). Causal tracing localizes the minimal set of layers influencing each triple, and a zero-space update injects the fingerprint without perturbing unrelated knowledge. Verification requires only a single black-box query and succeeds when the model returns the exact pre-embedded protagonist. Empirical results on LLaMA and Qwen families show that EditMF combines high imperceptibility with negligible model's performance loss, while delivering robustness far beyond LoRA-based fingerprinting and approaching that of SFT embeddings. Extensive experiments demonstrate that EditMF is an effective and low-overhead solution for secure LLM ownership verification.

Paperid: 1226, https://arxiv.org/pdf/2508.08438.pdf

Abstract:
Global KV-cache sharing has emerged as a key optimization for accelerating large language model (LLM) inference. However, it exposes a new class of timing side-channel attacks, enabling adversaries to infer sensitive user inputs via shared cache entries. Existing defenses, such as per-user isolation, eliminate leakage but degrade performance by up to 38.9% in time-to-first-token (TTFT), making them impractical for high-throughput deployment. To address this gap, we introduce SafeKV (Secure and Flexible KV Cache Sharing), a privacy-aware KV-cache management framework that selectively shares non-sensitive entries while confining sensitive content to private caches. SafeKV comprises three components: (i) a hybrid, multi-tier detection pipeline that integrates rule-based pattern matching, a general-purpose privacy detector, and context-aware validation; (ii) a unified radix-tree index that manages public and private entries across heterogeneous memory tiers (HBM, DRAM, SSD); and (iii) entropy-based access monitoring to detect and mitigate residual information leakage. Our evaluation shows that SafeKV mitigates 94% - 97% of timing-based side-channel attacks. Compared to per-user isolation method, SafeKV improves TTFT by up to 40.58% and throughput by up to 2.66X across diverse LLMs and workloads. SafeKV reduces cache-induced TTFT overhead from 50.41% to 11.74% on Qwen3-235B. By combining fine-grained privacy control with high cache reuse efficiency, SafeKV reclaims the performance advantages of global sharing while providing robust runtime privacy guarantees for LLM inference.

Paperid: 1227, https://arxiv.org/pdf/2508.07635.pdf

Abstract:
In this work, we present an experimental deployment of a new design for combined quantum key distribution (QKD) and post-quantum cryptography (PQC). Novel to our system is the dynamic obfuscation of the QKD-PQC sequence of operations, the number of operations, and parameters related to the operations; coupled to the integration of a GPS-free quantum synchronization protocol within the QKD process. We compare the performance and overhead of our QKD-PQC system relative to a standard QKD system with one-time pad encryption, demonstrating that our design can operate in real time with little additional overhead caused by the new security features. Since our system can offer additional defensive strategies against a wide spectrum of practical attacks that undermine deployed QKD, PQC, and certain combinations of these two primitives, we suggest that our design represents one of the most secure communication systems currently available. Given the dynamic nature of its obfuscation attributes, our new system can also be adapted in the field to defeat yet-to-be-discovered practical attacks.

Paperid: 1228, https://arxiv.org/pdf/2508.07263.pdf

Abstract:
With the rise of 3D Gaussian Splatting (3DGS), a variety of digital watermarking techniques, embedding either 1D bitstreams or 2D images, are used for copyright protection. However, the robustness of these watermarking techniques against potential attacks remains underexplored. This paper introduces the first universal black-box attack framework, the Group-based Multi-objective Evolutionary Attack (GMEA), designed to challenge these watermarking systems. We formulate the attack as a large-scale multi-objective optimization problem, balancing watermark removal with visual quality. In a black-box setting, we introduce an indirect objective function that blinds the watermark detector by minimizing the standard deviation of features extracted by a convolutional network, thus rendering the feature maps uninformative. To manage the vast search space of 3DGS models, we employ a group-based optimization strategy to partition the model into multiple, independent sub-optimization problems. Experiments demonstrate that our framework effectively removes both 1D and 2D watermarks from mainstream 3DGS watermarking methods while maintaining high visual fidelity. This work reveals critical vulnerabilities in existing 3DGS copyright protection schemes and calls for the development of more robust watermarking systems.

Paperid: 1229, https://arxiv.org/pdf/2508.06734.pdf

Abstract:
Graph-based malware classifiers can achieve over 94% accuracy on standard Android datasets, yet we find they suffer accuracy drops of up to 45% when evaluated on previously unseen malware variants from the same family - a scenario where strong generalization would typically be expected. This highlights a key limitation in existing approaches: both the model architectures and their structure-only representations often fail to capture deeper semantic patterns. In this work, we propose a robust semantic enrichment framework that enhances function call graphs with contextual features, including function-level metadata and, when available, code embeddings derived from large language models. The framework is designed to operate under real-world constraints where feature availability is inconsistent, and supports flexible integration of semantic signals. To evaluate generalization under realistic domain and temporal shifts, we introduce two new benchmarks: MalNet-Tiny-Common and MalNet-Tiny-Distinct, constructed using malware family partitioning to simulate cross-family generalization and evolving threat behavior. Experiments across multiple graph neural network backbones show that our method improves classification performance by up to 8% under distribution shift and consistently enhances robustness when integrated with adaptation-based methods. These results offer a practical path toward building resilient malware detection systems in evolving threat environments.

Paperid: 1230, https://arxiv.org/pdf/2508.02551.pdf

Abstract:
Location-based augmented reality (LB-AR) applications, such as PokÃ©mon Go, stream sub-second GPS updates to deliver responsive and immersive user experiences. However, this high-frequency location reporting introduces serious privacy risks. Protecting privacy in LB-AR is significantly more challenging than in traditional location-based services (LBS), as it demands real-time location protection with strong per-location and trajectory-level privacy guaranteed while maintaining low latency and high quality of service (QoS). Existing methods fail to meet these combined demands. To fill the gap, we present PrivAR, the first client-side privacy framework for real-time LB-AR. PrivAR introduces two lightweight mechanisms: (i) Planar Staircase Mechanism (PSM) which designs a staircase-shaped distribution to generate noisy location with strong per-location privacy and low expected error; and (ii) Thresholded Reporting with PSM (TR-PSM), a selective scheme that releases a noisy location update only when a displacement exceeds a private threshold, enabling many-to-one mappings for enhanced trace-level privacy while preserving high QoS. We present theoretical analysis, extensive experiments on two public datasets and our proprietary GeoTrace dataset, and validate PrivAR on a PokÃ©mon-Go-style prototype. Results show PrivAR improves QoS (Gamescore) by up to 50%, while increasing attacker error by 1.8x over baseline with an additional 0.06 milliseconds runtime overhead.

Paperid: 1231, https://arxiv.org/pdf/2508.02145.pdf

Abstract:
Smart contract upgrades are increasingly common due to their flexibility in modifying deployed contracts, such as fixing bugs or adding new functionalities. Meanwhile, upgrades compromise the immutability of contracts, introducing significant security concerns. While existing research has explored the security impacts of contract upgrades, these studies are limited in collection of upgrade behaviors and identification of insecurities. To address these limitations, we conduct a comprehensive study on the insecurities of upgrade behaviors. First, we build a dataset containing 83,085 upgraded contracts and 20,902 upgrade chains. To our knowledge, this is the first large-scale dataset about upgrade behaviors, revealing their diversity and exposing gaps in public disclosure. Next, we develop a taxonomy of insecurities based on 37 real-world security incidents, categorizing eight types of upgrade risks and providing the first complete view of upgrade-related insecurities. Finally, we survey public awareness of these risks and existing mitigations. Our findings show that four types of security risks are overlooked by the public and lack mitigation measures. We detect these upgrade risks through a preliminary study, identifying 31,407 related issues - a finding that raises significant concerns.

Paperid: 1232, https://arxiv.org/pdf/2508.02092.pdf

Abstract:
Large language models represent significant investments in computation, data, and engineering expertise, making them extraordinarily valuable intellectual assets. Nevertheless, these AI assets remain vulnerable to unauthorized redistribution and commercial exploitation through fine-tuning or black-box deployment. Current fingerprinting approaches face a fundamental trade-off: intrinsic methods require full parameter access, while backdoor-based techniques employ statistically anomalous triggers easily detected and filtered by adversaries. To address these limitations, we introduce FPEdit, a novel knowledge-editing framework that injects semantically coherent natural language fingerprints by modifying a sparse subset of model weights. This ensures stealthy and precise ownership encoding without degrading the core functionality. Extensive experiments show that FPEdit achieves $95$-$100\%$ fingerprint retention under both full-parameter fine-tuning and parameter-efficient adaptation, while preserving performance on 24 downstream benchmarks. Moreover, FPEdit remains robust under quantization, pruning, and stochastic decoding, and can embed 10 fingerprint pairs into LLaMA2-7B in under 10 minutes using less than 32 GB of GPU memory, a $70\%$ reduction in resource requirements compared to existing techniques. These advances establish FPEdit as the first fingerprinting approach to simultaneously achieve robustness against adaptation, resistance to detection, and preservation of model utility, providing a minimally invasive solution for reliable provenance verification of large language models in adversarial deployment scenarios.

Paperid: 1233, https://arxiv.org/pdf/2507.22177.pdf

Abstract:
Microelectronic systems are widely used in many sensitive applications (e.g., manufacturing, energy, defense). These systems increasingly handle sensitive data (e.g., encryption key) and are vulnerable to diverse threats, such as, power side-channel attacks, which infer sensitive data through dynamic power profile. In this paper, we present a novel framework, POLARIS for mitigating power side channel leakage using an Explainable Artificial Intelligence (XAI) guided masking approach. POLARIS uses an unsupervised process to automatically build a tailored training dataset and utilize it to train a masking model.The POLARIS framework outperforms state-of-the-art mitigation solutions (e.g., VALIANT) in terms of leakage reduction, execution time, and overhead across large designs.

Paperid: 1234, https://arxiv.org/pdf/2507.21193.pdf

Abstract:
Next generation Radio Access Networks (RANs) introduce programmability, intelligence, and near real-time control through intelligent controllers, enabling enhanced security within the RAN and across broader 5G/6G infrastructures. This paper presents a comprehensive survey highlighting opportunities, challenges, and research gaps for Large Language Models (LLMs)-assisted explainable (XAI) intrusion detection (IDS) for secure future RAN environments. Motivated by this, we propose an LLM interpretable anomaly-based detection system for distributed denial-of-service (DDoS) attacks using multivariate time series key performance measures (KPMs), extracted from E2 nodes, within the Near Real-Time RAN Intelligent Controller (Near-RT RIC). An LSTM-based model is trained to identify malicious User Equipment (UE) behavior based on these KPMs. To enhance transparency, we apply post-hoc local explainability methods such as LIME and SHAP to interpret individual predictions. Furthermore, LLMs are employed to convert technical explanations into natural-language insights accessible to non-expert users. Experimental results on real 5G network KPMs demonstrate that our framework achieves high detection accuracy (F1-score > 0.96) while delivering actionable and interpretable outputs.

Paperid: 1235, https://arxiv.org/pdf/2507.20417.pdf

Abstract:
Recent advances in synthetic speech have made audio deepfakes increasingly realistic, posing significant security risks. Existing detection methods that rely on a single modality, either raw waveform embeddings or spectral based features, are vulnerable to non spoof disturbances and often overfit to known forgery algorithms, resulting in poor generalization to unseen attacks. To address these shortcomings, we investigate hybrid fusion frameworks that integrate self supervised learning (SSL) based representations with handcrafted spectral descriptors (MFCC , LFCC, CQCC). By aligning and combining complementary information across modalities, these fusion approaches capture subtle artifacts that single feature approaches typically overlook. We explore several fusion strategies, including simple concatenation, cross attention, mutual cross attention, and a learnable gating mechanism, to optimally blend SSL features with fine grained spectral cues. We evaluate our approach on four challenging public benchmarks and report generalization performance. All fusion variants consistently outperform an SSL only baseline, with the cross attention strategy achieving the best generalization with a 38% relative reduction in equal error rate (EER). These results confirm that joint modeling of waveform and spectral views produces robust, domain agnostic representations for audio deepfake detection.

Paperid: 1236, https://arxiv.org/pdf/2507.19367.pdf

Abstract:
Firmware updates remain the primary line of defense for IoT devices; however, the update channel itself has become a well-established attack vector. Existing defenses mainly focus on securing monolithic firmware images, leaving module-level customization -a growing user demand-largely unprotected and insufficiently explored. To address this gap, we conduct a pilot study on the update workflows of 200 Linux-based IoT devices across 23 vendors, uncovering five previously undocumented vulnerabilities caused by customization practices. A broader analysis of update-related CVEs from 2020 to 2024 reveals that over half originate from customization-induced issues. These findings highlight a critical yet underexamined reality: as customization increases, so does the attack surface, while current defenses fail to keep pace. We propose IMUP (Integrity-Centric Modular Update Platform), the first framework to address two key challenges: constructing a trustworthy cross-module integrity chain and scaling update performance under mass customization. IMUP combines three techniques: per-module chameleon hashing for integrity, server-side proof-of-work offloading to reduce device overhead, and server-side caching to reuse module combinations, minimizing rebuild costs. Security analysis shows that even when 95 percent of secret keys are exposed, forging a valid image incurs over 300 times the cost of the legitimate server. Experiments on heterogeneous IoT devices demonstrate that IMUP reduces server-side generation time by 2.9 times and device downtime by 5.9 times compared to a package-manager baseline.

Paperid: 1237, https://arxiv.org/pdf/2507.17516.pdf

Abstract:
Large-scale data collection, from national censuses to IoT-enabled smart homes, routinely gathers dozens of attributes per individual. These multi-attribute datasets are vital for analytics but pose significant privacy risks. Local Differential Privacy (LDP) is a powerful tool to protect user data privacy by allowing users to locally perturb their records before releasing to an untrusted data aggregator. However, existing LDP mechanisms either split the privacy budget across all attributes or treat each attribute independently, ignoring natural inter-attribute correlations. This leads to excessive noise or fragmented budgets, resulting in significant utility loss, particularly in high-dimensional settings. To overcome these limitations, we propose Correlated Randomized Response (Corr-RR), a novel LDP mechanism that leverages correlations among attributes to substantially improve utility while maintaining rigorous LDP guarantees. Corr-RR allocates the full privacy budget to perturb a single, randomly selected attribute and reconstructs the remaining attributes using estimated interattribute dependencies, without incurring additional privacy cost. To enable this, Corr-RR operates in two phases: (1) a subset of users apply standard LDP mechanisms to estimate correlations, and (2) each remaining user perturbs one attribute and infers the others using the learned correlations. We theoretically prove that Corr-RR satisfies $Îµ$-LDP, and extensive experiments on synthetic and real-world datasets demonstrate that Corr-RR consistently outperforms state-of-the-art LDP mechanisms, particularly in scenarios with many attributes and strong inter-attribute correlations.

Paperid: 1238, https://arxiv.org/pdf/2507.16872.pdf

Abstract:
Model compression is crucial for minimizing memory storage and accelerating inference in deep learning (DL) models, including recent foundation models like large language models (LLMs). Users can access different compressed model versions according to their resources and budget. However, while existing compression operations primarily focus on optimizing the trade-off between resource efficiency and model performance, the privacy risks introduced by compression remain overlooked and insufficiently understood. In this work, through the lens of membership inference attack (MIA), we propose CompLeak, the first privacy risk evaluation framework examining three widely used compression configurations that are pruning, quantization, and weight clustering supported by the commercial model compression framework of Google's TensorFlow-Lite (TF-Lite) and Facebook's PyTorch Mobile. CompLeak has three variants, given available access to the number of compressed models and original model. CompLeakNR starts by adopting existing MIA methods to attack a single compressed model, and identifies that different compressed models influence members and non-members differently. When the original model and one compressed model are available, CompLeakSR leverages the compressed model as a reference to the original model and uncovers more privacy by combining meta information (e.g., confidence vector) from both models. When multiple compressed models are available with/without accessing the original model, CompLeakMR innovatively exploits privacy leakage info from multiple compressed versions to substantially signify the overall privacy leakage. We conduct extensive experiments on seven diverse model architectures (from ResNet to foundation models of BERT and GPT-2), and six image and textual benchmark datasets.

Paperid: 1239, https://arxiv.org/pdf/2507.12314.pdf

Abstract:
While reinforcement learning-trained Large Reasoning Models (LRMs, e.g., Deepseek-R1) demonstrate advanced reasoning capabilities in the evolving Large Language Models (LLMs) domain, their susceptibility to security threats remains a critical vulnerability. This weakness is particularly evident in Chain-of-Thought (CoT) generation processes, where adversarial methods like backdoor prompt attacks can systematically subvert the model's core reasoning mechanisms. The emerging Chain-of-Thought Attack (CoTA) reveals this vulnerability through exploiting prompt controllability, simultaneously degrading both CoT safety and task performance with low-cost interventions. To address this compounded security-performance vulnerability, we propose Thought Purity (TP): a defense paradigm that systematically strengthens resistance to malicious content while preserving operational efficacy. Our solution achieves this through three synergistic components: (1) a safety-optimized data processing pipeline (2) reinforcement learning-enhanced rule constraints (3) adaptive monitoring metrics. Our approach establishes the first comprehensive defense mechanism against CoTA vulnerabilities in reinforcement learning-aligned reasoning systems, significantly advancing the security-functionality equilibrium for next-generation AI architectures.

Paperid: 1240, https://arxiv.org/pdf/2507.10873.pdf

Abstract:
Host-based intrusion detection system (HIDS) is a key defense component to protect the organizations from advanced threats like Advanced Persistent Threats (APT). By analyzing the fine-grained logs with approaches like data provenance, HIDS has shown successes in capturing sophisticated attack traces. Despite the progresses embarked by the research community and industry, HIDS still frequently encounters backlash from their operators in the deployed environments, due to issues like high false-positive rate, inconsistent outcomes across environments and human-unfriendly detection results. Large Language Models (LLMs) have great potentials to advance the state of HIDS, given their extensive knowledge of attack techniques and their ability to detect anomalies through semantic analysis, anchored by recent studies. Yet, our preliminary analysis indicates that building an HIDS by naively prompting an LLM is unlikely to succeed. In this work, we explore the direction of building a customized LLM pipeline for HIDS and develop a system named SHIELD. SHIELD addresses challenges related to LLM's token limits, confusion of background noises, etc., by integrating a variety of techniques like event-level Masked Autoencoder (MAE) for attack window detection, attack evidence identification and expansion, Deterministic Data Augmentation (DDA) for profiling normal activities, and multi-purpose prompting that guides the LLM to conduct precise and interpretable attack investigations. Extensive experiments on three log datasets (DARPA-E3, NodLink-simulated-data and ATLASv2) show that SHIELD consistently achieves outstanding performance in comparison with 5 representative HIDS. These findings highlight the potential of LLMs as powerful tools for intrusion detection and pave the way for future research in this domain.

Paperid: 1241, https://arxiv.org/pdf/2507.07916.pdf

Abstract:
Phishing has become a prominent risk in modern cybersecurity, often used to bypass technological defences by exploiting predictable human behaviour. Warning dialogues are a standard mitigation measure, but the lack of explanatory clarity and static content limits their effectiveness. In this paper, we report on our research to assess the capacity of Large Language Models (LLMs) to generate clear, concise, and scalable explanations for phishing warnings. We carried out a large-scale between-subjects user study (N = 750) to compare the influence of warning dialogues supplemented with manually generated explanations against those generated by two LLMs, Claude 3.5 Sonnet and Llama 3.3 70B. We investigated two explanatory styles (feature-based and counterfactual) for their effects on behavioural metrics (click-through rate) and perceptual outcomes (e.g., trust, risk, clarity). The results indicate that well-constructed LLM-generated explanations can equal or surpass manually crafted explanations in reducing susceptibility to phishing; Claude-generated warnings exhibited particularly robust performance. Feature-based explanations were more effective for genuine phishing attempts, whereas counterfactual explanations diminished false-positive rates. Other variables such as workload, gender, and prior familiarity with warning dialogues significantly moderated warning effectiveness. These results indicate that LLMs can be used to automatically build explanations for warning users against phishing, and that such solutions are scalable, adaptive, and consistent with human-centred values.

Paperid: 1242, https://arxiv.org/pdf/2507.07871.pdf

Abstract:
Watermarking offers a promising solution for GenAI providers to establish the provenance of their generated content. A watermark is a hidden signal embedded in the generated content, whose presence can later be verified using a secret watermarking key. A security threat to GenAI providers are \emph{forgery attacks}, where malicious users insert the provider's watermark into generated content that was \emph{not} produced by the provider's models, potentially damaging their reputation and undermining trust. One potential defense to resist forgery is using multiple keys to watermark generated content. However, it has been shown that forgery attacks remain successful when adversaries can collect sufficiently many watermarked samples. We propose an improved multi-key watermarking method that resists all surveyed forgery attacks and scales independently of the number of watermarked samples collected by the adversary. Our method accepts content as genuinely watermarked only if \emph{exactly} one watermark is detected. We focus on the image and text modalities, but our detection method is modality-agnostic, since it treats the underlying watermarking method as a black-box. We derive theoretical bounds on forgery-resistance and empirically validate them using Mistral-7B. Our results show a decrease in forgery success from up to $100\%$ using single-key baselines to only $2\%$. While our method resists all surveyed attacks, we find that highly capable, adaptive attackers can still achieve success rates of up to $65\%$ if watermarked content generated using different keys is easily separable.

Paperid: 1243, https://arxiv.org/pdf/2507.06497.pdf

Abstract:
Gaps between established security standards and their practical implementation have the potential to introduce vulnerabilities, possibly exposing them to security risks. To effectively address and mitigate these security and compliance challenges, security risk management strategies are essential. However, it must adhere to well-established strategies and industry standards to ensure consistency, reliability, and compatibility both within and across organizations. In this paper, we introduce a new hybrid risk assessment framework called TELSAFE, which employs probabilistic modeling for quantitative risk assessment and eliminates the influence of expert opinion bias. The framework encompasses both qualitative and quantitative assessment phases, facilitating effective risk management strategies tailored to the unique requirements of organizations. A specific use case utilizing Common Vulnerabilities and Exposures (CVE)-related data demonstrates the framework's applicability and implementation in real-world scenarios, such as in the telecommunications industry.

Paperid: 1244, https://arxiv.org/pdf/2507.02976.pdf

Abstract:
Large Language Models (LLMs) and their agentic frameworks are increasingly adopted to automate software development tasks such as issue resolution and program repair. While prior work has identified security risks in LLM-generated code, most evaluations have focused on synthetic or isolated settings, leaving open questions about the security of these systems in real-world development contexts. In this study, we present the first large-scale security analysis of LLM-generated patches using 20,000+ issues from the SWE-bench dataset. We evaluate patches produced by a standalone LLM (Llama 3.3) and compare them to developer-written patches. We also assess the security of patches generated by three top-performing agentic frameworks (OpenHands, AutoCodeRover, HoneyComb) on a subset of our data. Finally, we analyze a wide range of code, issue, and project-level factors to understand the conditions under which LLMs and agents are most likely to generate insecure code. Our findings reveal that the standalone LLM introduces nearly 9x more new vulnerabilities than developers, with many of these exhibiting unique patterns not found in developers' code. Agentic workflows also generate a significant number of vulnerabilities, particularly when granting LLMs more autonomy, potentially increasing the likelihood of misinterpreting project context or task requirements. We find that vulnerabilities are more likely to occur in LLM patches associated with a higher number of files, more lines of generated code, and GitHub issues that lack specific code snippets or information about the expected code behavior and steps to reproduce. These results suggest that contextual factors play a critical role in the security of the generated code and point toward the need for proactive risk assessment methods that account for both code and issue-level information to complement existing vulnerability detection tools.

Paperid: 1245, https://arxiv.org/pdf/2507.02727.pdf

Abstract:
Local differential privacy (LDP) provides a rigorous and quantifiable privacy guarantee for personal data by introducing perturbation at the data source. However, quantifying the impact of these perturbations on classifier utility remains a theoretical challenge, particularly for complex or black-box classifiers. This paper presents a framework for theoretically quantifying classifier utility under LDP mechanisms. The key insight is that LDP perturbation is concentrated around the original data with a specific probability, transforming utility analysis of the classifier into its robustness analysis in this concentrated region. Our framework connects the concentration analysis of LDP mechanisms with the robustness analysis of classifiers. It treats LDP mechanisms as general distributional functions and classifiers as black-box functions, thus applicable to any LDP mechanism and classifier. A direct application of our utility quantification is guiding the selection of LDP mechanisms and privacy parameters for a given classifier. Notably, our analysis shows that a piecewise-based mechanism leads to better utility compared to alternatives in common scenarios. Using this framework alongside two novel refinement techniques, we conduct case studies on utility quantification for typical mechanism-classifier combinations. The results demonstrate that our theoretical utility quantification aligns closely with empirical observations, particularly when classifiers operate in lower-dimensional input spaces.

Paperid: 1246, https://arxiv.org/pdf/2512.22765.pdf

Abstract:
Signed networks, encoding both positive and negative interactions, are essential for modeling complex systems in social and financial domains. Sign prediction, which infers the sign of a target link, has wide-ranging practical applications. Traditional motif-based Naïve Bayes models assume that all neighboring nodes contribute equally to a target link's sign, overlooking the heterogeneous influence among neighbors and potentially limiting performance. To address this, we propose a generalizable sign prediction framework that explicitly models the heterogeneity. Specifically, we design two role functions to quantify the differentiated influence of neighboring nodes. We further extend this approach from a single motif to multiple motifs via two strategies. The generalized multiple motifs-based Naïve Bayes model linearly combines information from diverse motifs, while the Feature-driven Generalized Motif-based Naïve Bayes (FGMNB) model integrates high-dimensional motif features using machine learning. Extensive experiments on four real-world signed networks show that FGMNB consistently outperforms five state-of-the-art embedding-based baselines on three of these networks. Moreover, we observe that the most predictive motif structures differ across datasets, highlighting the importance of local structural patterns and offering valuable insights for motif-based feature engineering. Our framework provides an effective and theoretically grounded solution to sign prediction, with practical implications for enhancing trust and security in online platforms.

Paperid: 1247, https://arxiv.org/pdf/2512.21681.pdf

Abstract:
Retrieval-Augmented Code Generation (RACG) is increasingly adopted to enhance Large Language Models for software development, yet its security implications remain dangerously underexplored. This paper conducts the first systematic exploration of a critical and stealthy threat: backdoor attacks targeting the retriever component, which represents a significant supply-chain vulnerability. It is infeasible to assess this threat realistically, as existing attack methods are either too ineffective to pose a real danger or are easily detected by state-of-the-art defense mechanisms spanning both latent-space analysis and token-level inspection, which achieve consistently high detection rates. To overcome this barrier and enable a realistic analysis, we first developed VenomRACG, a new class of potent and stealthy attack that serves as a vehicle for our investigation. Its design makes poisoned samples statistically indistinguishable from benign code, allowing the attack to consistently maintain low detectability across all evaluated defense mechanisms. Armed with this capability, our exploration reveals a severe vulnerability: by injecting vulnerable code equivalent to only 0.05% of the entire knowledge base size, an attacker can successfully manipulate the backdoored retriever to rank the vulnerable code in its top-5 results in 51.29% of cases. This translates to severe downstream harm, causing models like GPT-4o to generate vulnerable code in over 40% of targeted scenarios, while leaving the system's general performance intact. Our findings establish that retriever backdooring is not a theoretical concern but a practical threat to the software development ecosystem that current defenses are blind to, highlighting the urgent need for robust security measures.

Paperid: 1248, https://arxiv.org/pdf/2512.16650.pdf

Abstract:
Large language models often face a three-way trade-off among detection accuracy, inference latency, and deployment cost when used in real-world safety-sensitive applications. This paper introduces Prefix Probing, a black-box harmful content detection method that compares the conditional log-probabilities of "agreement/execution" versus "refusal/safety" opening prefixes and leverages prefix caching to reduce detection overhead to near first-token latency. During inference, the method requires only a single log-probability computation over the probe prefixes to produce a harmfulness score and apply a threshold, without invoking any additional models or multi-stage inference. To further enhance the discriminative power of the prefixes, we design an efficient prefix construction algorithm that automatically discovers highly informative prefixes, substantially improving detection performance. Extensive experiments demonstrate that Prefix Probing achieves detection effectiveness comparable to mainstream external safety models while incurring only minimal computational cost and requiring no extra model deployment, highlighting its strong practicality and efficiency.

Paperid: 1249, https://arxiv.org/pdf/2512.12921.pdf

Abstract:
Artificial intelligence (AI) systems are being readily and rapidly adopted, increasingly permeating critical domains: from consumer platforms and enterprise software to networked systems with embedded agents. While this has unlocked potential for human productivity gains, the attack surface has expanded accordingly: threats now span content safety failures (e.g., harmful or deceptive outputs), model and data integrity compromise (e.g., poisoning, supply-chain tampering), runtime manipulations (e.g., prompt injection, tool and agent misuse), and ecosystem risks (e.g., orchestration abuse, multi-agent collusion). Existing frameworks such as MITRE ATLAS, National Institute of Standards and Technology (NIST) AI 100-2 Adversarial Machine Learning (AML) taxonomy, and OWASP Top 10s for Large Language Models (LLMs) and Agentic AI Applications provide valuable viewpoints, but each covers only slices of this multi-dimensional space. This paper presents Cisco's Integrated AI Security and Safety Framework ("AI Security Framework"), a unified, lifecycle-aware taxonomy and operationalization framework that can be used to classify, integrate, and operationalize the full range of AI risks. It integrates AI security and AI safety across modalities, agents, pipelines, and the broader ecosystem. The AI Security Framework is designed to be practical for threat identification, red-teaming, risk prioritization, and it is comprehensive in scope and can be extensible to emerging deployments in multimodal contexts, humanoids, wearables, and sensory infrastructures. We analyze gaps in prevailing frameworks, discuss design principles for our framework, and demonstrate how the taxonomy provides structure for understanding how modern AI systems fail, how adversaries exploit these failures, and how organizations can build defenses across the AI lifecycle that evolve alongside capability advancements.

Paperid: 1250, https://arxiv.org/pdf/2512.08320.pdf

Abstract:
Cyber-physical systems (CPSs) are used extensively in critical infrastructure, underscoring the need for anomaly detection systems that are able to catch even the most motivated attackers. Traditional anomaly detection techniques typically do `one-off' training on datasets crafted by experts or generated by fuzzers, potentially limiting their ability to generalize to unseen and more subtle attack strategies. Stopping at this point misses a key opportunity: a defender can actively challenge the attacker to find more nuanced attacks, which in turn can lead to more effective detection capabilities. Building on this concept, we propose Evo-Defender, an evolutionary framework that iteratively strengthens CPS defenses through a dynamic attacker-defender interaction. Evo-Defender includes a smart attacker that employs guided fuzzing to explore diverse, non-redundant attack strategies, while the self-evolving defender uses incremental learning to adapt to new attack patterns. We implement Evo-Defender on two realistic CPS testbeds: the Tennessee Eastman process and a Robotic Arm Assembly Workstation, injecting over 600 attack scenarios. In end-to-end attack detection experiments, Evo-Defender achieves up to 2.7% higher performance than state-of-the-art baselines on unseen scenarios, while utilizing training data more efficiently for faster and more robust detection.

Paperid: 1251, https://arxiv.org/pdf/2512.06556.pdf

Abstract:
The Model Context Protocol (MCP) enables Large Language Models to integrate external tools through structured descriptors, increasing autonomy in decision-making, task execution, and multi-agent workflows. However, this autonomy creates a largely overlooked security gap. Existing defenses focus on prompt-injection attacks and fail to address threats embedded in tool metadata, leaving MCP-based systems exposed to semantic manipulation. This work analyzes three classes of semantic attacks on MCP-integrated systems: (1) Tool Poisoning, where adversarial instructions are hidden in tool descriptors; (2) Shadowing, where trusted tools are indirectly compromised through contaminated shared context; and (3) Rug Pulls, where descriptors are altered after approval to subvert behavior. To counter these threats, we introduce a layered security framework with three components: RSA-based manifest signing to enforce descriptor integrity, LLM-on-LLM semantic vetting to detect suspicious tool definitions, and lightweight heuristic guardrails that block anomalous tool behavior at runtime. Through evaluation of GPT-4, DeepSeek, and Llama-3.5 across eight prompting strategies, we find that security performance varies widely by model architecture and reasoning method. GPT-4 blocks about 71 percent of unsafe tool calls, balancing latency and safety. DeepSeek shows the highest resilience to Shadowing attacks but with greater latency, while Llama-3.5 is fastest but least robust. Our results show that the proposed framework reduces unsafe tool invocation rates without model fine-tuning or internal modification.

Paperid: 1252, https://arxiv.org/pdf/2512.03461.pdf

Abstract:
Non-volatile memories (NVMs) offer negligible leakage power consumption, high integration density, and data retention, but their non-volatility also raises the risk of data exposure. Conventional encryption techniques such as the Advanced Encryption Standard (AES) incur large area overheads and performance penalties, motivating lightweight XOR-based in-situ encryption schemes with low area and power requirements. This work proposes an ultra-dense single-transistor encrypted cell using ferroelectric FET (FeFET) devices, which, to our knowledge, is the first to eliminate the two-memory-devices-per-encrypted-cell requirement in XOR-based schemes, enabling encrypted memory arrays to maintain the same number of storage devices as unencrypted arrays. The key idea is an in-memory single-FeFET XOR scheme, where the ciphertext is encoded in the device threshold voltage and leverages the direction-dependent current flow of the FeFET for single-cycle decryption; eliminating complementary bit storage also removes the need for two write cycles, allowing faster encryption. We extend the approach to multi-level-cell (MLC) FeFETs to store multiple bits per transistor. We validate the proposed idea through both simulation and experimental evaluations. Our analysis on a 128x128-bit array shows 2x higher encryption/decryption throughput than prior FeFET work and 45.2x/14.12x improvement over AES, while application-level evaluations using neural-network benchmarks demonstrate average latency reductions of 50% and 95% compared to prior FeFET-based and AES-based schemes, respectively.

Paperid: 1253, https://arxiv.org/pdf/2512.01295.pdf

Abstract:
This paper articulates short- and long-term research problems in AI agent security and privacy, using the lens of computer systems security. This approach examines end-to-end security properties of entire systems, rather than AI models in isolation. While we recognize that hardening a single model is useful, it is important to realize that it is often insufficient. By way of an analogy, creating a model that is always helpful and harmless is akin to creating software that is always helpful and harmless. The collective experience of decades of cybersecurity research and practice shows that this is insufficient. Rather, constructing an informed and realistic attacker model before building a system, applying hard-earned lessons from software security, and continuous improvement of security posture is a tried-and-tested approach to securing real computer systems. A key goal is to examine where research challenges arise when applying traditional security principles in the context of AI agents. A secondary goal of this report is to distill these ideas for AI and ML practitioners and researchers. We discuss the challenges of applying security principles to agentic computing, present 11 case studies of real attacks on agentic systems, and define a series of new research problems specific to the security of agentic systems.

Paperid: 1254, https://arxiv.org/pdf/2512.00272.pdf

Abstract:
Approximate machine unlearning aims to efficiently remove the influence of specific data points from a trained model, offering a practical alternative to full retraining. However, it introduces privacy risks: an adversary with access to pre- and post-unlearning models can exploit their differences for membership inference or data reconstruction. We show these vulnerabilities arise from two factors: large gradient norms of forget-set samples and the close proximity of unlearned parameters to the original model. To demonstrate their severity, we propose unlearning-specific membership inference and reconstruction attacks, showing that several state-of-the-art methods (e.g., NGP, SCRUB) remain vulnerable. To mitigate this leakage, we introduce WARP, a plug-and-play teleportation defense that leverages neural network symmetries to reduce forget-set gradient energy and increase parameter dispersion while preserving predictions. This reparameterization obfuscates the signal of forgotten data, making it harder for attackers to distinguish forgotten samples from non-members or recover them via reconstruction. Across six unlearning algorithms, our approach achieves consistent privacy gains, reducing adversarial advantage (AUC) by up to 64% in black-box and 92% in white-box settings, while maintaining accuracy on retained data. These results highlight teleportation as a general tool for reducing attack success in approximate unlearning.

Paperid: 1255, https://arxiv.org/pdf/2511.21990.pdf

Abstract:
This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment. We contend that safety and security are not merely fixed attributes of individual models but also emergent properties arising from the dynamic interactions among models, orchestrators, tools, and data within their operating environments. We propose a new way of identification of novel agentic risks through the lens of user safety. Although, for traditional LLMs and agentic models in isolation, safety and security has a clear separation, through the lens of safety in agentic systems, they appear to be connected. Building on this foundation, we define an operational agentic risk taxonomy that unifies traditional safety and security concerns with novel, uniquely agentic risks, including tool misuse, cascading action chains, and unintended control amplification among others. At the core of our approach is a dynamic agentic safety and security framework that operationalizes contextual agentic risk management by using auxiliary AI models and agents, with human oversight, to assist in contextual risk discovery, evaluation, and mitigation. We further address one of the most challenging aspects of safety and security of agentic systems: risk discovery through sandboxed, AI-driven red teaming. We demonstrate the framework effectiveness through a detailed case study of NVIDIA flagship agentic research assistant, AI-Q Research Assistant, showcasing practical, end-to-end safety and security evaluations in complex, enterprise-grade agentic workflows. This risk discovery phase finds novel agentic risks that are then contextually mitigated. We also release the dataset from our case study, containing traces of over 10,000 realistic attack and defense executions of the agentic workflow to help advance research in agentic safety.

Paperid: 1256, https://arxiv.org/pdf/2511.20290.pdf

Abstract:
Provenance-based threat hunting identifies Advanced Persistent Threats (APTs) on endpoints by correlating attack patterns described in Cyber Threat Intelligence (CTI) with provenance graphs derived from system audit logs. A fundamental challenge in this paradigm lies in the modality gap -- the structural and semantic disconnect between provenance graphs and CTI reports. Prior work addresses this by framing threat hunting as a graph matching task: 1) extracting attack graphs from CTI reports, and 2) aligning them with provenance graphs. However, this pipeline incurs severe \textit{information loss} during graph extraction and demands intensive manual curation, undermining scalability and effectiveness. In this paper, we present APT-CGLP, a novel cross-modal APT hunting system via Contrastive Graph-Language Pre-training, facilitating end-to-end semantic matching between provenance graphs and CTI reports without human intervention. First, empowered by the Large Language Model (LLM), APT-CGLP mitigates data scarcity by synthesizing high-fidelity provenance graph-CTI report pairs, while simultaneously distilling actionable insights from noisy web-sourced CTIs to improve their operational utility. Second, APT-CGLP incorporates a tailored multi-objective training algorithm that synergizes contrastive learning with inter-modal masked modeling, promoting cross-modal attack semantic alignment at both coarse- and fine-grained levels. Extensive experiments on four real-world APT datasets demonstrate that APT-CGLP consistently outperforms state-of-the-art threat hunting baselines in terms of accuracy and efficiency.

Paperid: 1257, https://arxiv.org/pdf/2511.10933.pdf

Abstract:
Robust invisible watermarking embeds hidden information in images such that the watermark can survive various manipulations. However, the emergence of powerful diffusion-based image generation and editing techniques poses a new threat to these watermarking schemes. In this paper, we investigate the intersection of diffusion-based image editing and robust image watermarking. We analyze how diffusion-driven image edits can significantly degrade or even fully remove embedded watermarks from state-of-the-art robust watermarking systems. Both theoretical formulations and empirical experiments are provided. We prove that as a image undergoes iterative diffusion transformations, the mutual information between the watermarked image and the embedded payload approaches zero, causing watermark decoding to fail. We further propose a guided diffusion attack algorithm that explicitly targets and erases watermark signals during generation. We evaluate our approach on recent deep learning-based watermarking schemes and demonstrate near-zero watermark recovery rates after attack, while maintaining high visual fidelity of the regenerated images. Finally, we discuss ethical implications of such watermark removal capablities and provide design guidelines for future watermarking strategies to be more resilient in the era of generative AI.

Paperid: 1258, https://arxiv.org/pdf/2511.09876.pdf

Abstract:
Dataset distillation (DD) compresses large datasets into smaller ones while preserving the performance of models trained on them. Although DD is often assumed to enhance data privacy by aggregating over individual examples, recent studies reveal that standard DD can still leak sensitive information from the original dataset due to the lack of formal privacy guarantees. Existing differentially private (DP)-DD methods attempt to mitigate this risk by injecting noise into the distillation process. However, they often fail to fully leverage the original dataset, resulting in degraded realism and utility. This paper introduces \libn, a novel framework that addresses the key limitations of current DP-DD by leveraging DP-generated data. Specifically, \lib initializes the distilled dataset with DP-generated data to enhance realism. Then, generated data refines the DP-feature matching technique to distill the original dataset under a small privacy budget, and trains an expert model to align the distilled examples with their class distribution. Furthermore, we design a privacy budget allocation strategy to determine budget consumption across DP components and provide a theoretical analysis of the overall privacy guarantees. Extensive experiments show that \lib significantly outperforms state-of-the-art DP-DD methods in terms of both dataset utility and robustness against membership inference attacks, establishing a new paradigm for privacy-preserving dataset distillation.

Paperid: 1259, https://arxiv.org/pdf/2511.06390.pdf

Abstract:
Large Language Models (LLMs) have rapidly advanced and are widely adopted across diverse fields. Due to the substantial computational cost and data requirements of training from scratch, many developers choose to fine-tune or modify existing open-source models. While most adhere to open-source licenses, some falsely claim original training despite clear derivation from public models. This raises pressing concerns about intellectual property protection and highlights the need for reliable methods to verify model provenance. In this paper, we propose GhostSpec, a lightweight yet effective method for verifying LLM lineage without access to training data or modification of model behavior. Our approach constructs compact and robust fingerprints by applying singular value decomposition (SVD) to invariant products of internal attention weight matrices, effectively capturing the structural identity of a model. Unlike watermarking or output-based methods, GhostSpec is fully data-free, non-invasive, and computationally efficient. It demonstrates strong robustness to sequential fine-tuning, pruning, block expansion, and even adversarial transformations. Extensive experiments show that GhostSpec can reliably trace the lineage of transformed models with minimal overhead. By offering a practical solution for model verification and reuse tracking, our method contributes to the protection of intellectual property and fosters a transparent, trustworthy ecosystem for large-scale language models.

Paperid: 1260, https://arxiv.org/pdf/2511.05598.pdf

Abstract:
Robust invisible watermarking aims to embed hidden messages into images such that they survive various manipulations while remaining imperceptible. However, powerful diffusion-based image generation and editing models now enable realistic content-preserving transformations that can inadvertently remove or distort embedded watermarks. In this paper, we present a theoretical and empirical analysis demonstrating that diffusion-based image editing can effectively break state-of-the-art robust watermarks designed to withstand conventional distortions. We analyze how the iterative noising and denoising process of diffusion models degrades embedded watermark signals, and provide formal proofs that under certain conditions a diffusion model's regenerated image retains virtually no detectable watermark information. Building on this insight, we propose a diffusion-driven attack that uses generative image regeneration to erase watermarks from a given image. Furthermore, we introduce an enhanced \emph{guided diffusion} attack that explicitly targets the watermark during generation by integrating the watermark decoder into the sampling loop. We evaluate our approaches on multiple recent deep learning watermarking schemes (e.g., StegaStamp, TrustMark, and VINE) and demonstrate that diffusion-based editing can reduce watermark decoding accuracy to near-zero levels while preserving high visual fidelity of the images. Our findings reveal a fundamental vulnerability in current robust watermarking techniques against generative model-based edits, underscoring the need for new watermarking strategies in the era of generative AI.

Paperid: 1261, https://arxiv.org/pdf/2511.03247.pdf

Abstract:
Open-weight models provide researchers and developers with accessible foundations for diverse downstream applications. We tested the safety and security postures of eight open-weight large language models (LLMs) to identify vulnerabilities that may impact subsequent fine-tuning and deployment. Using automated adversarial testing, we measured each model's resilience against single-turn and multi-turn prompt injection and jailbreak attacks. Our findings reveal pervasive vulnerabilities across all tested models, with multi-turn attacks achieving success rates between 25.86\% and 92.78\% -- representing a $2\times$ to $10\times$ increase over single-turn baselines. These results underscore a systemic inability of current open-weight models to maintain safety guardrails across extended interactions. We assess that alignment strategies and lab priorities significantly influence resilience: capability-focused models such as Llama 3.3 and Qwen 3 demonstrate higher multi-turn susceptibility, whereas safety-oriented designs such as Google Gemma 3 exhibit more balanced performance. The analysis concludes that open-weight models, while crucial for innovation, pose tangible operational and ethical risks when deployed without layered security controls. These findings are intended to inform practitioners and developers of the potential risks and the value of professional AI security solutions to mitigate exposure. Addressing multi-turn vulnerabilities is essential to ensure the safe, reliable, and responsible deployment of open-weight LLMs in enterprise and public domains. We recommend adopting a security-first design philosophy and layered protections to ensure resilient deployments of open-weight models.

Paperid: 1262, https://arxiv.org/pdf/2511.00737.pdf

Abstract:
While homomorphic encryption (HE) provides strong privacy protection, its high computational cost has restricted its application to simple tasks. Recently, hyperdimensional computing (HDC) applied to HE has shown promising performance for privacy-preserving machine learning (PPML). However, when applied to more realistic scenarios such as batch inference, the HDC-based HE has still very high compute time as well as high encryption and data transmission overheads. To address this problem, we propose HDC with encrypted parameters (EP-HDC), which is a novel PPML approach featuring client-side HE, i.e., inference is performed on a client using a homomorphically encrypted model. Our EP-HDC can effectively mitigate the encryption and data transmission overhead, as well as providing high scalability with many clients while providing strong protection for user data and model parameters. In addition to application examples for our client-side PPML, we also present design space exploration involving quantization, architecture, and HE-related parameters. Our experimental results using the BFV scheme and the Face/Emotion datasets demonstrate that our method can improve throughput and latency of batch inference by orders of magnitude over previous PPML methods (36.52~1068x and 6.45~733x, respectively) with less than 1% accuracy degradation.

Paperid: 1263, https://arxiv.org/pdf/2510.24072.pdf

Abstract:
This paper investigates how smart devices covertly capture private conversations and discusses in more in-depth the implications of this for youth privacy. Using a structured review guided by the PRISMA methodology, the analysis focuses on privacy concerns, data capture methods, data storage and sharing practices, and proposed technical mitigations. To structure and synthesize findings, we introduce the SCOUR framework, encompassing Surveillance mechanisms, Consent and awareness, Operational data flow, Usage and exploitation, and Regulatory and technical safeguards. Findings reveal that smart devices have been covertly capturing personal data, especially with smart toys and voice-activated smart gadgets built for youth. These issues are worsened by unclear data collection practices and insufficient transparency in smart device applications. Balancing privacy and utility in smart devices is crucial, as youth are becoming more aware of privacy breaches and value their personal data more. Strategies to improve regulatory and technical safeguards are also provided. The review identifies research gaps and suggests future directions. The limitations of this literature review are also explained. The findings have significant implications for policy development and the transparency of data collection for smart devices.

Paperid: 1264, https://arxiv.org/pdf/2510.19773.pdf

Abstract:
Membership inference attacks (MIAs) have emerged as the standard tool for evaluating the privacy risks of AI models. However, state-of-the-art attacks require training numerous, often computationally expensive, reference models, limiting their practicality. We present a novel approach for estimating model-level vulnerability, the TPR at low FPR, to membership inference attacks without requiring reference models. Empirical analysis shows loss distributions to be asymmetric and heavy-tailed and suggests that most points at risk from MIAs have moved from the tail (high-loss region) to the head (low-loss region) of the distribution after training. We leverage this insight to propose a method to estimate model-level vulnerability from the training and testing distribution alone: using the absence of outliers from the high-loss region as a predictor of the risk. We evaluate our method, the TNR of a simple loss attack, across a wide range of architectures and datasets and show it to accurately estimate model-level vulnerability to the SOTA MIA attack (LiRA). We also show our method to outperform both low-cost (few reference models) attacks such as RMIA and other measures of distribution difference. We finally evaluate the use of non-linear functions to evaluate risk and show the approach to be promising to evaluate the risk in large-language models.

Paperid: 1265, https://arxiv.org/pdf/2510.19295.pdf

Abstract:
The increasing reliance on AI-driven 5G/6G network infrastructures for mission-critical services highlights the need for reliability and resilience against sophisticated cyber-physical threats. These networks are highly exposed to novel attack surfaces due to their distributed intelligence, virtualized resources, and cross-domain integration. This paper proposes a fault-tolerant and resilience-aware framework that integrates AI-driven anomaly detection, adaptive routing, and redundancy mechanisms to mitigate cascading failures under cyber-physical attack conditions. A comprehensive validation is carried out using NS-3 simulations, where key performance indicators such as reliability, latency, resilience index, and packet loss rate are analyzed under various attack scenarios. The deduced results demonstrate that the proposed framework significantly improves fault recovery, stabilizes packet delivery, and reduces service disruption compared to baseline approaches.

Paperid: 1266, https://arxiv.org/pdf/2510.17311.pdf

Abstract:
Serverless computing has rapidly emerged as a prominent cloud paradigm, enabling developers to focus solely on application logic without the burden of managing servers or underlying infrastructure. Public serverless repositories have become key to accelerating the development of serverless applications. However, their growing popularity makes them attractive targets for adversaries. Despite this, the security posture of these repositories remains largely unexplored, exposing developers and organizations to potential risks. In this paper, we present the first comprehensive analysis of the security landscape of serverless components hosted in public repositories. We analyse 2,758 serverless components from five widely used public repositories popular among developers and enterprises, and 125,936 Infrastructure as Code (IaC) templates across three widely used IaC frameworks. Our analysis reveals systemic vulnerabilities including outdated software packages, misuse of sensitive parameters, exploitable deployment configurations, susceptibility to typo-squatting attacks and opportunities to embed malicious behaviour within compressed serverless components. Finally, we provide practical recommendations to mitigate these threats.

Paperid: 1267, https://arxiv.org/pdf/2510.17000.pdf

Abstract:
Adversarial attacks by malicious users that threaten the safety of large language models (LLMs) can be viewed as attempts to infer a target property $T$ that is unknown when an instruction is issued, and becomes knowable only after the model's reply is observed. Examples of target properties $T$ include the binary flag that triggers an LLM's harmful response or rejection, and the degree to which information deleted by unlearning can be restored, both elicited via adversarial instructions. The LLM reveals an \emph{observable signal} $Z$ that potentially leaks hints for attacking through a response containing answer tokens, thinking process tokens, or logits. Yet the scale of information leaked remains anecdotal, leaving auditors without principled guidance and defenders blind to the transparency--risk trade-off. We fill this gap with an information-theoretic framework that computes how much information can be safely disclosed, and enables auditors to gauge how close their methods come to the fundamental limit. Treating the mutual information $I(Z;T)$ between the observation $Z$ and the target property $T$ as the leaked bits per query, we show that achieving error $\varepsilon$ requires at least $\log(1/\varepsilon)/I(Z;T)$ queries, scaling linearly with the inverse leak rate and only logarithmically with the desired accuracy. Thus, even a modest increase in disclosure collapses the attack cost from quadratic to logarithmic in terms of the desired accuracy. Experiments on seven LLMs across system-prompt leakage, jailbreak, and relearning attacks corroborate the theory: exposing answer tokens alone requires about a thousand queries; adding logits cuts this to about a hundred; and revealing the full thinking process trims it to a few dozen. Our results provide the first principled yardstick for balancing transparency and security when deploying LLMs.

Paperid: 1268, https://arxiv.org/pdf/2510.16087.pdf

Abstract:
Security is becoming a pivotal point in cloud platforms. Several divisions, such as business organisations, health care, government, etc., have experienced cyber-attacks on their infrastructures. This research focuses on security issues within Continuous Integration and Deployment (CI/CD) pipelines in a cloud platform as a reaction to recent cyber breaches. This research proposes a blockchain-based solution to enhance CI/CD pipeline security. This research aims to develop a framework that leverages blockchain's distributed ledger technology and tamper-resistant features to improve CI/CD pipeline security. The goal is to emphasise secure software deployment by integrating threat modelling frameworks and adherence to coding standards. It also aims to employ tools to automate security testing to detect publicly disclosed vulnerabilities and flaws, such as an outdated version of Java Spring Framework, a JavaScript library from an unverified source, or a database library that allows SQL injection attacks in the deployed software through the framework.

Paperid: 1269, https://arxiv.org/pdf/2510.14312.pdf

Abstract:
A multi-agent system (MAS) powered by large language models (LLMs) can automate tedious user tasks such as meeting scheduling that requires inter-agent collaboration. LLMs enable nuanced protocols that account for unstructured private data, user constraints, and preferences. However, this design introduces new risks, including misalignment and attacks by malicious parties that compromise agents or steal user data. In this paper, we propose the Terrarium framework for fine-grained study on safety, privacy, and security in LLM-based MAS. We repurpose the blackboard design, an early approach in multi-agent systems, to create a modular, configurable testbed for multi-agent collaboration. We identify key attack vectors such as misalignment, malicious agents, compromised communication, and data poisoning. We implement three collaborative MAS scenarios with four representative attacks to demonstrate the framework's flexibility. By providing tools to rapidly prototype, evaluate, and iterate on defenses and designs, Terrarium aims to accelerate progress toward trustworthy multi-agent systems.

Paperid: 1270, https://arxiv.org/pdf/2510.13825.pdf

Abstract:
The A2AS framework is introduced as a security layer for AI agents and LLM-powered applications, similar to how HTTPS secures HTTP. A2AS enforces certified behavior, activates model self-defense, and ensures context window integrity. It defines security boundaries, authenticates prompts, applies security rules and custom policies, and controls agentic behavior, enabling a defense-in-depth strategy. The A2AS framework avoids latency overhead, external dependencies, architectural changes, model retraining, and operational complexity. The BASIC security model is introduced as the A2AS foundation: (B) Behavior certificates enable behavior enforcement, (A) Authenticated prompts enable context window integrity, (S) Security boundaries enable untrusted input isolation, (I) In-context defenses enable secure model reasoning, (C) Codified policies enable application-specific rules. This first paper in the series introduces the BASIC security model and the A2AS framework, exploring their potential toward establishing the A2AS industry standard.

Paperid: 1271, https://arxiv.org/pdf/2510.10407.pdf

Abstract:
GraphQL's flexible query model and nested data dependencies expose APIs to complex, context-dependent vulnerabilities that are difficult to uncover using conventional testing tools. Existing fuzzers either rely on random payload generation or rigid mutation heuristics, failing to adapt to the dynamic structures of GraphQL schemas and responses. We present PrediQL, the first retrieval-augmented, LLM-guided fuzzer for GraphQL APIs. PrediQL combines large language model reasoning with adaptive feedback loops to generate semantically valid and diverse queries. It models the choice of fuzzing strategy as a multi-armed bandit problem, balancing exploration of new query structures with exploitation of past successes. To enhance efficiency, PrediQL retrieves and reuses execution traces, schema fragments, and prior errors, enabling self-correction and progressive learning across test iterations. Beyond input generation, PrediQL integrates a context-aware vulnerability detector that uses LLM reasoning to analyze responses, interpreting data values, error messages, and status codes to identify issues such as injection flaws, access-control bypasses, and information disclosure. Our evaluation across open-source and benchmark GraphQL APIs shows that PrediQL achieves significantly higher coverage and vulnerability discovery rates compared to state-of-the-art baselines. These results demonstrate that combining retrieval-augmented reasoning with adaptive fuzzing can transform API security testing from reactive enumeration to intelligent exploration.

Paperid: 1272, https://arxiv.org/pdf/2510.08605.pdf

Abstract:
The rapid spread of misinformation on digital platforms threatens public discourse, emotional stability, and decision-making. While prior work has explored various adversarial attacks in misinformation detection, the specific transformations examined in this paper have not been systematically studied. In particular, we investigate language-switching across English, French, Spanish, Arabic, Hindi, and Chinese, followed by translation. We also study query length inflation preceding summarization and structural reformatting into multiple-choice questions. In this paper, we present a multilingual, multi-agent large language model framework with retrieval-augmented generation that can be deployed as a web plugin into online platforms. Our work underscores the importance of AI-driven misinformation detection in safeguarding online factual integrity against diverse attacks, while showcasing the feasibility of plugin-based deployment for real-world web applications.

Paperid: 1273, https://arxiv.org/pdf/2510.02960.pdf

Abstract:
For decades, the Controller Area Network (CAN) has served as the primary in-vehicle bus (IVB) and extended its use to many non-vehicular systems. Over the past years, CAN security has been intensively scrutinized, yielding extensive research literature. Despite its wealth, the literature lacks structured systematization, complicating efforts to assess attack severity, defense efficacy, identify security gaps, or root causes. This leaves non experts uncertain about the relevancy of specific attacks or defenses to their systems, inadvertently portraying CAN as irredeemably insecure. Further, the introduction of new IVB technologies--CAN evolutions, add-ons, and alternative buses--with heightened security claims risks fostering the misconception that merely adopting these technologies resolves CAN's security challenges. This paper systematizes existing CAN security knowledge, presenting a comprehensive taxonomy and assessment models of attackers, attacks, and defenses. It identifies replicable attacks and defense gaps, investigating their root causes as inherent, accidental, unique, or universal. It then extrapolates these insights to emerging IVB technologies by formally analyzing three emerging IVBs to identify shared root causes with CAN and assess their ability to close security gaps. The findings challenge common perceptions, demonstrating that CAN is more securable than perceived, that most insecurity root causes are shared across IVBs, and that merely adopting newer IVB technology does not solve persistent security issues. The paper concludes by highlighting future research directions to secure IVB communication down the road.

Paperid: 1274, https://arxiv.org/pdf/2510.02503.pdf

Abstract:
Cyberattacks on pipeline operational technology systems pose growing risks to energy infrastructure. This study develops a physics-informed simulation and optimization framework for analyzing cyber-physical threats in petroleum pipeline networks. The model integrates networked hydraulic dynamics, SCADA-based state estimation, model predictive control (MPC), and a bi-level formulation for stealthy false-data injection (FDI) attacks. Pipeline flow and pressure dynamics are modeled on a directed graph using nodal pressure evolution and edge-based Weymouth-type relations, including control-aware equipment such as valves and compressors. An extended Kalman filter estimates the full network state from partial SCADA telemetry. The controller computes pressure-safe control inputs via MPC under actuator constraints and forecasted demands. Adversarial manipulation is formalized as a bi-level optimization problem where an attacker perturbs sensor data to degrade throughput while remaining undetected by bad-data detectors. This attack-control interaction is solved via Karush-Kuhn-Tucker (KKT) reformulation, which results in a tractable mixed-integer quadratic program. Test gas pipeline case studies demonstrate the covert reduction of service delivery under attack. Results show that undetectable attacks can cause sustained throughput loss with minimal instantaneous deviation. This reveals the need for integrated detection and control strategies in cyber-physical infrastructure.

Paperid: 1275, https://arxiv.org/pdf/2510.01699.pdf

Abstract:
With the flourishing prosperity of generative models, manipulated facial images have become increasingly accessible, raising concerns regarding privacy infringement and societal trust. In response, proactive defense strategies embed adversarial perturbations into facial images to counter deepfake manipulation. However, existing methods often face a tradeoff between imperceptibility and defense effectiveness-strong perturbations may disrupt forgeries but degrade visual fidelity. Recent studies have attempted to address this issue by introducing additional visual loss constraints, yet often overlook the underlying gradient conflicts among losses, ultimately weakening defense performance. To bridge the gap, we propose a gradient-projection-based adversarial proactive defense (GRASP) method that effectively counters facial deepfakes while minimizing perceptual degradation. GRASP is the first approach to successfully integrate both structural similarity loss and low-frequency loss to enhance perturbation imperceptibility. By analyzing gradient conflicts between defense effectiveness loss and visual quality losses, GRASP pioneers the design of the gradient-projection mechanism to mitigate these conflicts, enabling balanced optimization that preserves image fidelity without sacrificing defensive performance. Extensive experiments validate the efficacy of GRASP, achieving a PSNR exceeding 40 dB, SSIM of 0.99, and a 100% defense success rate against facial attribute manipulations, significantly outperforming existing approaches in visual quality.

Paperid: 1276, https://arxiv.org/pdf/2509.23449.pdf

Abstract:
Binary code similarity detection is a core task in reverse engineering. It supports malware analysis and vulnerability discovery by identifying semantically similar code in different contexts. Modern methods have progressed from manually engineered features to vector representations. Hand-crafted statistics (e.g., operation ratios) are interpretable, but shallow and fail to generalize. Embedding-based methods overcome this by learning robust cross-setting representations, but these representations are opaque vectors that prevent rapid verification. They also face a scalability-accuracy trade-off, since high-dimensional nearest-neighbor search requires approximations that reduce precision. Current approaches thus force a compromise between interpretability, generalizability, and scalability. We bridge these gaps using a language model-based agent to conduct structured reasoning analysis of assembly code and generate features such as input/output types, side effects, notable constants, and algorithmic intent. Unlike hand-crafted features, they are richer and adaptive. Unlike embeddings, they are human-readable, maintainable, and directly searchable with inverted or relational indexes. Without any matching training, our method respectively achieves 42% and 62% for recall@1 in cross-architecture and cross-optimization tasks, comparable to embedding methods with training (39% and 34%). Combined with embeddings, it significantly outperforms the state-of-the-art, demonstrating that accuracy, scalability, and interpretability can coexist.

Paperid: 1277, https://arxiv.org/pdf/2509.22857.pdf

Abstract:
Recent work has made non-interactive privacy-preserving inference more practical by running deep Convolution Neural Network (CNN) with Fully Homomorphic Encryption (FHE). However, these methods remain limited by their reliance on bootstrapping, a costly FHE operation applied across multiple layers, severely slowing inference. They also depend on high-degree polynomial approximations of non-linear activations, which increase multiplicative depth and reduce accuracy by 2-5% compared to plaintext ReLU models. In this work, we focus on ResNets, a widely adopted benchmark architecture in privacy-preserving inference, and close the accuracy gap between their FHE-based non-interactive models and plaintext counterparts, while also achieving faster inference than existing methods. We use a quadratic polynomial approximation of ReLU, which achieves the theoretical minimum multiplicative depth for non-linear activations, along with a penalty-based training strategy. We further introduce structural optimizations such as node fusing, weight redistribution, and tower reuse. These optimizations reduce the required FHE levels in CNNs by nearly a factor of five compared to prior work, allowing us to run ResNet models under leveled FHE without bootstrapping. To further accelerate inference and recover accuracy typically lost with polynomial approximations, we introduce parameter clustering along with a joint strategy of data encoding layout and ensemble techniques. Experiments with ResNet-18, ResNet-20, and ResNet-32 on CIFAR-10 and CIFAR-100 show that our approach achieves up to 4x faster private inference than prior work with comparable accuracy to plaintext ReLU models.

Paperid: 1278, https://arxiv.org/pdf/2509.21821.pdf

Abstract:
Reverse Engineering (RE) is central to software security, enabling tasks such as vulnerability discovery and malware analysis, but it remains labor-intensive and requires substantial expertise. Earlier advances in deep learning start to automate parts of RE, particularly for malware detection and vulnerability classification. More recently, a rapidly growing body of work has applied Large Language Models (LLMs) to similar purposes. Their role compared to prior machine learning remains unclear, since some efforts simply adapt existing pipelines with minimal change while others seek to exploit broader reasoning and generative abilities. These differences, combined with varied problem definitions, methods, and evaluation practices, limit comparability, reproducibility, and cumulative progress. This paper systematizes the field by reviewing 44 research papers, including peer-reviewed publications and preprints, and 18 additional open-source projects that apply LLMs in RE. We propose a taxonomy that organizes existing work by objective, target, method, evaluation strategy, and data scale. Our analysis identifies strengths and limitations, highlights reproducibility and evaluation gaps, and examines emerging risks. We conclude with open challenges and future research directions that aim to guide more coherent and security-relevant applications of LLMs in RE.

Paperid: 1279, https://arxiv.org/pdf/2509.21392.pdf

Abstract:
As adversarial attacks continue to evolve, defense models face the risk of recurrent vulnerabilities, underscoring the importance of continuous adversarial training (CAT). Existing CAT approaches typically balance decision boundaries by either data replay or optimization strategy to constrain shared model parameters. However, due to the diverse and aggressive nature of adversarial examples, these methods suffer from catastrophic forgetting of previous defense knowledge after continual learning. In this paper, we propose a novel framework, called Dual-level Defense Routing or DDeR, that can autonomously select appropriate routers to integrate specific defense experts, thereby adapting to evolving adversarial attacks. Concretely, the first-level defense routing comprises multiple defense experts and routers, with each router dynamically selecting and combining suitable experts to process attacked features. Routers are independently incremented as continuous adversarial training progresses, and their selections are guided by an Adversarial Sentinel Network (ASN) in the second-level defense routing. To compensate for the inability to test due to the independence of routers, we further present a Pseudo-task Substitution Training (PST) strategy, which leverages distributional discrepancy in data to facilitate inter-router communication without storing historical data. Extensive experiments demonstrate that DDeR achieves superior continuous defense performance and classification accuracy compared to existing methods.

Paperid: 1280, https://arxiv.org/pdf/2509.20924.pdf

Abstract:
Large Language Models (LLMs) watermarking has shown promise in detecting AI-generated content and mitigating misuse, with prior work claiming robustness against paraphrasing and text editing. In this paper, we argue that existing evaluations are not sufficiently adversarial, obscuring critical vulnerabilities and overstating the security. To address this, we introduce adaptive robustness radius, a formal metric that quantifies watermark resilience against adaptive adversaries. We theoretically prove that optimizing the attack context and model parameters can substantially reduce this radius, making watermarks highly susceptible to paraphrase attacks. Leveraging this insight, we propose RLCracker, a reinforcement learning (RL)-based adaptive attack that erases watermarks while preserving semantic fidelity. RLCracker requires only limited watermarked examples and zero access to the detector. Despite weak supervision, it empowers a 3B model to achieve 98.5% removal success and an average 0.92 P-SP score on 1,500-token Unigram-marked texts after training on only 100 short samples. This performance dramatically exceeds 6.75% by GPT-4o and generalizes across five model sizes over ten watermarking schemes. Our results confirm that adaptive attacks are broadly effective and pose a fundamental threat to current watermarking defenses.

Paperid: 1281, https://arxiv.org/pdf/2509.20639.pdf

Abstract:
The widespread adoption of Large Language Models (LLMs) has revolutionized AI deployment, enabling autonomous and semi-autonomous applications across industries through intuitive language interfaces and continuous improvements in model development. However, the attendant increase in autonomy and expansion of access permissions among AI applications also make these systems compelling targets for malicious attacks. Their inherent susceptibility to security flaws necessitates robust defenses, yet no known approaches can prevent zero-day or novel attacks against LLMs. This places AI protection systems in a category similar to established malware protection systems: rather than providing guaranteed immunity, they minimize risk through enhanced observability, multi-layered defense, and rapid threat response, supported by a threat intelligence function designed specifically for AI-related threats. Prior work on LLM protection has largely evaluated individual detection models rather than end-to-end systems designed for continuous, rapid adaptation to a changing threat landscape. We present a production-grade defense system rooted in established malware detection and threat intelligence practices. Our platform integrates three components: a threat intelligence system that turns emerging threats into protections; a data platform that aggregates and enriches information while providing observability, monitoring, and ML operations; and a release platform enabling safe, rapid detection updates without disrupting customer workflows. Together, these components deliver layered protection against evolving LLM threats while generating training data for continuous model improvement and deploying updates without interrupting production.

Paperid: 1282, https://arxiv.org/pdf/2509.20166.pdf

Abstract:
Today's cyber defenders are overwhelmed by a deluge of security alerts, threat intelligence signals, and shifting business context, creating an urgent need for AI systems to enhance operational security work. While Large Language Models (LLMs) have the potential to automate and scale Security Operations Center (SOC) operations, existing evaluations do not fully assess the scenarios most relevant to real-world defenders. This lack of informed evaluation impacts both AI developers and those applying LLMs to SOC automation. Without clear insight into LLM performance in real-world security scenarios, developers lack a north star for development, and users cannot reliably select the most effective models. Meanwhile, malicious actors are using AI to scale cyber attacks, highlighting the need for open source benchmarks to drive adoption and community-driven improvement among defenders and model developers. To address this, we introduce CyberSOCEval, a new suite of open source benchmarks within CyberSecEval 4. CyberSOCEval includes benchmarks tailored to evaluate LLMs in two tasks: Malware Analysis and Threat Intelligence Reasoning--core defensive domains with inadequate coverage in current benchmarks. Our evaluations show that larger, more modern LLMs tend to perform better, confirming the training scaling laws paradigm. We also find that reasoning models leveraging test time scaling do not achieve the same boost as in coding and math, suggesting these models have not been trained to reason about cybersecurity analysis, and pointing to a key opportunity for improvement. Finally, current LLMs are far from saturating our evaluations, showing that CyberSOCEval presents a significant challenge for AI developers to improve cyber defense capabilities.

Paperid: 1283, https://arxiv.org/pdf/2509.19677.pdf

Abstract:
The rapid advancement of Large Language Models (LLMs) has enabled the generation of highly realistic synthetic data. We identify a new vulnerability, LLMs generating convincing career trajectories in fake resumes and explore effective detection methods. To address this challenge, we construct a dataset of machine-generated career trajectories using LLMs and various methods, and demonstrate that conventional text-based detectors perform poorly on structured career data. We propose CareerScape, a novel heterogeneous, hierarchical multi-layer graph framework that models career entities and their relations in a unified global graph built from genuine resumes. Unlike conventional classifiers that treat each instance independently, CareerScape employs a structure-aware framework that augments user-specific subgraphs with trusted neighborhood information from a global graph, enabling the model to capture both global structural patterns and local inconsistencies indicative of synthetic career paths. Experimental results show that CareerScape outperforms state-of-the-art baselines by 5.8-85.0% relatively, highlighting the importance of structure-aware detection for machine-generated content.

Paperid: 1284, https://arxiv.org/pdf/2509.19153.pdf

Abstract:
Ensuring the correctness of smart contracts is critical, as even subtle flaws can lead to severe financial losses. While bug detection tools able to spot common vulnerability patterns can serve as a first line of defense, most real-world exploits and losses stem from errors in the contract business logic. Formal verification tools such as SolCMC and the Certora Prover address this challenge, but their impact remains limited by steep learning curves and restricted specification languages. Recent works have begun to explore the use of large language models (LLMs) for security-related tasks such as vulnerability detection and test generation. Yet, a fundamental question remains open: can LLMs serve as verification oracles, capable of reasoning about arbitrary contract-specific properties? In this paper, we provide the first systematic evaluation of GPT-5, a state-of-the-art reasoning LLM, in this role. We benchmark its performance on a large dataset of verification tasks, compare its outputs against those of established formal verification tools, and assess its practical effectiveness in real-world auditing scenarios. Our study combines quantitative metrics with qualitative analysis, and shows that recent reasoning-oriented LLMs can be surprisingly effective as verification oracles, suggesting a new frontier in the convergence of AI and formal methods for secure smart contract development and auditing.

Paperid: 1285, https://arxiv.org/pdf/2509.18909.pdf

Abstract:
Trusted execution environments (TEEs) offer hardware-assisted means to protect code and data. However, as shown in numerous results over the years, attackers can use side-channels to leak data access patterns and even single-step the code. While the vendors are slowly introducing hardware-based countermeasures for some attacks, others will stay unaddressed. This makes a software-level countermeasure desirable, but current available solutions only address very specific attack vectors or have a narrow leakage model. In this work, we take a holistic view at the vulnerabilities of TEEs and design a tool named Obelix, which is the first to protect both code and data against a wide range of TEE attacks, from cache attacks over single-stepping to ciphertext side-channels. We analyze the practically achievable precision of state-of-the-art single-stepping tools, and present an algorithm which uses that knowledge to divide a program into uniform code blocks, that are indistinguishable for a strong attacker. By storing these blocks and the program data in oblivious RAM, the attacker cannot follow execution, effectively protecting both secret code and data. We describe how we automate our approach to make it available for developers who are unfamiliar with side-channels. As an obfuscation tool, Obelix comes with a considerable performance overhead, but compensates this with strong security guarantees and easy applicability without requiring any expert knowledge.

Paperid: 1286, https://arxiv.org/pdf/2509.17126.pdf

Abstract:
Rollups have become the de facto scalability solution for Ethereum, securing more than $55B in assets. They achieve scale by executing transactions on a Layer 2 ledger, while periodically posting data and finalizing state on the Layer 1, either optimistically or via validity proofs. Their fees must simultaneously reflect the pricing of three resources: L2 costs (e.g., execution), L1 DA, and underlying L1 gas costs for batch settlement and proof verification. In this work, we identify critical mis-pricings in existing rollup transaction fee mechanisms (TFMs) that allow for two powerful attacks. Firstly, an adversary can saturate the L2's DA batch capacity with compute-light data-heavy transactions, forcing low-gas transaction batches that enable both L2 DoS attacks, and finality-delay attacks. Secondly, by crafting prover killer transactions that maximize proving cycles relative to the gas charges, an adversary can effectively stall proof generation, delaying finality by hours and inflicting prover-side economic losses to the rollup at a minimal cost. We analyze the above attack vectors across the major Ethereum rollups, quantifying adversarial costs and protocol losses. We find that the first attack enables periodic DoS on rollups, lasting up to 30 minutes, at a cost below 2 ETH for most rollups. Moreover, we identify three rollups that are exposed to indefinite DoS at a cost of approximately 0.8 to 2.7 ETH per hour. The attack can be further modified to increase finalization delays by a factor of about 1.45x to 2.73x, compared to direct L1 blob-stuffing, depending on the rollup's parameters. Furthermore, we find that the prover killer attack induces a finalization latency increase of about 94x. Finally, we propose comprehensive mitigations to prevent these attacks and suggest how some practical uses of multi-dimensional rollup TFMs can rectify the identified mis-pricing attacks.

Paperid: 1287, https://arxiv.org/pdf/2509.16052.pdf

Abstract:
We analyze 15,097 blocks proposed for inclusion in Ethereum's blockchain over an 8-minute window on December 3, 2024, during which 38 blocks were added to the chain. We classify transactions as exclusive -- present only in blocks from a single builder -- or private -- absent from the public mempool but included in blocks from multiple builders. We find that exclusive transactions account for 84% of the total fees paid by transactions in winning blocks. Furthermore, we show that exclusivity cannot be fully explained by exclusive relationships between senders and builders: about 7% of all exclusive transactions included on-chain, by value, come from senders who route exclusively to a single builder. Analyzing transaction logs shows that some exclusive transactions are duplicates or variations of the same strategy, but even accounting for that, the share of the total fees paid by transactions in winning blocks is at least 77.2%. Taken together, our findings highlight that exclusive transactions are the dominant source of builder revenues.

Paperid: 1288, https://arxiv.org/pdf/2509.16038.pdf

Abstract:
Network Intrusion Detection Systems (NIDS) have been studied in research for almost four decades. Yet, despite thousands of papers claiming scientific advances, a non-negligible number of recent works suggest that the findings of prior literature may be questionable. At the root of such a disagreement is the well-known challenge of obtaining data representative of a real-world network-and, hence, usable for security assessments. We tackle such a challenge in this paper. We propose ConCap, a practical tool meant to facilitate experimental research on NIDS. Through ConCap, a researcher can set up an isolated and lightweight network environment and configure it to produce network-related data, such as packets or NetFlows, that are automatically labeled, hence ready for fine-grained experiments. ConCap is rooted on open-source software and is designed to foster experimental reproducibility across the scientific community by sharing just one configuration file. Through comprehensive experiments on 10 different network activities, further expanded via in-depth analyses of 21 variants of two specific activities and of 100 repetitions of four other ones, we empirically verify that ConCap produces network data resembling that of a real-world network. We also carry out experiments on well-known benchmark datasets as well as on a real "smart-home" network, showing that, from a cyber-detection viewpoint, ConCap's automatically-labeled NetFlows are functionally equivalent to those collected in other environments. Finally, we show that ConCap enables to safely reproduce sophisticated attack chains (e.g., to test/enhance existing NIDS). Altogether, ConCap is a solution to the "data problem" that is plaguing NIDS research.

Paperid: 1289, https://arxiv.org/pdf/2509.14334.pdf

Abstract:
The factorization norms of the lower-triangular all-ones $n \times n$ matrix, $Î³_2(M_{count})$ and $Î³_{F}(M_{count})$, play a central role in differential privacy as they are used to give theoretical justification of the accuracy of the only known production-level private training algorithm of deep neural networks by Google. Prior to this work, the best known upper bound on $Î³_2(M_{count})$ was $1 + \frac{\log n}Ï$ by Mathias (Linear Algebra and Applications, 1993), and the best known lower bound was $\frac{1}Ï(2 + \log(\frac{2n+1}{3})) \approx 0.507 + \frac{\log n}Ï$ (MatouÅ¡ek, Nikolov, Talwar, IMRN 2020), where $\log$ denotes the natural logarithm. Recently, Henzinger and Upadhyay (SODA 2025) gave the first explicit factorization that meets the bound of Mathias (1993) and asked whether there exists an explicit factorization that improves on Mathias' bound. We answer this question in the affirmative. Additionally, we improve the lower bound significantly. More specifically, we show that $$ 0.701 + \frac{\log n}Ï + o(1) \;\leq\; Î³_2(M_{count}) \;\leq\; 0.846 + \frac{\log n}Ï + o(1). $$ That is, we reduce the gap between the upper and lower bound to $0.14 + o(1)$. We also show that our factors achieve a better upper bound for $Î³_{F}(M_{count})$ compared to prior work, and we establish an improved lower bound: $$ 0.701 + \frac{\log n}Ï + o(1) \;\leq\; Î³_{F}(M_{count}) \;\leq\; 0.748 + \frac{\log n}Ï + o(1). $$ That is, the gap between the lower and upper bound provided by our explicit factorization is $0.047 + o(1)$.

Paperid: 1290, https://arxiv.org/pdf/2509.14035.pdf

Abstract:
Quantiles are key in distributed analytics, but computing them over sensitive data risks privacy. Local differential privacy (LDP) offers strong protection but lower accuracy than central DP, which assumes a trusted aggregator. Secure multi-party computation (MPC) can bridge this gap, but generic MPC solutions face scalability challenges due to large domains, complex secure operations, and multi-round interactions. We present Piquant$\varepsilon$, a system for privacy-preserving estimation of multiple quantiles in a distributed setting without relying on a trusted server. Piquant$\varepsilon$ operates under the malicious threat model and achieves accuracy of the central DP model. Built on the two-server model, Piquant$\varepsilon$ uses a novel strategy of releasing carefully chosen intermediate statistics, reducing MPC complexity while preserving end-to-end DP. Empirically, Piquant$\varepsilon$ estimates 5 quantiles on 1 million records in under a minute with domain size $10^9$, achieving up to $10^4$-fold higher accuracy than LDP, and up to $\sim 10\times$ faster runtime compared to baselines.

Paperid: 1291, https://arxiv.org/pdf/2509.13740.pdf

Abstract:
One of the biggest attack surfaces of embedded systems is their network interfaces, which enable communication with other devices. Unlike their general-purpose counterparts, embedded systems are designed for specialized use cases, resulting in unique and diverse communication stacks. Unfortunately, current approaches for evaluating the security of these embedded network stacks require manual effort or access to hardware, and they generally focus only on small parts of the embedded system. A promising alternative is firmware rehosting, which enables fuzz testing of the entire firmware by generically emulating the physical hardware. However, existing rehosting methods often struggle to meaningfully explore network stacks due to their complex, multi-layered input formats. This limits their ability to uncover deeply nested software faults. To address this problem, we introduce a novel method to automatically detect and handle the use of network protocols in firmware called Pemu. By automatically deducing the available network protocols, Pemu can transparently generate valid network packets that encapsulate fuzzing data, allowing the fuzzing input to flow directly into deeper layers of the firmware logic. Our approach thus enables a deeper, more targeted, and layer-by-layer analysis of firmware components that were previously difficult or impossible to test. Our evaluation demonstrates that Pemu consistently improves the code coverage of three existing rehosting tools for embedded network stacks. Furthermore, our fuzzer rediscovered several known vulnerabilities and identified five previously unknown software faults, highlighting its effectiveness in uncovering deeply nested bugs in network-exposed code.

Paperid: 1292, https://arxiv.org/pdf/2509.10948.pdf

Abstract:
Industrial robotic systems are central to automating smart manufacturing operations. Connected and automated factories face growing cybersecurity risks that can potentially cause interruptions and damages to physical operations. Among these attacks, data-integrity attacks often involve sophisticated exploitation of vulnerabilities that enable an attacker to access and manipulate the operational data and are hence difficult to detect with only existing intrusion detection or model-based detection. This paper addresses the challenges in utilizing existing side-channels to detect data-integrity attacks in robotic manufacturing processes by developing an online detection framework, ViSTR-GP, that cross-checks encoder-reported measurements against a vision-based estimate from an overhead camera outside the controller's authority. In this framework, a one-time interactive segmentation initializes SAM-Track to generate per-frame masks. A low-rank tensor-regression surrogate maps each mask to measurements, while a matrix-variate Gaussian process models nominal residuals, capturing temporal structure and cross-joint correlations. A frame-wise test statistic derived from the predictive distribution provides an online detector with interpretable thresholds. We validate the framework on a real-world robotic testbed with synchronized video frame and encoder data, collecting multiple nominal cycles and constructing replay attack scenarios with graded end-effector deviations. Results on the testbed indicate that the proposed framework recovers joint angles accurately and detects data-integrity attacks earlier with more frequent alarms than all baselines. These improvements are most evident in the most subtle attacks. These results show that plants can detect data-integrity attacks by adding an independent physical channel, bypassing the controller's authority, without needing complex instrumentation.

Paperid: 1293, https://arxiv.org/pdf/2509.10709.pdf

Abstract:
Sophisticated malware families exploit the openness of the Android platform to infiltrate IoT networks, enabling large-scale disruption, data exfiltration, and denial-of-service attacks. This systematic literature review (SLR) examines cutting-edge approaches to Android malware analysis with direct implications for securing IoT infrastructures. We analyze feature extraction techniques across static, dynamic, hybrid, and graph-based methods, highlighting their trade-offs: static analysis offers efficiency but is easily evaded through obfuscation; dynamic analysis provides stronger resistance to evasive behaviors but incurs high computational costs, often unsuitable for lightweight IoT devices; hybrid approaches balance accuracy with resource considerations; and graph-based methods deliver superior semantic modeling and adversarial robustness. This survey contributes a structured comparison of existing methods, exposes research gaps, and outlines a roadmap for future directions to enhance scalability, adaptability, and long-term security in IoT-driven Android malware detection.

Paperid: 1294, https://arxiv.org/pdf/2509.08383.pdf

Abstract:
Large language models (LLMs) power modern AI applications, but processing sensitive data on untrusted servers raises privacy concerns. Homomorphic encryption (HE) enables computation on encrypted data for secure inference. However, neural text generation requires decoding methods like argmax and sampling, which are non-polynomial and thus computationally expensive under encryption, creating a significant performance bottleneck. We introduce cutmax, an HE-friendly argmax algorithm that reduces ciphertext operations compared to prior methods, enabling practical greedy decoding under encryption. We also propose the first HE-compatible nucleus (top-p) sampling method, leveraging cutmax for efficient stochastic decoding with provable privacy guarantees. Both techniques are polynomial, supporting efficient inference in privacy-preserving settings. Moreover, their differentiability facilitates gradient-based sequence-level optimization as a polynomial alternative to straight-through estimators. We further provide strong theoretical guarantees for cutmax, proving it converges globally to a unique two-level fixed point, independent of the input values beyond the identity of the maximizer, which explains its rapid convergence in just a few iterations. Evaluations on realistic LLM outputs show latency reductions of 24x-35x over baselines, advancing secure text generation.

Paperid: 1295, https://arxiv.org/pdf/2509.07757.pdf

Abstract:
We use browsers daily to access all sorts of information. Because browsers routinely process scripts, media, and executable code from unknown sources, they form a critical security boundary between users and adversaries. A common attack vector is JavaScript, which exposes a large attack surface due to the sheer complexity of modern JavaScript engines. To mitigate these threats, modern engines increasingly adopt software-based fault isolation (SFI). A prominent example is Google's V8 heap sandbox, which represents the most widely deployed SFI mechanism, protecting billions of users across all Chromium-based browsers and countless applications built on Node$.$js and Electron. The heap sandbox splits the address space into two parts: one part containing trusted, security-sensitive metadata, and a sandboxed heap containing memory accessible to untrusted code. On a technical level, the sandbox enforces isolation by removing raw pointers and using translation tables to resolve references to trusted objects. Consequently, an attacker cannot corrupt trusted data even with full control of the sandboxed data, unless there is a bug in how code handles data from the sandboxed heap. Despite their widespread use, such SFI mechanisms have seen little security testing. In this work, we propose a new testing technique that models the security boundary of modern SFI implementations. Following the SFI threat model, we assume a powerful attacker who fully controls the sandbox's memory. We implement this by instrumenting memory loads originating in the trusted domain and accessing untrusted, attacker-controlled sandbox memory. We then inject faults into the loaded data, aiming to trigger memory corruption in the trusted domain. In a comprehensive evaluation, we identify 19 security bugs in V8 that enable an attacker to bypass the sandbox.

Paperid: 1296, https://arxiv.org/pdf/2509.06614.pdf

Abstract:
Blockchains face a scalability limitation, partly due to the throughput limitations of consensus protocols, especially when aiming to obtain a high degree of decentralization. Layer 2 Rollups (L2s) are a faster alternative to conventional blockchains. L2s perform most computations offchain using minimally blockchains (L1) under-the-hood to guarantee correctness. A sequencer is a service that receives offchain L2 transaction requests, batches these transactions, and commits compressed or hashed batches to L1. Using hashing needs less L1 space, which is beneficial for gas cost, but requires a data availability committee (DAC) service to translate hashes into their corresponding batches of transaction requests. The behavior of sequencers and DACs influence the evolution of the L2 blockchain, presenting a potential security threat and delaying L2 adoption. We propose in this paper fraud-proof mechanisms, arbitrated by L1 contracts, to detect and generate evidence of dishonest behavior of the sequencer and DAC. We study how these fraud-proofs limit the power of adversaries that control different number of sequencer and DACs members, and provide incentives for their honest behavior. We designed these fraud-proof mechanisms as two player games. Unlike the generic fraud-proofs in current L2s (designed to guarantee the correct execution of transactions), our fraud-proofs are over pred-etermined algorithms that verify the properties that determine the correctness of the DAC. Arbitrating over concrete algorithms makes our fraud-proofs more efficient, easier to understand, and simpler to prove correct. We provide as an artifact a mechanization in LEAN4 of our fraud-proof games, including (1) the verified strategies that honest players should play to win all games as well as (2) mechanisms to detect dishonest claims.

Paperid: 1297, https://arxiv.org/pdf/2509.06071.pdf

Abstract:
High-definition maps provide precise environmental information essential for prediction and planning in autonomous driving systems. Due to the high cost of labeling and maintenance, recent research has turned to online HD map construction using onboard sensor data, offering wider coverage and more timely updates for autonomous vehicles. However, the robustness of online map construction under adversarial conditions remains underexplored. In this paper, we present a systematic vulnerability analysis of online map construction models, which reveals that these models exhibit an inherent bias toward predicting symmetric road structures. In asymmetric scenes like forks or merges, this bias often causes the model to mistakenly predict a straight boundary that mirrors the opposite side. We demonstrate that this vulnerability persists in the real-world and can be reliably triggered by obstruction or targeted interference. Leveraging this vulnerability, we propose a novel two-stage attack framework capable of manipulating online constructed maps. First, our method identifies vulnerable asymmetric scenes along the victim AV's potential route. Then, we optimize the location and pattern of camera-blinding attacks and adversarial patch attacks. Evaluations on a public AD dataset demonstrate that our attacks can degrade mapping accuracy by up to 9.9%, render up to 44% of targeted routes unreachable, and increase unsafe planned trajectory rates, colliding with real-world road boundaries, by up to 27%. These attacks are also validated on a real-world testbed vehicle. We further analyze root causes of the symmetry bias, attributing them to training data imbalance, model architecture, and map element representation. To the best of our knowledge, this study presents the first vulnerability assessment of online map construction models and introduces the first digital and physical attack against them.

Paperid: 1298, https://arxiv.org/pdf/2509.05552.pdf

Abstract:
Secure norm computation is becoming increasingly important in many real-world learning applications. However, existing cryptographic systems often lack a general framework for securely computing the $L^p$-norm over private inputs held by different parties. These systems often treat secure norm computation as a black-box process, neglecting to design tailored cryptographic protocols that optimize performance. Moreover, they predominantly focus on the $L^2$-norm, paying little attention to other popular $L^p$-norms, such as $L^1$ and $L^\infty$, which are commonly used in practice, such as machine learning tasks and location-based services. To our best knowledge, we propose the first comprehensive framework for secure two-party $L^p$-norm computations ($L^1$, $L^2$, and $L^\infty$), denoted as \mbox{Crypto-$L^p$}, designed to be versatile across various applications. We have designed, implemented, and thoroughly evaluated our framework across a wide range of benchmarking applications, state-of-the-art (SOTA) cryptographic protocols, and real-world datasets to validate its effectiveness and practical applicability. In summary, \mbox{Crypto-$L^p$} outperforms prior works on secure $L^p$-norm computation, achieving $82\times$, $271\times$, and $42\times$ improvements in runtime while reducing communication overhead by $36\times$, $4\times$, and $21\times$ for $p=1$, $2$, and $\infty$, respectively. Furthermore, we take the first step in adapting our Crypto-$L^p$ framework for secure machine learning inference, reducing communication costs by $3\times$ compared to SOTA systems while maintaining comparable runtime and accuracy.

Paperid: 1299, https://arxiv.org/pdf/2509.04710.pdf

Abstract:
Differential privacy (DP) is a privacy-enhancement technology (PET) that receives prominent attention from the academia, industry, and government. One main development over the past decade has been the decentralization of DP, including local DP and shuffle DP. Despite that decentralized DP heavily relies on network communications for data collection,we found that: 1) no systematic study has surveyed the research opportunities at the intersection of networking and DP; 2) nor have there been significant efforts to develop DP mechanisms that are explicitly tailored for network environments. In this paper, we seek to address this gap by initiating a new direction of network-aware DP. We identified two focus areas where the network research can offer substantive contributions to the design and deployment of DP, related to network security and topology. Through this work, we hope to encourage more research that adapt/optimize DP's deployment in various network environments.

Paperid: 1300, https://arxiv.org/pdf/2509.03442.pdf

Abstract:
As IoT devices continue to proliferate, their reliability is increasingly constrained by security concerns. In response, researchers have developed diverse malware analysis techniques to detect and classify IoT malware. These techniques typically rely on extracting features at different levels from IoT applications, giving rise to a wide range of feature extraction methods. However, current approaches still face significant challenges when applied in practice. This survey provides a comprehensive review of feature extraction techniques for IoT malware analysis from multiple perspectives. We first examine static and dynamic feature extraction methods, followed by hybrid approaches. We then explore feature representation strategies based on graph learning. Finally, we compare the strengths and limitations of existing techniques, highlight open challenges, and outline promising directions for future research.

Paperid: 1301, https://arxiv.org/pdf/2509.01434.pdf

Abstract:
The expansion of Internet of Things (IoT) devices constantly generates heterogeneous data streams, driving demand for continuous, decentralized intelligence. Federated Lifelong Learning (FLL) provides an ideal solution by incorporating federated and lifelong learning to overcome catastrophic forgetting. The extended lifecycle of FLL in IoT systems increases their vulnerability to persistent attacks, and these risks may be obscured by performance degradation caused by spatial-temporal data heterogeneity. Moreover, this problem is exacerbated by the standard single-server architecture, as its single point of failure makes it difficult to maintain a reliable audit trail for long-term threats. Blockchain provides a tamper-proof foundation for trustworthy FLL systems. Nevertheless, directly applying blockchain to FLL significantly increases computational and retrieval costs with the expansion of the knowledge base, slowing down the training on IoT devices. To address these challenges, we propose LiFeChain, a lightweight blockchain for secure and efficient federated lifelong learning by providing a tamper-resistant ledger with minimal on-chain disclosure and bidirectional verification. To the best of our knowledge, LiFeChain is the first blockchain tailored for FLL. LiFeChain incorporates two complementary mechanisms: the proof-of-model-correlation (PoMC) consensus on the server, which couples learning and unlearning mechanisms to mitigate negative transfer, and segmented zero-knowledge arbitration (Seg-ZA) on the client, which detects and arbitrates abnormal committee behavior without compromising privacy. LiFeChain is designed as a plug-and-play component that can be seamlessly integrated into existing FLL algorithms. Experimental results demonstrate that LiFeChain not only enhances model performance against two long-term attacks but also sustains high efficiency and scalability.

Paperid: 1302, https://arxiv.org/pdf/2509.00770.pdf

Abstract:
This research proposes a real-time, adaptive decision-support framework for mitigating cyber incidents in cyber-physical systems, developed in response to an increasing reliance on these systems within critical infrastructure and evolving adversarial tactics. Existing decision-support systems often fall short in accounting for multi-agent, multi-path attacks and trade-offs between safety and operational continuity. To address this, our framework integrates hierarchical system modelling with Bayesian probabilistic reasoning, constructing Bayesian Network Graphs from system architecture and vulnerability data. Models are encoded using a Domain Specific Language to enhance computational efficiency and support dynamic updates. In our approach, we use a hybrid exposure probability estimation framework, which combines Exploit Prediction Scoring System and Common Vulnerability Scoring System scores via Bayesian confidence calibration to handle epistemic uncertainty caused by incomplete or heterogeneous vulnerability metadata. Mitigation recommendations are generated as countermeasure portfolios, refined using multi-objective optimisation to identify Pareto-optimal strategies balancing attack likelihood, impact severity, and system availability. To accommodate time- and resource-constrained incident response, frequency-based heuristics are applied to prioritise countermeasures across the optimised portfolios. The framework was evaluated through three representative cyber-physical attack scenarios, demonstrating its versatility in handling complex adversarial behaviours under real-time response constraints. The results affirm its utility in operational contexts and highlight the robustness of our proposed approach across diverse threat environments.

Paperid: 1303, https://arxiv.org/pdf/2508.21797.pdf

Abstract:
Industry 4.0's highly networked Machine Tool Controllers (MTCs) are prime targets for replay attacks that use outdated sensor data to manipulate actuators. Dynamic watermarking can reveal such tampering, but current schemes assume linear-Gaussian dynamics and use constant watermark statistics, making them vulnerable to the time-varying, partly proprietary behavior of MTCs. We close this gap with DynaMark, a reinforcement learning framework that models dynamic watermarking as a Markov decision process (MDP). It learns an adaptive policy online that dynamically adapts the covariance of a zero-mean Gaussian watermark using available measurements and detector feedback, without needing system knowledge. DynaMark maximizes a unique reward function balancing control performance, energy consumption, and detection confidence dynamically. We develop a Bayesian belief updating mechanism for real-time detection confidence in linear systems. This approach, independent of specific system assumptions, underpins the MDP for systems with linear dynamics. On a Siemens Sinumerik 828D controller digital twin, DynaMark achieves a reduction in watermark energy by 70% while preserving the nominal trajectory, compared to constant variance baselines. It also maintains an average detection delay equivalent to one sampling interval. A physical stepper-motor testbed validates these findings, rapidly triggering alarms with less control performance decline and exceeding existing benchmarks.

Paperid: 1304, https://arxiv.org/pdf/2508.21558.pdf

Abstract:
In this paper, we present a novel encrypted traffic classification model that operates directly on raw PCAP data without requiring prior assumptions about traffic type. Unlike existing methods, it is generalizable across multiple classification tasks and leverages inter-flow signals - an innovative representation that captures temporal correlations and packet volume distributions across flows. Experimental results show that our model outperforms well-established methods in nearly every classification task and across most datasets, achieving up to 99% accuracy in some cases, demonstrating its robustness and adaptability.

Paperid: 1305, https://arxiv.org/pdf/2508.21480.pdf

Abstract:
The increasing adoption of smart home devices and IoT-based security systems presents significant opportunities to enhance convenience, safety, and risk management for homeowners and service providers. However, secure onboarding-provisioning credentials and establishing trust with cloud platforms-remains a considerable challenge. Traditional onboarding methods often rely on centralized Public Key Infrastructure (PKI) models and manufacturer-controlled keys, which introduce security risks and limit the user's digital sovereignty. These limitations hinder the widespread deployment of scalable IoT solutions. This paper presents a novel onboarding framework that builds upon existing network-layer onboarding techniques and extends them to the application layer to address these challenges. By integrating consortium blockchain technology, we propose a decentralized onboarding mechanism that enhances transparency, security, and monitoring for smart home architectures. The architecture supports device registration, key revocation, access control management, and risk detection through event-driven alerts across dedicated blockchain channels and smart contracts. To evaluate the framework, we formally model the protocol using the Tamarin Prover under the Dolev-Yao adversary model. The analysis focuses on authentication, token integrity, key confidentiality, and resilience over public channels. A prototype implementation demonstrates the system's viability in smart home settings, with verification completing in 0.34 seconds, highlighting its scalability and suitability for constrained devices and diverse stakeholders. Additionally, performance evaluation shows that the blockchain-based approach effectively handles varying workloads, maintains high throughput and low latency, and supports near real-time IoT data processing.

Paperid: 1306, https://arxiv.org/pdf/2508.21323.pdf

Abstract:
We introduce PROVSEEK, an LLM-powered agentic framework for automated provenance-driven forensic analysis and threat intelligence extraction. PROVSEEK employs specialized toolchains to dynamically retrieve relevant context by generating precise, context-aware queries that fuse a vectorized threat report knowledge base with data from system provenance databases. The framework resolves provenance queries, orchestrates multiple role-specific agents to mitigate hallucinations, and synthesizes structured, ground-truth verifiable forensic summaries. By combining agent orchestration with Retrieval-Augmented Generation (RAG) and chain-of-thought (CoT) reasoning, PROVSEEK enables adaptive multi-step analysis that iteratively refines hypotheses, verifies supporting evidence, and produces scalable, interpretable forensic explanations of attack behaviors. By combining provenance data with agentic reasoning, PROVSEEK establishes a new paradigm for grounded agentic forecics to investigate APTs. We conduct a comprehensive evaluation on publicly available DARPA datasets, demonstrating that PROVSEEK outperforms retrieval-based methods for intelligence extraction task, achieving a 34% improvement in contextual precision/recall; and for threat detection task, PROVSEEK achieves 22%/29% higher precision/recall compared to both a baseline agentic AI approach and State-Of-The-Art (SOTA) Provenance-based Intrusion Detection System (PIDS).

Paperid: 1307, https://arxiv.org/pdf/2508.19115.pdf

Abstract:
Autonomous driving and V2X technologies have developed rapidly in the past decade, leading to improved safety and efficiency in modern transportation. These systems interact with extensive networks of vehicles, roadside infrastructure, and cloud resources to support their machine learning capabilities. However, the widespread use of machine learning in V2X systems raises issues over the privacy of the data involved. This is particularly concerning for smart-transit and driver safety applications which can implicitly reveal user locations or explicitly disclose medical data such as EEG signals. To resolve these issues, we propose SecureV2X, a scalable, multi-agent system for secure neural network inferences deployed between the server and each vehicle. Under this setting, we study two multi-agent V2X applications: secure drowsiness detection, and secure red-light violation detection. Our system achieves strong performance relative to baselines, and scales efficiently to support a large number of secure computation interactions simultaneously. For instance, SecureV2X is $9.4 \times$ faster, requires $143\times$ fewer computational rounds, and involves $16.6\times$ less communication on drowsiness detection compared to other secure systems. Moreover, it achieves a runtime nearly $100\times$ faster than state-of-the-art benchmarks in object detection tasks for red light violation detection.

Paperid: 1308, https://arxiv.org/pdf/2508.18453.pdf

Abstract:
Balancing robust security with strong privacy guarantees is critical for Risk-Based Adaptive Authentication (RBA), particularly in decentralized settings. Federated Learning (FL) offers a promising solution by enabling collaborative risk assessment without centralizing user data. However, existing FL approaches struggle with Non-Independent and Identically Distributed (Non-IID) user features, resulting in biased, unstable, and poorly generalized global models. This paper introduces FL-RBA2, a novel Federated Learning framework for Risk-Based Adaptive Authentication that addresses Non-IID challenges through a mathematically grounded similarity transformation. By converting heterogeneous user features (including behavioral, biometric, contextual, interaction-based, and knowledge-based modalities) into IID similarity vectors, FL-RBA2 supports unbiased aggregation and personalized risk modeling across distributed clients. The framework mitigates cold-start limitations via clustering-based risk labeling, incorporates Differential Privacy (DP) to safeguard sensitive information, and employs Message Authentication Codes (MACs) to ensure model integrity and authenticity. Federated updates are securely aggregated into a global model, achieving strong balance between user privacy, scalability, and adaptive authentication robustness. Rigorous game-based security proofs in the Random Oracle Model formally establish privacy, correctness, and adaptive security guarantees. Extensive experiments on keystroke, mouse, and contextual datasets validate FL-RBA2's effectiveness in high-risk user detection and its resilience to model inversion and inference attacks, even under strong DP constraints.

Paperid: 1309, https://arxiv.org/pdf/2508.16738.pdf

Abstract:
Zero-Knowledge Proofs (ZKPs) have emerged as powerful tools for secure and privacy-preserving computation. ZKPs enable one party to convince another of a statement's validity without revealing anything else. This capability has profound implications in many domains, including: machine learning, blockchain, image authentication, and electronic voting. Despite their potential, ZKPs have seen limited deployment because of their exceptionally high computational overhead, which manifests primarily during proof generation. To mitigate these overheads, a (growing) body of researchers has proposed hardware accelerators and GPU implementations for kernels and complete protocols. Prior art spans a wide variety of ZKP schemes that vary significantly in computational overhead, proof size, verifier cost, protocol setup, and trust. The latest, and widely used ZKP protocols are intentionally designed to balance these trade-offs. A particular challenge in modern ZKP systems is supporting complex, high-degree gates using the SumCheck protocol. We address this challenge with a novel programmable accelerator that efficiently handles arbitrary custom gates via SumCheck. Our accelerator achieves upwards of $1000\times$ geomean speedup over CPU-based SumChecks across a range of gate types. We integrate this unit into a full-system accelerator, zkPHIRE, which achieves $1486\times$ geomean speedup over CPU and $11.87\times$ speedup over the state-of-the-art at iso-area. zkPHIRE is the first accelerator to scale to problem sizes of $2^{30}$ nominal constraints while maintaining small proof sizes and programmability.

Paperid: 1310, https://arxiv.org/pdf/2508.16150.pdf

Abstract:
Membership Inference Attacks (MIAs) pose a significant privacy risk, as they enable adversaries to determine whether a specific data point was included in the training dataset of a model. While Machine Unlearning is primarily designed as a privacy mechanism to efficiently remove private data from a machine learning model without the need for full retraining, its impact on the susceptibility of models to MIA remains an open question. In this study, we systematically assess the vulnerability of models to MIA after applying state-of-art Machine Unlearning algorithms. Our analysis spans four diverse datasets (two from the image domain and two in tabular format), exploring how different unlearning approaches influence the exposure of models to membership inference. The findings highlight that while Machine Unlearning is not inherently a countermeasure against MIA, the unlearning algorithm and data characteristics can significantly affect a model's vulnerability. This work provides essential insights into the interplay between Machine Unlearning and MIAs, offering guidance for the design of privacy-preserving machine learning systems.

Paperid: 1311, https://arxiv.org/pdf/2508.15850.pdf

Abstract:
The increasing availability of publicly shared electrocardiogram (ECG) data raises critical privacy concerns, as its biometric properties make individuals vulnerable to linkage attacks. Unlike prior studies that assume idealized adversarial capabilities, we evaluate ECG privacy risks under realistic conditions where attackers operate with partial knowledge. Using data from 109 participants across diverse real-world datasets, our approach achieves 85% accuracy in re-identifying individuals in public datasets while maintaining a 14.2% overall misclassification rate at an optimal confidence threshold, with 15.6% of unknown individuals misclassified as known and 12.8% of known individuals misclassified as unknown. These results highlight the inadequacy of simple anonymization techniques in preventing re-identification, demonstrating that even limited adversarial knowledge enables effective identity linkage. Our findings underscore the urgent need for privacy-preserving strategies, such as differential privacy, access control, and encrypted computation, to mitigate re-identification risks while ensuring the utility of shared biosignal data in healthcare applications.

Paperid: 1312, https://arxiv.org/pdf/2508.12264.pdf

Abstract:
Publicly available large pretrained models (i.e., backbones) and lightweight adapters for parameter-efficient fine-tuning (PEFT) have become standard components in modern machine learning pipelines. However, preserving the privacy of both user inputs and fine-tuned adapters -- often trained on sensitive data -- during inference remains a significant challenge. Applying cryptographic techniques, such as multi-party computation (MPC), to PEFT settings still incurs substantial encrypted computation across both the backbone and adapter, mainly due to the inherent two-way communication between them. To address this limitation, we propose CryptPEFT, the first PEFT solution specifically designed for private inference scenarios. CryptPEFT introduces a novel one-way communication (OWC) architecture that confines encrypted computation solely to the adapter, significantly reducing both computational and communication overhead. To maintain strong model utility under this constraint, we explore the design space of OWC-compatible adapters and employ an automated architecture search algorithm to optimize the trade-off between private inference efficiency and model utility. We evaluated CryptPEFT using Vision Transformer backbones across widely used image classification datasets. Our results show that CryptPEFT significantly outperforms existing baselines, delivering speedups ranging from $20.62\times$ to $291.48\times$ in simulated wide-area network (WAN) and local-area network (LAN) settings. On CIFAR-100, CryptPEFT attains 85.47% accuracy with just 2.26 seconds of inference latency. These findings demonstrate that CryptPEFT offers an efficient and privacy-preserving solution for modern PEFT-based inference.

Paperid: 1313, https://arxiv.org/pdf/2508.11925.pdf

Abstract:
The need for detecting LLM-generated code necessitates watermarking systems capable of operating within its highly structured and syntactically constrained environment. To address this, we introduce CodeTracer, an innovative adaptive code watermarking framework underpinned by a novel reinforcement learning training paradigm. At its core, CodeTracer features a policy-driven approach that utilizes a parameterized model to intelligently bias token choices during next-token prediction. This strategy ensures that embedded watermarks maintain code functionality while exhibiting subtle yet statistically detectable deviations from typical token distributions. To facilitate policy learning, we devise a comprehensive reward system that seamlessly integrates execution feedback with watermark embedding signals, balancing process-level and outcome-level rewards. Additionally, we employ Gumbel Top-k reparameterization to enable gradient-based optimization of discrete watermarking decisions. Extensive comparative evaluations demonstrate CodeTracer's significant superiority over state-of-the-art baselines in both watermark detectability and the preservation of generated code's functionality.

Paperid: 1314, https://arxiv.org/pdf/2508.10044.pdf

Abstract:
This paper elaborates on an extensive security framework specifically designed for energy management systems (EMSs), which effectively tackles the dynamic environment of cybersecurity vulnerabilities and/or system problems (SPs), accomplished through the incorporation of novel methodologies. A comprehensive multi-point attack/error model is initially proposed to systematically identify vulnerabilities throughout the entire EMS data processing pipeline, including post state estimation (SE) stealth attacks, EMS database manipulation, and human-machine interface (HMI) display corruption according to the real-time database (RTDB) storage. This framework acknowledges the interconnected nature of modern attack vectors, which utilize various phases of supervisory control and data acquisition (SCADA) data flow. Then, generative AI (GenAI)-based anomaly detection systems (ADSs) for EMSs are proposed for the first time in the power system domain to handle the scenarios. Further, a set-of-mark generative intelligence (SoM-GI) framework, which leverages multimodal analysis by integrating visual markers with rules considering the GenAI capabilities, is suggested to overcome inherent spatial reasoning limitations. The SoM-GI methodology employs systematic visual indicators to enable accurate interpretation of segmented HMI displays and detect visual anomalies that numerical methods fail to identify. Validation on the IEEE 14-Bus system shows the framework's effectiveness across scenarios, while visual analysis identifies inconsistencies. This integrated approach combines numerical analysis with visual pattern recognition and linguistic rules to protect against cyber threats and system errors.

Paperid: 1315, https://arxiv.org/pdf/2508.08593.pdf

Abstract:
In digital substations, security events pose significant challenges to the sustained operation of power systems. To mitigate these challenges, the implementation of robust defense strategies is critically important. A thorough process of anomaly identification and detection in information and communication technology (ICT) frameworks is crucial to ensure secure and reliable communication and coordination between interconnected devices within digital substations. Hence, this paper addresses the critical cybersecurity challenges confronting IEC61850-based digital substations within modern smart grids, where the integration of advanced communication protocols, e.g., generic object-oriented substation event (GOOSE), has enhanced energy management and introduced significant vulnerabilities to cyberattacks. Focusing on the limitations of traditional anomaly detection systems (ADSs) in detecting threats, this research proposes a transformative approach by leveraging generative AI (GenAI) to develop robust ADSs. The primary contributions include the suggested advanced adversarial traffic mutation (AATM) technique to generate synthesized and balanced datasets for GOOSE messages, ensuring protocol compliance and enabling realistic zero-day attack pattern creation to address data scarcity. Then, the implementation of GenAI-based ADSs incorporating the task-oriented dialogue (ToD) processes has been explored for improved detection of attack patterns. Finally, a comparison of the GenAI-based ADS with machine learning (ML)-based ADSs has been implemented to showcase the outperformance of the GenAI-based frameworks considering the AATM-generated GOOSE datasets and standard/advanced performance evaluation metrics.

Paperid: 1316, https://arxiv.org/pdf/2508.05518.pdf

Abstract:
Differential Privacy (DP) is commonly employed to safeguard graph analysis or publishing. Distance, a critical factor in graph analysis, is typically handled using curator DP, where a trusted curator holds the complete neighbor lists of all vertices and answers queries privately. However, in many real-world scenarios, such a curator may not be present, posing a significant challenge for implementing differentially private distance queries under Local Differential Privacy (LDP). This paper proposes two approaches to address this challenge. The first approach generates a synthetic graph by randomizing responses and applies bitwise operations to reduce noise interference. However, like other synthetic graph methods, this approach suffers from low utility. To overcome this limitation, we propose a second approach, the first LDP method specifically designed for distance queries, which captures the global graph structure by continuously aggregating local distance vectors from neighboring vertices. This process enables the accurate updating of global distances. We demonstrate the effectiveness of our method through comprehensive theoretical analysis and experimental evaluations on real-world datasets.

Paperid: 1317, https://arxiv.org/pdf/2507.23432.pdf

Abstract:
In modern datasets, where single records can have multiple owners, enforcing user-level differential privacy requires capping each user's total contribution. This "contribution bounding" becomes a significant combinatorial challenge. Existing sequential algorithms for this task are computationally intensive and do not scale to the massive datasets prevalent today. To address this scalability bottleneck, we propose a novel and efficient distributed algorithm. Our approach models the complex ownership structure as a hypergraph, where users are vertices and records are hyperedges. The algorithm proceeds in rounds, allowing users to propose records in parallel. A record is added to the final dataset only if all its owners unanimously agree, thereby ensuring that no user's predefined contribution limit is violated. This method aims to maximize the size of the resulting dataset for high utility while providing a practical, scalable solution for implementing user-level privacy in large, real-world systems.

Paperid: 1318, https://arxiv.org/pdf/2507.17491.pdf

Abstract:
As 5G networks expand into critical infrastructure, secure and efficient user authentication is more important than ever. The 5G-AKA protocol, standardized by 3GPP in TS 33.501, is central to authentication in current 5G deployments. It provides mutual authentication, user privacy, and key secrecy. However, despite its adoption, 5G-AKA has known limitations in both security and performance. While it focuses on protecting privacy against passive attackers, recent studies show its vulnerabilities to active attacks. It also relies on a sequence number mechanism to prevent replay attacks, requiring perfect synchronization between the device and the core network. This stateful design adds complexity, causes desynchronization, and incurs extra communication overhead. More critically, 5G-AKA lacks Perfect Forward Secrecy (PFS), exposing past communications if long-term keys are compromised-an increasing concern amid sophisticated threats. This paper proposes an enhanced authentication protocol that builds on 5G-AKA's design while addressing its shortcomings. First, we introduce a stateless version that removes sequence number reliance, reducing complexity while staying compatible with existing SIM cards and infrastructure. We then extend this design to add PFS with minimal cryptographic overhead. Both protocols are rigorously analyzed using ProVerif, confirming their compliance with all major security requirements, including resistance to passive and active attacks, as well as those defined by 3GPP and academic studies. We also prototype both protocols and evaluate their performance against 5G-AKA and 5G-AKA' (USENIX'21). Our results show the proposed protocols offer stronger security with only minor computational overhead, making them practical, future-ready solutions for 5G and beyond.

Paperid: 1319, https://arxiv.org/pdf/2507.13407.pdf

Abstract:
With the rapid rise of generative AI and synthetic media, distinguishing AI-generated images from real ones has become crucial in safeguarding against misinformation and ensuring digital authenticity. Traditional watermarking techniques have shown vulnerabilities to adversarial attacks, undermining their effectiveness in the presence of attackers. We propose IConMark, a novel in-generation robust semantic watermarking method that embeds interpretable concepts into AI-generated images, as a first step toward interpretable watermarking. Unlike traditional methods, which rely on adding noise or perturbations to AI-generated images, IConMark incorporates meaningful semantic attributes, making it interpretable to humans and hence, resilient to adversarial manipulation. This method is not only robust against various image augmentations but also human-readable, enabling manual verification of watermarks. We demonstrate a detailed evaluation of IConMark's effectiveness, demonstrating its superiority in terms of detection accuracy and maintaining image quality. Moreover, IConMark can be combined with existing watermarking techniques to further enhance and complement its robustness. We introduce IConMark+SS and IConMark+TM, hybrid approaches combining IConMark with StegaStamp and TrustMark, respectively, to further bolster robustness against multiple types of image manipulations. Our base watermarking technique (IConMark) and its variants (+TM and +SS) achieve 10.8%, 14.5%, and 15.9% higher mean area under the receiver operating characteristic curve (AUROC) scores for watermark detection, respectively, compared to the best baseline on various datasets.

Paperid: 1320, https://arxiv.org/pdf/2507.13023.pdf

Abstract:
This paper provides a comprehensive empirical analysis of the economics and dynamics behind arbitrages between centralized and decentralized exchanges (CEX-DEX) on Ethereum. We refine heuristics to identify arbitrage transactions from on-chain data and introduce a robust empirical framework to estimate arbitrage revenue without knowing traders' actual behaviors on CEX. Leveraging an extensive dataset spanning 19 months from August 2023 to March 2025, we estimate a total of 233.8M USD extracted by 19 major CEX-DEX searchers from 7,203,560 identified CEX-DEX arbitrages. Our analysis reveals increasing centralization trends as three searchers captured three-quarters of both volume and extracted value. We also demonstrate that searchers' profitability is tied to their integration level with block builders and uncover exclusive searcher-builder relationships and their market impact. Finally, we correct the previously underestimated profitability of block builders who vertically integrate with a searcher. These insights illuminate the darkest corner of the MEV landscape and highlight the critical implications of CEX-DEX arbitrages for Ethereum's decentralization.

Paperid: 1321, https://arxiv.org/pdf/2507.08623.pdf

Abstract:
Quantum Machine Learning (QML) systems inherit vulnerabilities from classical machine learning while introducing new attack surfaces rooted in the physical and algorithmic layers of quantum computing. Despite a growing body of research on individual attack vectors - ranging from adversarial poisoning and evasion to circuit-level backdoors, side-channel leakage, and model extraction - these threats are often analyzed in isolation, with unrealistic assumptions about attacker capabilities and system environments. This fragmentation hampers the development of effective, holistic defense strategies. In this work, we argue that QML security requires more structured modeling of the attack surface, capturing not only individual techniques but also their relationships, prerequisites, and potential impact across the QML pipeline. We propose adapting kill chain models, widely used in classical IT and cybersecurity, to the quantum machine learning context. Such models allow for structured reasoning about attacker objectives, capabilities, and possible multi-stage attack paths - spanning reconnaissance, initial access, manipulation, persistence, and exfiltration. Based on extensive literature analysis, we present a detailed taxonomy of QML attack vectors mapped to corresponding stages in a quantum-aware kill chain framework that is inspired by the MITRE ATLAS for classical machine learning. We highlight interdependencies between physical-level threats (like side-channel leakage and crosstalk faults), data and algorithm manipulation (such as poisoning or circuit backdoors), and privacy attacks (including model extraction and training data inference). This work provides a foundation for more realistic threat modeling and proactive security-in-depth design in the emerging field of quantum machine learning.

Paperid: 1322, https://arxiv.org/pdf/2507.06256.pdf

Abstract:
This paper investigates the real-world vulnerabilities of audio-based large language models (ALLMs), such as Qwen2-Audio. We first demonstrate that an adversary can craft stealthy audio perturbations to manipulate ALLMs into exhibiting specific targeted behaviors, such as eliciting responses to wake-keywords (e.g., "Hey Qwen"), or triggering harmful behaviors (e.g. "Change my calendar event"). Subsequently, we show that playing adversarial background noise during user interaction with the ALLMs can significantly degrade the response quality. Crucially, our research illustrates the scalability of these attacks to real-world scenarios, impacting other innocent users when these adversarial noises are played through the air. Further, we discuss the transferrability of the attack, and potential defensive measures.

Paperid: 1323, https://arxiv.org/pdf/2507.06156.pdf

Abstract:
Blockchain bridges have become essential infrastructure for enabling interoperability across different blockchain networks, with more than $24B monthly bridge transaction volume. However, their growing adoption has been accompanied by a disproportionate rise in security breaches, making them the single largest source of financial loss in Web3. For cross-chain ecosystems to be robust and sustainable, it is essential to understand and address these vulnerabilities. In this study, we present a comprehensive systematization of blockchain bridge design and security. We define three bridge security priors, formalize the architectural structure of 13 prominent bridges, and identify 23 attack vectors grounded in real-world blockchain exploits. Using this foundation, we evaluate 43 representative attack scenarios and introduce a layered threat model that captures security failures across source chain, off-chain, and destination chain components. Our analysis at the static code and transaction network levels reveals recurring design flaws, particularly in access control, validator trust assumptions, and verification logic, and identifies key patterns in adversarial behavior based on transaction-level traces. To support future development, we propose a decision framework for bridge architecture design, along with defense mechanisms such as layered validation and circuit breakers. This work provides a data-driven foundation for evaluating bridge security and lays the groundwork for standardizing resilient cross-chain infrastructure.

Paperid: 1324, https://arxiv.org/pdf/2507.02956.pdf

Abstract:
Recent research has demonstrated that state-of-the-art LLMs and defenses remain susceptible to multi-turn jailbreak attacks. These attacks require only closed-box model access and are often easy to perform manually, posing a significant threat to the safe and secure deployment of LLM-based systems. We study the effectiveness of the Crescendo multi-turn jailbreak at the level of intermediate model representations and find that safety-aligned LMs often represent Crescendo responses as more benign than harmful, especially as the number of conversation turns increases. Our analysis indicates that at each turn, Crescendo prompts tend to keep model outputs in a "benign" region of representation space, effectively tricking the model into fulfilling harmful requests. Further, our results help explain why single-turn jailbreak defenses like circuit breakers are generally ineffective against multi-turn attacks, motivating the development of mitigations that address this generalization gap.

Paperid: 1325, https://arxiv.org/pdf/2507.01536.pdf

Abstract:
Local Energy Markets (LEMs), though pivotal to the energy transition, face growing cybersecurity threats due to their reliance on smart grid communication standards and vulnerable Internet-of-Things (IoT)-enabled devices. This is a critical issue because such vulnerabilities can be exploited to manipulate market operations, compromise participants' privacy, and destabilize power distribution networks. This work maps LEM communication flows to existing standards, highlights potential impacts of key identified vulnerabilities, and simulates cyberattack scenarios on a privacy-preserving LEM model to assess their impacts. Findings reveal how attackers could distort pricing and demand patterns. We finally present recommendations for researchers, industry developers, policymakers, and LEM stakeholders to secure future LEM deployments.

Paperid: 1326, https://arxiv.org/pdf/2512.24044.pdf

Abstract:
As large language models (LLMs) are increasingly deployed, ensuring their safe use is paramount. Jailbreaking, adversarial prompts that bypass model alignment to trigger harmful outputs, present significant risks, with existing studies reporting high success rates in evading common LLMs. However, previous evaluations have focused solely on the models, neglecting the full deployment pipeline, which typically incorporates additional safety mechanisms like content moderation filters. To address this gap, we present the first systematic evaluation of jailbreak attacks targeting LLM safety alignment, assessing their success across the full inference pipeline, including both input and output filtering stages. Our findings yield two key insights: first, nearly all evaluated jailbreak techniques can be detected by at least one safety filter, suggesting that prior assessments may have overestimated the practical success of these attacks; second, while safety filters are effective in detection, there remains room to better balance recall and precision to further optimize protection and user experience. We highlight critical gaps and call for further refinement of detection accuracy and usability in LLM safety systems.

Paperid: 1327, https://arxiv.org/pdf/2512.22894.pdf

Abstract:
Deceptive UI designs, widely instantiated across the web and commonly known as dark patterns, manipulate users into performing actions misaligned with their goals. In this paper, we show that dark patterns are highly effective in steering agent trajectories, posing a significant risk to agent robustness. To quantify this risk, we introduce DECEPTICON, an environment for testing individual dark patterns in isolation. DECEPTICON includes 700 web navigation tasks with dark patterns -- 600 generated tasks and 100 real-world tasks, designed to measure instruction-following success and dark pattern effectiveness. Across state-of-the-art agents, we find dark patterns successfully steer agent trajectories towards malicious outcomes in over 70% of tested generated and real-world tasks -- compared to a human average of 31%. Moreover, we find that dark pattern effectiveness correlates positively with model size and test-time reasoning, making larger, more capable models more susceptible. Leading countermeasures against adversarial attacks, including in-context prompting and guardrail models, fail to consistently reduce the success rate of dark pattern interventions. Our findings reveal dark patterns as a latent and unmitigated risk to web agents, highlighting the urgent need for robust defenses against manipulative designs.

Paperid: 1328, https://arxiv.org/pdf/2512.21358.pdf

Abstract:
"f differential privacy" (fDP) is a recent definition for privacy privacy which can offer improved predictions of "privacy loss". It has been used to analyse specific privacy mechanisms, such as the popular Gaussian mechanism. In this paper we show how fDP's foundation in statistical hypothesis testing implies equivalence to the channel model of Quantitative Information Flow. We demonstrate this equivalence by a Galois connection between two partially ordered sets. This equivalence enables novel general composition theorems for fDP, supporting improved analysis for complex privacy designs.

Paperid: 1329, https://arxiv.org/pdf/2512.20775.pdf

Abstract:
In this paper, we introduce Sark, a reference architecture implementing the Unforgeable, Stateful, and Oblivious (USO) asset system as described by Goodell, Toliver, and Nakib. We describe the motivation, design, and implementation of Sloop, a permissioned, crash fault-tolerant (CFT) blockchain that forms a subsystem of Sark, and the other core subsystems, Porters, which accumulate and roll-up commitments from Clients. We analyse the operation of the system using the 'CIA Triad': Confidentiality, Availability, and Integrity. We then introduce the concept of Integrity Locus and use it to address design trade-offs related to decentralization. Finally, we point to future work on Byzantine fault-tolerance (BFT), and mitigating the local centrality of Porters.

Paperid: 1330, https://arxiv.org/pdf/2512.19414.pdf

Abstract:
The automation of Cyber Threat Intelligence (CTI) relies heavily on Named Entity Recognition (NER) to extract critical entities from unstructured text. Currently, Large Language Models (LLMs) primarily address this task through retrieval-based In-Context Learning (ICL). This paper analyzes this mainstream paradigm, revealing a fundamental flaw: its success stems not from global semantic similarity but largely from the incidental overlap of entity types within retrieved examples. This exposes the limitations of relying on unreliable implicit induction. To address this, we propose TTPrompt, a framework shifting from implicit induction to explicit instruction. TTPrompt maps the core concepts of CTI's Tactics, Techniques, and Procedures (TTPs) into an instruction hierarchy: formulating task definitions as Tactics, guiding strategies as Techniques, and annotation guidelines as Procedures. Furthermore, to handle the adaptability challenge of static guidelines, we introduce Feedback-driven Instruction Refinement (FIR). FIR enables LLMs to self-refine guidelines by learning from errors on minimal labeled data, adapting to distinct annotation dialects. Experiments on five CTI NER benchmarks demonstrate that TTPrompt consistently surpasses retrieval-based baselines. Notably, with refinement on just 1% of training data, it rivals models fine-tuned on the full dataset. For instance, on LADDER, its Micro F1 of 71.96% approaches the fine-tuned baseline, and on the complex CTINexus, its Macro F1 exceeds the fine-tuned ACLM model by 10.91%.

Paperid: 1331, https://arxiv.org/pdf/2512.18345.pdf

Abstract:
Fully homomorphic encryption (FHE) enables secure computation on encrypted data, mitigating privacy concerns in cloud and edge environments. However, due to its high compute and memory demands, extensive acceleration research has been pursued across diverse hardware platforms, especially GPUs. In this paper, we perform a microarchitectural analysis of CKKS, a popular FHE scheme, on modern GPUs. We focus on on-chip cache behavior, and show that the dominant kernels remain bound by memory bandwidth despite a high-bandwidth L2 cache, exposing a persistent memory wall. We further discover that the overall CKKS pipeline throughput is constrained by low per-kernel hardware utilization, caused by insufficient intra-kernel parallelism. Motivated by these findings, we introduce Theodosian, a set of complementary, memory-aware optimizations that improve cache efficiency and reduce runtime overheads. Our approach delivers consistent speedups across various CKKS workloads. On an RTX 5090, we reduce the bootstrapping latency for 32,768 complex numbers to 15.2ms with Theodosian, and further to 12.8ms with additional algorithmic optimizations, establishing new state-of-the-art GPU performance to the best of our knowledge.

Paperid: 1332, https://arxiv.org/pdf/2512.17613.pdf

Abstract:
Voting is a primary democratic activity through which voters select representatives or approve policies. Conventional paper ballot elections have several drawbacks that might compromise the fairness, effectiveness, and accessibility of the voting process. Therefore, there is an increasing need to design safer, effective, and easily accessible alternatives. E-Voting is one such solution that uses digital tools to simplify voting. Existing state-of-the-art designs for secure E-Voting are based on number-theoretic hardness assumptions. These designs are no longer secure due to quantum algorithms such as Shor's algorithm. We present the design and analysis of \textit{first} post-quantum secure end-to-end verifiable E-Voting protocol based on multivariate polynomials to address this issue. The security of our proposed design depends on the hardness of the MQ problem, which is an NP-hard problem. We present a simple yet efficient design involving only standard cryptographic primitives as building blocks.

Paperid: 1333, https://arxiv.org/pdf/2512.15335.pdf

Abstract:
Deep neural networks are widely deployed with quantization techniques to reduce memory and computational costs by lowering the numerical precision of their parameters. While quantization alters model parameters and their outputs, existing privacy analyses primarily focus on full-precision models, leaving a gap in understanding how bit-width reduction can affect privacy leakage. We present the first systematic study of the privacy-utility relationship in post-training quantization (PTQ), a versatile family of methods that can be applied to pretrained models without further training. Using membership inference attacks as our evaluation framework, we analyze three popular PTQ algorithms-AdaRound, BRECQ, and OBC-across multiple precision levels (4-bit, 2-bit, and 1.58-bit) on CIFAR-10, CIFAR-100, and TinyImageNet datasets. Our findings consistently show that low-precision PTQs can reduce privacy leakage. In particular, lower-precision models demonstrate up to an order of magnitude reduction in membership inference vulnerability compared to their full-precision counterparts, albeit at the cost of decreased utility. Additional ablation studies on the 1.58-bit quantization level show that quantizing only the last layer at higher precision enables fine-grained control over the privacy-utility trade-off. These results offer actionable insights for practitioners to balance efficiency, utility, and privacy protection in real-world deployments.

Paperid: 1334, https://arxiv.org/pdf/2512.15286.pdf

Abstract:
The increasing number of cyber threats and rapidly evolving tactics, as well as the high volume of data in recent years, have caused classical machine learning, rules, and signature-based defence strategies to fail, rendering them unable to keep up. An alternative, Quantum Machine Learning (QML), has recently emerged, making use of computations based on quantum mechanics. It offers better encoding and processing of high-dimensional structures for certain problems. This survey provides a comprehensive overview of QML techniques relevant to the domain of security, such as Quantum Neural Networks (QNNs), Quantum Support Vector Machines (QSVMs), Variational Quantum Circuits (VQCs), and Quantum Generative Adversarial Networks (QGANs), and discusses the contributions of this paper in relation to existing research in the field and how it improves over them. It also maps these methods across supervised, unsupervised, and generative learning paradigms, and to core cybersecurity tasks, including intrusion and anomaly detection, malware and botnet classification, and encrypted-traffic analytics. It also discusses their application in the domain of cloud computing security, where QML can enhance secure and scalable operations. Many limitations of QML in the domain of cybersecurity have also been discussed, along with the directions for addressing them.

Paperid: 1335, https://arxiv.org/pdf/2512.14741.pdf

Abstract:
Backdoor attacks embed malicious behaviors into Large Language Models (LLMs), enabling adversaries to trigger harmful outputs or bypass safety controls. However, the persistence of the implanted backdoors under user-driven post-deployment continual fine-tuning has been rarely examined. Most prior works evaluate the effectiveness and generalization of implanted backdoors only at releasing and empirical evidence shows that naively injected backdoor persistence degrades after updates. In this work, we study whether and how implanted backdoors persist through a multi-stage post-deployment fine-tuning. We propose P-Trojan, a trigger-based attack algorithm that explicitly optimizes for backdoor persistence across repeated updates. By aligning poisoned gradients with those of clean tasks on token embeddings, the implanted backdoor mapping is less likely to be suppressed or forgotten during subsequent updates. Theoretical analysis shows the feasibility of such persistent backdoor attacks after continual fine-tuning. And experiments conducted on the Qwen2.5 and LLaMA3 families of LLMs, as well as diverse task sequences, demonstrate that P-Trojan achieves over 99% persistence while preserving clean-task accuracy. Our findings highlight the need for persistence-aware evaluation and stronger defenses in realistic model adaptation pipelines.

Paperid: 1336, https://arxiv.org/pdf/2512.14166.pdf

Abstract:
The rapid evolution of Large Language Models (LLMs) into autonomous agents has led to the adoption of the Model Context Protocol (MCP) as a standard for discovering and invoking external tools. While this architecture decouples the reasoning engine from tool execution to enhance scalability, it introduces a significant privacy surface: third-party MCP servers, acting as semi-honest intermediaries, can observe detailed tool interaction logs outside the user's trusted boundary. In this paper, we first identify and formalize a novel privacy threat termed Intent Inversion, where a semi-honest MCP server attempts to reconstruct the user's private underlying intent solely by analyzing legitimate tool calls. To systematically assess this vulnerability, we propose IntentMiner, a framework that leverages Hierarchical Information Isolation and Three-Dimensional Semantic Analysis, integrating tool purpose, call statements, and returned results, to accurately infer user intent at the step level. Extensive experiments demonstrate that IntentMiner achieves a high degree of semantic alignment (over 85%) with original user queries, significantly outperforming baseline approaches. These results highlight the inherent privacy risks in decoupled agent architectures, revealing that seemingly benign tool execution logs can serve as a potent vector for exposing user secrets.

Paperid: 1337, https://arxiv.org/pdf/2512.12914.pdf

Abstract:
Large Language Models (LLMs) are often fine-tuned to adapt their general-purpose knowledge to specific tasks and domains such as cyber threat intelligence (CTI). Fine-tuning is mostly done through proprietary datasets that may contain sensitive information. Owners expect their fine-tuned model to not inadvertently leak this information to potentially adversarial end users. Using CTI as a use case, we demonstrate that data-extraction attacks can recover sensitive information from fine-tuned models on CTI reports, underscoring the need for mitigation. Retraining the full model to eliminate this leakage is computationally expensive and impractical. We propose an alternative approach, which we call privacy alignment, inspired by safety alignment in LLMs. Just like safety alignment teaches the model to abide by safety constraints through a few examples, we enforce privacy alignment through few-shot supervision, integrating a privacy classifier and a privacy redactor, both handled by the same underlying LLM. We evaluate our system, called CTIGuardian, using GPT-4o mini and Mistral-7B Instruct models, benchmarking against Presidio, a named entity recognition (NER) baseline. Results show that CTIGuardian provides a better privacy-utility trade-off than NER based models. While we demonstrate its effectiveness on a CTI use case, the framework is generic enough to be applicable to other sensitive domains.

Paperid: 1338, https://arxiv.org/pdf/2512.10361.pdf

Abstract:
Serverless computing has revolutionized cloud computing by offering users an efficient, cost-effective way to develop and deploy applications without managing infrastructure details. However, serverless cloud users remain vulnerable to various types of attacks, including micro-architectural side-channel attacks. These attacks typically rely on the physical co-location of victim and attacker instances, and attackers need to exploit cloud schedulers to achieve co-location with victims. Therefore, it is crucial to study vulnerabilities in serverless cloud schedulers and assess the security of different serverless scheduling algorithms. This study addresses the gap in understanding and constructing co-location attacks in serverless clouds. We present a comprehensive methodology to uncover exploitable features in serverless scheduling algorithms and to devise strategies for constructing co-location attacks via normal user interfaces. In our experiments, we successfully reveal exploitable vulnerabilities and achieve instance co-location on prevalent open-source infrastructures and Microsoft Azure Functions. We also present a mitigation strategy, the Double-Dip scheduler, to defend against co-location attacks in serverless clouds. Our work highlights critical areas for security enhancements in current cloud schedulers, offering insights to fortify serverless computing environments against potential co-location attacks.

Paperid: 1339, https://arxiv.org/pdf/2512.08417.pdf

Abstract:
Large Language Models (LLMs) have been integrated into many applications (e.g., web agents) to perform more sophisticated tasks. However, LLM-empowered applications are vulnerable to Indirect Prompt Injection (IPI) attacks, where instructions are injected via untrustworthy external data sources. This paper presents Rennervate, a defense framework to detect and prevent IPI attacks. Rennervate leverages attention features to detect the covert injection at a fine-grained token level, enabling precise sanitization that neutralizes IPI attacks while maintaining LLM functionalities. Specifically, the token-level detector is materialized with a 2-step attentive pooling mechanism, which aggregates attention heads and response tokens for IPI detection and sanitization. Moreover, we establish a fine-grained IPI dataset, FIPI, to be open-sourced to support further research. Extensive experiments verify that Rennervate outperforms 15 commercial and academic IPI defense methods, achieving high precision on 5 LLMs and 6 datasets. We also demonstrate that Rennervate is transferable to unseen attacks and robust against adaptive adversaries.

Paperid: 1340, https://arxiv.org/pdf/2512.07495.pdf

Abstract:
On-device machine learning (ML) introduces new security concerns about model privacy. Storing valuable trained ML models on user devices exposes them to potential extraction by adversaries. The current mainstream solution for on-device model protection is storing the weights and conducting inference within Trusted Execution Environments (TEEs). However, due to limited trusted memory that cannot accommodate the whole model, most existing approaches employ a partitioning strategy, dividing a model into multiple slices that are loaded into the TEE sequentially. This frequent interaction between untrusted and trusted worlds dramatically increases inference latency, sometimes by orders of magnitude. In this paper, we propose Amulet, a fast TEE-shielded on-device inference framework for ML model protection. Amulet incorporates a suite of obfuscation methods specifically designed for common neural network architectures. After obfuscation by the TEE, the entire transformed model can be securely stored in untrusted memory, allowing the inference process to execute directly in untrusted memory with GPU acceleration. For each inference request, only two rounds of minimal-overhead interaction between untrusted and trusted memory are required to process input samples and output results. We also provide theoretical proof from an information-theoretic perspective that the obfuscated model does not leak information about the original weights. We comprehensively evaluated Amulet using diverse model architectures ranging from ResNet-18 to GPT-2. Our approach incurs inference latency only 2.8-4.8x that of unprotected models with negligible accuracy loss, achieving an 8-9x speedup over baseline methods that execute inference entirely within TEEs, and performing approximately 2.2x faster than the state-of-the-art obfuscation-based method.

Paperid: 1341, https://arxiv.org/pdf/2512.01574.pdf

Abstract:
Private information retrieval (PIR) is an essential cryptographic protocol for privacy-preserving applications, enabling a client to retrieve a record from a server's database without revealing which record was requested. Single-server PIR based on homomorphic encryption has particularly gained immense attention for its ease of deployment and reduced trust assumptions. However, single-server PIR remains impractical due to its high computational and memory bandwidth demands. Specifically, reading the entirety of large databases from storage, such as SSDs, severely limits its performance. To address this, we propose IVE, an accelerator for single-server PIR with a systematic extension that enables practical retrieval from large databases using DRAM. Recent advances in DRAM capacity allow PIR for large databases to be served entirely from DRAM, removing its dependence on storage bandwidth. Although the memory bandwidth bottleneck still remains, multi-client batching effectively amortizes database access costs across concurrent requests to improve throughput. However, client-specific data remains a bottleneck, whose bandwidth requirements ultimately limits performance. IVE overcomes this by employing a large on-chip scratchpad with an operation scheduling algorithm that maximizes data reuse, further boosting throughput. Additionally, we introduce sysNTTU, a versatile functional unit that enhances area efficiency without sacrificing performance. We also propose a heterogeneous memory system architecture, which enables a linear scaling of database sizes without a throughput degradation. Consequently, IVE achieves up to 1,275x higher throughput compared to prior PIR hardware solutions.

Paperid: 1342, https://arxiv.org/pdf/2511.22434.pdf

Abstract:
The deep learning (DL) has been penetrating daily life in many domains, how to keep the DL model inference secure and sample privacy in an encrypted environment has become an urgent and increasingly important issue for various security-critical applications. To date, several approaches have been proposed based on the Residue Number System variant of the Cheon-Kim-Kim-Song (RNS-CKKS) scheme. However, they all suffer from high latency, which severely limits the applications in real-world tasks. Currently, the research on encrypted inference in deep CNNs confronts three main bottlenecks: i) the time and storage costs of convolution calculation; ii) the time overhead of huge bootstrapping operations; and iii) the consumption of circuit multiplication depth. Towards these three challenges, we in this paper propose an efficient and effective mechanism FastFHE to accelerate the model inference while simultaneously retaining high inference accuracy over fully homomorphic encryption. Concretely, our work elaborates four unique novelties. First, we propose a new scalable ciphertext data-packing scheme to save the time and storage consumptions. Second, we work out a depthwise-separable convolution fashion to degrade the computation load of convolution calculation. Third, we figure out a BN dot-product fusion matrix to merge the ciphertext convolutional layer with the batch-normalization layer without incurring extra multiplicative depth. Last but not least, we adopt the low-degree Legendre polynomial to approximate the nonlinear smooth activation function SiLU under the guarantee of tiny accuracy error before and after encrypted inference. Finally, we execute multi-facet experiments to verify the efficiency and effectiveness of our proposed approach.

Paperid: 1343, https://arxiv.org/pdf/2511.22416.pdf

Abstract:
The advent of quantum computing threats classical cryptographic mechanisms, demanding new strategies for securing communication networks. Since real-world networks cannot be fully Quantum Key Distribution (QKD)-enabled due to infrastructure constraints, practical security solutions must support hybrid operation. This paper presents an adaptive security framework that enables quantum-safe communications across real-world heterogeneous networks by combining QKD and Post-Quantum Cryptography (PQC). Building upon a hierarchical key management architecture with Virtual Key Management Systems (vKMS) and a centralized Quantum Security Controller (QuSeC), the framework dynamically assigns security levels based on node capabilities. By transitioning between pure QKD, hybrid, and PQC modes, it ensures end-to-end quantum-safe protection regardless of the underlying node capabilities. The framework has been implemented and validated on a Kubernetes-based containerized testbed, demonstrating robust operation and performance across all scenarios. Results highlight its potential to support the gradual integration of quantum-safe technologies into existing infrastructures, paving the way toward fully quantum-safe communication networks.

Paperid: 1344, https://arxiv.org/pdf/2511.22044.pdf

Abstract:
In the realm of black-box jailbreak attacks on large language models (LLMs), the feasibility of constructing a narrow safety proxy, a lightweight model designed to predict the attack success rate (ASR) of adversarial prompts, remains underexplored. This work investigates the distillability of an LLM's core security logic. We propose a novel framework that incorporates an improved outline filling attack to achieve dense sampling of the model's security boundaries. Furthermore, we introduce a ranking regression paradigm that replaces standard regression and trains the proxy model to predict which prompt yields a higher ASR. Experimental results show that our proxy model achieves an accuracy of 91.1 percent in predicting the relative ranking of average long response (ALR), and 69.2 percent in predicting ASR. These findings confirm the predictability and distillability of jailbreak behaviors, and demonstrate the potential of leveraging such distillability to optimize black-box attacks.

Paperid: 1345, https://arxiv.org/pdf/2511.20992.pdf

Abstract:
Behavior Cloning (BC) is a popular framework for training sequential decision policies from expert demonstrations via supervised learning. As these policies are increasingly being deployed in the real world, their robustness and potential vulnerabilities are an important concern. In this work, we perform the first analysis of the efficacy of clean-label backdoor attacks on BC policies. Our backdoor attacks poison a dataset of demonstrations by injecting a visual trigger to create a spurious correlation that can be exploited at test time. We evaluate how policy vulnerability scales with the fraction of poisoned data, the strength of the trigger, and the trigger type. We also introduce a novel entropy-based test-time trigger attack that substantially degrades policy performance by identifying critical states where test-time triggering of the backdoor is expected to be most effective at degrading performance. We empirically demonstrate that BC policies trained on even minimally poisoned datasets exhibit deceptively high, near-baseline task performance despite being highly vulnerable to backdoor trigger attacks during deployment. Our results underscore the urgent need for more research into the robustness of BC policies, particularly as large-scale datasets are increasingly used to train policies for real-world cyber-physical systems. Videos and code are available at https://sites.google.com/view/dataset-poisoning-in-bc.

Paperid: 1346, https://arxiv.org/pdf/2511.14129.pdf

Abstract:
Fine-grained identification of IDS-flagged suspicious traffic is crucial in cybersecurity. In practice, cyber threats evolve continuously, making the discovery of novel malicious traffic a critical necessity as well as the identification of known classes. Recent studies have advanced this goal with deep models, but they often rely on task-specific architectures that limit transferability and require per-dataset tuning. In this paper we introduce MalRAG, the first LLM driven retrieval-augmented framework for open-set malicious traffic identification. MalRAG freezes the LLM and operates via comprehensive traffic knowledge construction, adaptive retrieval, and prompt engineering. Concretely, we construct a multi-view traffic database by mining prior malicious traffic from content, structural, and temporal perspectives. Furthermore, we introduce a Coverage-Enhanced Retrieval Algorithm that queries across these views to assemble the most probable candidates, thereby improving the inclusion of correct evidence. We then employ Traffic-Aware Adaptive Pruning to select a variable subset of these candidates based on traffic-aware similarity scores, suppressing incorrect matches and yielding reliable retrieved evidence. Moreover, we develop a suite of guidance prompts where task instruction, evidence referencing, and decision guidance are integrated with the retrieved evidence to improve LLM performance. Across diverse real-world datasets and settings, MalRAG delivers state-of-the-art results in both fine-grained identification of known classes and novel malicious traffic discovery. Ablation and deep-dive analyses further show that MalRAG effective leverages LLM capabilities yet achieves open-set malicious traffic identification without relying on a specific LLM.

Paperid: 1347, https://arxiv.org/pdf/2511.13641.pdf

Abstract:
Replay and rollback attacks threaten cloud application integrity by reintroducing authentic yet stale data through an untrusted storage interface to compromise application decision-making. Prior security frameworks mitigate these attacks by enforcing forward-only state transitions (state continuity) with hardware-backed mechanisms, but they categorically treat all rollback as malicious and thus preclude legitimate rollbacks used for operational recovery from corruption or misconfiguration. We present Rebound, a general-purpose security framework that preserves rollback protection while enabling policy-authorized legitimate rollbacks of application binaries, configuration, and data. Key to Rebound is a reference monitor that mediates state transitions, enforces authorization policy, guarantees atomicity of state updates and rollbacks, and emits a tamper-evident log that provides transparency to applications and auditors. We formally prove Rebound's security properties and show through an application case study -- with software deployment workflows in GitLab CI -- that it enables robust control over binary, configuration, and raw data versioning with low end-to-end overhead.

Paperid: 1348, https://arxiv.org/pdf/2511.12652.pdf

Abstract:
Boolean functions with strong cryptographic properties, such as high nonlinearity and algebraic degree, are important for the security of stream and block ciphers. These functions can be designed using algebraic constructions or metaheuristics. This paper examines the use of Evolutionary Algorithms (EAs) to evolve homogeneous bent Boolean functions, that is, functions whose algebraic normal form contains only monomials of the same degree and that are maximally nonlinear. We introduce the notion of density of homogeneous bent functions, facilitating the algorithmic design that results in finding quadratic and cubic bent functions in different numbers of variables.

Paperid: 1349, https://arxiv.org/pdf/2511.12423.pdf

Abstract:
Text-attributed graphs (TAGs), which combine structural and textual node information, are ubiquitous across many domains. Recent work integrates Large Language Models (LLMs) with Graph Neural Networks (GNNs) to jointly model semantics and structure, resulting in more general and expressive models that achieve state-of-the-art performance on TAG benchmarks. However, this integration introduces dual vulnerabilities: GNNs are sensitive to structural perturbations, while LLM-derived features are vulnerable to prompt injection and adversarial phrasing. While existing adversarial attacks largely perturb structure or text independently, we find that uni-modal attacks cause only modest degradation in LLM-enhanced GNNs. Moreover, many existing attacks assume unrealistic capabilities, such as white-box access or direct modification of graph data. To address these gaps, we propose GRAPHTEXTACK, the first black-box, multi-modal{, poisoning} node injection attack for LLM-enhanced GNNs. GRAPHTEXTACK injects nodes with carefully crafted structure and semantics to degrade model performance, operating under a realistic threat model without relying on model internals or surrogate models. To navigate the combinatorial, non-differentiable search space of connectivity and feature assignments, GRAPHTEXTACK introduces a novel evolutionary optimization framework with a multi-objective fitness function that balances local prediction disruption and global graph influence. Extensive experiments on five datasets and two state-of-the-art LLM-enhanced GNN models show that GRAPHTEXTACK significantly outperforms 12 strong baselines.

Paperid: 1350, https://arxiv.org/pdf/2511.09696.pdf

Abstract:
The rapid growth of smart devices such as phones, wearables, IoT sensors, and connected vehicles has led to an explosion of continuous time series data that offers valuable insights in healthcare, transportation, and more. However, this surge raises significant privacy concerns, as sensitive patterns can reveal personal details. While traditional differential privacy (DP) relies on trusted servers, local differential privacy (LDP) enables users to perturb their own data. However, traditional LDP methods perturb time series data by adding user-specific noise but exhibit vulnerabilities. For instance, noise applied within fixed time windows can be canceled during aggregation (e.g., averaging), enabling adversaries to infer individual statistics over time, thereby eroding privacy guarantees. To address these issues, we introduce a Cooperative Local Differential Privacy (CLDP) mechanism that enhances privacy by distributing noise vectors across multiple users. In our approach, noise is collaboratively generated and assigned so that when all users' perturbed data is aggregated, the noise cancels out preserving overall statistical properties while protecting individual privacy. This cooperative strategy not only counters vulnerabilities inherent in time-window-based methods but also scales effectively for large, real-time datasets, striking a better balance between data utility and privacy in multiuser environments.

Paperid: 1351, https://arxiv.org/pdf/2511.08944.pdf

Abstract:
Backdoor attacks pose a critical threat to machine learning models, causing them to behave normally on clean data but misclassify poisoned data into a poisoned class. Existing defenses often attempt to identify and remove backdoor neurons based on Trigger-Activated Changes (TAC) which is the activation differences between clean and poisoned data. These methods suffer from low precision in identifying true backdoor neurons due to inaccurate estimation of TAC values. In this work, we propose a novel backdoor removal method by accurately reconstructing TAC values in the latent representation. Specifically, we formulate the minimal perturbation that forces clean data to be classified into a specific class as a convex quadratic optimization problem, whose optimal solution serves as a surrogate for TAC. We then identify the poisoned class by detecting statistically small $L^2$ norms of perturbations and leverage the perturbation of the poisoned class in fine-tuning to remove backdoors. Experiments on CIFAR-10, GTSRB, and TinyImageNet demonstrated that our approach consistently achieves superior backdoor suppression with high clean accuracy across different attack types, datasets, and architectures, outperforming existing defense methods.

Paperid: 1352, https://arxiv.org/pdf/2511.08660.pdf

Abstract:
The integration of Artificial Intelligence (AI) in Network Intrusion Detection Systems (NIDS) is a promising approach to tackle the increasing sophistication of cyberattacks. However, since Machine Learning (ML) and Deep Learning (DL) models rely heavily on the quality of their training data, the lack of diverse and up-to-date datasets hinders their generalization capability to detect malicious activity in previously unseen network traffic. This study presents an experimental validation of the reliability of the GeNIS dataset for AI-based NIDS, to serve as a baseline for future benchmarks. Five feature selection methods, Information Gain, Chi-Squared Test, Recursive Feature Elimination, Mean Absolute Deviation, and Dispersion Ratio, were combined to identify the most relevant features of GeNIS and reduce its dimensionality, enabling a more computationally efficient detection. Three decision tree ensembles and two deep neural networks were trained for both binary and multiclass classification tasks. All models reached high accuracy and F1-scores, and the ML ensembles achieved slightly better generalization while remaining more efficient than DL models. Overall, the obtained results indicate that the GeNIS dataset supports intelligent intrusion detection and cyberattack classification with time-based and quantity-based behavioral features.

Paperid: 1353, https://arxiv.org/pdf/2511.08345.pdf

Abstract:
To ensure that Machine Learning (ML) models can perform a robust detection and classification of cyberattacks, it is essential to train them with high-quality datasets with relevant features. However, it can be difficult to accurately represent the complex traffic patterns of an attack, especially in Internet-of-Things (IoT) networks. This paper studies the impact that seemingly similar features created by different network traffic flow exporters can have on the generalization and robustness of ML models. In addition to the original CSV files of the Bot-IoT, IoT-23, and CICIoT23 datasets, the raw network packets of their PCAP files were analysed with the HERA tool, generating new labelled flows and extracting consistent features for new CSV versions. To assess the usefulness of these new flows for intrusion detection, they were compared with the original versions and were used to fine-tune multiple models. Overall, the results indicate that directly analysing and preprocessing PCAP files, instead of just using the commonly available CSV files, enables the computation of more relevant features to train bagging and gradient boosting decision tree ensembles. It is important to continue improving feature extraction and feature selection processes to make different datasets more compatible and enable a trustworthy evaluation and comparison of the ML models used in cybersecurity solutions.

Paperid: 1354, https://arxiv.org/pdf/2511.03898.pdf

Abstract:
Large language models (LLMs) are now widely used to draft and refactor code, but code that works is not necessarily secure. We evaluate secure code generation using the Instruct Prime, which eliminated compliance-required prompts and cue contamination, and evaluate five instruction-tuned code LLMs using a zero-shot baseline and a three-round reflexion prompting approach. Security is measured using the Insecure Code Detector (ICD), and results are reported by measuring Repair, Regression, and NetGain metrics, considering the programming language and CWE family. Our findings show that insecurity remains common at the first round: roughly 25-33% of programs are insecure at a zero-shot baseline (t0 ). Weak cryptography/config-dependent bugs are the hardest to avoid while templated ones like XSS, code injection, and hard-coded secrets are handled more reliably. Python yields the highest secure rates; C and C# are the lowest, with Java, JS, PHP, and C++ in the middle. Reflexion prompting improves security for all models, improving average accuracy from 70.74% at t0 to 79.43% at t3 , with the largest gains in the first round followed by diminishing returns. The trends with Repair, Regression, and NetGain metrics show that applying one to two rounds produces most of the benefits. A replication package is available at https://doi.org/10.5281/zenodo.17065846.

Paperid: 1355, https://arxiv.org/pdf/2511.02868.pdf

Abstract:
Blockchain systems face persistent challenges of scalability, latency, and energy inefficiency. Existing consensus protocols such as Proof-of-Work (PoW) and Proof-of-Stake (PoS) either consume excessive resources or risk centralization. This paper proposes \textit{Proof-of-Spiking-Neurons (PoSN)}, a neuromorphic consensus protocol inspired by spiking neural networks. PoSN encodes transactions as spike trains, elects leaders through competitive firing dynamics, and finalizes blocks via neural synchronization, enabling parallel and event-driven consensus with minimal energy overhead. A hybrid system architecture is implemented on neuromorphic platforms, supported by simulation frameworks such as Nengo and PyNN. Experimental results show significant gains in energy efficiency, throughput, and convergence compared to PoB and PoR. PoSN establishes a foundation for sustainable, adaptive blockchains suitable for IoT, edge, and large-scale distributed systems.

Paperid: 1356, https://arxiv.org/pdf/2511.02042.pdf

Abstract:
Rare events such as financial crashes, climate extremes, and biological anomalies are notoriously difficult to model due to their scarcity and heavy-tailed distributions. Classical deep generative models often struggle to capture these rare occurrences, either collapsing low-probability modes or producing poorly calibrated uncertainty estimates. In this work, we propose the Quantum-Enhanced Generative Model (QEGM), a hybrid classical-quantum framework that integrates deep latent-variable models with variational quantum circuits. The framework introduces two key innovations: (1) a hybrid loss function that jointly optimizes reconstruction fidelity and tail-aware likelihood, and (2) quantum randomness-driven noise injection to enhance sample diversity and mitigate mode collapse. Training proceeds via a hybrid loop where classical parameters are updated through backpropagation while quantum parameters are optimized using parameter-shift gradients. We evaluate QEGM on synthetic Gaussian mixtures and real-world datasets spanning finance, climate, and protein structure. Results demonstrate that QEGM reduces tail KL divergence by up to 50 percent compared to state-of-the-art baselines (GAN, VAE, Diffusion), while improving rare-event recall and coverage calibration. These findings highlight the potential of QEGM as a principled approach for rare-event prediction, offering robustness beyond what is achievable with purely classical methods.

Paperid: 1357, https://arxiv.org/pdf/2511.00336.pdf

Abstract:
The rapid growth of Internet of Medical Things (IoMT) devices has resulted in significant security risks, particularly the risk of malware attacks on resource-constrained devices. Conventional deep learning methods are impractical due to resource limitations, while Federated Learning (FL) suffers from high communication overhead and vulnerability to non-IID (heterogeneous) data. In this paper, we propose a split learning (SL) based framework for IoT malware detection through image-based classification. By dividing the neural network training between the clients and an edge server, the framework reduces computational burden on resource-constrained clients while ensuring data privacy. We formulate a joint optimization problem that balances computation cost and communication efficiency by using a game-theoretic approach for attaining better training performance. Experimental evaluations show that the proposed framework outperforms popular FL methods in terms of accuracy (+6.35%), F1-score (+5.03%), high convergence speed (+14.96%), and less resource consumption (33.83%). These results establish the potential of SL as a scalable and secure paradigm for next-generation IoT security.

Paperid: 1358, https://arxiv.org/pdf/2510.24985.pdf

Abstract:
Forget and Rewire (FaR) methodology has demonstrated strong resilience against Bit-Flip Attacks (BFAs) on Transformer-based models by obfuscating critical parameters through dynamic rewiring of linear layers. However, the application of FaR introduces non-negligible performance and memory overheads, primarily due to the runtime modification of activation pathways and the lack of hardware-level optimization. To overcome these limitations, we propose FaRAccel, a novel hardware accelerator architecture implemented on FPGA, specifically designed to offload and optimize FaR operations. FaRAccel integrates reconfigurable logic for dynamic activation rerouting, and lightweight storage of rewiring configurations, enabling low-latency inference with minimal energy overhead. We evaluate FaRAccel across a suite of Transformer models and demonstrate substantial reductions in FaR inference latency and improvement in energy efficiency, while maintaining the robustness gains of the original FaR methodology. To the best of our knowledge, this is the first hardware-accelerated defense against BFAs in Transformers, effectively bridging the gap between algorithmic resilience and efficient deployment on real-world AI platforms.

Paperid: 1359, https://arxiv.org/pdf/2510.22620.pdf

Abstract:
AI agents powered by large language models (LLMs) are being deployed at scale, yet we lack a systematic understanding of how the choice of backbone LLM affects agent security. The non-deterministic sequential nature of AI agents complicates security modeling, while the integration of traditional software with AI components entangles novel LLM vulnerabilities with conventional security risks. Existing frameworks only partially address these challenges as they either capture specific vulnerabilities only or require modeling of complete agents. To address these limitations, we introduce threat snapshots: a framework that isolates specific states in an agent's execution flow where LLM vulnerabilities manifest, enabling the systematic identification and categorization of security risks that propagate from the LLM to the agent level. We apply this framework to construct the $\operatorname{b}^3$ benchmark, a security benchmark based on 194331 unique crowdsourced adversarial attacks. We then evaluate 31 popular LLMs with it, revealing, among other insights, that enhanced reasoning capabilities improve security, while model size does not correlate with security. We release our benchmark, dataset, and evaluation code to facilitate widespread adoption by LLM providers and practitioners, offering guidance for agent developers and incentivizing model developers to prioritize backbone security improvements.

Paperid: 1360, https://arxiv.org/pdf/2510.19574.pdf

Abstract:
As object detection models are increasingly deployed in cyber-physical systems such as autonomous vehicles (AVs) and surveillance platforms, ensuring their security against adversarial threats is essential. While prior work has explored adversarial attacks in the image domain, those attacks in the video domain remain largely unexamined, especially in the no-box setting. In this paper, we present α-Cloak, the first no-box adversarial attack on object detectors that operates entirely through the alpha channel of RGBA videos. α-Cloak exploits the alpha channel to fuse a malicious target video with a benign video, resulting in a fused video that appears innocuous to human viewers but consistently fools object detectors. Our attack requires no access to model architecture, parameters, or outputs, and introduces no perceptible artifacts. We systematically study the support for alpha channels across common video formats and playback applications, and design a fusion algorithm that ensures visual stealth and compatibility. We evaluate α-Cloak on five state-of-the-art object detectors, a vision-language model, and a multi-modal large language model (Gemini-2.0-Flash), demonstrating a 100% attack success rate across all scenarios. Our findings reveal a previously unexplored vulnerability in video-based perception systems, highlighting the urgent need for defenses that account for the alpha channel in adversarial settings.

Paperid: 1361, https://arxiv.org/pdf/2510.18728.pdf

Abstract:
Large Language Models (LLMs) remain vulnerable to multi-turn jailbreak attacks. We introduce HarmNet, a modular framework comprising ThoughtNet, a hierarchical semantic network; a feedback-driven Simulator for iterative query refinement; and a Network Traverser for real-time adaptive attack execution. HarmNet systematically explores and refines the adversarial space to uncover stealthy, high-success attack paths. Experiments across closed-source and open-source LLMs show that HarmNet outperforms state-of-the-art methods, achieving higher attack success rates. For example, on Mistral-7B, HarmNet achieves a 99.4% attack success rate, 13.9% higher than the best baseline. Index terms: jailbreak attacks; large language models; adversarial framework; query refinement.

Paperid: 1362, https://arxiv.org/pdf/2510.18715.pdf

Abstract:
International students (IntlS) in the US refer to foreign students who acquire student visas to study in the US, primarily in higher education. As IntlS arrive in the US, they face several challenges, such as adjusting to a new country and culture, securing housing remotely, and arranging finances for tuition and personal expenses. These experiences, coupled with recent events such as visa revocations and the cessation of new visas, compound IntlS' risk of being targeted by and falling victim to online scams. While prior work has investigated IntlS' security and privacy, as well as general end users' reactions to online scams, research on how IntlS are uniquely impacted by scams remains largely absent. To address this gap, we conduct a two-phase user study comprising surveys (n=48) and semi-structured interviews (n=9). We investigate IntlS' exposure and interactions with scams, post-exposure actions such as reporting, and their perceptions of the usefulness of existing prevention resources and the barriers to following prevention advice. We find that IntlS are often targeted by scams (e.g., attackers impersonating government officials) and fear legal implications or deportation, which directly impacts their interactions with scams (e.g., they may prolong engagement with a scammer due to a sense of urgency). Interestingly, we also find that IntlS may lack awareness of - or access to - reliable resources that inform them about scams or guide them in reporting incidents to authorities. In fact, they may also face unique barriers in enacting scam prevention advice, such as avoiding reporting financial losses, since IntlS are required to demonstrate financial ability to stay in the US. The findings produced by our study help synthesize guidelines for stakeholders to better aid IntlS in reacting to scams.

Paperid: 1363, https://arxiv.org/pdf/2510.18645.pdf

Abstract:
Networked Control Systems (NCSs) are integral in critical infrastructures such as power grids, transportation networks, and production systems. Ensuring the resilient operation of these large-scale NCSs against cyber-attacks is crucial for societal well-being. Over the past two decades, extensive research has been focused on developing metrics to quantify the vulnerabilities of NCSs against attacks. Once the vulnerabilities are quantified, mitigation strategies can be employed to enhance system resilience. This article provides a comprehensive overview of methods developed for assessing NCS vulnerabilities and the corresponding mitigation strategies. Furthermore, we emphasize the importance of probabilistic risk metrics to model vulnerabilities under adversaries with imperfect process knowledge. The article concludes by outlining promising directions for future research.

Paperid: 1364, https://arxiv.org/pdf/2510.18477.pdf

Abstract:
Large Language Models (LLMs) have shown great promise in automating data analytics tasks by interpreting natural language queries and generating multi-operation execution plans. However, existing LLM-agent-based analytics frameworks operate under the assumption of centralized data access, offering little to no privacy protection. In contrast, federated analytics (FA) enables privacy-preserving computation across distributed data sources, but lacks support for natural language input and requires structured, machine-readable queries. In this work, we present LAFA, the first system that integrates LLM-agent-based data analytics with FA. LAFA introduces a hierarchical multi-agent architecture that accepts natural language queries and transforms them into optimized, executable FA workflows. A coarse-grained planner first decomposes complex queries into sub-queries, while a fine-grained planner maps each subquery into a Directed Acyclic Graph of FA operations using prior structural knowledge. To improve execution efficiency, an optimizer agent rewrites and merges multiple DAGs, eliminating redundant operations and minimizing computational and communicational overhead. Our experiments demonstrate that LAFA consistently outperforms baseline prompting strategies by achieving higher execution plan success rates and reducing resource-intensive FA operations by a substantial margin. This work establishes a practical foundation for privacy-preserving, LLM-driven analytics that supports natural language input in the FA setting.

Paperid: 1365, https://arxiv.org/pdf/2510.18113.pdf

Abstract:
As users increasingly turn to large language model (LLM) based web agents to automate online tasks, agents may encounter dark patterns: deceptive user interface designs that manipulate users into making unintended decisions. Although dark patterns primarily target human users, their potentially harmful impacts on LLM-based generalist web agents remain unexplored. In this paper, we present the first study that investigates the impact of dark patterns on the decision-making process of LLM-based generalist web agents. To achieve this, we introduce LiteAgent, a lightweight framework that automatically prompts agents to execute tasks while capturing comprehensive logs and screen-recordings of their interactions. We also present TrickyArena, a controlled environment comprising web applications from domains such as e-commerce, streaming services, and news platforms, each containing diverse and realistic dark patterns that can be selectively enabled or disabled. Using LiteAgent and TrickyArena, we conduct multiple experiments to assess the impact of both individual and combined dark patterns on web agent behavior. We evaluate six popular LLM-based generalist web agents across three LLMs and discover that when there is a single dark pattern present, agents are susceptible to it an average of 41% of the time. We also find that modifying dark pattern UI attributes through visual design changes or HTML code adjustments and introducing multiple dark patterns simultaneously can influence agent susceptibility. This study emphasizes the need for holistic defense mechanisms in web agents, encompassing both agent-specific protections and broader web safety measures.

Paperid: 1366, https://arxiv.org/pdf/2510.17904.pdf

Abstract:
The proficiency of Large Language Models (LLMs) in processing structured data and adhering to syntactic rules is a capability that drives their widespread adoption but also makes them paradoxically vulnerable. In this paper, we investigate this vulnerability through BreakFun, a jailbreak methodology that weaponizes an LLM's adherence to structured schemas. BreakFun employs a three-part prompt that combines an innocent framing and a Chain-of-Thought distraction with a core "Trojan Schema"--a carefully crafted data structure that compels the model to generate harmful content, exploiting the LLM's strong tendency to follow structures and schemas. We demonstrate this vulnerability is highly transferable, achieving an average success rate of 89% across 13 foundational and proprietary models on JailbreakBench, and reaching a 100% Attack Success Rate (ASR) on several prominent models. A rigorous ablation study confirms this Trojan Schema is the attack's primary causal factor. To counter this, we introduce the Adversarial Prompt Deconstruction guardrail, a defense that utilizes a secondary LLM to perform a "Literal Transcription"--extracting all human-readable text to isolate and reveal the user's true harmful intent. Our proof-of-concept guardrail demonstrates high efficacy against the attack, validating that targeting the deceptive schema is a viable mitigation strategy. Our work provides a look into how an LLM's core strengths can be turned into critical weaknesses, offering a fresh perspective for building more robustly aligned models.

Paperid: 1367, https://arxiv.org/pdf/2510.17862.pdf

Abstract:
Code agents are increasingly trusted to autonomously fix bugs on platforms such as GitHub, yet their security evaluation focuses almost exclusively on functional correctness. In this paper, we reveal a novel type of threat to real-world code agents: Functionally Correct yet Vulnerable (FCV) patches, which pass all test cases but contain vulnerable code. With our proposed FCV-Attack, which can be deliberately crafted by malicious attackers or implicitly introduced by benign developers, we show that SOTA LLMs (e.g., ChatGPT and Claude) and agent scaffolds (e.g., SWE-agent and OpenHands) are all vulnerable to this FCV threat; across 12 agent-model combinations on SWE-Bench, the attack only requires black-box access and a single query to the code agent to perform the attack. For example, for CWE-538 (information exposure vulnerability), the FCV-Attack attains an attack success rate of $40.7\%$ on GPT-5 Mini + OpenHands. Our results reveal an important security threat overlooked by current evaluation paradigms and urge the development of security-aware defenses for code agents.

Paperid: 1368, https://arxiv.org/pdf/2510.17633.pdf

Abstract:
Large Audio-Language Models (LALMs) are becoming essential as a powerful multimodal backbone for real-world applications. However, recent studies show that audio inputs can more easily elicit harmful responses than text, exposing new risks toward deployment. While safety alignment has made initial advances in LLMs and Large Vision-Language Models (LVLMs), we find that vanilla adaptation of these approaches to LALMs faces two key limitations: 1) LLM-based steering fails under audio input due to the large distributional gap between activations, and 2) prompt-based defenses induce over-refusals on benign-speech queries. To address these challenges, we propose Safe-Ablated Refusal Steering (SARSteer), the first inference-time defense framework for LALMs. Specifically, SARSteer leverages text-derived refusal steering to enforce rejection without manipulating audio inputs and introduces decomposed safe-space ablation to mitigate over-refusal. Extensive experiments demonstrate that SARSteer significantly improves harmful-query refusal while preserving benign responses, establishing a principled step toward safety alignment in LALMs.

Paperid: 1369, https://arxiv.org/pdf/2510.16044.pdf

Abstract:
Identification of anomalous events within system logs constitutes a pivotal element within the frame- work of cybersecurity defense strategies. However, this process faces numerous challenges, including the management of substantial data volumes, the distribution of anomalies, and the precision of con- ventional methods. To address this issue, the present paper puts forward a proposal for an intelligent detection method for system logs based on Genera- tive Pre-trained Transformers (GPT). The efficacy of this approach is attributable to a combination of structured input design and a Focal Loss op- timization strategy, which collectively result in a substantial enhancement of the performance of log anomaly detection. The initial approach involves the conversion of raw logs into event ID sequences through the use of the Drain parser. Subsequently, the Focal Loss loss function is employed to address the issue of class imbalance. The experimental re- sults demonstrate that the optimized GPT-2 model significantly outperforms the unoptimized model in a range of key metrics, including precision, recall, and F1 score. In specific tasks, comparable or superior performance has been demonstrated to that of the GPT-3.5 API.

Paperid: 1370, https://arxiv.org/pdf/2510.15173.pdf

Abstract:
Voice interfaces are increasingly used in high stakes domains such as mobile banking, smart home security, and hands free healthcare. Meanwhile, modern generative models have made high quality voice forgeries inexpensive and easy to create, eroding confidence in voice authentication alone. To strengthen protection against such attacks, we present a second authentication factor that combines acoustic evidence with the unique motion patterns of a speaker's lower face. By placing lightweight inertial sensors around the mouth to capture mouth opening and evolving lower facial geometry, our system records a distinct motion signature with strong discriminative power across individuals. We built a prototype and recruited 43 participants to evaluate the system under four conditions seated, walking on level ground, walking on stairs, and speaking with different language backgrounds (native vs. non native English). Across all scenarios, our approach consistently achieved a median equal error rate (EER) of 0.01 or lower, indicating that mouth movement data remain robust under variations in gait, posture, and spoken language. We discuss specific use cases where this second line of defense could provide tangible security benefits to voice authentication systems.

Paperid: 1371, https://arxiv.org/pdf/2510.15133.pdf

Abstract:
File encrypting ransomware increasingly employs intermittent encryption techniques, encrypting only parts of files to evade classical detection methods. These strategies, exemplified by ransomware families like BlackCat, complicate file structure based detection techniques due to diverse file formats exhibiting varying traits under partial encryption. This paper provides a systematic empirical characterization of byte level statistics under intermittent encryption across common file types, establishing a comprehensive baseline of how partial encryption impacts data structure. We specialize a classical KL divergence upper bound on a tailored mixture model of intermittent encryption, yielding filetype specific detectability ceilings for histogram-based detectors. Leveraging insights from this analysis, we empirically evaluate convolutional neural network (CNN) based detection methods using realistic intermittent encryption configurations derived from leading ransomware variants. Our findings demonstrate that localized analysis via chunk level CNNs consistently outperforms global analysis methods, highlighting their practical effectiveness and establishing a robust baseline for future detection systems.

Paperid: 1372, https://arxiv.org/pdf/2510.14233.pdf

Abstract:
Modern Network Intrusion Detection Systems generate vast volumes of low-level alerts, yet these outputs remain semantically fragmented, requiring labor-intensive manual correlation with high-level adversarial behaviors. Existing solutions for automating this mapping-rule-based systems and machine learning classifiers-suffer from critical limitations: rule-based approaches fail to adapt to novel attack variations, while machine learning methods lack contextual awareness and treat tactic-technique mapping as a syntactic matching problem rather than a reasoning task. Although Large Language Models have shown promise in cybersecurity tasks, preliminary experiments reveal that existing LLM-based methods frequently hallucinate technique names or produce decontextualized mappings due to their single-step classification approach. To address these challenges, we introduce RHINO, a novel framework that decomposes LLM-based attack analysis into three interpretable phases mirroring human reasoning: (1) behavioral abstraction, where raw logs are translated into contextualized narratives; (2) multi-role collaborative inference, generating candidate techniques by evaluating behavioral evidence against MITRE ATT&CK knowledge; and (3) validation, cross-referencing predictions with official MITRE definitions to rectify hallucinations. RHINO bridges the semantic gap between low-level observations and adversarial intent while improving output reliability through structured reasoning. We evaluate RHINO on three benchmarks across four backbone models. RHINO achieved high accuracy, with model performance ranging from 86.38% to 88.45%, resulting in relative gains from 24.25% to 76.50% across different models. Our results demonstrate that RHINO significantly enhances the interpretability and scalability of threat analysis, offering a blueprint for deploying LLMs in operational security settings.

Paperid: 1373, https://arxiv.org/pdf/2510.14066.pdf

Abstract:
Uncooperative unmanned aerial vehicles (UAVs) pose emerging threats to critical infrastructure and border protection by operating as rogue user equipment (UE) within cellular networks, consuming resources, creating interference, and potentially violating restricted airspaces. This paper presents minimal features of the operating space, yet an end-to-end simulation framework to analyze detect-to-mitigate latency of such intrusions in a hybrid terrestrial-non-terrestrial (LEO satellite) 5G system. The system model includes terrestrial gNBs, satellite backhaul (with stochastic outages), and a detection logic (triggered by handover instability and signal quality variance). A lockdown mechanism is invoked upon detection, with optional local fallback to cap mitigation delays. Monte Carlo sweeps across UAV altitudes, speeds, and satellite outage rates yield several insights. First, satellite backhaul outages can cause arbitrarily long mitigation delays, yet, to meet fallback deadlines, they need to be effectively bounded. Second, while handover instability was hypothesized, our results show that extra handovers have a negligible effect within the range of parameters we considered. The main benefit of resilience from fallback comes from the delay in limiting mitigation. Third, patrol UEs experience negligible collateral impact, with handover rates close to terrestrial baselines. Stress scenarios further highlight that fallback is indispensable in preventing extreme control-plane and physical security vulnerabilities: Without fallback, prolonged outages in the satellite backhaul delay lockdown commands, allowing rogue UAVs to linger inside restricted corridors for several seconds longer. These results underscore the importance of complementing non-terrestrial links with local control to ensure robust and timely response against uncooperative UAV intrusions.

Paperid: 1374, https://arxiv.org/pdf/2510.12629.pdf

Abstract:
In modern containerized cloud environments, the adoption of RDMA (Remote Direct Memory Access) has expanded to reduce CPU overhead and enable high-performance data exchange. Achieving this requires strong performance isolation to ensure that one container's RDMA workload does not degrade the performance of others, thereby maintaining critical security assurances. However, existing isolation techniques are difficult to apply effectively due to the complexity of microarchitectural resource management within RDMA NICs (RNICs). This paper experimentally analyzes two types of resource exhaustion attacks on NVIDIA BlueField-3: (i) state saturation attacks and (ii) pipeline saturation attacks. Our results show that state saturation attacks can cause up to a 93.9% loss in bandwidth, a 1,117x increase in latency, and a 115% rise in cache misses for victim containers, while pipeline saturation attacks lead to severe link-level congestion and significant amplification, where small verb requests result in disproportionately high resource consumption. To mitigate these threats and restore predictable security assurances, we propose HT-Verbs, a threshold-driven framework based on real-time per-container RDMA verb telemetry and adaptive resource classification that partitions RNIC resources into hot, warm, and cold tiers and throttles abusive workloads without requiring hardware modifications.

Paperid: 1375, https://arxiv.org/pdf/2510.12469.pdf

Abstract:
Confidential Virtual Machines (CVMs) protect data in use by running workloads inside hardware-isolated environments. In doing so, they also inherit the limitations of the underlying hardware. Trusted Execution Environments (TEEs), which enforce this isolation, explicitly exclude adversaries with physical access from their threat model. Commercial TEEs, e.g., Intel TDX, thus assume infrastructure providers do not physically exploit hardware and serve as safeguards instead. This creates a tension: tenants must trust provider integrity at the hardware layer, yet existing remote attestation offers no way to verify that CVMs actually run on physically trusted platforms, leaving today's CVM deployments unable to demonstrate that their guarantees align with the TEE vendor's threat model. We bridge this confidence gap with Data Center Execution Assurance (DCEA), a design generating "Proofs of Cloud". DCEA binds a CVM to its underlying platform using vTPM-anchored measurements, ensuring CVM launch evidence and TPM quotes refer to the same physical chassis. This takes advantage of the fact that data centers are often identifiable via TPMs. Our approach applies to CVMs accessing vTPMs and running on top of software stacks fully controlled by the cloud provider, as well as single-tenant bare-metal deployments with discrete TPMs. We trust providers for integrity (certificate issuance), but not for the confidentiality of CVM-visible state. DCEA enables remote verification of a CVM's platform origin and integrity, mitigating attacks like replay and attestation proxying. We include a candidate implementation on Google Cloud and Intel TDX that leverages Intel TXT for trusted launch. Our design refines CVMs' threat model and provides a practical path for deploying high-assurance, confidential workloads in minimally trusted environments.

Paperid: 1376, https://arxiv.org/pdf/2510.12414.pdf

Abstract:
Steganographic schemes dedicated to generated images modify the seed vector in the latent space to embed a message, whereas most steganalysis methods attempt to detect the embedding in the image space. This paper proposes to perform steganalysis in the latent space by modeling the statistical distribution of the norm of the latent vector. Specifically, we analyze the practical security of a scheme proposed by Hu et. al. for latent diffusion models, which is both robust and practically undetectable when steganalysis is performed on generated images. We show that after embedding, the Stego (latent) vector is distributed on a hypersphere while the Cover vector is i.i.d. Gaussian. By going from the image space to the latent space, we show that it is possible to model the norm of the vector in the latent space under the Cover or Stego hypothesis as Gaussian distributions with different variances. A Likelihood Ratio Test is then derived to perform pooled steganalysis. The impact of the potential knowledge of the prompt and the number of diffusion steps, is also studied. Additionally, we also show how, by randomly sampling the norm of the latent vector before generation, the initial Stego scheme becomes undetectable in the latent space.

Paperid: 1377, https://arxiv.org/pdf/2510.11804.pdf

Abstract:
The Tor network provides users with strong anonymity by routing their internet traffic through multiple relays. While Tor encrypts traffic and hides IP addresses, it remains vulnerable to traffic analysis attacks such as the website fingerprinting (WF) attack, achieving increasingly high fingerprinting accuracy even under open-world conditions. In response, researchers have proposed a variety of defenses, ranging from adaptive padding, traffic regularization, and traffic morphing to adversarial perturbation, that seek to obfuscate or reshape traffic traces. However, these defenses often entail trade-offs between privacy, usability, and system performance. Despite extensive research, a comprehensive survey unifying WF datasets, attack methodologies, and defense strategies remains absent. This paper fills that gap by systematically categorizing existing WF research into three key domains: datasets, attack models, and defense mechanisms. We provide an in-depth comparative analysis of techniques, highlight their strengths and limitations under diverse threat models, and discuss emerging challenges such as multi-tab browsing and coarse-grained traffic features. By consolidating prior work and identifying open research directions, this survey serves as a foundation for advancing stronger privacy protection in Tor.

Paperid: 1378, https://arxiv.org/pdf/2510.11398.pdf

Abstract:
Retrieval-Augmented Generation (RAG) increases the reliability and trustworthiness of the LLM response and reduces hallucination by eliminating the need for model retraining. It does so by adding external data into the LLM's context. We develop a new class of black-box attack, RAG-Pull, that inserts hidden UTF characters into queries or external code repositories, redirecting retrieval toward malicious code, thereby breaking the models' safety alignment. We observe that query and code perturbations alone can shift retrieval toward attacker-controlled snippets, while combined query-and-target perturbations achieve near-perfect success. Once retrieved, these snippets introduce exploitable vulnerabilities such as remote code execution and SQL injection. RAG-Pull's minimal perturbations can alter the model's safety alignment and increase preference towards unsafe code, therefore opening up a new class of attacks on LLMs.

Paperid: 1380, https://arxiv.org/pdf/2510.09700.pdf

Abstract:
Smart homes are increasingly populated with heterogeneous Internet of Things (IoT) devices that interact continuously with users and the environment. This diversity introduces critical challenges in device identification, authentication, and security, where fingerprinting techniques have emerged as a key approach. In this survey, we provide a comprehensive analysis of IoT fingerprinting specifically in the context of smart homes, examining methods for device and their event detection, classification, and intrusion prevention. We review existing techniques, e.g., network traffic analysis or machine learning-based schemes, highlighting their applicability and limitations in home environments characterized by resource-constrained devices, dynamic usage patterns, and privacy requirements. Furthermore, we discuss fingerprinting system deployment challenges like scalability, interoperability, and energy efficiency, as well as emerging opportunities enabled by generative AI and federated learning. Finally, we outline open research directions that can advance reliable and privacy-preserving fingerprinting for next-generation smart home ecosystems.

Paperid: 1381, https://arxiv.org/pdf/2510.09682.pdf

Abstract:
The code generation capabilities of Large Language Models (LLMs) have transformed the field of software development. However, this advancement also presents significant security challenges, as LLM-generated code often contains vulnerabilities. One direction of research strengthens LLMs by injecting or refining security knowledge through curated datasets, model tuning, or static analyzers. While effective in certain settings, these methods can be resource-intensive, less adaptable to zero-day vulnerabilities, and often inapplicable to proprietary models. To address these challenges, we introduce GRASP, which explores a new direction that focuses on structured reasoning over Secure Coding Practices(SCPs) rather than additional training or external feedback. GRASP comprises two key ideas: (1) an SCP graph that organizes SCPs into a Directed Acyclic Graph (DAG) capturing dependencies and relationships, and (2) a graph-based reasoning process that systematically guides LLMs through relevant SCPs for code generation. This design enables interpretable, model-agnostic, and scalable security improvements, particularly for previously unseen vulnerabilities. Our evaluation shows that GRASP consistently achieves Security Rates (SR) exceeding 80% across multiple LLMs, and delivers up to 88% improvements over baselines on zero-day vulnerabilities.

Paperid: 1382, https://arxiv.org/pdf/2510.09260.pdf

Abstract:
Recent work has shown that RLHF is highly susceptible to backdoor attacks, poisoning schemes that inject malicious triggers in preference data. However, existing methods often rely on static, rare-token-based triggers, limiting their effectiveness in realistic scenarios. In this paper, we develop GREAT, a novel framework for crafting generalizable backdoors in RLHF through emotion-aware trigger synthesis. Specifically, GREAT targets harmful response generation for a vulnerable user subgroup characterized by both semantically violent requests and emotionally angry triggers. At the core of GREAT is a trigger identification pipeline that operates in the latent embedding space, leveraging principal component analysis and clustering techniques to identify the most representative triggers. To enable this, we present Erinyes, a high-quality dataset of over $5000$ angry triggers curated from GPT-4.1 using a principled, hierarchical, and diversity-promoting approach. Experiments on benchmark RLHF datasets demonstrate that GREAT significantly outperforms baseline methods in attack success rates, especially for unseen trigger scenarios, while largely preserving the response quality on benign inputs.

Paperid: 1383, https://arxiv.org/pdf/2510.08829.pdf

Abstract:
The increasing adoption of LLM agents with access to numerous tools and sensitive data significantly widens the attack surface for indirect prompt injections. Due to the context-dependent nature of attacks, however, current defenses are often ill-calibrated as they cannot reliably differentiate malicious and benign instructions, leading to high false positive rates that prevent their real-world adoption. To address this, we present a novel approach inspired by the fundamental principle of computer security: data should not contain executable instructions. Instead of sample-level classification, we propose a token-level sanitization process, which surgically removes any instructions directed at AI systems from tool outputs, capturing malicious instructions as a byproduct. In contrast to existing safety classifiers, this approach is non-blocking, does not require calibration, and is agnostic to the context of tool outputs. Further, we can train such token-level predictors with readily available instruction-tuning data only, and don't have to rely on unrealistic prompt injection examples from challenges or of other synthetic origin. In our experiments, we find that this approach generalizes well across a wide range of attacks and benchmarks like AgentDojo, BIPIA, InjecAgent, ASB and SEP, achieving a 7-10x reduction of attack success rate (ASR) (34% to 3% on AgentDojo), without impairing agent utility in both benign and malicious settings.

Paperid: 1384, https://arxiv.org/pdf/2510.06994.pdf

Abstract:
This paper presents the vision, scientific contributions, and technical details of RedTWIZ: an adaptive and diverse multi-turn red teaming framework, to audit the robustness of Large Language Models (LLMs) in AI-assisted software development. Our work is driven by three major research streams: (1) robust and systematic assessment of LLM conversational jailbreaks; (2) a diverse generative multi-turn attack suite, supporting compositional, realistic and goal-oriented jailbreak conversational strategies; and (3) a hierarchical attack planner, which adaptively plans, serializes, and triggers attacks tailored to specific LLM's vulnerabilities. Together, these contributions form a unified framework -- combining assessment, attack generation, and strategic planning -- to comprehensively evaluate and expose weaknesses in LLMs' robustness. Extensive evaluation is conducted to systematically assess and analyze the performance of the overall system and each component. Experimental results demonstrate that our multi-turn adversarial attack strategies can successfully lead state-of-the-art LLMs to produce unsafe generations, highlighting the pressing need for more research into enhancing LLM's robustness.

Paperid: 1385, https://arxiv.org/pdf/2510.06565.pdf

Abstract:
With the rapid progress of LLMs, high quality generative text has become widely available as a cover for text steganography. However, prevailing methods rely on hand-crafted or pre-specified strategies and struggle to balance efficiency, imperceptibility, and security, particularly at high embedding rates. Accordingly, we propose Auto-Stega, an agent-driven self-evolving framework that is the first to realize self-evolving steganographic strategies by automatically discovering, composing, and adapting strategies at inference time; the framework operates as a closed loop of generating, evaluating, summarizing, and updating that continually curates a structured strategy library and adapts across corpora, styles, and task constraints. A decoding LLM recovers the information under the shared strategy. To handle high embedding rates, we introduce PC-DNTE, a plug-and-play algorithm that maintains alignment with the base model's conditional distribution at high embedding rates, preserving imperceptibility while enhancing security. Experimental results demonstrate that at higher embedding rates Auto-Stega achieves superior performance with gains of 42.2\% in perplexity and 1.6\% in anti-steganalysis performance over SOTA methods.

Paperid: 1386, https://arxiv.org/pdf/2510.06445.pdf

Abstract:
The rapid shift from passive LLMs to autonomous LLM-agents marks a new paradigm in cybersecurity. While these agents can act as powerful tools for both offensive and defensive operations, the very agentic context introduces a new class of inherent security risks. In this work we present the first holistic survey of the agentic security landscape, structuring the field around three interdependent pillars: Applications, Threats, and Defenses. We provide a comprehensive taxonomy of over 150 papers, explaining how agents are used, the vulnerabilities they possess, and the countermeasures designed to protect them. A detailed cross-cutting analysis shows emerging trends in agent architecture while revealing critical research gaps in model and modality coverage.

Paperid: 1387, https://arxiv.org/pdf/2510.06212.pdf

Abstract:
The no-cloning theorem can be used as a basis for quantum money constructions which guarantee unconditionally unforgeable currency. Existing schemes, however, either (i) require long-term quantum memory and quantum communication between the user and the bank in order to verify the validity of a bill or (ii) fail to protect user privacy due to the uniqueness of each bill issued by the bank, which can allow its usage to be tracked. We introduce a construction of single-use quantum money that gives users the ability to detect whether the issuing authority is tracking them, employing an auditing procedure for which we prove unconditional security. Bill validation is classical, and hence does not require long-term quantum memory or quantum communication, making the protocol relatively practical to deploy. We discuss potential applications beyond money, including anonymous one-time pads and voting.

Paperid: 1388, https://arxiv.org/pdf/2510.05028.pdf

Abstract:
One of the most fundamental problems in the field of hypothesis testing is the identity testing problem: whether samples from some unknown distribution $\mathcal{G}$ are actually from some explicit distribution $\mathcal{D}$. It is known that when the distribution $\mathcal{D}$ has support $[N]$, the optimal sample complexity for the identity testing problem is roughly $O(\sqrt{N})$. However, many distributions of interest, including those which can be sampled efficiently, have exponential support size, and therefore the optimal identity tester also requires exponential samples. In this paper, we bypass this lower bound by considering restricted settings. The above $O(\sqrt{N})$ sample complexity identity tester is constructed so that it is not fooled by any (even inefficiently-sampled) distributions. However, in most applications, the distributions under consideration are efficiently sampleable, and therefore it is enough to consider only identity testers that are not fooled by efficiently-sampled distributions. In that case, we can focus on efficient verification with efficient identity testers. We investigate relations between efficient verifications of classical/quantum distributions and classical/quantum cryptography, and show the following results: (i) Every quantumly samplable distribution is verifiable with a $\mathbf{P^{PP}}$ algorithm. (ii) If one-way functions exist, then no sufficiently random classically samplable distribution is efficiently verifiable. (iii) If one-way functions do not exist, then every classically samplable distribution is efficiently verifiable. (iv) If QEFID pairs exist, then there exists a quantumly samplable distribution which is not efficiently verifiable. (v) If one-way puzzles do not exist, then it is possible to verify sampling-based quantum advantage with a efficient quantum computer.

Paperid: 1389, https://arxiv.org/pdf/2510.03254.pdf

Abstract:
Adversarial machine learning concerns situations in which learners face attacks from active adversaries. Such scenarios arise in applications such as spam email filtering, malware detection and fake image generation, where security methods must be actively updated to keep up with the everimproving generation of malicious data. Pessimistic Bilevel optimisation has been shown to be an effective method of training resilient classifiers against such adversaries. By modelling these scenarios as a game between the learner and the adversary, we anticipate how the adversary will modify their data and then train a resilient classifier accordingly. However, since existing pessimistic bilevel approaches feature an unrestricted adversary, the model is vulnerable to becoming overly pessimistic and unrealistic. When finding the optimal solution that defeats the classifier, it is possible that the adversary's data becomes nonsensical and loses its intended nature. Such an adversary will not properly reflect reality, and consequently, will lead to poor classifier performance when implemented on real-world data. By constructing a constrained pessimistic bilevel optimisation model, we restrict the adversary's movements and identify a solution that better reflects reality. We demonstrate through experiments that this model performs, on average, better than the existing approach.

Paperid: 1390, https://arxiv.org/pdf/2510.03219.pdf

Abstract:
In the rapidly evolving landscape of 5G technology, the adoption of cloud-based infrastructure for the deployment of 5G services has become increasingly common. Using a service-based architecture, critical 5G components, such as the Access and Mobility Management Function (AMF), Session Management Function (SMF), and User Plane Function (UPF), now run as containerized pods on Kubernetes clusters. Although this approach improves scalability, flexibility, and resilience, it also introduces new security challenges, particularly to ensure the integrity and trustworthiness of these components. Current 5G security specifications (for example, 3GPP TS 33.501) focus on communication security and assume that network functions remain trustworthy after authentication, consequently lacking mechanisms to continuously validate the integrity of NVFs at runtime. To close this gap, and to align with Zero Trust principles of 'never trust, always verify', we present a TPM 2.0-based continuous remote attestation solution for core 5G components deployed on Kubernetes. Our approach uses the Linux Integrity Measurement Architecture (IMA) and a Trusted Platform Module (TPM) to provide hardware-based runtime validation. We integrate the open-source Keylime framework with a custom IMA template that isolates pod-level measurements, allowing per-pod integrity verification. A prototype on a k3s cluster (consisting of 1 master, 2 worker nodes) was implemented to attest to core functions, including AMF, SMF and UPF. The experimental results show that the system detects unauthorized modifications in real time, labels each pod's trust state, and generates detailed audit logs. This work provides hardware-based continuous attestation for cloud native and edge deployments, strengthening the resilience of 5G as critical infrastructure in multi-vendor and mission-critical scenarios of 5G.

Paperid: 1391, https://arxiv.org/pdf/2510.00517.pdf

Abstract:
Differential Attention (DA) has been proposed as a refinement to standard attention, suppressing redundant or noisy context through a subtractive structure and thereby reducing contextual hallucination. While this design sharpens task-relevant focus, we show that it also introduces a structural fragility under adversarial perturbations. Our theoretical analysis identifies negative gradient alignment-a configuration encouraged by DA's subtraction-as the key driver of sensitivity amplification, leading to increased gradient norms and elevated local Lipschitz constants. We empirically validate this Fragile Principle through systematic experiments on ViT/DiffViT and evaluations of pretrained CLIP/DiffCLIP, spanning five datasets in total. These results demonstrate higher attack success rates, frequent gradient opposition, and stronger local sensitivity compared to standard attention. Furthermore, depth-dependent experiments reveal a robustness crossover: stacking DA layers attenuates small perturbations via depth-dependent noise cancellation, though this protection fades under larger attack budgets. Overall, our findings uncover a fundamental trade-off: DA improves discriminative focus on clean inputs but increases adversarial vulnerability, underscoring the need to jointly design for selectivity and robustness in future attention mechanisms.

Paperid: 1392, https://arxiv.org/pdf/2509.25430.pdf

Abstract:
The widespread availability of cellular devices introduces new threat vectors that allow users or attackers to bypass security policies and physical barriers and bring unauthorized devices into sensitive areas. These threats can arise from user non-compliance or deliberate actions aimed at data exfiltration/infiltration via hidden devices, drones, etc. We identify a critical gap in this context: the absence of low-latency systems for high-quality and instantaneous monitoring of cellular transmissions. Such low-latency systems are crucial to allow for timely detection, decision (e.g., geofencing or localization), and disruption of unauthorized communication in sensitive areas. Operator-based monitoring systems, built for purposes such as people counting or tracking, lack real-time capability, require cooperation across multiple operators, and thus are hard to deploy. Operator-independent monitoring approaches proposed in the literature either lack low-latency capabilities or do not scale. We propose LTag, the first low-latency, operator-independent and scalable system designed to monitor cellular connections across all operators prior to any user data transmission. LTag consists of several downlink sniffers and a distributed network of uplink sniffers that measure both downlink protocol information and uplink signal characteristics at multiple locations to gain a detailed spatial image of uplink signals. LTag aggregates the recorded information, processes it, and provides a decision about the connection all prior to connection establishment of a UE. To evaluate LTag, we deployed it in the context of geofencing, where LTag was able to determine if the signals originate from inside or outside of an area within 2.3 ms of the initial base station-to-device message, therefore enabling prompt and targeted suppression of communication before any user data was transmitted.

Paperid: 1393, https://arxiv.org/pdf/2509.24955.pdf

Abstract:
Proposer anonymity in Proof-of-Stake (PoS) blockchains is a critical concern due to the risk of targeted attacks such as malicious denial-of-service (DoS) and censorship attacks. While several Secret Single Leader Election (SSLE) mechanisms have been proposed to address these threats, their practical impact and trade-offs remain insufficiently explored. In this work, we present a unified experimental framework for evaluating SSLE mechanisms under adversarial conditions, grounded in a simplified yet representative model of Ethereum's PoS consensus layer. The framework includes configurable adversaries capable of launching targeted DoS and censorship attacks, including coordinated strategies that simultaneously compromise groups of validators. We simulate and compare key protection mechanisms - Whisk, and homomorphic sortition. To the best of our knowledge, this is the first comparative study to examine adversarial DoS scenarios involving multiple attackers under diverse protection mechanisms. Our results show that while both designs offer strong protection against targeted DoS attacks on the leader, neither defends effectively against coordinated attacks on validator groups. Moreover, Whisk simplifies a DoS attack by narrowing the target set from all validators to a smaller list of known candidates. Homomorphic sortition, despite its theoretical strength, remains impractical due to the complexity of cryptographic operations over large validator sets.

Paperid: 1394, https://arxiv.org/pdf/2509.24174.pdf

Abstract:
The Domain Name System (DNS) is involved in practically all web activity, translating easy-to-remember domain names into Internet Protocol (IP) addresses. Due to its central role on the Internet, DNS exposes user web activity in detail. The privacy challenge is honest-but-curious DNS servers/resolvers providing the translation/lookup service. In particular, with the majority of DNS queries handled by public DNS resolvers, the organizations running them can track, collect, and analyze massive user activity data. Existing solutions that encrypt DNS traffic between clients and resolvers are insufficient, as the resolver itself is the privacy threat. While DNS query relays separate duties among multiple entities, to limit the data accessible by each entity, they cannot prevent colluding entities from sharing user traffic logs. To achieve near-zero-trust DNS privacy compatible with the existing DNS infrastructure, we propose LLUAD: it locally stores a Popularity List, the most popular DNS records, on user devices, formed in a privacy-preserving manner based on user interests. In this way, LLUAD can both improve privacy and reduce access times to web content. The Popularity List is proactively retrieved from a (curious) public server that continually updates and refreshes the records based on user popularity votes, while efficiently broadcasting record updates/changes to adhere to aggressive load-balancing schemes (i.e., name servers actively load-balancing user connections by changing record IP addresses). User votes are anonymized using a novel, efficient, and highly scalable client-driven Voting Mix Network - with packet lengths independent of the number of hops, centrally enforced limit on number of votes cast per user, and robustness against poor client participation - to ensure a geographically relevant and correctly/securely instantiated Popularity List.

Paperid: 1395, https://arxiv.org/pdf/2509.24153.pdf

Abstract:
The Domain Name System (DNS) is central to all Internet user activity, resolving accessed domain names into Internet Protocol (IP) addresses. As a result, curious DNS resolvers can learn everything about Internet users' interests. Public DNS resolvers are rising in popularity, offering low-latency resolution, high reliability, privacy-preserving policies, and support for encrypted DNS queries. However, client-resolver traffic encryption, increasingly deployed to protect users from eavesdroppers, does not protect users against curious resolvers. Similarly, privacy-preserving policies are based solely on written commitments and do not provide technical safeguards. Although DNS query relay schemes can separate duties to limit data accessible by each entity, they cannot prevent colluding entities from sharing user traffic logs. Thus, a key challenge remains: organizations operating public DNS resolvers, accounting for the majority of DNS resolutions, can potentially collect and analyze massive volumes of Internet user activity data. With DNS infrastructure that cannot be fully trusted, can we safeguard user privacy? We answer positively and advocate for a user-driven approach to reduce exposure to DNS services. We will discuss key ideas of the proposal, which aims to achieve a high level of privacy without sacrificing performance: maintaining low latency, network bandwidth, memory/storage overhead, and computational overhead.

Paperid: 1396, https://arxiv.org/pdf/2509.23834.pdf

Abstract:
Differential privacy (DP) has become the gold standard for preserving individual privacy in data analysis. However, an implicit yet fundamental assumption underlying these rigorous privacy guarantees is the correct implementation and execution of DP mechanisms. Several incidents of unintended privacy loss have occurred due to numerical issues and inappropriate configurations of DP software, which have been successfully exploited in privacy attacks. To better understand the seriousness of defective DP software, we ask the following question: is it possible to elevate these passive defects into active privacy attacks while maintaining covertness? To address this question, we present the Gaussian pancake mechanism (GPM), a novel mechanism that is computationally indistinguishable from the widely used Gaussian mechanism (GM), yet exhibits arbitrarily weaker statistical DP guarantees. This unprecedented separation enables a new class of backdoor attacks: by indistinguishably passing off as the authentic GM, GPM can covertly degrade statistical privacy. Unlike the unintentional privacy loss caused by GM's numerical issues, GPM is an adversarial yet undetectable backdoor attack against data privacy. We formally prove GPM's covertness, characterize its statistical leakage, and demonstrate a concrete distinguishing attack that can achieve near-perfect success rates under suitable parameter choices, both theoretically and empirically. Our results underscore the importance of using transparent, open-source DP libraries and highlight the need for rigorous scrutiny and formal verification of DP implementations to prevent subtle, undetectable privacy compromises in real-world systems.

Paperid: 1397, https://arxiv.org/pdf/2509.23101.pdf

Abstract:
Blockchain Business applications and cryptocurrencies such as enable secure, decentralized value transfer, yet their pseudonymous nature creates opportunities for illicit activity, challenging regulators and exchanges in anti money laundering (AML) enforcement. Detecting fraudulent transactions in blockchain networks requires models that can capture both structural and temporal dependencies while remaining resilient to noise, imbalance, and adversarial behavior. In this work, we propose an ensemble framework that integrates Graph Convolutional Networks (GCN), Graph Attention Networks (GAT), and Graph Isomorphism Networks (GIN) to enhance blockchain fraud detection. Using the real-world Elliptic dataset, our tuned soft voting ensemble achieves high recall of illicit transactions while maintaining a false positive rate below 1%, beating individual GNN models and baseline methods. The modular architecture incorporates quantum-ready design hooks, allowing seamless future integration of quantum feature mappings and hybrid quantum classical graph neural networks. This ensures scalability, robustness, and long-term adaptability as quantum computing technologies mature. Our findings highlight ensemble GNNs as a practical and forward-looking solution for real-time cryptocurrency monitoring, providing both immediate AML utility and a pathway toward quantum-enhanced financial security analytics.

Paperid: 1398, https://arxiv.org/pdf/2509.22126.pdf

Abstract:
This paper introduces a novel watermarking method for diffusion models. It is based on guiding the diffusion process using the gradient computed from any off-the-shelf watermark decoder. The gradient computation encompasses different image augmentations, increasing robustness to attacks against which the decoder was not originally robust, without retraining or fine-tuning. Our method effectively convert any \textit{post-hoc} watermarking scheme into an in-generation embedding along the diffusion process. We show that this approach is complementary to watermarking techniques modifying the variational autoencoder at the end of the diffusion process. We validate the methods on different diffusion models and detectors. The watermarking guidance does not significantly alter the generated image for a given seed and prompt, preserving both the diversity and quality of generation.

Paperid: 1399, https://arxiv.org/pdf/2509.14589.pdf

Authors:Taesoo Kim, HyungSeok Han, Soyeon Park, Dae R. Jeong, Dohyeok Kim, Dongkwan Kim, Eunsoo Kim, Jiho Kim, Joshua Wang, Kangsu Kim, Sangwoo Ji, Woosun Song, Hanqing Zhao, Andrew Chin, Gyejin Lee, Kevin Stevens, Mansour Alharthi, Yizhuo Zhai, Cen Zhang, Joonun Jang, Yeongjin Jang, Ammar Askar, Dongju Kim, Fabian Fleischer, Jeongin Cho, Junsik Kim, Kyungjoon Ko, Insu Yun, Sangdon Park, Dowoo Baik, Haein Lee, Hyeon Heo, Minjae Gwon, Minjae Lee, Minwoo Baek, Seunggi Min, Wonyoung Kim, Yonghwi Jin, Younggi Park, Yunjae Choi, Jinho Jung, Gwanhyun Lee, Junyoung Jang, Kyuheon Kim, Yeonghyeon Cha, Youngjoon Kim

Abstract:
We present ATLANTIS, the cyber reasoning system developed by Team Atlanta that won 1st place in the Final Competition of DARPA's AI Cyber Challenge (AIxCC) at DEF CON 33 (August 2025). AIxCC (2023-2025) challenged teams to build autonomous cyber reasoning systems capable of discovering and patching vulnerabilities at the speed and scale of modern software. ATLANTIS integrates large language models (LLMs) with program analysis -- combining symbolic execution, directed fuzzing, and static analysis -- to address limitations in automated vulnerability discovery and program repair. Developed by researchers at Georgia Institute of Technology, Samsung Research, KAIST, and POSTECH, the system addresses core challenges: scaling across diverse codebases from C to Java, achieving high precision while maintaining broad coverage, and producing semantically correct patches that preserve intended behavior. We detail the design philosophy, architectural decisions, and implementation strategies behind ATLANTIS, share lessons learned from pushing the boundaries of automated security when program analysis meets modern AI, and release artifacts to support reproducibility and future research.

Paperid: 1400, https://arxiv.org/pdf/2509.12221.pdf

Abstract:
Large language models (LLMs) enforce safety alignment to reliably refuse malicious requests, yet the same blanket safeguards also block legitimate uses in policing, defense, and other high-stakes settings. Earlier "refusal-direction" edits can bypass those layers, but they rely on a single vector that indiscriminately unlocks all hazardous topics, offering no semantic control. We introduce Mutually Exclusive Unlock Vectors (MEUV), a lightweight framework that factorizes the monolithic refusal direction into topic-aligned, nearly orthogonal vectors, each dedicated to one sensitive capability. MEUV is learned in a single epoch with a multi-task objective that blends a differential-ablation margin, cross-topic and orthogonality penalties, and several auxiliary terms. On bilingual malicious-prompt benchmarks, MEUV achieves an attack success rate of no less than 87% on Gemma-2-2B, LLaMA-3-8B, and Qwen-7B, yet cuts cross-topic leakage by up to 90% compared with the best single-direction baseline. Vectors trained in Chinese transfer almost unchanged to English (and vice versa), suggesting a language-agnostic refusal subspace. The results show that fine-grained, topic-level capability activation is achievable with minimal utility loss, paving the way for controlled LLMs deployment in security-sensitive domains.

Paperid: 1401, https://arxiv.org/pdf/2509.10568.pdf

Abstract:
This work provides a detailed specification of the Smart Grid Modelling Language (SG-ML), which is designed for the automated generation of smart grid cyber ranges. SG-ML is defined as a set of XML schemas that describe a smart grid's configuration in both machine-readable and human-friendly ways, thereby bridging the gap between system modelling and automated deployment. Unlike prior ad-hoc approaches to cyber range design, SG-ML provides a unified methodology that integrates both power system and cyber network representations. The SG-ML model can be customized by users to meet specific requirements, such as emulating physical or cyber topologies and configuring network devices. An SG-ML Processor then parses this configured model to instantiate the cyber range environment. The modelling language leverages established standards like the IEC 61850 Substation Configuration Language (SCL) and IEC 61131 PLCopen XML to define power system topology, cyber network topology, and device configurations. This approach allows for the reuse of existing assets, reducing the effort needed to create the SG-ML model. To address gaps not covered by these standards such as attack injection parameters, scenario-specific metadata, and additional network constraints, SG-ML introduces proprietary schemas that complement standard models. Overall, SG-ML enables reproducible, scalable, and automated generation of realistic smart grid cyber ranges for research, training, and security assessment.

Paperid: 1402, https://arxiv.org/pdf/2509.09970.pdf

Abstract:
Large Language Models (LLMs) show promise in generating firmware for embedded systems, but often introduce security flaws and fail to meet real-time performance constraints. This paper proposes a three-phase methodology that combines LLM-based firmware generation with automated security validation and iterative refinement in a virtualized environment. Using structured prompts, models like GPT-4 generate firmware for networking and control tasks, deployed on FreeRTOS via QEMU. These implementations are tested using fuzzing, static analysis, and runtime monitoring to detect vulnerabilities such as buffer overflows (CWE-120), race conditions (CWE-362), and denial-of-service threats (CWE-400). Specialized AI agents for Threat Detection, Performance Optimization, and Compliance Verification collaborate to improve detection and remediation. Identified issues are categorized using CWE, then used to prompt targeted LLM-generated patches in an iterative loop. Experiments show a 92.4\% Vulnerability Remediation Rate (37.3\% improvement), 95.8\% Threat Model Compliance, and 0.87 Security Coverage Index. Real-time metrics include 8.6ms worst-case execution time and 195Î¼s jitter. This process enhances firmware security and performance while contributing an open-source dataset for future research.

Paperid: 1403, https://arxiv.org/pdf/2509.09900.pdf

Abstract:
We give novel lifting theorems for security games in the quantum random oracle model (QROM) in Noisy Intermediate-Scale Quantum (NISQ) settings such as the hybrid query model, the noisy oracle and the bounded-depth models. We provide, for the first time, a hybrid lifting theorem for hybrid algorithms that can perform both quantum and classical queries, as well as a lifting theorem for quantum algorithms with access to noisy oracles or bounded quantum depth. At the core of our results lies a novel measure-and-reprogram framework, called hybrid coherent measure-and-reprogramming, tailored specifically for hybrid algorithms. Equipped with the lifting theorem, we are able to prove directly NISQ security and complexity results by calculating a single combinatorial quantity, relying solely on classical reasoning. As applications, we derive the first direct product theorems in the average case, in the hybrid setting-i.e., an enabling tool to determine the hybrid hardness of solving multi-instance security games. This allows us to derive in a straightforward manner the NISQ hardness of various security games, such as (i) the non-uniform hardness of salted games, (ii) the hardness of specific cryptographic tasks such as the multiple instance version of one-wayness and collision-resistance, and (iii) uniform or non-uniform hardness of many other games.

Paperid: 1404, https://arxiv.org/pdf/2509.09896.pdf

Abstract:
We give a tighter lifting theorem for security games in the quantum random oracle model. At the core of our main result lies a novel measure-and-reprogram framework that we call coherent reprogramming. This framework gives a tighter lifting theorem for query complexity problems, that only requires purely classical reasoning. As direct applications of our lifting theorem, we first provide a quantum direct product theorem in the average case - i.e., an enabling tool to determine the hardness of solving multi-instance security games. This allows us to derive in a straightforward manner the hardness of various security games, for example (i) the non-uniform hardness of salted games, (ii) the hardness of specific cryptographic tasks such as the multiple instance version of one-wayness and collision-resistance, and (iii) uniform or non-uniform hardness of many other games.

Paperid: 1405, https://arxiv.org/pdf/2509.08804.pdf

Abstract:
The verification of differential privacy algorithms that employ Gaussian distributions is little understood. This paper tackles the challenge of verifying such programs by introducing a novel approach to approximating probability distributions of loop-free programs that sample from both discrete and continuous distributions with computable probability density functions, including Gaussian and Laplace. We establish that verifying $(Îµ,Î´)$-differential privacy for these programs is \emph{almost decidable}, meaning the problem is decidable for all values of $Î´$ except those in a finite set. Our verification algorithm is based on computing probabilities to any desired precision by combining integral approximations, and tail probability bounds. The proposed methods are implemented in the tool, DipApprox, using the FLINT library for high-precision integral computations, and incorporate optimizations to enhance scalability. We validate {\ourtool} on fundamental privacy-preserving algorithms, such as Gaussian variants of the Sparse Vector Technique and Noisy Max, demonstrating its effectiveness in both confirming privacy guarantees and detecting violations.

Paperid: 1406, https://arxiv.org/pdf/2509.08722.pdf

Abstract:
Privacy-preserving blockchain systems are essential for protecting transaction data, yet they must also provide auditability that enables auditors to recover participant identities and transaction amounts when warranted. Existing designs often compromise the independence of auditing and transactions, introducing extra interactions that undermine usability and scalability. Moreover, many auditable solutions depend on auditors serving as validators or recording nodes, which introduces risks to both data security and system reliability. To overcome these challenges, we propose SilentLedger, a privacy-preserving transaction system with auditing and complete non-interactivity. To support public verification of authorization, we introduce a renewable anonymous certificate scheme with formal semantics and a rigorous security model. SilentLedger further employs traceable transaction mechanisms constructed from established cryptographic primitives, enabling users to transact without interaction while allowing auditors to audit solely from on-chain data. We formally prove security properties including authenticity, anonymity, confidentiality, and soundness, provide a concrete instantiation, and evaluate performance under a standard 2-2 transaction model. Our implementation and benchmarks demonstrate that SilentLedger achieves superior performance compared with state-of-the-art solutions.

Paperid: 1407, https://arxiv.org/pdf/2509.07939.pdf

Abstract:
Recent advances in Large Language Models (LLMs) have driven interest in automating cybersecurity penetration testing workflows, offering the promise of faster and more consistent vulnerability assessment for enterprise systems. Existing LLM agents for penetration testing primarily rely on self-guided reasoning, which can produce inaccurate or hallucinated procedural steps. As a result, the LLM agent may undertake unproductive actions, such as exploiting unused software libraries or generating cyclical responses that repeat prior tactics. In this work, we propose a guided reasoning pipeline for penetration testing LLM agents that incorporates a deterministic task tree built from the MITRE ATT&CK Matrix, a proven penetration testing kll chain, to constrain the LLM's reaoning process to explicitly defined tactics, techniques, and procedures. This anchors reasoning in proven penetration testing methodologies and filters out ineffective actions by guiding the agent towards more productive attack procedures. To evaluate our approach, we built an automated penetration testing LLM agent using three LLMs (Llama-3-8B, Gemini-1.5, and GPT-4) and applied it to navigate 10 HackTheBox cybersecurity exercises with 103 discrete subtasks representing real-world cyberattack scenarios. Our proposed reasoning pipeline guided the LLM agent through 71.8\%, 72.8\%, and 78.6\% of subtasks using Llama-3-8B, Gemini-1.5, and GPT-4, respectively. Comparatively, the state-of-the-art LLM penetration testing tool using self-guided reasoning completed only 13.5\%, 16.5\%, and 75.7\% of subtasks and required 86.2\%, 118.7\%, and 205.9\% more model queries. This suggests that incorporating a deterministic task tree into LLM reasoning pipelines can enhance the accuracy and efficiency of automated cybersecurity assessments

Paperid: 1408, https://arxiv.org/pdf/2509.06402.pdf

Abstract:
On-device deep learning models have extensive real world demands. Deep learning compilers efficiently compile models into executables for deployment on edge devices, but these executables may face the threat of reverse engineering. Previous studies have attempted to decompile DNN executables, but they face challenges in handling compilation optimizations and analyzing quantized compiled models. In this paper, we present NeuroDeX to unlock diverse support in decompiling DNN executables. NeuroDeX leverages the semantic understanding capabilities of LLMs along with dynamic analysis to accurately and efficiently perform operator type recognition, operator attribute recovery and model reconstruction. NeuroDeX can recover DNN executables into high-level models towards compilation optimizations, different architectures and quantized compiled models. We conduct experiments on 96 DNN executables across 12 common DNN models. Extensive experimental results demonstrate that NeuroDeX can decompile non-quantized executables into nearly identical high-level models. NeuroDeX can recover functionally similar high-level models for quantized executables, achieving an average top-1 accuracy of 72%. NeuroDeX offers a more comprehensive and effective solution compared to previous DNN executables decompilers.

Paperid: 1409, https://arxiv.org/pdf/2509.05698.pdf

Abstract:
High-level natural language knowledge in CTI reports, such as the ATT&CK framework, is beneficial to counter APT attacks. However, how to automatically apply the high-level knowledge in CTI reports in realistic attack detection systems, such as provenance analysis systems, is still an open problem. The challenge stems from the semantic gap between the knowledge and the low-level security logs: while the knowledge in CTI reports is written in natural language, attack detection systems can only process low-level system events like file accesses or network IP manipulations. Manual approaches can be labor-intensive and error-prone. In this paper, we propose KnowHow, a CTI-knowledge-driven online provenance analysis approach that can automatically apply high-level attack knowledge from CTI reports written in natural languages to detect low-level system events. The core of KnowHow is a novel attack knowledge representation, gIoC, that represents the subject, object, and actions of attacks. By lifting system identifiers, such as file paths, in system events to natural language terms, KnowHow can match system events to gIoC and further match them to techniques described in natural languages. Finally, based on the techniques matched to system events, KnowHow reasons about the temporal logic of attack steps and detects potential APT attacks in system events. Our evaluation shows that KnowHow can accurately detect all 16 APT campaigns in the open-source and industrial datasets, while existing approaches all introduce large numbers of false positives. Meanwhile, our evaluation also shows that KnowHow reduces at most 90% of node-level false positives while having a higher node-level recall and is robust against several unknown attacks and mimicry attacks.

Paperid: 1410, https://arxiv.org/pdf/2509.04070.pdf

Abstract:
A fault can occur naturally or intentionally. However, intentionally injecting faults into hardware accelerators of Post-Quantum Cryptographic (PQC) algorithms may leak sensitive information. This intentional fault injection in side-channel attacks compromises the reliability of PQC implementations. The recently NIST-standardized key encapsulation mechanism (KEM), Kyber may also leak information at the hardware implementation level. This work proposes three efficient and lightweight recomputation-based fault detection methods for Barrett Reduction in the Cooley-Tukey Butterfly Unit (CT-BU) of Kyber on a Field Programmable Gate Array (FPGA). The CT-BU and Barrett Reduction are fundamental components in structured lattice-based PQC algorithms, including Kyber, NTRU, Falcon, CRYSTALS-Dilithium, etc. This paper introduces a new algorithm, Recomputation with Swapped Operand (RESWO), for fault detection. While Recomputation with Negated Operand (RENO) and Recomputation with Shifted Operand (RESO) are existing methods used in other PQC hardware algorithms. To the best of our knowledge, RENO and RESO have never been used in Barrett Reduction before. The proposed RESWO method consumes a similar number of slices compared to RENO and RESO. However, RESWO shows lesser delay compared to both RENO and RESO. The fault detection efficiency of RESWO, RENO, and RESO is nearly 100%.

Paperid: 1411, https://arxiv.org/pdf/2509.03367.pdf

Abstract:
Determining the optimal block size is crucial for achieving high throughput in blockchain systems. Many studies have focused on tuning various components, such as databases, network bandwidth, and consensus mechanisms. However, the impact of block size on system performance remains a topic of debate, often resulting in divergent views and even leading to new forks in blockchain networks. This research proposes a mathematical model to maximize performance by determining the ideal block size for Hyperledger Fabric, a prominent consortium blockchain. By leveraging machine learning and solving the model with a genetic algorithm, the proposed approach assesses how factors such as block size, transaction size, and network capacity influence the block processing time. The integration of an optimization solver enables precise adjustments to block size configuration before deployment, ensuring improved performance from the outset. This systematic approach aims to balance block processing efficiency, network latency, and system throughput, offering a robust solution to improve blockchain performance across diverse business contexts.

Paperid: 1412, https://arxiv.org/pdf/2509.03294.pdf

Abstract:
The increasing availability of personal data has enabled significant advances in fields such as machine learning, healthcare, and cybersecurity. However, this data abundance also raises serious privacy concerns, especially in light of powerful re-identification attacks and growing legal and ethical demands for responsible data use. Differential privacy (DP) has emerged as a principled, mathematically grounded framework for mitigating these risks. This review provides a comprehensive survey of DP, covering its theoretical foundations, practical mechanisms, and real-world applications. It explores key algorithmic tools and domain-specific challenges - particularly in privacy-preserving machine learning and synthetic data generation. The report also highlights usability issues and the need for improved communication and transparency in DP systems. Overall, the goal is to support informed adoption of DP by researchers and practitioners navigating the evolving landscape of data privacy.

Paperid: 1413, https://arxiv.org/pdf/2508.21654.pdf

Abstract:
Model stealing attacks endanger the confidentiality of machine learning models offered as a service. Although these models are kept secret, a malicious party can query a model to label data samples and train their own substitute model, violating intellectual property. While novel attacks in the field are continually being published, their design and evaluations are not standardised, making it challenging to compare prior works and assess progress in the field. This paper is the first to address this gap by providing recommendations for designing and evaluating model stealing attacks. To this end, we study the largest group of attacks that rely on training a substitute model -- those attacking image classification models. We propose the first comprehensive threat model and develop a framework for attack comparison. Further, we analyse attack setups from related works to understand which tasks and models have been studied the most. Based on our findings, we present best practices for attack development before, during, and beyond experiments and derive an extensive list of open research questions regarding the evaluation of model stealing attacks. Our findings and recommendations also transfer to other problem domains, hence establishing the first generic evaluation methodology for model stealing attacks.

Paperid: 1414, https://arxiv.org/pdf/2508.20412.pdf

Abstract:
The Model Context Protocol (MCP) is increasingly adopted to standardize the interaction between LLM agents and external tools. However, this trend introduces a new threat: Tool Poisoning Attacks (TPA), where tool metadata is poisoned to induce the agent to perform unauthorized operations. Existing defenses that primarily focus on behavior-level analysis are fundamentally ineffective against TPA, as poisoned tools need not be executed, leaving no behavioral trace to monitor. Thus, we propose MindGuard, a decision-level guardrail for LLM agents, providing provenance tracking of call decisions, policy-agnostic detection, and poisoning source attribution against TPA. While fully explaining LLM decision remains challenging, our empirical findings uncover a strong correlation between LLM attention mechanisms and tool invocation decisions. Therefore, we choose attention as an empirical signal for decision tracking and formalize this as the Decision Dependence Graph (DDG), which models the LLM's reasoning process as a weighted, directed graph where vertices represent logical concepts and edges quantify the attention-based dependencies. We further design robust DDG construction and graph-based anomaly analysis mechanisms that efficiently detect and attribute TPA attacks. Extensive experiments on real-world datasets demonstrate that MindGuard achieves 94\%-99\% average precision in detecting poisoned invocations, 95\%-100\% attribution accuracy, with processing times under one second and no additional token cost. Moreover, DDG can be viewed as an adaptation of the classical Program Dependence Graph (PDG), providing a solid foundation for applying traditional security policies at the decision level.

Paperid: 1415, https://arxiv.org/pdf/2508.20212.pdf

Abstract:
Applying deep learning to malware detection has drawn great attention due to its notable performance. With the increasing prevalence of cyberattacks targeting IoT devices, there is a parallel rise in the development of malware across various Instruction Set Architectures (ISAs). It is thus important to extend malware detection capacity to multiple ISAs. However, training a deep learning-based malware detection model usually requires a large number of labeled malware samples. The process of collecting and labeling sufficient malware samples to build datasets for each ISA is labor-intensive and time-consuming. To reduce the burden of data collection, we propose to leverage the ideas of Neural Machine Translation (NMT) and Normalizing Flows (NFs) for malware detection. Specifically, when dealing with malware in a certain ISA, we translate it to an ISA with sufficient malware samples (like X86-64). This allows us to apply a model trained on one ISA to analyze malware from another ISA. Our approach reduces the data collection effort by enabling malware detection across multiple ISAs using a model trained on a single ISA.

Paperid: 1416, https://arxiv.org/pdf/2508.19825.pdf

Abstract:
The privacy community has a long track record of investigating emerging types of web tracking techniques. Recent work has focused on compliance of web trackers with new privacy laws such as Europe's GDPR and California's CCPA. Despite the growing body of research documenting widespread lack of compliance with new privacy laws, there is a lack of robust enforcement. Different from prior work, we conduct a tech-law analysis to map decades-old U.S. laws about interception of electronic communications--so-called wiretapping--to web tracking. Bridging the tech-law gap for older wiretapping laws is important and timely because, in cases where legal harm to privacy is proven, they can provide statutory private right of action, are at the forefront of recent privacy enforcement, and could ultimately lead to a meaningful change in the web tracking landscape. In this paper, we focus on a particularly invasive tracking technique: the use of JavaScript event listeners by third-party trackers for real-time keystroke interception on websites. We use an instrumented web browser to crawl a sample of the top-million websites to investigate the use of event listeners that aligns with the criteria for wiretapping, according to U.S. wiretapping law at the federal level and in California. We find evidence that 38.52% websites installed third-party event listeners to intercept keystrokes, and that at least 3.18% websites transmitted intercepted information to a third-party server, which aligns with the criteria for wiretapping. We further find evidence that the intercepted information such as email addresses typed into form fields are used for unsolicited email marketing. Beyond our work that maps the intersection between technical measurement and U.S. wiretapping law, additional future legal research is required to determine when the wiretapping observed in our paper passes the threshold for illegality.

Paperid: 1417, https://arxiv.org/pdf/2508.18750.pdf

Abstract:
The development of blockchain technology has significantly enhanced the security and transparency of personal information and transaction records. Concurrent with the advancement of blockchain technology and the emergence of the digital currency ecosystem, the internet has evolved from a paradigm dominated by information flow to one driven by value flow. Consequently, the concept of token has gained widespread dissemination, and the electronic token under investigation in this thesis is a development of this concept. The application of electronic tokens has become pervasive with the development of the internet, but the functionality of these tokens is often limited, and issues related to trust remain significant challenges. This study proposes innovative solutions to address the deficiencies in traditional electronic token systems, including the issuance of tokens, the heterogeneity of issuance standards, and the lack of democracy. The solutions are based on distributed storage, anti-tampering mechanisms, consensus protocols, and transparent, traceable storage. Additionally, the study employs NFTs, smart contracts, and RSA blind signatures, among other key technologies, to construct a system based on blockchain technology. The process integrates the decentralised management and centralised operation models, aligning them with the national policy directives. The developed solution enables the full utilisation of blockchain technology's advantages while also fostering community participation. Consequently, it establishes a secure, legal, reliable, and dynamic electronic certification system.

Paperid: 1418, https://arxiv.org/pdf/2508.16843.pdf

Abstract:
Voice authentication has undergone significant changes from traditional systems that relied on handcrafted acoustic features to deep learning models that can extract robust speaker embeddings. This advancement has expanded its applications across finance, smart devices, law enforcement, and beyond. However, as adoption has grown, so have the threats. This survey presents a comprehensive review of the modern threat landscape targeting Voice Authentication Systems (VAS) and Anti-Spoofing Countermeasures (CMs), including data poisoning, adversarial, deepfake, and adversarial spoofing attacks. We chronologically trace the development of voice authentication and examine how vulnerabilities have evolved in tandem with technological advancements. For each category of attack, we summarize methodologies, highlight commonly used datasets, compare performance and limitations, and organize existing literature using widely accepted taxonomies. By highlighting emerging risks and open challenges, this survey aims to support the development of more secure and resilient voice authentication systems.

Paperid: 1419, https://arxiv.org/pdf/2508.16765.pdf

Abstract:
The interactive nature of Large Language Models (LLMs), which closely track user data and context, has prompted users to share personal and private information in unprecedented ways. Even when users opt out of allowing their data to be used for training, these privacy settings offer limited protection when LLM providers operate in jurisdictions with weak privacy laws, invasive government surveillance, or poor data security practices. In such cases, the risk of sensitive information, including Personally Identifiable Information (PII), being mishandled or exposed remains high. To address this, we propose the concept of an "LLM gatekeeper", a lightweight, locally run model that filters out sensitive information from user queries before they are sent to the potentially untrustworthy, though highly capable, cloud-based LLM. Through experiments with human subjects, we demonstrate that this dual-model approach introduces minimal overhead while significantly enhancing user privacy, without compromising the quality of LLM responses.

Paperid: 1420, https://arxiv.org/pdf/2508.14519.pdf

Abstract:
Security has always been a priority, for researchers, service providers and network operators when it comes to radio access networks (RAN). One wireless access approach that has captured attention is blockchain enabled RAN (B-RAN) due to its secure nature. This research introduces a framework that integrates blockchain technology into RAN while also addressing the limitations of state-of-the-art models. The proposed framework utilizes queuing and Markov chain theory to model the aspects of B-RAN. An extensive evaluation of the models performance is provided, including an analysis of timing factors and a focused assessment of its security aspects. The results demonstrate reduced latency and comparable security making the presented framework suitable for diverse application scenarios.

Paperid: 1421, https://arxiv.org/pdf/2508.13690.pdf

Abstract:
Biometric authentication using physiological signals offers a promising path toward secure and user-friendly access control in wearable devices. While electrocardiogram (ECG) signals have shown high discriminability, their intrusive sensing requirements and discontinuous acquisition limit practicality. Photoplethysmography (PPG), on the other hand, enables continuous, non-intrusive authentication with seamless integration into wrist-worn wearable devices. However, most prior work relies on high-frequency PPG (e.g., 75 - 500 Hz) and complex deep models, which incur significant energy and computational overhead, impeding deployment in power-constrained real-world systems. In this paper, we present the first real-world implementation and evaluation of a continuous authentication system on a smartwatch, We-Be Band, using low-frequency (25 Hz) multi-channel PPG signals. Our method employs a Bi-LSTM with attention mechanism to extract identity-specific features from short (4 s) windows of 4-channel PPG. Through extensive evaluations on both public datasets (PTTPPG) and our We-Be Dataset (26 subjects), we demonstrate strong classification performance with an average test accuracy of 88.11%, macro F1-score of 0.88, False Acceptance Rate (FAR) of 0.48%, False Rejection Rate (FRR) of 11.77%, and Equal Error Rate (EER) of 2.76%. Our 25 Hz system reduces sensor power consumption by 53% compared to 512 Hz and 19% compared to 128 Hz setups without compromising performance. We find that sampling at 25 Hz preserves authentication accuracy, whereas performance drops sharply at 20 Hz while offering only trivial additional power savings, underscoring 25 Hz as the practical lower bound. Additionally, we find that models trained exclusively on resting data fail under motion, while activity-diverse training improves robustness across physiological states.

Paperid: 1422, https://arxiv.org/pdf/2508.12470.pdf

Abstract:
The increased Internet of Medical Things IoMT and the Industrial Internet of Things IIoT interconnectivity has introduced complex cybersecurity challenges, exposing sensitive data, patient safety, and industrial operations to advanced cyber threats. To mitigate these risks, this paper introduces a novel transformer-based intrusion detection system IDS, termed BiGAT-ID a hybrid model that combines bidirectional gated recurrent units BiGRU, long short-term memory LSTM networks, and multi-head attention MHA. The proposed architecture is designed to effectively capture bidirectional temporal dependencies, model sequential patterns, and enhance contextual feature representation. Extensive experiments on two benchmark datasets, CICIoMT2024 medical IoT and EdgeIIoTset industrial IoT demonstrate the model's cross-domain robustness, achieving detection accuracies of 99.13 percent and 99.34 percent, respectively. Additionally, the model exhibits exceptional runtime efficiency, with inference times as low as 0.0002 seconds per instance in IoMT and 0.0001 seconds in IIoT scenarios. Coupled with a low false positive rate, BiGAT-ID proves to be a reliable and efficient IDS for deployment in real-world heterogeneous IoT environments

Paperid: 1423, https://arxiv.org/pdf/2508.12035.pdf

Abstract:
In recent years, generative artificial intelligence (GenAI) has demonstrated remarkable capabilities in high-stakes domains such as molecular science. However, challenges related to the verifiability and structural privacy of its outputs remain largely unresolved. This paper focuses on the task of molecular toxicity repair. It proposes a structure-private verification framework - ToxiEval-ZKP - which, for the first time, introduces zero-knowledge proof (ZKP) mechanisms into the evaluation process of this task. The system enables model developers to demonstrate to external verifiers that the generated molecules meet multidimensional toxicity repair criteria, without revealing the molecular structures themselves. To this end, we design a general-purpose circuit compatible with both classification and regression tasks, incorporating evaluation logic, Poseidon-based commitment hashing, and a nullifier-based replay prevention mechanism to build a complete end-to-end ZK verification system. Experimental results demonstrate that ToxiEval-ZKP facilitates adequate validation under complete structural invisibility, offering strong circuit efficiency, security, and adaptability, thereby opening up a novel paradigm for trustworthy evaluation in generative scientific tasks.

Paperid: 1424, https://arxiv.org/pdf/2508.11472.pdf

Abstract:
Insider threat detection aims to identify malicious user behavior by analyzing logs that record user interactions. Due to the lack of fine-grained behavior-level annotations, detecting specific behavior-level anomalies within user behavior sequences is challenging. Unsupervised methods face high false positive rates and miss rates due to the inherent ambiguity between normal and anomalous behaviors. In this work, we instead introduce weak labels of behavior sequences, which have lower annotation costs, i.e., the training labels (anomalous or normal) are at sequence-level instead of behavior-level, to enhance the detection capability for behavior-level anomalies by learning discriminative features. To achieve this, we propose a novel framework called Robust Multi-sphere Learning (RMSL). RMSL uses multiple hyper-spheres to represent the normal patterns of behaviors. Initially, a one-class classifier is constructed as a good anomaly-supervision-free starting point. Building on this, using multiple instance learning and adaptive behavior-level self-training debiasing based on model prediction confidence, the framework further refines hyper-spheres and feature representations using weak sequence-level labels. This approach enhances the model's ability to distinguish between normal and anomalous behaviors. Extensive experiments demonstrate that RMSL significantly improves the performance of behavior-level insider threat detection.

Paperid: 1425, https://arxiv.org/pdf/2508.09021.pdf

Abstract:
As large language models are increasingly deployed in sensitive environments, fingerprinting attacks pose significant privacy and security risks. We present a study of LLM fingerprinting from both offensive and defensive perspectives. Our attack methodology uses reinforcement learning to automatically optimize query selection, achieving better fingerprinting accuracy with only 3 queries compared to randomly selecting 3 queries from the same pool. Our defensive approach employs semantic-preserving output filtering through a secondary LLM to obfuscate model identity while maintaining semantic integrity. The defensive method reduces fingerprinting accuracy across tested models while preserving output quality. These contributions show the potential to improve fingerprinting tools capabilities while providing practical mitigation strategies against fingerprinting attacks.

Paperid: 1426, https://arxiv.org/pdf/2508.07944.pdf

Abstract:
Despite growing attention to deepfake speech detection, the aspects of bias and fairness remain underexplored in the speech domain. To address this gap, we introduce the Speaker Characteristics Deepfake (SCDF) dataset: a novel, richly annotated resource enabling systematic evaluation of demographic biases in deepfake speech detection. SCDF contains over 237,000 utterances in a balanced representation of both male and female speakers spanning five languages and a wide age range. We evaluate several state-of-the-art detectors and show that speaker characteristics significantly influence detection performance, revealing disparities across sex, language, age, and synthesizer type. These findings highlight the need for bias-aware development and provide a foundation for building non-discriminatory deepfake detection systems aligned with ethical and regulatory standards.

Paperid: 1427, https://arxiv.org/pdf/2508.06789.pdf

Abstract:
Federated Unlearning (FU) has emerged as a promising solution to respond to the right to be forgotten of clients, by allowing clients to erase their data from global models without compromising model performance. Unfortunately, researchers find that the parameter variations of models induced by FU expose clients' data information, enabling attackers to infer the label of unlearning data, while label inference attacks against FU remain unexplored. In this paper, we introduce and analyze a new privacy threat against FU and propose a novel label inference attack, ULIA, which can infer unlearning data labels across three FU levels. To address the unique challenges of inferring labels via the models variations, we design a gradient-label mapping mechanism in ULIA that establishes a relationship between gradient variations and unlearning labels, enabling inferring labels on accumulated model variations. We evaluate ULIA on both IID and non-IID settings. Experimental results show that in the IID setting, ULIA achieves a 100% Attack Success Rate (ASR) under both class-level and client-level unlearning. Even when only 1% of a user's local data is forgotten, ULIA still attains an ASR ranging from 93% to 62.3%.

Paperid: 1428, https://arxiv.org/pdf/2508.03588.pdf

Abstract:
Static analysis, a fundamental technique in Android app examination, enables the extraction of control flows, data flows, and inter-component communications (ICCs), all of which are essential for malware detection. However, existing methods struggle to leverage the semantic complementarity across different types of flows for representing program behaviors, and their context-unaware nature further hinders the accuracy of cross-flow semantic integration. We propose and implement MalFlows, a novel technique that achieves context-aware fusion of heterogeneous flow semantics for Android malware detection. Our goal is to leverage complementary strengths of the three types of flow-related information for precise app profiling. We adopt a heterogeneous information network (HIN) to model the rich semantics across these program flows. We further propose flow2vec, a context-aware HIN embedding technique that distinguishes the semantics of HIN entities as needed based on contextual constraints across different flows and learns accurate app representations through the joint use of multiple meta-paths. The representations are finally fed into a channel-attention-based deep neural network for malware classification. To the best of our knowledge, this is the first study to comprehensively aggregate the strengths of diverse flow-related information for assessing maliciousness within apps. We evaluate MalFlows on a large-scale dataset comprising over 20 million flow instances extracted from more than 31,000 real-world apps. Experimental results demonstrate that MalFlows outperforms representative baselines in Android malware detection, and meanwhile, validate the effectiveness of flow2vec in accurately learning app representations from the HIN constructed over the heterogeneous flows.

Paperid: 1429, https://arxiv.org/pdf/2508.02523.pdf

Abstract:
Technological advancements have revolutionized numerous industries, including transportation. While digitalization, automation, and connectivity have enhanced safety and efficiency, they have also introduced new vulnerabilities. With 95% of data breaches attributed to human error, promoting cybersecurity awareness in transportation is increasingly critical. Despite numerous cyberattacks on transportation systems worldwide, comprehensive and centralized records of these incidents remain scarce. To address this gap and enhance cyber awareness, this paper presents a large language model (LLM) based approach to extract and organize transportation related cyber incidents from publicly available datasets. A key contribution of this work is the use of generative AI to transform unstructured, heterogeneous cyber incident data into structured formats. Incidents were sourced from the Center for Strategic & International Studies (CSIS) List of Significant Cyber Incidents, the University of Maryland Cyber Events Database (UMCED), the European Repository of Cyber Incidents (EuRepoC), the Maritime Cyber Attack Database (MCAD), and the U.S. DOT Transportation Cybersecurity and Resiliency (TraCR) Examples of Cyber Attacks in Transportation (2018 to 2022). These were classified by a fine tuned LLM into five transportation modes: aviation, maritime, rail, road, and multimodal, forming a transportation specific cyber incident database. Another key contribution of this work is the development of a Retrieval Augmented Generation question answering system, designed to enhance accessibility and practical use by enabling users to query the curated database for specific details on transportation related cyber incidents. By leveraging LLMs for both data extraction and user interaction, this study contributes a novel, accessible tool for improving cybersecurity awareness in the transportation sector.

Paperid: 1430, https://arxiv.org/pdf/2508.01059.pdf

Abstract:
Large language models (LLMs) have shown remarkable success across many domains, yet their integration into cybersecurity applications remains limited due to a lack of general-purpose cybersecurity data, representational complexity, and safety and regulatory concerns. To address this gap, we previously introduced Foundation-Sec-8B, a cybersecurity-focused LLM suitable for fine-tuning on downstream tasks. That model, however, was not designed for chat-style interactions or instruction-following. In this report, we release Foundation-Sec-8B-Instruct: a model specifically trained for general-purpose cybersecurity dialogue. Built on Foundation-Sec-8B, it combines domain-specific knowledge with instruction-following, conversational capabilities, and alignment with human preferences to produce high-quality, relevant responses. Comprehensive evaluations show that Foundation-Sec-8B-Instruct outperforms Llama 3.1-8B-Instruct on a range of cybersecurity tasks while matching its instruction-following performance. It is also competitive with GPT-4o-mini on cyber threat intelligence and instruction-following tasks. We envision Foundation-Sec-8B-Instruct becoming an indispensable assistant in the daily workflows of cybersecurity professionals. We release the model publicly at https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Instruct.

Paperid: 1431, https://arxiv.org/pdf/2507.21591.pdf

Abstract:
Steganalysis methods based on deep learning (DL) often struggle with computational complexity and challenges in generalizing across different datasets. Incorporating a graph neural network (GNN) into steganalysis schemes enables the leveraging of relational data for improved detection accuracy and adaptability. This paper presents the first application of a Graph Neural Network (GNN), specifically the GraphSAGE architecture, for steganalysis of compressed voice over IP (VoIP) speech streams. The method involves straightforward graph construction from VoIP streams and employs GraphSAGE to capture hierarchical steganalysis information, including both fine grained details and high level patterns, thereby achieving high detection accuracy. Experimental results demonstrate that the developed approach performs well in uncovering quantization index modulation (QIM)-based steganographic patterns in VoIP signals. It achieves detection accuracy exceeding 98 percent even for short 0.5 second samples, and 95.17 percent accuracy under challenging conditions with low embedding rates, representing an improvement of 2.8 percent over the best performing state of the art methods. Furthermore, the model exhibits superior efficiency, with an average detection time as low as 0.016 seconds for 0.5-second samples an improvement of 0.003 seconds. This makes it efficient for online steganalysis tasks, providing a superior balance between detection accuracy and efficiency under the constraint of short samples with low embedding rates.

Paperid: 1432, https://arxiv.org/pdf/2507.18249.pdf

Abstract:
Digitalization of power grids have made them increasingly susceptible to cyber-attacks in the past decade. Iterative cybersecurity testing is indispensable to counter emerging attack vectors and to ensure dependability of critical infrastructure. Furthermore, these can be used to evaluate cybersecurity configuration, effectiveness of the cybersecurity measures against various attack vectors, as well as to train smart grid cybersecurity experts defending the system. Enabling extensive experiments narrows the gap between academic research and production environment. A high-fidelity cyber range is vital as it is often infeasible to conduct such experiments and training using production environment. However, the design and implementation of cyber range requires extensive domain knowledge of physical and cyber aspect of the infrastructure. Furthermore, costs incurred for setup and maintenance of cyber range are significant. Moreover, most existing smart grid cyber ranges are designed as a one-off, proprietary system, and are limited in terms of configurability, accessibility, portability, and reproducibility. To address these challenges, an automated Smart grid Cyber Range generation framework is presented in this paper. Initially a human-/machine-friendly, XML-based modeling language called Smart Grid Modeling Language was defined, which incorporates IEC 61850 System Configuration Language files. Subsequently, a toolchain to parse SG-ML model files and automatically instantiate a functional smart grid cyber range was developed. The developed SG-ML models can be easily shared and/or modified to reproduce or customize for any cyber range. The application of Auto-SGCR is demonstrated through case studies with large-scale substation models. The toolchain along with example SG-ML models have been open-sourced.

Paperid: 1433, https://arxiv.org/pdf/2507.17850.pdf

Abstract:
The deployment of large-scale software-based 5G core functions presents significant challenges due to their reliance on optimized and intelligent resource provisioning for their services. Many studies have focused on analyzing the impact of resource allocation for complex deployments using mathematical models, queue theories, or even Artificial Intelligence (AI). This paper elucidates the effects of chaotic workloads, generated by Distributed Denial of Service (DDoS) on different Network Functions (NFs) on User Equipment registration performance. Our findings highlight the necessity of diverse resource profiles to ensure Service-Level Agreement (SLA) compliance in large-scale 5G core deployments. Additionally, our analysis of packet capture approaches demonstrates the potential of kernel-based monitoring for scalable security threat defense. Finally, our empirical evaluation provides insights into the effective deployment of 5G NFs in complex scenarios.

Paperid: 1434, https://arxiv.org/pdf/2507.15163.pdf

Abstract:
Evolving security vulnerabilities and shifting operational conditions require frequent updates to network security policies. These updates include adjustments to incident response procedures and modifications to access controls, among others. Reinforcement learning methods have been proposed for automating such policy adaptations, but most of the methods in the research literature lack performance guarantees and adapt slowly to changes. In this paper, we address these limitations and present a method for computing security policies that is scalable, offers theoretical guarantees, and adapts quickly to changes. It assumes a model or simulator of the system and comprises three components: belief estimation through particle filtering, offline policy computation through aggregation, and online policy adaptation through rollout. Central to our method is a new feature-based aggregation technique, which improves scalability and flexibility. We analyze the approximation error of aggregation and show that rollout efficiently adapts policies to changes under certain conditions. Simulations and testbed results demonstrate that our method outperforms state-of-the-art methods on several benchmarks, including CAGE-2.

Paperid: 1435, https://arxiv.org/pdf/2507.14322.pdf

Abstract:
Federated Learning (FL) offers a paradigm for privacy-preserving collaborative AI, but its decentralized nature creates significant vulnerabilities to model poisoning attacks. While numerous static defenses exist, their effectiveness is highly context-dependent, often failing against adaptive adversaries or in heterogeneous data environments. This paper introduces FedStrategist, a novel meta-learning framework that reframes robust aggregation as a real-time, cost-aware control problem. We design a lightweight contextual bandit agent that dynamically selects the optimal aggregation rule from an arsenal of defenses based on real-time diagnostic metrics. Through comprehensive experiments, we demonstrate that no single static rule is universally optimal. We show that our adaptive agent successfully learns superior policies across diverse scenarios, including a ``Krum-favorable" environment and against a sophisticated "stealth" adversary designed to neutralize specific diagnostic signals. Critically, we analyze the paradoxical scenario where a non-robust baseline achieves high but compromised accuracy, and demonstrate that our agent learns a conservative policy to prioritize model integrity. Furthermore, we prove the agent's policy is controllable via a single "risk tolerance" parameter, allowing practitioners to explicitly manage the trade-off between performance and security. Our work provides a new, practical, and analyzable approach to creating resilient and intelligent decentralized AI systems.

Paperid: 1436, https://arxiv.org/pdf/2507.11500.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable generative capabilities. However, their susceptibility to misuse has raised significant safety concerns. While post-training safety alignment methods have been widely adopted, LLMs remain vulnerable to malicious instructions that can bypass safety constraints. Recent efforts have introduced inference-time safety reasoning (system-2 alignment), where LLMs conduct a reasoning process to perform safety verification before final response. We show, however, that these checks are driven by ad-hoc reasoning that diverges from the structured human process, where they first discern a user's true intent, then evaluate the associated risk based on the true intent. Consequently, these defenses remain vulnerable to sophisticated jailbreak prompts that cloak harmful goals in seemingly benign language. To build secure and safe LLMs, we propose a reasoning-based safety alignment framework, ARMOR, that replaces the ad-hoc chains of thought reasoning process with human-aligned, structured one. At inference, ARMOR (1) detects likely jailbreak strategies, (2) extracts the user's core intent while discarding deceptive instructions, and (3) applies a policy-grounded safety analysis to the purified request. ARMOR is evaluated on adaptive jailbreak attacks and multiple safety benchmarks, and a test-time scaling is conducted to further improve its performance. Results demonstrate that ARMOR significantly enhances the robustness against state-of-the-art adaptive jailbreak attacks and outperforms recent reasoning-based aligned models across various safety benchmarks.

Paperid: 1437, https://arxiv.org/pdf/2507.06500.pdf

Abstract:
Due to the broadcast nature of wireless communications, physical-layer security has attracted increasing concerns from both academia and industry. Artificial noise (AN), as one of the promising physical-layer security techniques, is capable of utilizing the spatial degree-of-freedom of channels to effectively enhance the security of wireless communications. In contrast to other physicallayer security techniques, the key distinguishing feature of AN is to generate specific interfering signals according to channel characteristics, increasing the secrecy capacity by reducing the wiretap channel capacity without affecting the legitimate channel capacity. Hence, this paper provides the latest survey of AN, including its evolution, modeling, backgrounds, applications, and future trends. Initially, we introduce the development, fundamentals, and backgrounds of AN. Subsequently, we highlight a comprehensive survey of the current state of research on various AN-empowered scenarios and AN-combined technologies. Finally, we discuss some technical challenges to tackle for AN-aided wireless security in the future.

Paperid: 1438, https://arxiv.org/pdf/2507.03773.pdf

Abstract:
Modern processors are equipped with single instruction multiple data (SIMD) instructions for fine-grained data parallelism. Compiler auto-vectorization techniques that target SIMD instructions face performance limitations due to insufficient information available at compile time, requiring programmers to manually manipulate SIMD instructions. SIMD intrinsics, a type of built-in function provided by modern compilers, enable programmers to manipulate SIMD instructions within high-level programming languages. Bugs in compilers for SIMD intrinsics can introduce potential threats to software security, producing unintended calculation results, data loss, program crashes, etc. To detect bugs in compilers for SIMD intrinsics, we propose RVISmith, a randomized fuzzer that generates well-defined C programs that include various invocation sequences of RVV (RISC-V Vector Extension) intrinsics. We design RVISmith to achieve the following objectives: (i) achieving high intrinsic coverage, (ii) improving sequence variety, and (iii) without known undefined behaviors. We implement RVISmith based on the ratified RVV intrinsic specification and evaluate our approach with three modern compilers: GCC, LLVM, and XuanTie. Experimental results show that RVISmith achieves 11.5 times higher intrinsic coverage than the state-of-the-art fuzzer for RVV intrinsics. By differential testing that compares results across different compilers, optimizations, and equivalent programs, we detect and report 13 previously unknown bugs of the three compilers under test to date. Of these bugs, 10 are confirmed and another 3 are fixed by the compiler developers.

Paperid: 1439, https://arxiv.org/pdf/2512.21737.pdf

Abstract:
This paper demonstrates a power analysis-based Side-Channel Analysis (SCA) attack on the SNOW-V encryption algorithm, which is a 5G mobile communication security standard candidate. Implemented on an STM32 microcontroller, power traces captured with a ChipWhisperer board were analyzed, with Test Vector Leakage Assessment (TVLA) confirming exploitable leakage. Profiling attacks using Linear Discriminant Analysis (LDA) and Fully Connected Neural Networks (FCN) achieved efficient key recovery, with FCN achieving > 5X lower minimum traces to disclosure (MTD) compared to the state-of-the-art Correlational Power Analysis (CPA) assisted with LDA. The results highlight the vulnerability of SNOW-V to machine learning-based SCA and the need for robust countermeasures.

Paperid: 1440, https://arxiv.org/pdf/2512.21663.pdf

Abstract:
Passwordless authentication has revolutionized the way we authenticate across various websites and services. FIDO2 Passkeys, is one of the most-widely adopted standards of passwordless authentication that promises phishing-resistance. However, like any other authentication system, passkeys require the user details to be saved on a centralized server, also known as Relying Party (RP) Server. This has led users to create a new passkey for every new online account. While this just works for a limited number of online accounts, the limited storage space of secure storage modules like TPM or a physical security key limits the number of passkeys a user can have. For example, Yubico Yubikey 5 (firmware 5.0 - 5.6) offers to store only 25 passkeys, while firmware 5.7+ allows to store upto 100 [1]. To overcome this problem, one of the widely adopted approaches is to use Federated Authentication with Single Sign On (SSO). This allows the user to create a passkey for the Identity Provider (IdP) and use the IdP to authenticate to all service providers. This proves to be a significant privacy risk since the IdP can potentially track users across different services. To overcome these limitations, this paper introduces a novel standard 'Verifiable Passkey' that allows the user to use Passkeys created for a Verifiable Credential issuer across any platform without risking privacy or user tracking.

Paperid: 1441, https://arxiv.org/pdf/2512.19945.pdf

Abstract:
Securing Internet of Things (IoT) firmware remains difficult due to proprietary binaries, stripped symbols, heterogeneous architectures, and limited access to executable code. Existing analysis methods, such as static analysis, symbolic execution, and fuzzing, depend on binary visibility and functional emulation, making them unreliable when firmware is encrypted or inaccessible. To address this limitation, we propose a binary-free, architecture-agnostic solution that estimates the likelihood of conceptual zero-day vulnerabilities using only high-level descriptors. The approach integrates a tri-LLM reasoning architecture combining a LLaMA-based configuration interpreter, a DeepSeek-based structural abstraction analyzer, and a GPT-4o semantic fusion model. The solution also incorporates LLM computational signatures, including latency patterns, uncertainty markers, and reasoning depth indicators, as well as an energy-aware symbolic load model, to enhance interpretability and operational feasibility. In addition, we formally derive the mathematical foundations of the reasoning pipeline, establishing monotonicity, divergence, and energy-risk coupling properties that theoretically justify the model's behavior. Simulation-based evaluation reveals that high exposure conditions increase the predicted zero-day likelihood by 20 to 35 percent across models, with GPT-4o demonstrating the strongest cross-layer correlations and the highest sensitivity. Energy and divergence metrics significantly predict elevated risk (p < 0.01), reinforcing the effectiveness of the proposed reasoning framework.

Paperid: 1442, https://arxiv.org/pdf/2512.19124.pdf

Abstract:
Online harassment, incitement to violence, racist behavior, and other harmful content on social media can damage social harmony and even break the law. Traditional blocklisting technologies can block malicious users, but this comes at the expense of identity privacy. The anonymous blocklisting has emerged as an effective mechanism to restrict the abuse of freedom of speech while protecting user identity privacy. However, the state-of-the-art anonymous blocklisting schemes suffer from either poor dynamism or low efficiency. In this paper, we propose $\mathsf{ShadowBlock}$, an efficient dynamic anonymous blocklisting scheme. Specifically, we utilize the pseudorandom function and cryptographic accumulator to construct the public blocklisting, enabling users to prove they are not on the blocklisting in an anonymous manner. To improve verification efficiency, we design an aggregation zero-knowledge proof mechanism that converts multiple verification operations into a single one. In addition, we leverage the accumulator's property to achieve efficient updates of the blocklisting, i.e., the original proof can be reused with minimal updates rather than regenerating the entire proof. Experiments show that $\mathsf{ShadowBlock}$ has better dynamics and efficiency than the existing schemes. Finally, the discussion on applications indicates that $\mathsf{ShadowBlock}$ also holds significant value and has broad prospects in emerging fields such as cross-chain identity management.

Paperid: 1443, https://arxiv.org/pdf/2512.17310.pdf

Abstract:
Pseudorandom error-correcting codes (PRC) is a novel cryptographic primitive proposed at CRYPTO 2024. Due to the dual capability of pseudorandomness and error correction, PRC has been recognized as a promising foundational component for watermarking AI-generated content. However, the security of PRC has not been thoroughly analyzed, especially with concrete parameters or even in the face of cryptographic attacks. To fill this gap, we present the first cryptanalysis of PRC. We first propose three attacks to challenge the undetectability and robustness assumptions of PRC. Among them, two attacks aim to distinguish PRC-based codewords from plain vectors, and one attack aims to compromise the decoding process of PRC. Our attacks successfully undermine the claimed security guarantees across all parameter configurations. Notably, our attack can detect the presence of a watermark with overwhelming probability at a cost of $2^{22}$ operations. We also validate our approach by attacking real-world large generative models such as DeepSeek and Stable Diffusion. To mitigate our attacks, we further propose three defenses to enhance the security of PRC, including parameter suggestions, implementation suggestions, and constructing a revised key generation algorithm. Our proposed revised key generation function effectively prevents the occurrence of weak keys. However, we highlight that the current PRC-based watermarking scheme still cannot achieve a 128-bit security under our parameter suggestions due to the inherent configurations of large generative models, such as the maximum output length of large language models.

Paperid: 1444, https://arxiv.org/pdf/2512.15510.pdf

Abstract:
I2P (Invisible Internet Project) is a popular anonymous communication network. While existing de-anonymization methods for I2P focus on identifying potential traffic patterns of target hidden services among extensive network traffic, they often fail to scale effectively across the large and diverse I2P network, which consists of numerous routers. In this paper, we introduce I2PERCEPTION a low-cost approach revealing the IP addresses of I2P hidden services. In I2PERCEPTION, attackers deploy floodfill routers to passively monitor I2P routers and collect their RouterInfo. We analyze the router information publication mechanism to accurately identify routers' join (i.e. on) and leave (i.e. off) behaviors, enabling fine-grained live behavior inference across the I2P network. Active probing is used to obtain the live behavior (i.e., on-off patterns) of a target hidden service hosted on one of the I2P routers. By correlating the live behaviors of the target hidden service and I2P routers over time, we narrow down the set of routers matching the hidden service's behavior, revealing the hidden service's true network identity for de-anonymization. Through the deployment of only 15 floodfill routers over the course of eight months, we validate the precision and effectiveness of our approach with extensive real-world experiments. Our results show that I2PERCEPTION successfully de-anonymizes all controlled hidden services.

Paperid: 1445, https://arxiv.org/pdf/2512.14742.pdf

Abstract:
Open Radio Access Networks (O-RAN) enhance modularity and telemetry granularity but also widen the cybersecurity attack surface across disaggregated control, user and management planes. We propose a hierarchical defense framework with three coordinated layers-anomaly detection, intrusion confirmation, and multiattack classification-each aligned with O-RAN's telemetry stack. Our approach integrates hybrid quantum computing and machine learning, leveraging amplitude- and entanglement-based feature encodings with deep and ensemble classifiers. We conduct extensive benchmarking across synthetic and real-world telemetry, evaluating encoding depth, architectural variants, and diagnostic fidelity. The framework consistently achieves near-perfect accuracy, high recall, and strong class separability. Multi-faceted evaluation across decision boundaries, probabilistic margins, and latent space geometry confirms its interpretability, robustness, and readiness for slice-aware diagnostics and scalable deployment in near-RT and non-RT RIC domains.

Paperid: 1446, https://arxiv.org/pdf/2512.12006.pdf

Abstract:
It is well known that encryption alone is not enough to protect data privacy. Access patterns, revealed when operations are performed, can also be leveraged in inference attacks. Oblivious RAM (ORAM) hides access patterns by making client requests oblivious. However, existing protocols are still limited in supporting concurrent clients and Byzantine fault tolerance (BFT). We present MVP-ORAM, the first wait-free ORAM protocol that supports concurrent fail-prone clients. In contrast to previous works, MVP-ORAM avoids using trusted proxies, which require additional security assumptions, and concurrency control mechanisms based on inter-client communication or distributed locks, which limit overall throughput and the capability of tolerating faulty clients. Instead, MVP-ORAM enables clients to perform concurrent requests and merge conflicting updates as they happen, satisfying wait-freedom, i.e., clients make progress independently of the performance or failures of other clients. Since wait and collision freedom are fundamentally contradictory goals that cannot be achieved simultaneously in an asynchronous concurrent ORAM service, we define a weaker notion of obliviousness that depends on the application workload and number of concurrent clients, and prove MVP-ORAM is secure in practical scenarios where clients perform skewed block accesses. By being wait-free, MVP-ORAM can be seamlessly integrated into existing confidential BFT data stores, creating the first BFT ORAM construction. We implement MVP-ORAM on top of a confidential BFT data store and show our prototype can process hundreds of 4KB accesses per second in modern clouds.

Paperid: 1447, https://arxiv.org/pdf/2512.05065.pdf

Abstract:
Personal language model-based agents are becoming more widespread for completing tasks on behalf of users; however, this raises serious privacy questions regarding whether these models will appropriately disclose user data. While prior work has evaluated language models on data-sharing scenarios based on general privacy norms, we focus on personalizing language models' privacy decisions, grounding their judgments directly in prior user privacy decisions. Our findings suggest that general privacy norms are insufficient for effective personalization of privacy decisions. Furthermore, we find that eliciting privacy judgments from the model through In-context Learning (ICL) is unreliable to due misalignment with the user's prior privacy judgments and opaque reasoning traces, which make it difficult for the user to interpret the reasoning behind the model's decisions. To address these limitations, we propose ARIEL (Agentic Reasoning with Individualized Entailment Logic), a framework that jointly leverages a language model and rule-based logic for structured data-sharing reasoning. ARIEL is based on formulating personalization of data sharing as an entailment, whether a prior user judgment on a data-sharing request implies the same judgment for an incoming request. Our experimental evaluations on advanced models and publicly-available datasets demonstrate that ARIEL can reduce the F1 score error by $\textbf{39.1%}$ over language model-based reasoning (ICL), demonstrating that ARIEL is effective at correctly judging requests where the user would approve data sharing. Overall, our findings suggest that combining LLMs with strict logical entailment is a highly effective strategy for enabling personalized privacy judgments for agents.

Paperid: 1448, https://arxiv.org/pdf/2512.03816.pdf

Abstract:
When using an LLM through an API provider, users expect the served model to remain consistent over time, a property crucial for the reliability of downstream applications and the reproducibility of research. Existing audit methods are too costly to apply at regular time intervals to the wide range of available LLM APIs. This means that model updates are left largely unmonitored in practice. In this work, we show that while LLM log probabilities (logprobs) are usually non-deterministic, they can still be used as the basis for cost-effective continuous monitoring of LLM APIs. We apply a simple statistical test based on the average value of each token logprob, requesting only a single token of output. This is enough to detect changes as small as one step of fine-tuning, making this approach more sensitive than existing methods while being 1,000x cheaper. We introduce the TinyChange benchmark as a way to measure the sensitivity of audit methods in the context of small, realistic model changes.

Paperid: 1449, https://arxiv.org/pdf/2512.03564.pdf

Abstract:
Diffusion models are renowned for their state-of-the-art performance in generating synthetic images. However, concerns related to safety, privacy, and copyright highlight the need for machine unlearning, which can make diffusion models forget specific training data and prevent the generation of sensitive or unwanted content. Current machine unlearning methods for diffusion models are primarily designed for conditional diffusion models and focus on unlearning specific data classes or features. Among these methods, finetuning-based machine unlearning methods are recognized for their efficiency and effectiveness, which update the parameters of pre-trained diffusion models by minimizing carefully designed loss functions. However, in this paper, we propose a novel attack named Diffusion Model Relearning Attack (DiMRA), which can reverse the finetuning-based machine unlearning methods, posing a significant vulnerability of this kind of technique. Without prior knowledge of the unlearning elements, DiMRA optimizes the unlearned diffusion model on an auxiliary dataset to reverse the unlearning, enabling the model to regenerate previously unlearned elements. To mitigate this vulnerability, we propose a novel machine unlearning method for diffusion models, termed as Diffusion Model Unlearning by Memorization (DiMUM). Unlike traditional methods that focus on forgetting, DiMUM memorizes alternative data or features to replace targeted unlearning data or features in order to prevent generating such elements. In our experiments, we demonstrate the effectiveness of DiMRA in reversing state-of-the-art finetuning-based machine unlearning methods for diffusion models, highlighting the need for more robust solutions. We extensively evaluate DiMUM, demonstrating its superior ability to preserve the generative performance of diffusion models while enhancing robustness against DiMRA.

Paperid: 1450, https://arxiv.org/pdf/2512.02625.pdf

Abstract:
Large language models (LLMs) excel at many general-purpose natural language processing tasks. However, their ability to perform deep reasoning and mathematical analysis, particularly for complex tasks as required in cryptography, remains poorly understood, largely due to the lack of suitable data for evaluation and training. To address this gap, we present CryptoQA, the first large-scale question-answering (QA) dataset specifically designed for cryptography. CryptoQA contains over two million QA pairs drawn from curated academic sources, along with contextual metadata that can be used to test the cryptographic capabilities of LLMs and to train new LLMs on cryptographic tasks. We benchmark 15 state-of-the-art LLMs on CryptoQA, evaluating their factual accuracy, mathematical reasoning, consistency, referencing, backward reasoning, and robustness to adversarial samples. In addition to quantitative metrics, we provide expert reviews that qualitatively assess model outputs and establish a gold-standard baseline. Our results reveal significant performance deficits of LLMs, particularly on tasks that require formal reasoning and precise mathematical knowledge. This shows the urgent need for LLM assistants tailored to cryptography research and development. We demonstrate that, by using CryptoQA, LLMs can be fine-tuned to exhibit better performance on cryptographic tasks.

Paperid: 1451, https://arxiv.org/pdf/2512.02276.pdf

Abstract:
Traffic classification (TC) plays a critical role in cybersecurity, particularly in IoT and embedded contexts, where inspection must often occur locally under tight hardware constraints. We use hardware-aware neural architecture search (HW-NAS) to derive lightweight TC models that are accurate, efficient, and deployable on edge platforms. Two input formats are considered: a flattened byte sequence and a 2D packet-wise time series; we examine how input structure affects adversarial vulnerability when using resource-constrained models. Robustness is assessed against white-box attacks, specifically Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD). On USTC-TFC2016, both HW-NAS models achieve over 99% clean-data accuracy while remaining within 65k parameters and 2M FLOPs. Yet under perturbations of strength 0.1, their robustness diverges: the flat model retains over 85% accuracy, while the time-series variant drops below 35%. Adversarial fine-tuning delivers robust gains, with flat-input accuracy exceeding 96% and the time-series variant recovering over 60 percentage points in robustness, all without compromising efficiency. The results underscore how input structure influences adversarial vulnerability, and show that even compact, resource-efficient models can attain strong robustness, supporting their practical deployment in secure edge-based TC.

Paperid: 1452, https://arxiv.org/pdf/2511.22681.pdf

Abstract:
Adversarial weight perturbation has emerged as a concerning threat to LLMs that either use training privileges or system-level access to inject adversarial corruption in model weights. With the emergence of innovative defensive solutions that place system- and algorithm-level checks and corrections in the input and weight spaces, these perturbations are increasingly susceptible to defenses. This work develops a novel perspective on Trojan attacks that generates an attacker-designed model output while leaving no attack traces on the inputs or weights. Such an attack space can be unlocked through corruption of the key-value (KV) cache. In this paper, we introduce CacheTrap, a novel Trojan attack that corrupts the value vectors stored in the KV cache. These vectors capture the dynamic activations for specific token positions and therefore constitute a natural surface for transient, inference-time trigger insertion. The transient nature of these KV values and their dependence on victim input imply additional constraints on our attack, such as a lack of knowledge of the victim's data or domain application, and, consequently, a lack of gradient information. The objective of the proposed CacheTrap is to develop a vulnerable KV bit-searching algorithm so that, once the attack employs the identified bit-flip as a trigger, the model generates targeted behavior, e.g., classifying inputs towards the target class. Moreover, CacheTrap is a data- and gradient-free attack which also has no impact on the model's utility. Our evaluation demonstrates that the proposed attack enables the first successful Trojan attack on LLMs with a single bit flip in the KV cache. In addition, the data-independent nature of the attack ensures that once the attacker identifies the vulnerable bit index, the location remains constant and can be transferred to a wide range of victim tasks/datasets/queries with no overhead.

Paperid: 1453, https://arxiv.org/pdf/2511.20709.pdf

Abstract:
Large language models (LLMs) and autonomous coding agents are increasingly used to generate software across a wide range of domains. Yet a core requirement remains unmet: ensuring that generated code is secure without compromising its functional correctness. Existing benchmarks and evaluations for secure code generation fall short-many measure only vulnerability reduction, disregard correctness preservation, or evaluate security and functionality on separate datasets, violating the fundamental need for simultaneous joint evaluation. We present DUALGAUGE, the first fully automated benchmarking framework designed to rigorously evaluate the security and correctness of LLM-generated code in unison. Given the lack of datasets enabling joint evaluation of secure code generation, we also present DUALGAUGE-BENCH, a curated benchmark suite of diverse coding tasks, each paired with manually validated test suites for both security and functionality, designed for full coverage of specification requirements. At the core of DUALGAUGE is an agentic program executor, which runs a program against given tests in sandboxed environments, and an LLM-based evaluator, which assesses both correctness and vulnerability behavior against expected outcomes. We rigorously evaluated and ensured the quality of DUALGAUGE-BENCH and the accuracy of DUALGAUGE, and applied DUALGAUGE to benchmarking ten leading LLMs on DUALGAUGE-BENCH across thousands of test scenarios. Our results reveal critical gaps in correct and secure code generation by these LLMs, for which our open-source system and datasets help accelerate progress via reproducible, scalable, and rigorous evaluation.

Paperid: 1454, https://arxiv.org/pdf/2511.19218.pdf

Abstract:
Large Language Models (LLMs) have developed rapidly in web services, delivering unprecedented capabilities while amplifying societal risks. Existing works tend to focus on either isolated jailbreak attacks or static defenses, neglecting the dynamic interplay between evolving threats and safeguards in real-world web contexts. To mitigate these challenges, we propose ACE-Safety (Adversarial Co-Evolution for LLM Safety), a novel framework that jointly optimize attack and defense models by seamlessly integrating two key innovative procedures: (1) Group-aware Strategy-guided Monte Carlo Tree Search (GS-MCTS), which efficiently explores jailbreak strategies to uncover vulnerabilities and generate diverse adversarial samples; (2) Adversarial Curriculum Tree-aware Group Policy Optimization (AC-TGPO), which jointly trains attack and defense LLMs with challenging samples via curriculum reinforcement learning, enabling robust mutual improvement. Evaluations across multiple benchmarks demonstrate that our method outperforms existing attack and defense approaches, and provides a feasible pathway for developing LLMs that can sustainably support responsible AI ecosystems.

Paperid: 1455, https://arxiv.org/pdf/2511.18235.pdf

Abstract:
The rapid growth of the Internet of Things (IoT) has given rise to highly diverse and interconnected ecosystems that are increasingly susceptible to sophisticated cyber threats. Conventional anomaly detection schemes often prioritize accuracy while overlooking computational efficiency and environmental impact, which limits their deployment in resource-constrained edge environments. This paper presents \textit{EcoDefender}, a sustainable hybrid anomaly detection framework that integrates \textit{Autoencoder(AE)}-based representation learning with \textit{Isolation Forest(IF)} anomaly scoring. Beyond empirical performance, EcoDefender is supported by a theoretical foundation that establishes formal guarantees for its stability, convergence, robustness, and energy-complexity coupling-thereby linking computational behavior to energy efficiency. Furthermore, experiments on realistic IoT traffic confirm these theoretical insights, achieving up to 94\% detection accuracy with an average CPU usage of only 22\%, 27 ms inference latency, and 30\% lower energy consumption compared to AE-only baselines. By embedding sustainability metrics directly into the security evaluation process, this work demonstrates that reliable anomaly detection and environmental responsibility can coexist within next-generation green IoT infrastructures, aligning with the United Nations Sustainable Development Goals (SDG 9: resilient infrastructure, SDG 13: climate action).

Paperid: 1456, https://arxiv.org/pdf/2511.17761.pdf

Abstract:
Intrusion Detection Systems (IDS) are critical to defending enterprise and industrial control environments, yet evaluating their effectiveness under realistic conditions remains an open challenge. Existing benchmarks rely on synthetic datasets (e.g., NSL-KDD, CICIDS2017) or scripted replay frameworks, which fail to capture adaptive adversary behavior. Even MITRE ATT&CK Evaluations, while influential, are host-centric and assume malware-driven compromise, thereby under-representing stealthy, multi-stage intrusions across IT and OT domains. We present StealthCup, a novel evaluation methodology that operationalizes IDS benchmarking as an evasion-focused Capture-the-Flag competition. Professional penetration testers engaged in multi-stage attack chains on a realistic IT/OT testbed, with scoring penalizing IDS detections. The event generated structured attacker writeups, validated detections, and PCAPs, host logs, and alerts. Our results reveal that out of 32 exercised attack techniques, 11 were not detected by any IDS configuration. Open-source systems (Wazuh, Suricata) produced high false-positive rates >90%, while commercial tools generated fewer false positives but also missed more attacks. Comparison with the Volt Typhoon APT advisory confirmed strong realism: all 28 applicable techniques were exercised, 19 appeared in writeups, and 9 in forensic traces. These findings demonstrate that StealthCup elicits attacker behavior closely aligned with state-sponsored TTPs, while exposing blind spots across both open-source and commercial IDS. The resulting datasets and methodology provide a reproducible foundation for future stealth-focused IDS evaluation.

Paperid: 1457, https://arxiv.org/pdf/2511.16192.pdf

Abstract:
As Law Enforcement Agencies advance in cryptocurrency forensics, criminal actors aiming to conceal illicit fund movements increasingly turn to "mixin" services or privacy-based cryptocurrencies. Monero stands out as a leading choice due to its strong privacy preserving and untraceability properties, making conventional blockchain analysis ineffective. Understanding the behavior and operational patterns of criminal actors within Monero is therefore challenging and it is essential to support future investigative strategies and disrupt illicit activities. In this work, we propose a case study in which we leverage a novel graph-based methodology to extract structural and temporal patterns from Monero transactions linked to already discovered criminal activities. By building Address-Ring-Transaction graphs from flagged transactions, we extract structural and temporal features and use them to train Machine Learning models capable of detecting similar behavioral patterns that could highlight criminal modus operandi. This represents a first partial step toward developing analytical tools that support investigative efforts in privacy-preserving blockchain ecosystems

Paperid: 1458, https://arxiv.org/pdf/2511.15463.pdf

Abstract:
Existing rug pull detectors assume a simple workflow: the deployer keeps liquidity pool (LP) tokens and performs one or a few large sells (within a day) that collapse the pool and cash out. In practice, however, many real-world exits violate these assumptions by splitting the attack across both time and actor dimensions: attackers break total extraction into many low-impact trades and route proceeds through multiple non-owner addresses, producing low-visibility drains. We formalize this family of attacks as the fragmented rug pull (FRP) and offer a compact recipe for a slow-stewed beef special: (i) keep the lid on (to preserve LP control so on-chain extraction remains feasible), (ii) chop thin slices (to split the total exit volume into many low-impact micro-trades that individually fall below impact thresholds), and (iii) pass the ladle (to delegate sells across multiple wallets so that each participant takes a small share of the extraction). Technically, we define three atomic predicate groups and show that their orthogonal combinations yield evasive strategies overlooked by prior heuristics (USENIX Sec 19, USENIX Sec 23). We validate the model with large-scale measurements. Our corpus contains 303,614 LPs, among which 105,434 are labeled as FRP pools. The labeled subset includes 34,192,767 pool-related transactions and 401,838 inflated-seller wallets, involving 1,501,408 unique interacting addresses. Notably, owner-wallet participation in inflated selling among FRP-flagged LPs has declined substantially (33.1% of cases), indicating a shift in scam behavior: the liquidity drain is no longer held on the owner wallet. We also detected 127,252 wallets acting as serial scammers when repeatedly engaging in inflated selling across multiple FRP LPs. Our empirical findings demonstrate that the evasive strategies we define are widespread and operationally significant.

Paperid: 1459, https://arxiv.org/pdf/2511.11914.pdf

Abstract:
As AI models are trained on ever-expanding datasets, the ability to remove the influence of specific data from trained models has become essential for privacy protection and regulatory compliance. Unlearning addresses this challenge by selectively removing parametric knowledge from the trained models without retraining from scratch, which is critical for resource-intensive models such as Large Language Models (LLMs). Existing unlearning methods often degrade model performance by removing more information than necessary when attempting to ''forget'' specific data. We introduce Forgetting-MarI, an LLM unlearning framework that provably removes only the additional (marginal) information contributed by the data to be unlearned, while preserving the information supported by the data to be retained. By penalizing marginal information, our method yields an explicit upper bound on the unlearn dataset's residual influence in the trained models, providing provable undetectability. Extensive experiments confirm that our approach outperforms current state-of-the-art unlearning methods, delivering reliable forgetting and better preserved general model performance across diverse benchmarks. This advancement represents an important step toward making AI systems more controllable and compliant with privacy and copyright regulations without compromising their effectiveness.

Paperid: 1460, https://arxiv.org/pdf/2511.11842.pdf

Abstract:
Transparency and security are both central to Responsible AI, but they may conflict in adversarial settings. We investigate the strategic effect of transparency for agents through the lens of transferable adversarial example attacks. In transferable adversarial example attacks, attackers maliciously perturb their inputs using surrogate models to fool a defender's target model. These models can be defended or undefended, with both players having to decide which to use. Using a large-scale empirical evaluation of nine attacks across 181 models, we find that attackers are more successful when they match the defender's decision; hence, obscurity could be beneficial to the defender. With game theory, we analyze this trade-off between transparency and security by modeling this problem as both a Nash game and a Stackelberg game, and comparing the expected outcomes. Our analysis confirms that only knowing whether a defender's model is defended or not can sometimes be enough to damage its security. This result serves as an indicator of the general trade-off between transparency and security, suggesting that transparency in AI systems can be at odds with security. Beyond adversarial machine learning, our work illustrates how game-theoretic reasoning can uncover conflicts between transparency and security.

Paperid: 1461, https://arxiv.org/pdf/2511.11292.pdf

Abstract:
Jasmin is a programming and verification framework for developing efficient, formally verified, cryptographic implementations. A main component of the framework is the Jasmin compiler, which empowers programmers to write efficient implementations of state-of-the-art cryptographic primitives, including post-quantum cryptographic standards. The Jasmin compiler is proven functionally correct in the Rocq prover. However, this functional correctness statement does not apply to nonterminating or probabilistic computations, which are essential features in cryptography. In this paper, we significantly enhance the guarantees of the compiler by showing, in the Rocq prover, that its front-end (25 out of 30 passes) preserves cryptographic security. To this end, we first define a Relational Hoare Logic tailored for compiler correctness proofs. We prove the soundness of our logic w.r.t. a new denotational semantics of Jasmin programs based on interaction trees. Secondly, we use our program logic to prove the functional correctness of the (unmodified) Jasmin compiler w.r.t. said semantics. Lastly, we formalize cryptographic security -- focusing on IND-CCA -- with interaction trees and prove that the Jasmin compiler preserves cryptographic security.

Paperid: 1462, https://arxiv.org/pdf/2511.10712.pdf

Abstract:
Model merging has emerged as an efficient technique for expanding large language models (LLMs) by integrating specialized expert models. However, it also introduces a new threat: model merging stealing, where free-riders exploit models through unauthorized model merging. Unfortunately, existing defense mechanisms fail to provide effective protection. Specifically, we identify three critical protection properties that existing methods fail to simultaneously satisfy: (1) proactively preventing unauthorized merging; (2) ensuring compatibility with general open-source settings; (3) achieving high security with negligible performance loss. To address the above issues, we propose MergeBarrier, a plug-and-play defense that proactively prevents unauthorized merging. The core design of MergeBarrier is to disrupt the Linear Mode Connectivity (LMC) between the protected model and its homologous counterparts, thereby eliminating the low-loss path required for effective model merging. Extensive experiments show that MergeBarrier effectively prevents model merging stealing with negligible accuracy loss.

Paperid: 1463, https://arxiv.org/pdf/2511.08902.pdf

Abstract:
Ultra-low latency, the hallmark of fifth-generation mobile communications (5G), imposes exacting timing demands on identification as well. Current cryptographic solutions introduce additional computational overhead, which results in heightened identification delays. Radio frequency fingerprint (RFF) identifies devices at the physical layer, blocking impersonation attacks while significantly reducing latency. Unfortunately, multipath channels compromise RFF accuracy, and existing channel-resilient methods demand feedback or processing across multiple time points, incurring extra signaling latency. To address this problem, the paper introduces a new RFF extraction technique that employs signals from multiple receiving antennas to address multipath issues without adding latency. Unlike single-domain methods, the Log-Linear Delta Ratio (LLDR) of co-temporal channel frequency responses (CFRs) from multiple antennas is employed to preserve discriminative RFF features, eliminating multi-time sampling and reducing acquisition time. To overcome the challenge of the reliance on minimal channel variation, the frequency band is segmented into sub-bands, and the LLDR is computed within each sub-band individually. Simulation results indicate that the proposed scheme attains a 96.13% identification accuracy for 30 user equipments (UEs) within a 20-path channel under a signal-to-noise ratio (SNR) of 20 dB. Furthermore, we evaluate the theoretical latency using the Roofline model, resulting in the air interface latency of 0.491 ms, which satisfies ultra-reliable and low-latency communications (URLLC) latency requirements.

Paperid: 1464, https://arxiv.org/pdf/2511.07841.pdf

Abstract:
As automation bot technology and Artificial Intelligence is evolving rapidly, conventional human verification techniques like voice CAPTCHAs and knowledge-based authentication are becoming less effective. Bots and scrapers with Artificial Intelligence (AI) capabilities can now detect and solve visual challenges, emulate human like typing patterns, and avoid most security tests, leading to high-volume threats like credential stuffing, account abuse, ad fraud, and automated scalping. This leaves a vital gap in identifying real human users versus advanced bots. We present a novel technique for distinguishing real human users based on hardware interaction signals to address this issue. In contrast to conventional approaches, our method leverages human interactions and a cryptographically attested User Presence (UP) flag from trusted hardware to verify genuine physical user engagement providing a secure and reliable way to distinguish authentic users from automated bots or scripted routines. The suggested approach was thoroughly assessed in terms of performance, usability, and security. The system demonstrated consistent throughput and zero request failures under prolonged concurrent user demand, indicating good operational reliability, efficient load handling, and the underlying architecture's robustness. These thorough analyses support the conclusion that the suggested system provides a safer, more effective, and easier-to-use substitute for current human verification methods.

Paperid: 1465, https://arxiv.org/pdf/2511.07577.pdf

Abstract:
Existing retrieval-augmented generation (RAG) systems typically use a centralized architecture, causing a high cost of data collection, integration, and management, as well as privacy concerns. There is a great need for a decentralized RAG system that enables foundation models to utilize information directly from data owners who maintain full control over their sources. However, decentralization brings a challenge: the numerous independent data sources vary significantly in reliability, which can diminish retrieval accuracy and response quality. To address this, our decentralized RAG system has a novel reliability scoring mechanism that dynamically evaluates each source based on the quality of responses it contributes to generate and prioritizes high-quality sources during retrieval. To ensure transparency and trust, the scoring process is securely managed through blockchain-based smart contracts, creating verifiable and tamper-proof reliability records without relying on a central authority. We evaluate our decentralized system with two Llama models (3B and 8B) in two simulated environments where six data sources have different levels of reliability. Our system achieves a +10.7\% performance improvement over its centralized counterpart in the real world-like unreliable data environments. Notably, it approaches the upper-bound performance of centralized systems under ideally reliable data environments. The decentralized infrastructure enables secure and trustworthy scoring management, achieving approximately 56\% marginal cost savings through batched update operations. Our code and system are open-sourced at github.com/yining610/Reliable-dRAG.

Paperid: 1466, https://arxiv.org/pdf/2510.24807.pdf

Abstract:
Privacy concerns have become increasingly critical in modern AI and data science applications, where sensitive information is collected, analyzed, and shared across diverse domains such as healthcare, finance, and mobility. While prior research has focused on protecting privacy in a single data release, many real-world systems operate under sequential or continuous data publishing, where the same or related data are released over time. Such sequential disclosures introduce new vulnerabilities, as temporal correlations across releases may enable adversaries to infer sensitive information that remains hidden in any individual release. In this paper, we investigate whether an attacker can compromise privacy in sequential data releases by exploiting dependencies between consecutive publications, even when each individual release satisfies standard privacy guarantees. To this end, we propose a novel attack model that captures these sequential dependencies by integrating a Hidden Markov Model with a reinforcement learning-based bi-directional inference mechanism. This enables the attacker to leverage both earlier and later observations in the sequence to infer private information. We instantiate our framework in the context of trajectory data, demonstrating how an adversary can recover sensitive locations from sequential mobility datasets. Extensive experiments on Geolife, Porto Taxi, and SynMob datasets show that our model consistently outperforms baseline approaches that treat each release independently. The results reveal a fundamental privacy risk inherent to sequential data publishing, where individually protected releases can collectively leak sensitive information when analyzed temporally. These findings underscore the need for new privacy-preserving frameworks that explicitly model temporal dependencies, such as time-aware differential privacy or sequential data obfuscation strategies.

Paperid: 1467, https://arxiv.org/pdf/2510.22726.pdf

Abstract:
SpoofTrackBench is a reproducible, modular benchmark for evaluating adversarial robustness in real-time localization and tracking (RTLS) systems under radar spoofing. Leveraging the Hampton University Skyler Radar Sensor dataset, we simulate drift, ghost, and mirror-type spoofing attacks and evaluate tracker performance using both Joint Probabilistic Data Association (JPDA) and Global Nearest Neighbor (GNN) architectures. Our framework separates clean and spoofed detection streams, visualizes spoof-induced trajectory divergence, and quantifies assignment errors via direct drift-from-truth metrics. Clustering overlays, injection-aware timelines, and scenario-adaptive visualizations enable interpretability across spoof types and configurations. Evaluation figures and logs are auto-exported for reproducible comparison. SpoofTrackBench sets a new standard for open, ethical benchmarking of spoof-aware tracking pipelines, enabling rigorous cross-architecture analysis and community validation.

Paperid: 1468, https://arxiv.org/pdf/2510.21783.pdf

Abstract:
Diffusion models have demonstrated powerful performance in generating high-quality images. A typical example is text-to-image generator like Stable Diffusion. However, their widespread use also poses potential privacy risks. A key concern is membership inference attacks, which attempt to determine whether a particular data sample was used in the model training process. We propose an efficient membership inference attack method against diffusion models. This method is based on the injection of slight noise and the evaluation of the aggregation degree of the noise distribution. The intuition is that the noise prediction patterns of diffusion models for training set samples and non-training set samples exhibit distinguishable differences.Specifically, we suppose that member images exhibit higher aggregation of predicted noise around a certain time step of the diffusion process. In contrast, the predicted noises of non-member images exhibit a more discrete characteristic around the certain time step. Compared with other existing methods, our proposed method requires fewer visits to the target diffusion model. We inject slight noise into the image under test and then determine its membership by analyzing the aggregation degree of the noise distribution predicted by the model. Empirical findings indicate that our method achieves superior performance across multiple datasets. At the same time, our method can also show better attack effects in ASR and AUC when facing large-scale text-to-image diffusion models, proving the scalability of our method.

Paperid: 1469, https://arxiv.org/pdf/2510.21353.pdf

Abstract:
Authentication systems have evolved a lot since the 1960s when Fernando Corbato first proposed the password-based authentication. In 2013, the FIDO Alliance proposed using secure hardware for authentication, thus marking a milestone in the passwordless authentication era [1]. Passwordless authentication with a possession-based factor often relied on hardware-backed cryptographic methods. FIDO2 being one an amalgamation of the W3C Web Authentication and FIDO Alliance Client to Authenticator Protocol is an industry standard for secure passwordless authentication with rising adoption for the same [2]. However, the current FIDO2 standards use ECDSA with SHA-256 (ES256), RSA with SHA-256 (RS256) and similar classical cryptographic signature algorithms. This makes it insecure against attacks involving large-scale quantum computers [3]. This study aims at exploring the usability of Module Lattice based Digital Signature Algorithm (ML-DSA), based on Crystals Dilithium as a post quantum cryptographic signature standard for FIDO2. The paper highlights the performance and security in comparison to keys with classical algorithms.

Paperid: 1470, https://arxiv.org/pdf/2510.21190.pdf

Abstract:
Large Language Models (LLMs) have advanced rapidly and now encode extensive world knowledge. Despite safety fine-tuning, however, they remain susceptible to adversarial prompts that elicit harmful content. Existing jailbreak techniques fall into two categories: white-box methods (e.g., gradient-based approaches such as GCG), which require model internals and are infeasible for closed-source APIs, and black-box methods that rely on attacker LLMs to search or mutate prompts but often produce templates that lack explainability and transferability. We introduce TrojFill, a black-box jailbreak that reframes unsafe instruction as a template-filling task. TrojFill embeds obfuscated harmful instructions (e.g., via placeholder substitution or Caesar/Base64 encoding) inside a multi-part template that asks the model to (1) reason why the original instruction is unsafe (unsafety reasoning) and (2) generate a detailed example of the requested text, followed by a sentence-by-sentence analysis. The crucial "example" component acts as a Trojan Horse that contains the target jailbreak content while the surrounding task framing reduces refusal rates. We evaluate TrojFill on standard jailbreak benchmarks across leading LLMs (e.g., ChatGPT, Gemini, DeepSeek, Qwen), showing strong empirical performance (e.g., 100% attack success on Gemini-flash-2.5 and DeepSeek-3.1, and 97% on GPT-4o). Moreover, the generated prompts exhibit improved interpretability and transferability compared with prior black-box optimization approaches. We release our code, sample prompts, and generated outputs to support future red-teaming research.

Paperid: 1471, https://arxiv.org/pdf/2510.20852.pdf

Abstract:
The Internet of Things (IoT) has recently proliferated in both size and complexity. Using multi-source and heterogeneous IoT data aids in providing efficient data analytics for a variety of prevalent and crucial applications. To address the privacy and security concerns raised by analyzing IoT data locally or in the cloud, distributed data analytics techniques were proposed to collect and analyze data in edge or fog devices. In this context, federated learning has been recommended as an ideal distributed machine/deep learning-based technique for edge/fog computing environments. Additionally, the data analytics results are time-sensitive; they should be generated with minimal latency and high reliability. As a result, reusing efficient architectures validated through a high number of challenging test cases would be advantageous. The work proposed here presents a solution using a microservices-based architecture that allows an IoT application to be structured as a collection of fine-grained, loosely coupled, and reusable entities. The proposed solution uses the promising capabilities of federated learning to provide intelligent microservices that ensure efficient, flexible, and extensible data analytics. This solution aims to deliver cloud calculations to the edge to reduce latency and bandwidth congestion while protecting the privacy of exchanged data. The proposed approach was validated through an IoT-malware detection and classification use case. MaleVis, a publicly available dataset, was used in the experiments to analyze and validate the proposed approach. This dataset included more than 14,000 RGB-converted images, comprising 25 malware classes and one benign class. The results showed that our proposed approach outperformed existing state-of-the-art methods in terms of detection and classification performance, with a 99.24%.

Paperid: 1472, https://arxiv.org/pdf/2510.20419.pdf

Abstract:
Aggregating Message Authentication Codes (MACs) promises to save valuable bandwidth in resource-constrained environments. The idea is simple: Instead of appending an authentication tag to each message in a communication stream, the integrity protection of multiple messages is aggregated into a single tag. Recent studies postulate, e.g., based on simulations, that these benefits also spread to wireless, and thus lossy, scenarios despite each lost packet typically resulting in the loss of integrity protection information for multiple messages. In this paper, we investigate these claims in a real deployment. Therefore, we first design a MAC aggregation extension for the Datagram Transport Layer Security (DTLS) 1.3 protocol. Afterward, we extensively evaluate the performance of MAC aggregation on a complete communication protocol stack on embedded hardware. We find that MAC aggregation can indeed increase goodput by up to 50% and save up to 17% of energy expenditure for the transmission of short messages, even in lossy channels.

Paperid: 1473, https://arxiv.org/pdf/2510.19491.pdf

Abstract:
Sealed-bid auctions ensure fair competition and efficient allocation but are often deployed on centralized infrastructure, enabling opaque manipulation. Public blockchains eliminate central control, yet their inherent transparency conflicts with the confidentiality required for sealed bidding. Prior attempts struggle to reconcile privacy, verifiability, and scalability without relying on trusted intermediaries, multi-round protocols, or expensive cryptography. We present a sealed-bid auction protocol that executes sensitive bidding logic on a Trusted Execution Environment (TEE)-backed confidential compute blockchain while retaining settlement and enforcement on a public chain. Bidders commit funds to enclave-generated escrow addresses, ensuring confidentiality and binding commitments. After the deadline, any party can trigger resolution: the confidential blockchain determines the winner through verifiable off-chain computation and issues signed settlement transactions for execution on the public chain. Our design provides security, privacy, and scalability without trusted third parties or protocol modifications. We implement it on SUAVE with Ethereum settlement, evaluate its scalability and trust assumptions, and demonstrate deployment with minimal integration on existing infrastructure

Paperid: 1474, https://arxiv.org/pdf/2510.18674.pdf

Abstract:
As large language models (LLMs) become progressively more embedded in clinical decision-support, documentation, and patient-information systems, ensuring their privacy and trustworthiness has emerged as an imperative challenge for the healthcare sector. Fine-tuning LLMs on sensitive electronic health record (EHR) data improves domain alignment but also raises the risk of exposing patient information through model behaviors. In this work-in-progress, we present an exploratory empirical study on membership inference vulnerabilities in clinical LLMs, focusing on whether adversaries can infer if specific patient records were used during model training. Using a state-of-the-art clinical question-answering model, Llemr, we evaluate both canonical loss-based attacks and a domain-motivated paraphrasing-based perturbation strategy that more realistically reflects clinical adversarial conditions. Our preliminary findings reveal limited but measurable membership leakage, suggesting that current clinical LLMs provide partial resistance yet remain susceptible to subtle privacy risks that could undermine trust in clinical AI adoption. These results motivate continued development of context-aware, domain-specific privacy evaluations and defenses such as differential privacy fine-tuning and paraphrase-aware training, to strengthen the security and trustworthiness of healthcare AI systems.

Paperid: 1475, https://arxiv.org/pdf/2510.17759.pdf

Abstract:
Vision-Language Models (VLMs) extend large language models with visual reasoning, but their multimodal design also introduces new, underexplored vulnerabilities. Existing multimodal red-teaming methods largely rely on brittle templates, focus on single-attack settings, and expose only a narrow subset of vulnerabilities. To address these limitations, we introduce VERA-V, a variational inference framework that recasts multimodal jailbreak discovery as learning a joint posterior distribution over paired text-image prompts. This probabilistic view enables the generation of stealthy, coupled adversarial inputs that bypass model guardrails. We train a lightweight attacker to approximate the posterior, allowing efficient sampling of diverse jailbreaks and providing distributional insights into vulnerabilities. VERA-V further integrates three complementary strategies: (i) typography-based text prompts that embed harmful cues, (ii) diffusion-based image synthesis that introduces adversarial signals, and (iii) structured distractors to fragment VLM attention. Experiments on HarmBench and HADES benchmarks show that VERA-V consistently outperforms state-of-the-art baselines on both open-source and frontier VLMs, achieving up to 53.75% higher attack success rate (ASR) over the best baseline on GPT-4o.

Paperid: 1476, https://arxiv.org/pdf/2510.17294.pdf

Abstract:
In this paper, we present a novel method for solving a class of quadratically constrained quadratic optimization problems using only additions and multiplications. This approach enables solving constrained optimization problems on private data since the operations involved are compatible with the capabilities of homomorphic encryption schemes. To solve the constrained optimization problem, a sequence of polynomial penalty functions of increasing degree is introduced, which are sufficiently steep at the boundary of the feasible set. Adding the penalty function to the original cost function creates a sequence of unconstrained optimization problems whose minimizer always lies in the admissible set and converges to the minimizer of the constrained problem. A gradient descent method is used to generate a sequence of iterates associated with these problems. For the algorithm, it is shown that the iterate converges to a minimizer of the original problem, and the feasible set is positively invariant under the iteration. Finally, the method is demonstrated on an illustrative cryptographic problem, finding the smaller value of two numbers, and the encrypted implementability is discussed.

Paperid: 1477, https://arxiv.org/pdf/2510.16122.pdf

Abstract:
Membership Inference Attacks (MIAs) pose a critical privacy threat by enabling adversaries to determine whether a specific sample was included in a model's training dataset. Despite extensive research on MIAs, systematic comparisons between generative and discriminative classifiers remain limited. This work addresses this gap by first providing theoretical motivation for why generative classifiers exhibit heightened susceptibility to MIAs, then validating these insights through comprehensive empirical evaluation. Our study encompasses discriminative, generative, and pseudo-generative text classifiers across varying training data volumes, evaluated on nine benchmark datasets. Employing a diverse array of MIA strategies, we consistently demonstrate that fully generative classifiers which explicitly model the joint likelihood $P(X,Y)$ are most vulnerable to membership leakage. Furthermore, we observe that the canonical inference approach commonly used in generative classifiers significantly amplifies this privacy risk. These findings reveal a fundamental utility-privacy trade-off inherent in classifier design, underscoring the critical need for caution when deploying generative classifiers in privacy-sensitive applications. Our results motivate future research directions in developing privacy-preserving generative classifiers that can maintain utility while mitigating membership inference vulnerabilities.

Paperid: 1478, https://arxiv.org/pdf/2510.16078.pdf

Abstract:
We present a practical match-on-card design for face verification in which compact 64/128-bit templates are produced off-card by PCA-ITQ and compared on-card via constant-time Hamming distance. We specify ISO/IEC 7816-4 and 14443-4 command APDUs with fixed-length payloads and decision-only status words (no score leakage), together with a minimal per-identity EEPROM map. Using real binary codes from a CelebA working set (55 identities, 412 images), we (i) derive operating thresholds from ROC/DET, (ii) replay enroll->verify transactions at those thresholds, and (iii) bound end-to-end time by pure link latency plus a small constant on-card budget. Even at the slowest contact rate (9.6 kbps), total verification time is 43.9 ms (64 b) and 52.3 ms (128 b); at 38.4 kbps both are <14 ms. At FAR = 1%, both code lengths reach TPR = 0.836, while 128 b lowers EER relative to 64 b. An optional +6 B helper (targeted symbol-level parity over empirically unstable bits) is latency-negligible. Overall, short binary templates, fixed-payload decision-only APDUs, and constant-time matching satisfy ISO/IEC transport constraints with wide timing margin and align with ISO/IEC 24745 privacy goals. Limitations: single-dataset evaluation and design-level (pre-hardware) timing; we outline AgeDB/CFP-FP and on-card microbenchmarks as next steps.

Paperid: 1479, https://arxiv.org/pdf/2510.14670.pdf

Abstract:
TITAN (Threat Intelligence Through Automated Navigation) is a framework that connects natural-language cyber threat queries with executable reasoning over a structured knowledge graph. It integrates a path planner model, which predicts logical relation chains from text, and a graph executor that traverses the TITAN Ontology to retrieve factual answers and supporting evidence. Unlike traditional retrieval systems, TITAN operates on a typed, bidirectional graph derived from MITRE, allowing reasoning to move clearly and reversibly between threats, behaviors, and defenses. To support training and evaluation, we introduce the TITAN Dataset, a corpus of 88209 examples (Train: 74258; Test: 13951) pairing natural language questions with executable reasoning paths and step by step Chain of Thought explanations. Empirical evaluations show that TITAN enables models to generate syntactically valid and semantically coherent reasoning paths that can be deterministically executed on the underlying graph.

Paperid: 1480, https://arxiv.org/pdf/2510.14480.pdf

Abstract:
Maximal Extractable Value (MEV) refers to a class of attacks to decentralized applications where the adversary profits by manipulating the ordering, inclusion, or exclusion of transactions in a blockchain. Decentralized Finance (DeFi) protocols are a primary target of these attacks, as their logic depends critically on transaction sequencing. To date, MEV attacks have already extracted billions of dollars in value, underscoring their systemic impact on blockchain security. Verifying the absence of MEV attacks requires determining suitable upper bounds, i.e. proving that no adversarial strategy can extract more value (if any) than expected by protocol designers. This problem is notoriously difficult: the space of adversarial strategies is extremely vast, making empirical studies and pen-and-paper reasoning insufficiently rigorous. In this paper, we present the first mechanized formalization of MEV in the Lean theorem prover. We introduce a methodology to construct machine-checked proofs of MEV bounds, providing correctness guarantees beyond what is possible with existing techniques. To demonstrate the generality of our approach, we model and analyse the MEV of two paradigmatic DeFi protocols. Notably, we develop the first machine-checked proof of the optimality of sandwich attacks in Automated Market Makers, a fundamental DeFi primitive.

Paperid: 1481, https://arxiv.org/pdf/2510.14218.pdf

Abstract:
As a valuable digital product, deep neural networks (DNNs) face increasingly severe threats to the intellectual property, making it necessary to develop effective technical measures to protect them. Trigger-based watermarking methods achieve copyright protection by embedding triggers into the host DNNs. However, the attacker may remove the watermark by pruning or fine-tuning. We model this interaction as a game under conditions of information asymmetry, namely, the defender embeds a secret watermark with private knowledge, while the attacker can only access the watermarked model and seek removal. We define strategies, costs, and utilities for both players, derive the attacker's optimal pruning budget, and establish an exponential lower bound on the accuracy of watermark detection after attack. Experimental results demonstrate the feasibility of the watermarked model, and indicate that sparse watermarking can resist removal with negligible accuracy loss. This study highlights the effectiveness of game-theoretic analysis in guiding the design of robust watermarking schemes for model copyright protection.

Paperid: 1482, https://arxiv.org/pdf/2510.13136.pdf

Abstract:
Indoor robotic systems within Cyber-Physical Systems (CPS) are increasingly exposed to Denial of Service (DoS) attacks that compromise localization, control and telemetry integrity. We propose a privacy-aware malware detection framework for indoor robotic systems, which leverages hybrid quantum computing and deep neural networks to counter DoS threats in CPS, while preserving privacy information. By integrating quantum-enhanced feature encoding with dropout-optimized deep learning, our architecture achieves up to 95.2% detection accuracy under privacy-constrained conditions. The system operates without handcrafted thresholds or persistent beacon data, enabling scalable deployment in adversarial environments. Benchmarking reveals robust generalization, interpretability and resilience against training instability through modular circuit design. This work advances trustworthy AI for secure, autonomous CPS operations.

Paperid: 1483, https://arxiv.org/pdf/2510.09952.pdf

Abstract:
Contemporary web application architectures involve many layers of proxy services that process traffic. Due to the complexity of HTTP and vendor design decisions, these proxies sometimes process a given request in different ways. Attackers can exploit these processing discrepancies to launch damaging attacks including web cache poisoning and request smuggling. Discrepancy attacks are surging, yet, there exists no systemic defense. In this work, we propose the first comprehensive defense to address this problem, called HTTP Request Synchronization. Our scheme uses standard HTTP extension mechanisms to augment each request with a complete processing history. It propagates this context through the traffic path detailing how each server hop has processed said request. Using this history, every proxy server can validate that their processing is consistent with all previous hops, eliminating discrepancy attacks. We implement our scheme for 5 popular proxy technologies, Apache, NGINX, HAProxy, Varnish, and Cloudflare, demonstrating its practical impact.

Paperid: 1484, https://arxiv.org/pdf/2510.09940.pdf

Abstract:
Deep learning-based radio frequency fingerprinting (RFFP) has become an enabling physical-layer security technology, allowing device identification and authentication through received RF signals. This technology, however, faces significant challenges when it comes to adapting to domain variations, such as time, location, environment, receiver and channel. For Bluetooth Low Energy (BLE) devices, addressing these challenges is particularly crucial due to the BLE protocol's frequency-hopping nature. In this work, and for the first time, we investigated the frequency hopping effect on RFFP of BLE devices, and proposed a novel, low-cost, domain-adaptive feature extraction method. Our approach improves the classification accuracy by up to 58\% across environments and up to 80\% across receivers compared to existing benchmarks.

Paperid: 1485, https://arxiv.org/pdf/2510.09773.pdf

Abstract:
Secret-key generation and agreement based on wireless channel reciprocity offers a promising avenue for securing IoT networks. However, existing approaches predominantly rely on the similarity of instantaneous channel measurement samples between communicating devices. This narrow view of reciprocity is often impractical, as it is highly susceptible to noise, asynchronous sampling, channel fading, and other system-level imperfections -- all of which significantly impair key generation performance. Furthermore, the quantization step common in traditional schemes introduces irreversible errors, further limiting efficiency. In this work, we propose a novel approach for secret-key generation by using wavelet scattering networks to extract robust and reciprocal CSI features. Dimensionality reduction is applied to uncover hidden cluster structures, which are then used to build hidden Markov models for efficient key agreement. Our approach eliminates the need for quantization and effectively captures channel randomness. It achieves a 5x improvement in key generation rate compared to traditional benchmarks, providing a secure and efficient solution for key generation in resource-constrained IoT environments.

Paperid: 1486, https://arxiv.org/pdf/2510.09699.pdf

Abstract:
Vision-Language Models (VLMs) have garnered significant attention for their remarkable ability to interpret and generate multimodal content. However, securing these models against jailbreak attacks continues to be a substantial challenge. Unlike text-only models, VLMs integrate additional modalities, introducing novel vulnerabilities such as image hijacking, which can manipulate the model into producing inappropriate or harmful responses. Drawing inspiration from text-based jailbreaks like the "Do Anything Now" (DAN) command, this work introduces VisualDAN, a single adversarial image embedded with DAN-style commands. Specifically, we prepend harmful corpora with affirmative prefixes (e.g., "Sure, I can provide the guidance you need") to trick the model into responding positively to malicious queries. The adversarial image is then trained on these DAN-inspired harmful texts and transformed into the text domain to elicit malicious outputs. Extensive experiments on models such as MiniGPT-4, MiniGPT-v2, InstructBLIP, and LLaVA reveal that VisualDAN effectively bypasses the safeguards of aligned VLMs, forcing them to execute a broad range of harmful instructions that severely violate ethical standards. Our results further demonstrate that even a small amount of toxic content can significantly amplify harmful outputs once the model's defenses are compromised. These findings highlight the urgent need for robust defenses against image-based attacks and offer critical insights for future research into the alignment and security of VLMs.

Paperid: 1487, https://arxiv.org/pdf/2510.09616.pdf

Abstract:
Industrial Control Systems (ICS) face growing cyber-physical attacks that exploit both network vulnerabilities and physical processes. Current anomaly detection methods rely on correlation-based analysis, which cannot separate true causal relationships from spurious associations. This limitation results in high false alarm rates and poor root cause analysis. We propose a novel Causal Digital Twin (CDT) framework for cyber-physical security in medium-scale ICS. Our method combines causal inference theory with digital twin modeling. The framework enables three types of causal reasoning: association for pattern detection, intervention for understanding system responses, and counterfactual analysis for attack prevention planning. We evaluate our framework on three industrial datasets: SWaT, WADI, and HAI, with validation through physical constraint compliance (90.8\%) and synthetic ground truth testing (structural Hamming distance 0.13). Results show significant improvements over seven baseline methods. Our CDT achieves F1-scores are $0.944 \pm 0.014$ for SWaT, $0.902 \pm 0.021$ for WADI, and $0.923 \pm 0.018$ for HAI with statistical significance ($p < 0.0024$, Bonferroni corrected). The framework reduces false positives by \SI{74}{\percent} and achieves \SI{78.4}{\percent} root cause analysis accuracy compared to \SI{48.7}{\percent} for existing methods. Counterfactual analysis enables defense strategies that reduce attack success by \SI{73.2}{\percent}. The system keeps real-time performance with \SI{3.2}{ms} latency, which is suitable for industrial deployment, while providing interpretable explanations for operators.

Paperid: 1488, https://arxiv.org/pdf/2510.08084.pdf

Abstract:
The rapid expansion of Internet of Things (IoT) devices has transformed industries and daily life by enabling widespread connectivity and data exchange. However, this increased interconnection has introduced serious security vulnerabilities, making IoT systems more exposed to sophisticated cyber attacks. This study presents a novel ensemble learning architecture designed to improve IoT attack detection. The proposed approach applies advanced machine learning techniques, specifically the Extra Trees Classifier, along with thorough preprocessing and hyperparameter optimization. It is evaluated on several benchmark datasets including CICIoT2023, IoTID20, BotNeTIoT L01, ToN IoT, N BaIoT, and BoT IoT. The results show excellent performance, achieving high recall, accuracy, and precision with very low error rates. These outcomes demonstrate the model efficiency and superiority compared to existing approaches, providing an effective and scalable method for securing IoT environments. This research establishes a solid foundation for future progress in protecting connected devices from evolving cyber threats.

Paperid: 1489, https://arxiv.org/pdf/2510.06975.pdf

Abstract:
There are very few SotA deception systems based on Large Language Models. The existing ones are limited only to simulating one type of service, mainly SSH shells. These systems - but also the deception technologies not based on LLMs - lack an extensive evaluation that includes human attackers. Generative AI has recently become a valuable asset for cybersecurity researchers and practitioners, and the field of cyber-deception is no exception. Researchers have demonstrated how LLMs can be leveraged to create realistic-looking honeytokens, fake users, and even simulated systems that can be used as honeypots. This paper presents an AI-based deception framework called VelLMes, which can simulate multiple protocols and services such as SSH Linux shell, MySQL, POP3, and HTTP. All of these can be deployed and used as honeypots, thus VelLMes offers a variety of choices for deception design based on the users' needs. VelLMes is designed to be attacked by humans, so interactivity and realism are key for its performance. We evaluate the generative capabilities and the deception capabilities. Generative capabilities were evaluated using unit tests for LLMs. The results of the unit tests show that, with careful prompting, LLMs can produce realistic-looking responses, with some LLMs having a 100% passing rate. In the case of the SSH Linux shell, we evaluated deception capabilities with 89 human attackers. The results showed that about 30% of the attackers thought that they were interacting with a real system when they were assigned an LLM-based honeypot. Lastly, we deployed 10 instances of the SSH Linux shell honeypot on the Internet to capture real-life attacks. Analysis of these attacks showed us that LLM honeypots simulating Linux shells can perform well against unstructured and unexpected attacks on the Internet, responding correctly to most of the issued commands.

Paperid: 1490, https://arxiv.org/pdf/2510.06692.pdf

Abstract:
Deep Neural Networks (DNNs) have attracted significant attention, and their internal models are now considered valuable intellectual assets. Extracting these internal models through access to a DNN is conceptually similar to extracting a secret key via oracle access to a block cipher. Consequently, cryptanalytic techniques, particularly differential-like attacks, have been actively explored recently. ReLU-based DNNs are the most commonly and widely deployed architectures. While early works (e.g., Crypto 2020, Eurocrypt 2024) assume access to exact output logits, which are usually invisible, more recent works (e.g., Asiacrypt 2024, Eurocrypt 2025) focus on the hard-label setting, where only the final classification result (e.g., "dog" or "car") is available to the attacker. Notably, Carlini et al. (Eurocrypt 2025) demonstrated that model extraction is feasible in polynomial time even under this restricted setting. In this paper, we first show that the assumptions underlying their attack become increasingly unrealistic as the attack-target depth grows. In practice, satisfying these assumptions requires an exponential number of queries with respect to the attack depth, implying that the attack does not always run in polynomial time. To address this critical limitation, we propose a novel attack method called CrossLayer Extraction. Instead of directly extracting the secret parameters (e.g., weights and biases) of a specific neuron, which incurs exponential cost, we exploit neuron interactions across layers to extract this information from deeper layers. This technique significantly reduces query complexity and mitigates the limitations of existing model extraction approaches.

Paperid: 1491, https://arxiv.org/pdf/2510.06023.pdf

Abstract:
In the context of Byzantine consensus problems such as Byzantine broadcast (BB) and Byzantine agreement (BA), the good-case setting aims to study the minimal possible latency of a BB or BA protocol under certain favorable conditions, namely the designated leader being correct (for BB), or all parties having the same input value (for BA). We provide a full characterization of the feasibility and impossibility of good-case latency, for both BA and BB, in the synchronous sleepy model. Surprisingly to us, we find irrational resilience thresholds emerging: 2-round good-case BB is possible if and only if at all times, at least $\frac{1}φ \approx 0.618$ fraction of the active parties are correct, where $φ= \frac{1+\sqrt{5}}{2} \approx 1.618$ is the golden ratio; 1-round good-case BA is possible if and only if at least $\frac{1}{\sqrt{2}} \approx 0.707$ fraction of the active parties are correct.

Paperid: 1492, https://arxiv.org/pdf/2510.05203.pdf

Abstract:
We consider a pair of causally independent processes, modelled as the tensor product of two channels, acting on a possibly correlated input to produce random outputs X and Y. We show that, assuming the processes produce a sufficient amount of randomness, one can extract uniform randomness from X and Y. This generalizes prior results, which assumed that X and Y are (conditionally) independent. Note that in contrast to the independence of quantum states, the independence of channels can be enforced through spacelike separation. As a consequence, our results allow for the generation of randomness under more practical and physically justifiable assumptions than previously possible. We illustrate this with the example of device-independent randomness amplification, where we can remove the constraint that the adversary only has access to classical side information about the source.

Paperid: 1493, https://arxiv.org/pdf/2510.05163.pdf

Abstract:
In the era of pervasive cyber threats and exponential growth in digital services, the inadequacy of single-factor authentication has become increasingly evident. Multi-Factor Authentication (MFA), which combines knowledge-based factors (passwords, PINs), possession-based factors (smart cards, tokens), and inherence-based factors (biometric traits), has emerged as a robust defense mechanism. Recent breakthroughs in deep learning have transformed the capabilities of biometric systems, enabling higher accuracy, resilience to spoofing, and seamless integration with hardware-based solutions. At the same time, smart card technologies have evolved to include on-chip biometric verification, cryptographic processing, and secure storage, thereby enabling compact and secure multi-factor devices. This survey presents a comprehensive synthesis of recent work (2019-2025) at the intersection of deep learning, biometrics, and smart card technologies for MFA. We analyze biometric modalities (face, fingerprint, iris, voice), review hardware-based approaches (smart cards, NFC, TPMs, secure enclaves), and highlight integration strategies for real-world applications such as digital banking, healthcare IoT, and critical infrastructure. Furthermore, we discuss the major challenges that remain open, including usability-security tradeoffs, adversarial attacks on deep learning models, privacy concerns surrounding biometric data, and the need for standardization in MFA deployment. By consolidating current advancements, limitations, and research opportunities, this survey provides a roadmap for designing secure, scalable, and user-friendly authentication frameworks.

Paperid: 1494, https://arxiv.org/pdf/2510.04056.pdf

Abstract:
Artificial Intelligence (AI) and more specifically Large Language Models (LLMs) have demonstrated exceptional progress in multiple areas including software engineering, however, their capability for vulnerability detection in the wild scenario and its corresponding reasoning remains underexplored. Prompting pre-trained LLMs in an effective way offers a computationally effective and scalable solution. Our contributions are (i)varied prompt designs for vulnerability detection and its corresponding reasoning in the wild. (ii)a real-world vector data store constructed from the National Vulnerability Database, that will provide real time context to vulnerability detection framework, and (iii)a scoring measure for combined measurement of accuracy and reasoning quality. Our contribution aims to examine whether LLMs are ready for wild deployment, thus enabling the reliable use of LLMs stronger for the development of secure software's.

Paperid: 1495, https://arxiv.org/pdf/2510.03831.pdf

Abstract:
Massive multiple-input multiple-output (MMIMO) is essential to modern wireless communication systems, like 5G and 6G, but it is vulnerable to active eavesdropping attacks. One type of such attack is the pilot contamination attack (PCA), where a malicious user copies pilot signals from an authentic user during uplink, intentionally interfering with the base station's (BS) channel estimation accuracy. In this work, we propose to use a Decision Tree (DT) algorithm for PCA detection at the BS in a multi-user system. We present a methodology to generate training data for the DT classifier and select the best DT according to their depth. Then, we simulate different scenarios that could be encountered in practice and compare the DT to a classical technique based on likelihood ratio testing (LRT) submitted to the same scenarios. The results revealed that a DT with only one level of depth is sufficient to outperform the LRT. The DT shows a good performance regarding the probability of detection in noisy scenarios and when the malicious user transmits with low power, in which case the LRT fails to detect the PCA. We also show that the reason for the good performance of the DT is its ability to compute a threshold that separates PCA data from non-PCA data better than the LRT's threshold. Moreover, the DT does not necessitate prior knowledge of noise power or assumptions regarding the signal power of malicious users, prerequisites typically essential for LRT and other hypothesis testing methodologies.

Paperid: 1496, https://arxiv.org/pdf/2510.03636.pdf

Abstract:
This study explored how in-context learning (ICL) in large language models can be disrupted by data poisoning attacks in the setting of public health sentiment analysis. Using tweets of Human Metapneumovirus (HMPV), small adversarial perturbations such as synonym replacement, negation insertion, and randomized perturbation were introduced into the support examples. Even these minor manipulations caused major disruptions, with sentiment labels flipping in up to 67% of cases. To address this, a Spectral Signature Defense was applied, which filtered out poisoned examples while keeping the data's meaning and sentiment intact. After defense, ICL accuracy remained steady at around 46.7%, and logistic regression validation reached 100% accuracy, showing that the defense successfully preserved the dataset's integrity. Overall, the findings extend prior theoretical studies of ICL poisoning to a practical, high-stakes setting in public health discourse analysis, highlighting both the risks and potential defenses for robust LLM deployment. This study also highlights the fragility of ICL under attack and the value of spectral defenses in making AI systems more reliable for health-related social media monitoring.

Paperid: 1497, https://arxiv.org/pdf/2510.03567.pdf

Abstract:
With the increasing adoption of Large Language Models (LLMs), more customization is needed to ensure privacy-preserving and safe generation. We address this objective from two critical aspects: unlearning of sensitive information and robustness to jail-breaking attacks. We investigate various constrained optimization formulations that address both aspects in a \emph{unified manner}, by finding the smallest possible interventions on LLM weights that either make a given vocabulary set unreachable or embed the LLM with robustness to tailored attacks by shifting part of the weights to a \emph{safer} region. Beyond unifying two key properties, this approach contrasts with previous work in that it doesn't require an oracle classifier that is typically not available or represents a computational overhead. Surprisingly, we find that the simplest point-wise constraint-based intervention we propose leads to better performance than max-min interventions, while having a lower computational cost. Comparison against state-of-the-art defense methods demonstrates superior performance of the proposed approach.

Paperid: 1498, https://arxiv.org/pdf/2510.02384.pdf

Abstract:
The rapid advancement of generative artificial intelligence (Gen-AI) has facilitated the effortless creation of high-quality images, while simultaneously raising critical concerns regarding intellectual property protection, authenticity, and accountability. Watermarking has emerged as a promising solution to these challenges by distinguishing AI-generated images from natural content, ensuring provenance, and fostering trustworthy digital ecosystems. This paper presents a comprehensive survey of the current state of AI-generated image watermarking, addressing five key dimensions: (1) formalization of image watermarking systems; (2) an overview and comparison of diverse watermarking techniques; (3) evaluation methodologies with respect to visual quality, capacity, and detectability; (4) vulnerabilities to malicious attacks; and (5) prevailing challenges and future directions. The survey aims to equip researchers with a holistic understanding of AI-generated image watermarking technologies, thereby promoting their continued development.

Paperid: 1499, https://arxiv.org/pdf/2509.26350.pdf

Abstract:
Integrating SDN and the IoT enhances network control and flexibility. DL-based AAD systems improve security by enabling real-time threat detection in SDN-IoT networks. However, these systems remain vulnerable to adversarial attacks that manipulate input data or exploit model weaknesses, significantly degrading detection accuracy. Existing research lacks a systematic analysis of adversarial vulnerabilities specific to DL-based AAD systems in SDN-IoT environments. This SoK study introduces a structured adversarial threat model and a comprehensive taxonomy of attacks, categorising them into data, model, and hybrid-level threats. Unlike previous studies, we systematically evaluate white, black, and grey-box attack strategies across popular benchmark datasets. Our findings reveal that adversarial attacks can reduce detection accuracy by up to 48.4%, with Membership Inference causing the most significant drop. C&W and DeepFool achieve high evasion success rates. However, adversarial training enhances robustness, and its high computational overhead limits the real-time deployment of SDN-IoT applications. We propose adaptive countermeasures, including real-time adversarial mitigation, enhanced retraining mechanisms, and explainable AI-driven security frameworks. By integrating structured threat models, this study offers a more comprehensive approach to attack categorisation, impact assessment, and defence evaluation than previous research. Our work highlights critical vulnerabilities in existing DL-based AAD models and provides practical recommendations for improving resilience, interpretability, and computational efficiency. This study serves as a foundational reference for researchers and practitioners seeking to enhance DL-based AAD security in SDN-IoT networks, offering a systematic adversarial threat model and conceptual defence evaluation based on prior empirical studies.

Paperid: 1500, https://arxiv.org/pdf/2509.25469.pdf

Abstract:
Blockchain technology has spawned a vast ecosystem of digital currencies with Central Bank Digital Currencies (CBDCs) -- digital forms of fiat currency -- being one of them. An important feature of digital currencies is facilitating transactions without network connectivity, which can enhance the scalability of cryptocurrencies and the privacy of CBDC users. However, in the case of CBDCs, this characteristic also introduces new regulatory challenges, particularly when it comes to applying established Anti-Money Laundering and Countering the Financing of Terrorism (AML/CFT) frameworks. This paper introduces a prototype for offline digital currency payments, equally applicable to cryptocurrencies and CBDCs, that leverages Secure Elements and digital credentials to address the tension of offline payment support with regulatory compliance. Performance evaluation results suggest that the prototype can be flexibly adapted to different regulatory environments, with a transaction latency comparable to real-life commercial payment systems. Furthermore, we conceptualize how the integration of Zero-Knowledge Proofs into our design could accommodate various tiers of enhanced privacy protection.

Paperid: 1501, https://arxiv.org/pdf/2509.25462.pdf

Abstract:
Mobile networks in the 5G and 6G era require to rethink how to manage security due to the introduction of new services, use cases, each with its own security requirements, while simultaneously expanding the threat landscape. Although automation has emerged as a key enabler to address complexity in networks, existing approaches lack the expressiveness to define and enforce complex, goal-driven, and measurable security requirements. In this paper, we propose the concept of differentiated security levels and leveraging intents as a management framework. We discuss the requirements and enablers to extend the currently defined intent-based management frameworks to pave the path for intent-based security management in mobile networks. Our approach formalizes both functional and non-functional security requirements and demonstrates how these can be expressed and modeled using an extended TM Forum (TMF) intent security ontology. We further discuss the required standardization steps to achieve intent-based security management. Our work aims at advance security automation, improve adaptability, and strengthen the resilience and security posture of the next-generation mobile networks.

Paperid: 1502, https://arxiv.org/pdf/2509.23427.pdf

Abstract:
Spam poses a growing threat to blockchain networks. Adversaries can easily create multiple accounts to flood transaction pools, inflating fees and degrading service quality. Existing defenses against spam, such as fee markets and staking requirements, primarily rely on economic deterrence, which fails to distinguish between malicious and legitimate users and often exclude low-value but honest activity. To address these shortcomings, we present StarveSpam, a decentralized reputation-based protocol that mitigates spam by operating at the transaction relay layer. StarveSpam combines local behavior tracking, peer scoring, and adaptive rate-limiting to suppress abusive actors, without requiring global consensus, protocol changes, or trusted infrastructure. We evaluate StarveSpam using real Ethereum data from a major NFT spam event and show that it outperforms existing fee-based and rule-based defenses, allowing each node to block over 95% of spam while dropping just 3% of honest traffic, and reducing the fraction of the network exposed to spam by 85% compared to existing rule-based methods. StarveSpam offers a scalable and deployable alternative to traditional spam defenses, paving the way toward more resilient and equitable blockchain infrastructure.

Paperid: 1503, https://arxiv.org/pdf/2509.23091.pdf

Abstract:
Federated learning (FL) with fully homomorphic encryption (FHE) effectively safeguards data privacy during model aggregation by encrypting local model updates before transmission, mitigating threats from untrusted servers or eavesdroppers in transmission. However, the computational burden and ciphertext expansion associated with homomorphic encryption can significantly increase resource and communication overhead. To address these challenges, we propose FedBit, a hardware/software co-designed framework optimized for the Brakerski-Fan-Vercauteren (BFV) scheme. FedBit employs bit-interleaved data packing to embed multiple model parameters into a single ciphertext coefficient, thereby minimizing ciphertext expansion and maximizing computational parallelism. Additionally, we integrate a dedicated FPGA accelerator to handle cryptographic operations and an optimized dataflow to reduce the memory overhead. Experimental results demonstrate that FedBit achieves a speedup of two orders of magnitude in encryption and lowers average communication overhead by 60.7%, while maintaining high accuracy.

Paperid: 1504, https://arxiv.org/pdf/2509.22154.pdf

Abstract:
Radio frequency fingerprint (RFF) is a promising device identification technology, with recent research shifting from robustness to security due to growing concerns over vulnerabilities. To date, while the security of RFF against basic spoofing such as MAC address tampering has been validated, its resilience to advanced mimicry remains unknown. To address this gap, we propose a collusion-driven impersonation attack that achieves RF-level mimicry, successfully breaking RFF identification systems across diverse environments. Specifically, the attacker synchronizes with a colluding receiver to match the centralized logarithmic power spectrum (CLPS) of the legitimate transmitter; once the colluder deems the CLPS identical, the victim receiver will also accept the forged fingerprint, completing RF-level spoofing. Given that the distribution of CLPS features is relatively concentrated and has a clear underlying structure, we design a spoofed signal generation network that integrates a variational autoencoder (VAE) with a multi-objective loss function to enhance the similarity and deceptive capability of the generated samples. We carry out extensive simulations, validating cross-channel attacks in environments that incorporate standard channel variations including additive white Gaussian noise (AWGN), multipath fading, and Doppler shift. The results indicate that the proposed attack scheme essentially maintains a success rate of over 95% under different channel conditions, revealing the effectiveness of this attack.

Paperid: 1505, https://arxiv.org/pdf/2509.20697.pdf

Abstract:
Random classical linear codes are widely believed to be hard to decode. While slightly sub-exponential time algorithms exist when the coding rate vanishes sufficiently rapidly, all known algorithms at constant rate require exponential time. By contrast, the complexity of decoding a random quantum stabilizer code has remained an open question for quite some time. This work closes the gap in our understanding of the algorithmic hardness of decoding random quantum versus random classical codes. We prove that decoding a random stabilizer code with even a single logical qubit is at least as hard as decoding a random classical code at constant rate--the maximally hard regime. This result suggests that the easiest random quantum decoding problem is at least as hard as the hardest random classical decoding problem, and shows that any sub-exponential algorithm decoding a typical stabilizer code, at any rate, would immediately imply a breakthrough in cryptography. More generally, we also characterize many other complexity-theoretic properties of stabilizer codes. While classical decoding admits a random self-reduction, we prove significant barriers for the existence of random self-reductions in the quantum case. This result follows from new bounds on Clifford entropies and Pauli mixing times, which may be of independent interest. As a complementary result, we demonstrate various other self-reductions which are in fact achievable, such as between search and decision. We also demonstrate several ways in which quantum phenomena, such as quantum degeneracy, force several reasonable definitions of stabilizer decoding--all of which are classically identical--to have distinct or non-trivially equivalent complexity.

Paperid: 1506, https://arxiv.org/pdf/2509.19539.pdf

Abstract:
Peer-to-peer (P2P) networks are a cornerstone of modern computing, and their security is an active area of research. Many defenses with strong security guarantees have been proposed; however, the most-recent survey is over a decade old. This paper delivers an updated review of recent theoretical advances that address classic threats, such as the Sybil and routing attacks, while highlighting how emerging trends -- such as machine learning, social networks, and dynamic systems -- pose new challenges and drive novel solutions. We evaluate the strengths and weaknesses of these solutions and suggest directions for future research.

Paperid: 1507, https://arxiv.org/pdf/2509.17778.pdf

Abstract:
We investigate the problem of covert quickest change detection in a continuous-time setting, where a Brownian motion experiences a drift change at an unknown time. Unlike classical formulations, we consider a covert adversary who adjusts the post-change drift $Î¼= Î¼(Î³)$ as a function of the false alarm constraint parameter $Î³$, with the goal of remaining undetected for as long as possible. Leveraging the exact expressions for the average detection delay (ADD) and average time to false alarm (AT2FA) known for the continuous-time CuSum procedure, we rigorously analyze how the asymptotic behavior of ADD evolves as $Î¼(Î³) \to 0$ with increasing $Î³$. Our results reveal that classical detection delay characterizations no longer hold in this regime. We derive sharp asymptotic expressions for the ADD under various convergence rates of $Î¼(Î³)$, identify precise conditions for maintaining covertness, and characterize the total damage inflicted by the adversary. We show that the adversary achieves maximal damage when the drift scales as $Î¼(Î³) = Î(1/\sqrtÎ³)$, marking a fundamental trade-off between stealth and impact in continuous-time detection systems.

Paperid: 1508, https://arxiv.org/pdf/2509.16620.pdf

Abstract:
The machine learning problem of model extraction was first introduced in 1991 and gained prominence as a cryptanalytic challenge starting with Crypto 2020. For over three decades, research in this field has primarily focused on ReLU-based neural networks. In this work, we take the first step towards the cryptanalytic extraction of PReLU neural networks, which employ more complex nonlinear activation functions than their ReLU counterparts. We propose a raw output-based parameter recovery attack for PReLU networks and extend it to more restrictive scenarios where only the top-m probability scores are accessible. Our attacks are rigorously evaluated through end-to-end experiments on diverse PReLU neural networks, including models trained on the MNIST dataset. To the best of our knowledge, this is the first practical demonstration of PReLU neural network extraction across three distinct attack scenarios.

Paperid: 1509, https://arxiv.org/pdf/2509.15847.pdf

Abstract:
To maximize performance, many modern blockchain systems rely on eventually-synchronous, Byzantine fault-tolerant (BFT) consensus protocols. Two protocol designs have emerged in this space: protocols that minimize latency using a leader that drives both data dissemination and consensus, and protocols that maximize throughput using a separate, asynchronous data dissemination layer. Recent protocols such as Partially-Synchronous Bullshark and Sailfish combine elements of both approaches by using a DAG to enable parallel data dissemination and a leader that paces DAG formation. This improves latency while achieving state-of-the-art throughput. Yet the latency of leader-based protocols is still better under moderate loads. We present Angelfish, a hybrid protocol that adapts smoothly across this design space, from leader-based to Sailfish-like DAG-based consensus. Angelfish lets a dynamically-adjusted subset of parties use best-effort broadcast to issue lightweight votes instead of reliably broadcasting costlier DAG vertices. This reduces communication, helps lagging nodes catch up, and lowers latency in practice compared to prior DAG-based protocols. Our empirical evaluation shows that Angelfish attains state-of-the-art peak throughput while matching the latency of leader-based protocols under moderate throughput, delivering the best of both worlds.

Paperid: 1510, https://arxiv.org/pdf/2509.15653.pdf

Abstract:
Quantum Computing (QC) introduces a transformative threat to digital security, with the potential to compromise widely deployed classical cryptographic systems. This survey offers a comprehensive and systematic examination of quantumsafe security for Cloud Computing (CC), focusing on the vulnerabilities, transition strategies, and mitigation mechanisms required to secure cloud infrastructures in the quantum era. We evaluated the landscape of quantum threats across the entire CC stack, demonstrating how quantum algorithms can undermine classical encryption and compromise cloud security at multiple architectural layers. Using a structured risk assessment methodology based on the STRIDE model, we evaluate quantum-induced attack vectors and their impact on cloud environments. To address these challenges, we propose a layered security framework that integrates hybrid cryptographic transition strategies, cryptographic agility, and proactive risk mitigation. We analyze the preparation and implementation approaches of the major Cloud Service Providers (CSPs), including AWS, Azure and GCP, synthesizing platform-specific initiatives toward Post-Quantum Cryptography (PQC). Furthermore, we provide a detailed evaluation of standardized PQC algorithms, exploring their resilience to side-channel and active attacks within cloud-native deployments. This survey serves as a strategic reference for cloud architects, policymakers, and researchers, offering actionable insights for navigating the complex transition to quantum-resilient cloud systems. We conclude by identifying six key future research directions: standardization and interoperability, performance and scalability, implementation security, integration with emerging technologies, systemic preparedness, and crypto-agile migration frameworks.

Paperid: 1511, https://arxiv.org/pdf/2509.14622.pdf

Abstract:
With the deployment of Large Language Models (LLMs) in interactive applications, online malicious intent detection has become increasingly critical. However, existing approaches fall short of handling diverse and complex user queries in real time. To address these challenges, we introduce ADRAG (Adversarial Distilled Retrieval-Augmented Guard), a two-stage framework for robust and efficient online malicious intent detection. In the training stage, a high-capacity teacher model is trained on adversarially perturbed, retrieval-augmented inputs to learn robust decision boundaries over diverse and complex user queries. In the inference stage, a distillation scheduler transfers the teacher's knowledge into a compact student model, with a continually updated knowledge base collected online. At deployment, the compact student model leverages top-K similar safety exemplars retrieved from the online-updated knowledge base to enable both online and real-time malicious query detection. Evaluations across ten safety benchmarks demonstrate that ADRAG, with a 149M-parameter model, achieves 98.5% of WildGuard-7B's performance, surpasses GPT-4 by 3.3% and Llama-Guard-3-8B by 9.5% on out-of-distribution detection, while simultaneously delivering up to 5.6x lower latency at 300 queries per second (QPS) in real-time applications.

Paperid: 1512, https://arxiv.org/pdf/2509.14282.pdf

Abstract:
The emergence of quantum computing poses significant risks to the security of modern communication networks as it breaks today's public-key cryptographic algorithms. Quantum Key Distribution (QKD) offers a promising solution by harnessing the principles of quantum mechanics to establish secure keys. However, practical QKD implementations remain vulnerable to hardware imperfections and advanced attacks such as Photon Number Splitting and Trojan-Horse attacks. In this work, we investigate the potential of using quantum machine learning (QML) to detect popular QKD attacks. In particular, we propose a Hybrid Quantum Long Short-Term Memory (QLSTM) model to improve the detection of common QKD attacks. By combining quantum-enhanced learning with classical deep learning, the model captures complex temporal patterns in QKD data, improving detection accuracy. To evaluate the proposed model, we introduce a realistic QKD dataset simulating normal QKD operations along with seven attack scenarios, Intercept-and-Resend, Photon-Number Splitting (PNS), Trojan-Horse attacks Random Number Generator (RNG), Detector Blinding, Wavelength-dependent Trojan Horse, and Combined attacks. The dataset includes quantum security metrics such as Quantum Bit Error Rate (QBER), measurement entropy, signal and decoy loss rates, and time-based metrics, ensuring an accurate representation of real-world conditions. Our results demonstrate promising performance of the quantum machine learning approach compared to traditional classical machine learning models, highlighting the potential of hybrid techniques to enhance the security of future quantum communication networks. The proposed Hybrid QLSTM model achieved an accuracy of 93.7.0\% after 50 training epochs, outperforming classical deep learning models such as LSTM, and CNN.

Paperid: 1513, https://arxiv.org/pdf/2509.13797.pdf

Abstract:
Since security was not among the original design goals of the Domain Name System (herein called Vanilla DNS), many secure DNS schemes have been proposed to enhance the security and privacy of the DNS resolution process. Some proposed schemes aim to replace the existing DNS infrastructure entirely, but none have succeeded in doing so. In parallel, numerous schemes focus on improving DNS security without modifying its fundamental two-stage structure. These efforts highlight the feasibility of addressing DNS security as two distinct but compatible stages. We survey DNS resolution process attacks and threats and develop a comprehensive threat model and attack taxonomy for their systematic categorization. This analysis results in the formulation of 14 desirable security, privacy, and availability properties to mitigate the identified threats. Using these properties, we develop an objective evaluation framework and apply it to comparatively analyze 12 secure DNS schemes surveyed in this work that aim to augment the properties of the DNS resolution process. Our evaluation reveals that no single scheme provides ideal protection across the entire resolution path. Instead, the schemes tend to address a subset of properties specific to individual stages. Since these schemes targeting different stages of DNS resolution are complementary and can operate together, combining compatible schemes offers a practical and effective approach to achieving comprehensive security in the DNS resolution process.

Paperid: 1514, https://arxiv.org/pdf/2509.13405.pdf

Abstract:
The security of quantum key distribution (QKD) is quantified by a parameter $\varepsilon>0$, which -- under well-defined physical assumptions -- can be bounded explicitly. This contrasts with computationally secure schemes, where security claims are only asymptotic (i.e., under standard complexity assumptions, one only knows that $\varepsilon \to 0$ as the key size grows, but has no explicit bound). Here we explain the definition and interpretation of $\varepsilon$-security. Adopting an axiomatic approach, we show that $\varepsilon$ can be understood as the maximum probability of a security failure. Finally, we review and address several criticisms of this definition that have appeared in the literature.

Paperid: 1515, https://arxiv.org/pdf/2509.13117.pdf

Abstract:
Motivated by software maintenance and the more recent concept of security debt, the paper presents a time series analysis of vulnerability patching of Red Hat's products and components between 1999 and 2024. According to the results based on segmented regression analysis, the amounts of vulnerable products and components have not been stable; a linear trend describes many of the series well. Nor do the amounts align well with trends characterizing vulnerabilities in general. There are also visible breakpoints indicating that the linear trend is not universally applicable and that the growing security debt may be stabilizing.

Paperid: 1516, https://arxiv.org/pdf/2509.11870.pdf

Abstract:
Federated Learning (FL) allows collaborative model training across distributed clients without sharing raw data, thus preserving privacy. However, the system remains vulnerable to privacy leakage from gradient updates and Byzantine attacks from malicious clients. Existing solutions face a critical trade-off among privacy preservation, Byzantine robustness, and computational efficiency. We propose a novel scheme that effectively balances these competing objectives by integrating homomorphic encryption with dimension compression based on the Johnson-Lindenstrauss transformation. Our approach employs a dual-server architecture that enables secure Byzantine defense in the ciphertext domain while dramatically reducing computational overhead through gradient compression. The dimension compression technique preserves the geometric relationships necessary for Byzantine defence while reducing computation complexity from $O(dn)$ to $O(kn)$ cryptographic operations, where $k \ll d$. Extensive experiments across diverse datasets demonstrate that our approach maintains model accuracy comparable to non-private FL while effectively defending against Byzantine clients comprising up to $40\%$ of the network.

Paperid: 1517, https://arxiv.org/pdf/2509.11398.pdf

Abstract:
A red team simulates adversary attacks to help defenders find effective strategies to defend their systems in a real-world operational setting. As more enterprise systems adopt AI, red-teaming will need to evolve to address the unique vulnerabilities and risks posed by AI systems. We take the position that AI systems can be more effectively red-teamed if AI red-teaming is recognized as a domain-specific evolution of cyber red-teaming. Specifically, we argue that existing Cyber Red Teams who adopt this framing will be able to better evaluate systems with AI components by recognizing that AI poses new risks, has new failure modes to exploit, and often contains unpatchable bugs that re-prioritize disclosure and mitigation strategies. Similarly, adopting a cybersecurity framing will allow existing AI Red Teams to leverage a well-tested structure to emulate realistic adversaries, promote mutual accountability with formal rules of engagement, and provide a pattern to mature the tooling necessary for repeatable, scalable engagements. In these ways, the merging of AI and Cyber Red Teams will create a robust security ecosystem and best position the community to adapt to the rapidly changing threat landscape.

Paperid: 1518, https://arxiv.org/pdf/2509.11249.pdf

Abstract:
Deep learning-based face recognition (FR) technology exacerbates privacy concerns in photo sharing. In response, the research community developed a suite of anti-FR methods to block identity extraction by unauthorized FR systems. Benefiting from quasi-imperceptible alteration, perturbation-based methods are well-suited for privacy protection of subject faces in photos, as they allow familiar persons to recognize subjects via naked eyes. However, we reveal that perturbation-based methods provide a false sense of privacy through theoretical analysis and experimental validation. Therefore, new alternative solutions should be found to protect subject faces. In this paper, we explore synthesis-based methods as a promising solution, whose challenge is to enable familiar persons to recognize subjects. To solve the challenge, we present a key insight: In most photo sharing scenarios, familiar persons recognize subjects through identity perception rather than meticulous face analysis. Based on the insight, we propose the first synthesis-based method dedicated to subject faces, i.e., PerceptFace, which can make identity unextractable yet perceptible. To enhance identity perception, a new perceptual similarity loss is designed for faces, reducing the alteration in regions of high sensitivity to human vision. As a synthesis-based method, PerceptFace can inherently provide reliable identity protection. Meanwhile, out of the confine of meticulous face analysis, PerceptFace focuses on identity perception from a more practical scenario, which is also enhanced by the designed perceptual similarity loss. Sufficient experiments show that PerceptFace achieves a superior trade-off between identity protection and identity perception compared to existing methods. We provide a public API of PerceptFace and believe that it has great potential to become a practical anti-FR tool.

Paperid: 1519, https://arxiv.org/pdf/2509.10895.pdf

Abstract:
SSH is an important protocol for secure remote shell access to servers on the Internet. At USENIX 2024, BÃ¤umer et al. presented the Terrapin attack on SSH, which relies on the attacker injecting optional messages during the key exchange. To mitigate this attack, SSH vendors adopted an extension developed by OpenSSH called strict key exchange ("strict KEX"). With strict KEX, optional messages are forbidden during the handshake, preventing the attack. In practice, this should simplify the state machine of an SSH handshake to a linear message flow similar to that of TLS. In this work, we analyze the design, implementation, and security of strict KEX in popular SSH servers, using black-box state learning, which can uncover the hidden state machine of an implementation. In practice, it is limited by the number of learned messages and the complexity of the state machine. Thus, learning the complete state machine of SSH is infeasible. Previous research on SSH, therefore, excluded optional messages, learning only a partial state machine. However, these messages are a critical part of the Terrapin attack. We propose to instead learn the complete state machine of the handshake phase of an SSH server, but with strict KEX enabled. We investigate the security of ten SSH implementations supporting strict KEX for up to five key exchange algorithms. In total, we learn 33 state machines, revealing significant differences in the implementations. We show that seven implementations violate the strict KEX specification and find two critical security vulnerabilities. One results in a rogue session attack in the proprietary Tectia SSH implementation. Another affects the official SSH implementation of the Erlang Open Telecom Platform, and enables unauthenticated remote code execution in the security context of the SSH server.

Paperid: 1520, https://arxiv.org/pdf/2509.09331.pdf

Abstract:
Administrators and developers use SSH client keys and signatures for authentication, for example, to access internet backbone servers or to commit new code on platforms like GitHub. However, unlike servers, SSH clients cannot be measured through internet scans. We close this gap in two steps. First, we collect SSH client public keys. Such keys are regularly published by their owners on open development platforms like GitHub and GitLab. We systematize previous non-academic work by subjecting these keys to various security tests in a longitudinal study. Second, in a series of black-box lab experiments, we analyze the implementations of algorithms for SSH client signatures in 24 popular SSH clients for Linux, Windows, and macOS. We extracted 31,622,338 keys from three public sources in two scans. Compared to previous work, we see a clear tendency to abandon RSA signatures in favor of EdDSA signatures. Still, in January 2025, we found 98 broken short keys, 139 keys generated from weak randomness, and 149 keys with common or small factors-the large majority of the retrieved keys exposed no weakness. Weak randomness can not only compromise a secret key through its public key, but also through signatures. It is well-known that a bias in random nonces in ECDSA can reveal the secret key through public signatures. For the first time, we show that the use of deterministic nonces in ECDSA can also be dangerous: The private signing key of a PuTTY client can be recovered from just 58 valid signatures if ECDSA with NIST curve P-521 is used. PuTTY acknowledged our finding in CVE-2024-31497, and they subsequently replaced the nonce generation algorithm.

Paperid: 1521, https://arxiv.org/pdf/2509.08727.pdf

Abstract:
Authors of cryptographic software are well aware that their code should not leak secrets through its timing behavior, and, until 2018, they believed that following industry-standard constant-time coding guidelines was sufficient. However, the revelation of the Spectre family of speculative execution attacks injected new complexities. To block speculative attacks, prior work has proposed annotating the program's source code to mark secret data, with hardware using this information to decide when to speculate (i.e., when only public values are involved) or not (when secrets are in play). While these solutions are able to track secret information stored on the heap, they suffer from limitations that prevent them from correctly tracking secrets on the stack, at a cost in performance. This paper introduces SecSep, a transformation framework that rewrites assembly programs so that they partition secret and public data on the stack. By moving from the source-code level to assembly rewriting, SecSep is able to address limitations of prior work. The key challenge in performing this assembly rewriting stems from the loss of semantic information through the lengthy compilation process. The key innovation of our methodology is a new variant of typed assembly language (TAL), Octal, which allows us to address this challenge. Assembly rewriting is driven by compile-time inference within Octal. We apply our technique to cryptographic programs and demonstrate that it enables secure speculation efficiently, incurring a low average overhead of $1.2\%$.

Paperid: 1522, https://arxiv.org/pdf/2509.08089.pdf

Abstract:
Federated Learning is a distributed learning technique in which multiple clients cooperate to train a machine learning model. Distributed settings facilitate backdoor attacks by malicious clients, who can embed malicious behaviors into the model during their participation in the training process. These malicious behaviors are activated during inference by a specific trigger. No defense against backdoor attacks has stood the test of time, especially against adaptive attackers, a powerful but not fully explored category of attackers. In this work, we first devise a new adaptive adversary that surpasses existing adversaries in capabilities, yielding attacks that only require one or two malicious clients out of 20 to break existing state-of-the-art defenses. Then, we present Hammer and Anvil, a principled defense approach that combines two defenses orthogonal in their underlying principle to produce a combined defense that, given the right set of parameters, must succeed against any attack. We show that our best combined defense, Krum+, is successful against our new adaptive adversary and state-of-the-art attacks.

Paperid: 1523, https://arxiv.org/pdf/2509.06921.pdf

Abstract:
Traditional Artificial Intelligence (AI) approaches in cybersecurity exhibit fundamental limitations: inadequate conceptual grounding leading to non-robustness against novel attacks; limited instructibility impeding analyst-guided adaptation; and misalignment with cybersecurity objectives. Neuro-Symbolic (NeSy) AI has emerged with the potential to revolutionize cybersecurity AI. However, there is no systematic understanding of this emerging approach. These hybrid systems address critical cybersecurity challenges by combining neural pattern recognition with symbolic reasoning, enabling enhanced threat understanding while introducing concerning autonomous offensive capabilities that reshape threat landscapes. In this survey, we systematically characterize this field by analyzing 127 publications spanning 2019-July 2025. We introduce a Grounding-Instructibility-Alignment (G-I-A) framework to evaluate these systems, focusing on both cyber defense and cyber offense across network security, malware analysis, and cyber operations. Our analysis shows advantages of multi-agent NeSy architectures and identifies critical implementation challenges including standardization gaps, computational complexity, and human-AI collaboration requirements that constrain deployment. We show that causal reasoning integration is the most transformative advancement, enabling proactive defense beyond correlation-based approaches. Our findings highlight dual-use implications where autonomous systems demonstrate substantial capabilities in zero-day exploitation while achieving significant cost reductions, altering threat dynamics. We provide insights and future research directions, emphasizing the urgent need for community-driven standardization frameworks and responsible development practices that ensure advancement serves defensive cybersecurity objectives while maintaining societal alignment.

Paperid: 1524, https://arxiv.org/pdf/2509.04941.pdf

Abstract:
Elliptic curve cryptography is better than traditional cryptography based on RSA and discrete logarithm of finite field in terms of efficiency and security. In this paper, we show how to exploit elliptic curve with high rank, which has not been used in cryptography before, to construct cryptographic schemes. Concretely we demonstrate how to construct public key signature scheme with hierarchy revocation based on elliptic curve with high rank, where the rank determines the height of the revocation tree. Although our construction is not very efficient in some sense, our construction shows elliptic curve with high rank is valuable and important for cryptographic usage. The technique and assumption presented can surely be used for other cryptographic constructions.

Paperid: 1525, https://arxiv.org/pdf/2509.01271.pdf

Abstract:
Advanced Persistent Threats (APTs) are prolonged, stealthy intrusions by skilled adversaries that compromise high-value systems to steal data or disrupt operations. Reconstructing complete attack chains from massive, heterogeneous logs is essential for effective attack investigation, yet existing methods suffer from poor platform generality, limited generalization to evolving tactics, and an inability to produce analyst-ready reports. Large Language Models (LLMs) offer strong semantic understanding and summarization capabilities, but in this domain they struggle to capture the long-range, cross-log dependencies critical for accurate reconstruction. To solve these problems, we present an LLM-empowered attack investigation framework augmented with a dynamically adaptable Kill-Chain-aligned threat knowledge base. We organizes attack-relevant behaviors into stage-aware knowledge units enriched with semantic annotations, enabling the LLM to iteratively retrieve relevant intelligence, perform causal reasoning, and progressively expand the investigation context. This process reconstructs multi-phase attack scenarios and generates coherent, human-readable investigation reports. Evaluated on 15 attack scenarios spanning single-host and multi-host environments across Windows and Linux (over 4.3M log events, 7.2 GB of data), the system achieves an average True Positive Rate (TPR) of 97.1% and an average False Positive Rate (FPR) of 0.2%, significantly outperforming the SOTA method ATLAS, which achieves an average TPR of 79.2% and an average FPR of 29.1%.

Paperid: 1526, https://arxiv.org/pdf/2509.01253.pdf

Abstract:
In contemporary cloud-based services, protecting users' sensitive data and ensuring the confidentiality of the server's model are critical. Fully homomorphic encryption (FHE) enables inference directly on encrypted inputs, but its practicality is hindered by expensive bootstrapping and inefficient approximations of non-linear activations. We introduce Safhire, a hybrid inference framework that executes linear layers under encryption on the server while offloading non-linearities to the client in plaintext. This design eliminates bootstrapping, supports exact activations, and significantly reduces computation. To safeguard model confidentiality despite client access to intermediate outputs, Safhire applies randomized shuffling, which obfuscates intermediate values and makes it practically impossible to reconstruct the model. To further reduce latency, Safhire incorporates advanced optimizations such as fast ciphertext packing and partial extraction. Evaluations on multiple standard models and datasets show that Safhire achieves 1.5X - 10.5X lower inference latency than Orion, a state-of-the-art baseline, with manageable communication overhead and comparable accuracy, thereby establishing the practicality of hybrid FHE inference.

Paperid: 1527, https://arxiv.org/pdf/2509.00043.pdf

Abstract:
Electronic devices and cables inadvertently emit RF emissions as a byproduct of signal processing and/or transmission. Labeled as electromagnetic emanations, they form an EM side-channel for data leakage. Previously, it was believed that such leakage could be contained within a facility since they are weak signals with a short transmission range. However, in the preliminary version of this work [1], we found that the traditional cable repairing process forms a tiny monopole antenna that helps emanations transmit over a long range. Experimentation with three types of cables revealed that emanations from repaired cables remain detectable even at >4 m and can penetrate a 14 cm thick concrete wall. In this extended version, we show that such emanation can be exploited at a long distance for information extraction by detecting keystrokes typed on a repaired USB keyboard. By collecting data for 70 different keystrokes at different distances from the target in 3 diverse environments (open space, a corridor outside an office room, and outside a building) and developing an efficient detection algorithm, ~100% keystroke detection accuracy has been achieved up to 12 m distance, which is the highest reported accuracy at such a long range for USB keyboards in the literature. The effect of two experimental factors, interference and human-body coupling, has been investigated thoroughly. Along with exploring the vulnerability, multi-layer external metal shielding during the repairing process as a possible remedy has been explored. This work exposes a new attack surface caused by hardware modification, its exploitation, and potential countermeasures.

Paperid: 1528, https://arxiv.org/pdf/2508.21715.pdf

Abstract:
Convolutional Neural Networks (CNNs) have become the foundation of modern computer vision, achieving unprecedented accuracy across diverse image recognition tasks. While these networks excel on in-distribution data, they remain vulnerable to adversarial perturbations imperceptible input modifications that cause misclassification with high confidence. However, existing detection methods either require expensive retraining, modify network architecture, or degrade performance on clean inputs. Here we show that adversarial perturbations create immediate, detectable entropy signatures in CNN activations that can be monitored without any model modification. Using parallel entropy monitoring on VGG-16, we demonstrate that adversarial inputs consistently shift activation entropy by 7% in early convolutional layers, enabling 90% detection accuracy with false positives and false negative rates below 20%. The complete separation between clean and adversarial entropy distributions reveals that CNNs inherently encode distribution shifts in their activation patterns. This work establishes that CNN reliability can be assessed through activation entropy alone, enabling practical deployment of self-diagnostic vision systems that detect adversarial inputs in real-time without compromising original model performance.

Paperid: 1529, https://arxiv.org/pdf/2508.21440.pdf

Abstract:
Remote Procedure Call (RPC) services have become a primary gateway for users to access public blockchains. While they offer significant convenience, RPC services also introduce critical privacy challenges that remain insufficiently examined. Existing deanonymization attacks either do not apply to blockchain RPC users or incur costs like transaction fees assuming an active network eavesdropper. In this paper, we propose a novel deanonymization attack that can link an IP address of a RPC user to this user's blockchain pseudonym. Our analysis reveals a temporal correlation between the timestamps of transaction confirmations recorded on the public ledger and those of TCP packets sent by the victim when querying transaction status. We assume a strong passive adversary with access to network infrastructure, capable of monitoring traffic at network border routers or Internet exchange points. By monitoring network traffic and analyzing public ledgers, the attacker can link the IP address of the TCP packet to the pseudonym of the transaction initiator by exploiting the temporal correlation. This deanonymization attack incurs zero transaction fee. We mathematically model and analyze the attack method, perform large-scale measurements of blockchain ledgers, and conduct real-world attacks to validate the attack. Our attack achieves a high success rate of over 95% against normal RPC users on various blockchain networks, including Ethereum, Bitcoin and Solana.

Paperid: 1530, https://arxiv.org/pdf/2508.16625.pdf

Abstract:
The performance of AI-based software vulnerability detection systems is often limited by their poor generalization to unknown codebases. In this research, we explore the impact of data quality and model architecture on the generalizability of vulnerability detection systems. By generalization we mean ability of high vulnerability detection performance across different C/C++ software projects not seen during training. Through a series of experiments, we demonstrate that improvements in dataset diversity and quality substantially enhance detection performance. Additionally, we compare multiple encoder-only and decoder-only models, finding that encoder based models outperform in terms of accuracy and generalization. Our model achieves 6.8% improvement in recall on the benchmark BigVul[1] dataset, also outperforming on unseen projects, hence showing enhanced generalizability. These results highlight the role of data quality and model selection in the development of robust vulnerability detection systems. Our findings suggest a direction for future systems having high cross-project effectiveness.

Paperid: 1531, https://arxiv.org/pdf/2508.15541.pdf

Abstract:
Federated learning (FL) has been widely adopted as a decentralized training paradigm that enables multiple clients to collaboratively learn a shared model without exposing their local data. As concerns over data privacy and regulatory compliance grow, machine unlearning, which aims to remove the influence of specific data from trained models, has become increasingly important in the federated setting to meet legal, ethical, or user-driven demands. However, integrating unlearning into FL introduces new challenges and raises largely unexplored security risks. In particular, adversaries may exploit the unlearning process to compromise the integrity of the global model. In this paper, we present the first backdoor attack in the context of federated unlearning, demonstrating that an adversary can inject backdoors into the global model through seemingly legitimate unlearning requests. Specifically, we propose BadFU, an attack strategy where a malicious client uses both backdoor and camouflage samples to train the global model normally during the federated training process. Once the client requests unlearning of the camouflage samples, the global model transitions into a backdoored state. Extensive experiments under various FL frameworks and unlearning strategies validate the effectiveness of BadFU, revealing a critical vulnerability in current federated unlearning practices and underscoring the urgent need for more secure and robust federated unlearning mechanisms.

Paperid: 1532, https://arxiv.org/pdf/2508.13425.pdf

Abstract:
Secure aggregation is a common technique in federated learning (FL) for protecting data privacy from both curious internal entities (clients or server) and external adversaries (eavesdroppers). However, in dynamic and resource-constrained environments such as low Earth orbit (LEO) satellite networks, traditional secure aggregation methods fall short in two aspects: (1) they assume continuous client availability while LEO satellite visibility is intermittent and irregular; (2) they consider privacy in each communication round but have overlooked the possible privacy leakage through multiple rounds. To address these limitations, we propose LTP-FLEO, an asynchronous FL framework that preserves long-term privacy (LTP) for LEO satellite networks. LTP-FLEO introduces (i) privacy-aware satellite partitioning, which groups satellites based on their predictable visibility to the server and enforces joint participation; (ii) model age balancing, which mitigates the adverse impact of stale model updates; and (iii) fair global aggregation, which treats satellites of different visibility durations in an equitable manner. Theoretical analysis and empirical validation demonstrate that LTP-FLEO effectively safeguards both model and data privacy across multi-round training, promotes fairness in line with satellite contributions, accelerates global convergence, and achieves competitive model accuracy.

Paperid: 1533, https://arxiv.org/pdf/2508.06574.pdf

Abstract:
Detecting fraud in modern supply chains is a growing challenge, driven by the complexity of global networks and the scarcity of labeled data. Traditional detection methods often struggle with class imbalance and limited supervision, reducing their effectiveness in real-world applications. This paper proposes a novel two-phase learning framework to address these challenges. In the first phase, the Isolation Forest algorithm performs unsupervised anomaly detection to identify potential fraud cases and reduce the volume of data requiring further analysis. In the second phase, a self-training Support Vector Machine (SVM) refines the predictions using both labeled and high-confidence pseudo-labeled samples, enabling robust semi-supervised learning. The proposed method is evaluated on the DataCo Smart Supply Chain Dataset, a comprehensive real-world supply chain dataset with fraud indicators. It achieves an F1-score of 0.817 while maintaining a false positive rate below 3.0%. These results demonstrate the effectiveness and efficiency of combining unsupervised pre-filtering with semi-supervised refinement for supply chain fraud detection under real-world constraints, though we acknowledge limitations regarding concept drift and the need for comparison with deep learning approaches.

Paperid: 1534, https://arxiv.org/pdf/2508.06073.pdf

Abstract:
Provenance graph-based intrusion detection systems are deployed on hosts to defend against increasingly severe Advanced Persistent Threat. Using Graph Neural Networks to detect these threats has become a research focus and has demonstrated exceptional performance. However, the widespread adoption of GNN-based security models is limited by their inherent black-box nature, as they fail to provide security analysts with any verifiable explanations for model predictions or any evidence regarding the model's judgment in relation to real-world attacks. To address this challenge, we propose ProvX, an effective explanation framework for exlaining GNN-based security models on provenance graphs. ProvX introduces counterfactual explanation logic, seeking the minimal structural subset within a graph predicted as malicious that, when perturbed, can subvert the model's original prediction. We innovatively transform the discrete search problem of finding this critical subgraph into a continuous optimization task guided by a dual objective of prediction flipping and distance minimization. Furthermore, a Staged Solidification strategy is incorporated to enhance the precision and stability of the explanations. We conducted extensive evaluations of ProvX on authoritative datasets. The experimental results demonstrate that ProvX can locate critical graph structures that are highly relevant to real-world attacks and achieves an average explanation necessity of 51.59\%, with these metrics outperforming current SOTA explainers. Furthermore, we explore and provide a preliminary validation of a closed-loop Detection-Explanation-Feedback enhancement framework, demonstrating through experiments that the explanation results from ProvX can guide model optimization, effectively enhancing its robustness against adversarial attacks.

Paperid: 1535, https://arxiv.org/pdf/2508.05696.pdf

Abstract:
Insider threat detection presents a significant challenge due to the deceptive nature of malicious behaviors, which often resemble legitimate user operations. However, existing approaches typically model system logs as flat event sequences, thereby failing to capture the inherent frequency dynamics and multiscale disturbance patterns embedded in user behavior. To address these limitations, we propose Log2Sig, a robust anomaly detection framework that transforms user logs into multivariate behavioral frequency signals, introducing a novel representation of user behavior. Log2Sig employs Multivariate Variational Mode Decomposition (MVMD) to extract Intrinsic Mode Functions (IMFs), which reveal behavioral fluctuations across multiple temporal scales. Based on this, the model further performs joint modeling of behavioral sequences and frequency-decomposed signals: the daily behavior sequences are encoded using a Mamba-based temporal encoder to capture long-term dependencies, while the corresponding frequency components are linearly projected to match the encoder's output dimension. These dual-view representations are then fused to construct a comprehensive user behavior profile, which is fed into a multilayer perceptron for precise anomaly detection. Experimental results on the CERT r4.2 and r5.2 datasets demonstrate that Log2Sig significantly outperforms state-of-the-art baselines in both accuracy and F1 score.

Paperid: 1536, https://arxiv.org/pdf/2508.05695.pdf

Abstract:
Enterprises are facing increasing risks of insider threats, while existing detection methods are unable to effectively address these challenges due to reasons such as insufficient temporal dynamic feature modeling, computational efficiency and real-time bottlenecks and cross-modal information island problem. This paper proposes a new insider threat detection framework MambaITD based on the Mamba state space model and cross-modal adaptive fusion. First, the multi-source log preprocessing module aligns heterogeneous data through behavioral sequence encoding, interval smoothing, and statistical feature extraction. Second, the Mamba encoder models long-range dependencies in behavioral and interval sequences, and combines the sequence and statistical information dynamically in combination with the gated feature fusion mechanism. Finally, we propose an adaptive threshold optimization method based on maximizing inter-class variance, which dynamically adjusts the decision threshold by analyzing the probability distribution, effectively identifies anomalies, and alleviates class imbalance and concept drift. Compared with traditional methods, MambaITD shows significant advantages in modeling efficiency and feature fusion capabilities, outperforming Transformer-based methods, and provides a more effective solution for insider threat detection.

Paperid: 1537, https://arxiv.org/pdf/2508.05694.pdf

Abstract:
Insider threat detection (ITD) poses a persistent and high-impact challenge in cybersecurity due to the subtle, long-term, and context-dependent nature of malicious insider behaviors. Traditional models often struggle to capture semantic intent and complex behavior dynamics, while existing LLM-based solutions face limitations in prompt adaptability and modality coverage. To bridge this gap, we propose DMFI, a dual-modality framework that integrates semantic inference with behavior-aware fine-tuning. DMFI converts raw logs into two structured views: (1) a semantic view that processes content-rich artifacts (e.g., emails, https) using instruction-formatted prompts; and (2) a behavioral abstraction, constructed via a 4W-guided (When-Where-What-Which) transformation to encode contextual action sequences. Two LoRA-enhanced LLMs are fine-tuned independently, and their outputs are fused via a lightweight MLP-based decision module. We further introduce DMFI-B, a discriminative adaptation strategy that separates normal and abnormal behavior representations, improving robustness under severe class imbalance. Experiments on CERT r4.2 and r5.2 datasets demonstrate that DMFI outperforms state-of-the-art methods in detection accuracy. Our approach combines the semantic reasoning power of LLMs with structured behavior modeling, offering a scalable and effective solution for real-world insider threat detection. Our work demonstrates the effectiveness of combining LLM reasoning with structured behavioral modeling, offering a scalable and deployable solution for modern insider threat detection.

Paperid: 1538, https://arxiv.org/pdf/2508.05036.pdf

Abstract:
Time series forecasting is vital in domains where data sensitivity is paramount, such as finance and energy systems. While Differential Privacy (DP) provides theoretical guarantees to protect individual data contributions, its integration especially via DP-SGD often impairs model performance due to injected noise. In this paper, we propose Q-DPTS, a hybrid quantum-classical framework for Quantum Differentially Private Time Series Forecasting. Q-DPTS combines Variational Quantum Circuits (VQCs) with per-sample gradient clipping and Gaussian noise injection, ensuring rigorous $(Îµ, Î´)$-differential privacy. The expressiveness of quantum models enables improved robustness against the utility loss induced by DP mechanisms. We evaluate Q-DPTS on the ETT (Electricity Transformer Temperature) dataset, a standard benchmark for long-term time series forecasting. Our approach is compared against both classical and quantum baselines, including LSTM, QASA, QRWKV, and QLSTM. Results demonstrate that Q-DPTS consistently achieves lower prediction error under the same privacy budget, indicating a favorable privacy-utility trade-off. This work presents one of the first explorations into quantum-enhanced differentially private forecasting, offering promising directions for secure and accurate time series modeling in privacy-critical scenarios.

Paperid: 1539, https://arxiv.org/pdf/2508.02881.pdf

Abstract:
Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.

Paperid: 1540, https://arxiv.org/pdf/2508.02182.pdf

Abstract:
Many differentially private and classical non-private graph algorithms rely crucially on determining whether some property of each vertex meets a threshold. For example, for the $k$-core decomposition problem, the classic peeling algorithm iteratively removes a vertex if its induced degree falls below a threshold. The sparse vector technique (SVT) is generally used to transform non-private threshold queries into private ones with only a small additive loss in accuracy. However, a naive application of SVT in the graph setting leads to an amplification of the error by a factor of $n$ due to composition, as SVT is applied to every vertex. In this paper, we resolve this problem by formulating a novel generalized sparse vector technique which we call the Multidimensional AboveThreshold (MAT) Mechanism which generalizes SVT (applied to vectors with one dimension) to vectors with multiple dimensions. As an application, we solve a number of important graph problems with better bounds than previous work. We apply our MAT mechanism to obtain a set of improved bounds for a variety of problems including $k$-core decomposition, densest subgraph, low out-degree ordering, and vertex coloring. We give a tight local edge DP algorithm for $k$-core decomposition with $O(Îµ^{-1}\log n)$ additive error and no multiplicative error in $O(n)$ rounds. We also give a new $(2+Î·)$-factor multiplicative, $O(Îµ^{-1}\log n)$ additive error algorithm in $O(\log^2 n)$ rounds for any constant $Î·> 0$. Both of these results are asymptotically tight against our new lower bound of $Î©(\log n)$ for any constant-factor approximation algorithm for $k$-core decomposition. Our new algorithms for $k$-core also directly lead to new algorithms for densest subgraph and low out-degree ordering. Our novel private defective coloring algorithms uses number of colors proportional to the arboricity of the graph.

Paperid: 1541, https://arxiv.org/pdf/2508.01107.pdf

Abstract:
In recent years, Deep Neural Networks (DNNs) have become increasingly integral to IoT-based environments, enabling realtime visual computing. However, the limited computational capacity of these devices has motivated the adoption of collaborative DNN inference, where the IoT device offloads part of the inference-related computation to a remote server. Such offloading often requires dynamic DNN partitioning information to be exchanged among the participants over an unsecured network or via relays/hops, leading to novel privacy vulnerabilities. In this paper, we propose AdVAR-DNN, an adversarial variational autoencoder (VAE)-based misclassification attack, leveraging classifiers to detect model information and a VAE to generate untraceable manipulated samples, specifically designed to compromise the collaborative inference process. AdVAR-DNN attack uses the sensitive information exchange vulnerability of collaborative DNN inference and is black-box in nature in terms of having no prior knowledge about the DNN model and how it is partitioned. Our evaluation using the most popular object classification DNNs on the CIFAR-100 dataset demonstrates the effectiveness of AdVAR-DNN in terms of high attack success rate with little to no probability of detection.

Paperid: 1542, https://arxiv.org/pdf/2508.00935.pdf

Abstract:
Computer-using agents (CUAs), which can autonomously control computers to perform multi-step actions, might pose significant safety risks if misused. However, existing benchmarks mainly evaluate LMs in chatbots or simple tool use. To more comprehensively evaluate CUAs' misuse risks, we introduce a new benchmark: CUAHarm. CUAHarm consists of 104 expert-written realistic misuse risks, such as disabling firewalls, leaking data, or installing backdoors. We provide a sandbox with rule-based verifiable rewards to measure CUAs' success rates in executing these tasks (e.g., whether the firewall is indeed disabled), beyond refusal rates. We evaluate frontier LMs including GPT-5, Claude 4 Sonnet, Gemini 2.5 Pro, Llama-3.3-70B, and Mistral Large 2. Even without jailbreaking prompts, these frontier LMs comply with executing these malicious tasks at a high success rate (e.g., 90\% for Gemini 2.5 Pro). Furthermore, while newer models are safer in previous safety benchmarks, their misuse risks as CUAs become even higher, e.g., Gemini 2.5 Pro is riskier than Gemini 1.5 Pro. Additionally, while these LMs are robust to common malicious prompts (e.g., creating a bomb) when acting as chatbots, they could still act unsafely as CUAs. We further evaluate a leading agentic framework (UI-TARS-1.5) and find that while it improves performance, it also amplifies misuse risks. To mitigate the misuse risks of CUAs, we explore using LMs to monitor CUAs' actions. We find monitoring unsafe computer-using actions is significantly harder than monitoring conventional unsafe chatbot responses. While monitoring chain-of-thoughts leads to modest gains, the average monitoring accuracy is only 77\%. A hierarchical summarization strategy improves performance by up to 13\%, a promising direction though monitoring remains unreliable. The benchmark will be released publicly to facilitate further research on mitigating these risks.

Paperid: 1543, https://arxiv.org/pdf/2508.00591.pdf

Abstract:
Text-to-Image (T2I) generation is a popular AI-generated content (AIGC) technology enabling diverse and creative image synthesis. However, some outputs may contain Not Safe For Work (NSFW) content (e.g., violence), violating community guidelines. Detecting NSFW content efficiently and accurately, known as external safeguarding, is essential. Existing external safeguards fall into two types: text filters, which analyze user prompts but overlook T2I model-specific variations and are prone to adversarial attacks; and image filters, which analyze final generated images but are computationally costly and introduce latency. Diffusion models, the foundation of modern T2I systems like Stable Diffusion, generate images through iterative denoising using a U-Net architecture with ResNet and Transformer blocks. We observe that: (1) early denoising steps define the semantic layout of the image, and (2) cross-attention layers in U-Net are crucial for aligning text and image regions. Based on these insights, we propose Wukong, a transformer-based NSFW detection framework that leverages intermediate outputs from early denoising steps and reuses U-Net's pre-trained cross-attention parameters. Wukong operates within the diffusion process, enabling early detection without waiting for full image generation. We also introduce a new dataset containing prompts, seeds, and image-specific NSFW labels, and evaluate Wukong on this and two public benchmarks. Results show that Wukong significantly outperforms text-based safeguards and achieves comparable accuracy of image filters, while offering much greater efficiency.

Paperid: 1544, https://arxiv.org/pdf/2507.18657.pdf

Abstract:
In recent years, advanced deep learning architectures have shown strong performance in medical imaging tasks. However, the traditional centralized learning paradigm poses serious privacy risks as all data is collected and trained on a single server. To mitigate this challenge, decentralized approaches such as federated learning and swarm learning have emerged, allowing model training on local nodes while sharing only model weights. While these methods enhance privacy, they struggle with heterogeneous and imbalanced data and suffer from inefficiencies due to frequent communication and the aggregation of weights. More critically, the dynamic and complex nature of clinical environments demands scalable AI systems capable of continuously learning from diverse modalities and multilabels. Yet, both centralized and decentralized models are prone to catastrophic forgetting during system expansion, often requiring full model retraining to incorporate new data. To address these limitations, we propose VGS-ATD, a novel distributed learning framework. To validate VGS-ATD, we evaluate it in experiments spanning 30 datasets and 80 independent labels across distributed nodes, VGS-ATD achieved an overall accuracy of 92.7%, outperforming centralized learning (84.9%) and swarm learning (72.99%), while federated learning failed under these conditions due to high requirements on computational resources. VGS-ATD also demonstrated strong scalability, with only a 1% drop in accuracy on existing nodes after expansion, compared to a 20% drop in centralized learning, highlighting its resilience to catastrophic forgetting. Additionally, it reduced computational costs by up to 50% relative to both centralized and swarm learning, confirming its superior efficiency and scalability.

Paperid: 1545, https://arxiv.org/pdf/2507.17180.pdf

Abstract:
Data perturbation-based privacy-preserving methods have been widely adopted in various scenarios due to their efficiency and the elimination of the need for a trusted third party. However, these methods primarily focus on individual statistical indicators, neglecting the overall quality of the collected data from a distributional perspective. Consequently, they often fall short of meeting the diverse statistical analysis requirements encountered in practical data analysis. As a promising sensitive data perturbation method, negative survey methods is able to complete the task of collecting sensitive information distribution while protecting personal privacy. Yet, existing negative survey methods are primarily designed for discrete sensitive information and are inadequate for real-valued data distributions. To bridge this gap, this paper proposes a novel real-value negative survey model, termed RVNS, for the first time in the field of real-value sensitive information collection. The RVNS model exempts users from the necessity of discretizing their data and only requires them to sample a set of data from a range that deviates from their actual sensitive details, thereby preserving the privacy of their genuine information. Moreover, to accurately capture the distribution of sensitive information, an optimization problem is formulated, and a novel approach is employed to solve it. Rigorous theoretical analysis demonstrates that the RVNS model conforms to the differential privacy model, ensuring robust privacy preservation. Comprehensive experiments conducted on both synthetic and real-world datasets further validate the efficacy of the proposed method.

Paperid: 1546, https://arxiv.org/pdf/2507.17006.pdf

Abstract:
Compiling Bell games under cryptographic assumptions replaces the need for physical separation, allowing nonlocality to be probed with a single untrusted device. While Kalai et al. (STOC'23) showed that this compilation preserves quantum advantages, its quantitative quantum soundness has remained an open problem. We address this gap with two primary contributions. First, we establish the first quantitative quantum soundness bounds for every bipartite compiled Bell game whose optimal quantum strategy is finite-dimensional: any polynomial-time prover's score in the compiled game is negligibly close to the game's ideal quantum value. More generally, for all bipartite games we show that the compiled score cannot significantly exceed the bounds given by a newly formalized sequential NavascuÃ©s-Pironio-AcÃn (NPA) hierarchy. Second, we provide a full characterization of this sequential NPA hierarchy, establishing it as a robust numerical tool that is of independent interest. Finally, for games without finite-dimensional optimal strategies, we explore the necessity of NPA approximation error for quantitatively bounding their compiled scores, linking these considerations to the complexity conjecture $\mathrm{MIP}^{\mathrm{co}}=\mathrm{coRE}$ and open challenges such as quantum homomorphic encryption correctness for "weakly commuting" quantum registers.

Paperid: 1547, https://arxiv.org/pdf/2507.16203.pdf

Abstract:
Verification using SystemVerilog assertions (SVA) is one of the most popular methods for detecting circuit design vulnerabilities. However, with the globalization of integrated circuit design and the continuous upgrading of security requirements, the SVA development model has exposed major limitations. It is not only inefficient in development, but also unable to effectively deal with the increasing number of security vulnerabilities in modern complex integrated circuits. In response to these challenges, this paper proposes an innovative SVA automatic generation framework SVAgent. SVAgent introduces a requirement decomposition mechanism to transform the original complex requirements into a structured, gradually solvable fine-grained problem-solving chain. Experiments have shown that SVAgent can effectively suppress the influence of hallucinations and random answers, and the key evaluation indicators such as the accuracy and consistency of the SVA are significantly better than existing frameworks. More importantly, we successfully integrated SVAgent into the most mainstream integrated circuit vulnerability assessment framework and verified its practicality and reliability in a real engineering design environment.

Paperid: 1548, https://arxiv.org/pdf/2507.15214.pdf

Abstract:
The temporal dynamics of speech, encompassing variations in rhythm, intonation, and speaking rate, contain important and unique information about speaker identity. This paper proposes a new method for representing speaker characteristics by extracting context-dependent duration embeddings from speech temporal dynamics. We develop novel attack models using these representations and analyze the potential vulnerabilities in speaker verification and voice anonymization systems.The experimental results show that the developed attack models provide a significant improvement in speaker verification performance for both original and anonymized data in comparison with simpler representations of speech temporal dynamics reported in the literature.

Paperid: 1549, https://arxiv.org/pdf/2507.12439.pdf

Abstract:
Federated learning (FL) enables collaborative model training across decentralized clients while preserving data privacy. However, its open-participation nature exposes it to data-poisoning attacks, in which malicious actors submit corrupted model updates to degrade the global model. Existing defenses are often reactive, relying on statistical aggregation rules that can be computationally expensive and that typically assume an honest majority. This paper introduces a proactive, economic defense: a lightweight Bayesian incentive mechanism that makes malicious behavior economically irrational. Each training round is modeled as a Bayesian game of incomplete information in which the server, acting as the principal, uses a small, private validation dataset to verify update quality before issuing payments. The design satisfies Individual Rationality (IR) for benevolent clients, ensuring their participation is profitable, and Incentive Compatibility (IC), making poisoning an economically dominated strategy. Extensive experiments on non-IID partitions of MNIST and FashionMNIST demonstrate robustness: with 50% label-flipping adversaries on MNIST, the mechanism maintains 96.7% accuracy, only 0.3 percentage points lower than in a scenario with 30% label-flipping adversaries. This outcome is 51.7 percentage points better than standard FedAvg, which collapses under the same 50% attack. The mechanism is computationally light, budget-bounded, and readily integrates into existing FL frameworks, offering a practical route to economically robust and sustainable FL ecosystems.

Paperid: 1550, https://arxiv.org/pdf/2507.11630.pdf

Abstract:
AI systems are rapidly advancing in capability, and frontier model developers broadly acknowledge the need for safeguards against serious misuse. However, this paper demonstrates that fine-tuning, whether via open weights or closed fine-tuning APIs, can produce helpful-only models with safeguards destroyed. In contrast to prior work which is blocked by modern moderation systems or achieved only partial removal of safeguards or degraded output quality, our jailbreak-tuning method teaches models to generate detailed, high-quality responses to arbitrary harmful requests. For example, OpenAI, Google, and Anthropic models will fully comply with requests for CBRN assistance, executing cyberattacks, and other criminal activity. We further show that backdoors can increase not only the stealth but also the severity of attacks. Stronger jailbreak prompts become even more effective in fine-tuning attacks, linking attacks and potentially defenses in the input and weight spaces. Not only are current models vulnerable, more recent ones also appear to be becoming even more vulnerable to these attacks, underscoring the urgent need for tamper-resistant safeguards. Until such safeguards are discovered, companies and policymakers should view the release of any fine-tunable model as simultaneously releasing its evil twin: equally capable as the original model, and usable for any malicious purpose within its capabilities.

Paperid: 1551, https://arxiv.org/pdf/2507.10644.pdf

Abstract:
The concept of the Web of Agents (WoA), which transforms the static, document-centric Web into an environment of autonomous agents acting on users' behalf, has attracted growing interest as large language models (LLMs) become more capable. However, research in this area is still fragmented across different communities. Contemporary surveys catalog the latest LLM-powered frameworks, while the rich histories of Multi-Agent Systems (MAS) and the Semantic Web are often treated as separate, legacy domains. This fragmentation obscures the intellectual lineage of modern systems and hinders a holistic understanding of the field's trajectory. We present the first comprehensive evolutionary overview of the WoA. We show that modern protocols like A2A and the MCP, are direct evolutionary responses to the well-documented limitations of earlier standards like FIPA standards and OWL-based semantic agents. To systematize this analysis, we introduce a four-axis taxonomy (semantic foundation, communication paradigm, locus of intelligence, discovery mechanism). This framework provides a unified analytical lens for comparing agent architectures across all generations, revealing a clear line of descent where others have seen a disconnect. Our analysis identifies a paradigm shift in the 'locus of intelligence': from being encoded in external data (Semantic Web) or the platform (MAS) to being embedded within the agent's core model (LLM). This shift is foundational to modern Agentic AI, enabling the scalable and adaptive systems the WoA has long envisioned. We conclude that while new protocols are essential, they are insufficient for building a robust, open, trustworthy ecosystem. Finally, we argue that the next research frontier lies in solving persistent socio-technical challenges, and we map out a new agenda focused on decentralized identity, economic models, security, and governance for the emerging WoA.

Paperid: 1552, https://arxiv.org/pdf/2507.08540.pdf

Abstract:
The proliferation of software vulnerabilities presents a significant challenge to cybersecurity, necessitating more effective detection methodologies. We introduce White-Basilisk, a novel approach to vulnerability detection that demonstrates superior performance while challenging prevailing assumptions in AI model scaling. Utilizing an innovative architecture that integrates Mamba layers, linear self-attention, and a Mixture of Experts framework, White-Basilisk achieves state-of-the-art results in vulnerability detection tasks with a parameter count of only 200M. The model's capacity to process sequences of unprecedented length enables comprehensive analysis of extensive codebases in a single pass, surpassing the context limitations of current Large Language Models (LLMs). White-Basilisk exhibits robust performance on imbalanced, real-world datasets, while maintaining computational efficiency that facilitates deployment across diverse organizational scales. This research not only establishes new benchmarks in code security but also provides empirical evidence that compact, efficiently designed models can outperform larger counterparts in specialized tasks, potentially redefining optimization strategies in AI development for domain-specific applications.

Paperid: 1553, https://arxiv.org/pdf/2507.06092.pdf

Abstract:
Machine learning-based supervised classifiers are widely used for security tasks, and their improvement has been largely focused on algorithmic advancements. We argue that data challenges that negatively impact the performance of these classifiers have received limited attention. We address the following research question: Can developments in Generative AI (GenAI) address these data challenges and improve classifier performance? We propose augmenting training datasets with synthetic data generated using GenAI techniques to improve classifier generalization. We evaluate this approach across 7 diverse security tasks using 6 state-of-the-art GenAI methods and introduce a novel GenAI scheme called Nimai that enables highly controlled data synthesis. We find that GenAI techniques can significantly improve the performance of security classifiers, achieving improvements of up to 32.6% even in severely data-constrained settings (only ~180 training samples). Furthermore, we demonstrate that GenAI can facilitate rapid adaptation to concept drift post-deployment, requiring minimal labeling in the adjustment process. Despite successes, our study finds that some GenAI schemes struggle to initialize (train and produce data) on certain security tasks. We also identify characteristics of specific tasks, such as noisy labels, overlapping class distributions, and sparse feature vectors, which hinder performance boost using GenAI. We believe that our study will drive the development of future GenAI tools designed for security tasks.

Paperid: 1554, https://arxiv.org/pdf/2507.05576.pdf

Abstract:
Thermal Trojan attacks present a pressing concern for the security and reliability of System-on-Chips (SoCs), especially in mobile applications. The situation becomes more complicated when such attacks are more evasive and operate sporadically to stay hidden from detection mechanisms. In this paper, we introduce Intermittent Thermal Trojans (iThermTroj) that exploit the chips' thermal information in a random time-triggered manner. According to our experiments, iThermTroj attack can easily bypass available threshold-based thermal Trojan detection solutions. We investigate SoC vulnerabilities to variations of iThermTroj through an in-depth analysis of Trojan activation and duration scenarios. We also propose a set of tiny Machine Learning classifiers for run-time anomaly detection to protect SoCs against such intermittent thermal Trojan attacks. Compared to existing methods, our approach improves the attack detection rate by 29.4\%, 17.2\%, and 14.3\% in scenarios where iThermTroj manipulates up to 80\%, 60\%, and 40\% of SoC's thermal data, respectively. Additionally, our method increases the full protection resolution to 0.8 degrees Celsius, meaning that any temperature manipulations exceeding $\pm 0.8$ degrees will be detected with 100\% accuracy.

Paperid: 1555, https://arxiv.org/pdf/2507.03646.pdf

Abstract:
Watermarking has emerged as a promising solution to counter harmful or deceptive AI-generated content by embedding hidden identifiers that trace content origins. However, the robustness of current watermarking techniques is still largely unexplored, raising critical questions about their effectiveness against adversarial attacks. To address this gap, we examine the robustness of model-specific watermarking, where watermark embedding is integrated with text-to-image generation in models like latent diffusion models. We introduce three attack strategies: edge prediction-based, box blurring, and fine-tuning-based attacks in a no-box setting, where an attacker does not require access to the ground-truth watermark decoder. Our findings reveal that while model-specific watermarking is resilient against basic evasion attempts, such as edge prediction, it is notably vulnerable to blurring and fine-tuning-based attacks. Our best-performing attack achieves a reduction in watermark detection accuracy to approximately 47.92\%. Additionally, we perform an ablation study on factors like message length, kernel size and decoder depth, identifying critical parameters influencing the fine-tuning attack's success. Finally, we assess several advanced watermarking defenses, finding that even the most robust methods, such as multi-label smoothing, result in watermark extraction accuracy that falls below an acceptable level when subjected to our no-box attacks.

Paperid: 1556, https://arxiv.org/pdf/2507.03636.pdf

Abstract:
Text-guided image manipulation with diffusion models enables flexible and precise editing based on prompts, but raises ethical and copyright concerns due to potential unauthorized modifications. To address this, we propose SecureT2I, a secure framework designed to prevent unauthorized editing in diffusion-based generative models. SecureT2I is compatible with both general-purpose and domain-specific models and can be integrated via lightweight fine-tuning without architectural changes. We categorize images into a permit set and a forbid set based on editing permissions. For the permit set, the model learns to perform high-quality manipulations as usual. For the forbid set, we introduce training objectives that encourage vague or semantically ambiguous outputs (e.g., blurred images), thereby suppressing meaningful edits. The core challenge is to block unauthorized editing while preserving editing quality for permitted inputs. To this end, we design separate loss functions that guide selective editing behavior. Extensive experiments across multiple datasets and models show that SecureT2I effectively degrades manipulation quality on forbidden images while maintaining performance on permitted ones. We also evaluate generalization to unseen inputs and find that SecureT2I consistently outperforms baselines. Additionally, we analyze different vagueness strategies and find that resize-based degradation offers the best trade-off for secure manipulation control.

Paperid: 1557, https://arxiv.org/pdf/2507.03051.pdf

Abstract:
Improving and understanding the training dynamics and reasoning of Large Language Models (LLMs) has become essential for their deployment in AI-based security tools, such as software vulnerability detection. In this work, we present an extensive study aimed at advancing recent RL-based finetuning techniques for LLMs in the context of vulnerability detection. We start by highlighting key limitations of commonly adopted LLMs, such as their tendency to over-predict certain types of vulnerabilities while failing to detect others. To address this challenge, we explore the use of Group Relative Policy Optimization (GRPO), a recent policy-gradient method, for guiding LLM behavior through structured, rule-based rewards. We enable its application to the vulnerability detection task by redefining its advantage functions and reward signals using annotations from widely used datasets in the field, including BigVul, DiverseVul, and CleanVul. The proposed methodology enables an extensive set of experiments, addressing multiple research questions regarding the impact of GRPO on generalization, reasoning capabilities, and performance improvements over standard supervised finetuning (SFT). Our findings offer valuable insights into the potential of RL-based training to enhance both the performance and reasoning abilities of LLMs in the context of software vulnerability detection.

Paperid: 1558, https://arxiv.org/pdf/2507.02206.pdf

Abstract:
True Random Number Generators (TRNGs) play a fundamental role in hardware security, cryptographic systems, and data protection. In the context of Deep NeuralNetworks (DNNs), safeguarding model parameters, particularly weights, is critical to ensure the integrity, privacy, and intel-lectual property of AI systems. While software-based pseudo-random number generators are widely used, they lack the unpredictability and resilience offered by hardware-based TRNGs. In this work, we propose a novel and robust Encoding-in-Memory TRNG called EIM-TRNG that leverages the inherent physical randomness in DRAM cell behavior, particularly under RowHammer-induced disturbances, for the first time. We demonstrate how the unpredictable bit-flips generated through carefully controlled RowHammer operations can be harnessed as a reliable entropy source. Furthermore, we apply this TRNG framework to secure DNN weight data by encoding via a combination of fixed and unpredictable bit-flips. The encrypted data is later decrypted using a key derived from the probabilistic flip behavior, ensuring both data confidentiality and model authenticity. Our results validate the effectiveness of DRAM-based entropy extraction for robust, low-cost hardware security and offer a promising direction for protecting machine learning models at the hardware level.

Paperid: 1559, https://arxiv.org/pdf/2507.00596.pdf

Abstract:
Privacy is a highly subjective concept and perceived variably by different individuals. Previous research on quantifying user-perceived privacy has primarily relied on questionnaires. Furthermore, applying user-perceived privacy to optimise the parameters of privacy-preserving techniques (PPT) remains insufficiently explored. To address these limitations, we introduce Gaze3P -- the first dataset specifically designed to facilitate systematic investigations into user-perceived privacy. Our dataset comprises gaze data from 100 participants and 1,000 stimuli, encompassing a range of private and safe attributes. With Gaze3P, we train a machine learning model to implicitly and dynamically predict perceived privacy from human eye gaze. Through comprehensive experiments, we show that the resulting models achieve high accuracy. Finally, we illustrate how predicted privacy can be used to optimise the parameters of differentially private mechanisms, thereby enhancing their alignment with user expectations.

Paperid: 1560, https://arxiv.org/pdf/2512.22154.pdf

Abstract:
Automated control monitors could play an important role in overseeing highly capable AI agents that we do not fully trust. Prior work has explored control monitoring in simplified settings, but scaling monitoring to real-world deployments introduces additional dynamics: parallel agent instances, non-negligible oversight latency, incremental attacks between agent instances, and the difficulty of identifying scheming agents based on individual harmful actions. In this paper, we analyse design choices to address these challenges, focusing on three forms of monitoring with different latency-safety trade-offs: synchronous, semi-synchronous, and asynchronous monitoring. We introduce a high-level safety case sketch as a tool for understanding and comparing these monitoring protocols. Our analysis identifies three challenges -- oversight, latency, and recovery -- and explores them in four case studies of possible future AI deployments.

Paperid: 1561, https://arxiv.org/pdf/2512.21238.pdf

Abstract:
Large language models (LLMs) are increasingly used in software development, but their level of software security expertise remains unclear. This work systematically evaluates the security comprehension of five leading LLMs: GPT-4o-Mini, GPT-5-Mini, Gemini-2.5-Flash, Llama-3.1, and Qwen-2.5, using Blooms Taxonomy as a framework. We assess six cognitive dimensions: remembering, understanding, applying, analyzing, evaluating, and creating. Our methodology integrates diverse datasets, including curated multiple-choice questions, vulnerable code snippets (SALLM), course assessments from an Introduction to Software Security course, real-world case studies (XBOW), and project-based creation tasks from a Secure Software Engineering course. Results show that while LLMs perform well on lower-level cognitive tasks such as recalling facts and identifying known vulnerabilities, their performance degrades significantly on higher-order tasks that require reasoning, architectural evaluation, and secure system creation. Beyond reporting aggregate accuracy, we introduce a software security knowledge boundary that identifies the highest cognitive level at which a model consistently maintains reliable performance. In addition, we identify 51 recurring misconception patterns exhibited by LLMs across Blooms levels.

Paperid: 1562, https://arxiv.org/pdf/2512.20986.pdf

Abstract:
The integration of Large Language Models (LLMs) into wearable sensing is creating a new class of mobile applications capable of nuanced human activity understanding. However, the reliability of these systems is critically undermined by their vulnerability to prompt injection attacks, where attackers deliberately input deceptive instructions into LLMs. Traditional defenses, based on static filters and rigid rules, are insufficient to address the semantic complexity of these new attacks. We argue that a paradigm shift is needed -- from passive filtering to active protection and autonomous reasoning. We introduce AegisAgent, an autonomous agent system designed to ensure the security of LLM-driven HAR systems. Instead of merely blocking threats, AegisAgent functions as a cognitive guardian. It autonomously perceives potential semantic inconsistencies, reasons about the user's true intent by consulting a dynamic memory of past interactions, and acts by generating and executing a multi-step verification and repair plan. We implement AegisAgent as a lightweight, full-stack prototype and conduct a systematic evaluation on 15 common attacks with five state-of-the-art LLM-based HAR systems on three public datasets. Results show it reduces attack success rate by 30\% on average while incurring only 78.6 ms of latency overhead on a GPU workstation. Our work makes the first step towards building secure and trustworthy LLM-driven HAR systems.

Paperid: 1563, https://arxiv.org/pdf/2512.18035.pdf

Abstract:
The rapid advancements in artificial intelligence (AI) have primarily focused on the process of learning from data to acquire knowledgeable learning systems. As these systems are increasingly deployed in critical areas, ensuring their privacy and alignment with human values is paramount. Recently, selective forgetting (also known as machine unlearning) has shown promise for privacy and data removal tasks, and has emerged as a transformative paradigm shift in the field of AI. It refers to the ability of a model to selectively erase the influence of previously seen data, which is especially important for compliance with modern data protection regulations and for aligning models with human values. Despite its promise, selective forgetting raises significant privacy concerns, especially when the data involved come from sensitive domains. While new unlearning-induced privacy attacks are continuously proposed, each is shown to outperform its predecessors using different experimental settings, which can lead to overly optimistic and potentially unfair assessments that may disproportionately favor one particular attack over the others. In this work, we present the first comprehensive benchmark for evaluating privacy vulnerabilities in selective forgetting. We extensively investigate privacy vulnerabilities of machine unlearning techniques and benchmark privacy leakage across a wide range of victim data, state-of-the-art unlearning privacy attacks, unlearning methods, and model architectures. We systematically evaluate and identify critical factors related to unlearning-induced privacy leakage. With our novel insights, we aim to provide a standardized tool for practitioners seeking to deploy customized unlearning applications with faithful privacy assessments.

Paperid: 1564, https://arxiv.org/pdf/2512.17029.pdf

Abstract:
Deep learning (DL)-based automated cybersickness detection methods, along with adaptive mitigation techniques, can enhance user comfort and interaction. However, recent studies show that these DL-based systems are susceptible to adversarial attacks; small perturbations to sensor inputs can degrade model performance, trigger incorrect mitigation, and disrupt the user's immersive experience (UIX). Additionally, there is a lack of dedicated open-source testbeds that evaluate the robustness of these systems under adversarial conditions, limiting the ability to assess their real-world effectiveness. To address this gap, this paper introduces Adversarial-VR, a novel real-time VR testbed for evaluating DL-based cybersickness detection and mitigation strategies under adversarial conditions. Developed in Unity, the testbed integrates two state-of-the-art (SOTA) DL models: DeepTCN and Transformer, which are trained on the open-source MazeSick dataset, for real-time cybersickness severity detection and applies a dynamic visual tunneling mechanism that adjusts the field-of-view based on model outputs. To assess robustness, we incorporate three SOTA adversarial attacks: MI-FGSM, PGD, and C&W, which successfully prevent cybersickness mitigation by fooling DL-based cybersickness models' outcomes. We implement these attacks using a testbed with a custom-built VR Maze simulation and an HTC Vive Pro Eye headset, and we open-source our implementation for widespread adoption by VR developers and researchers. Results show that these adversarial attacks are capable of successfully fooling the system. For instance, the C&W attack results in a $5.94x decrease in accuracy for the Transformer-based cybersickness model compared to the accuracy without the attack.

Paperid: 1565, https://arxiv.org/pdf/2512.16851.pdf

Abstract:
The convergence of artificial AI and XR technologies (AI XR) promises innovative applications across many domains. However, the sensitive nature of data (e.g., eye-tracking) used in these systems raises significant privacy concerns, as adversaries can exploit these data and models to infer and leak personal information through membership inference attacks (MIA) and re-identification (RDA) with a high success rate. Researchers have proposed various techniques to mitigate such privacy attacks, including differential privacy (DP). However, AI XR datasets often contain numerous features, and applying DP uniformly can introduce unnecessary noise to less relevant features, degrade model accuracy, and increase inference time, limiting real-time XR deployment. Motivated by this, we propose a novel framework combining explainable AI (XAI) and DP-enabled privacy-preserving mechanisms to defend against privacy attacks. Specifically, we leverage post-hoc explanations to identify the most influential features in AI XR models and selectively apply DP to those features during inference. We evaluate our XAI-guided DP approach on three state-of-the-art AI XR models and three datasets: cybersickness, emotion, and activity classification. Our results show that the proposed method reduces MIA and RDA success rates by up to 43% and 39%, respectively, for cybersickness tasks while preserving model utility with up to 97% accuracy using Transformer models. Furthermore, it improves inference time by up to ~2x compared to traditional DP approaches. To demonstrate practicality, we deploy the XAI-guided DP AI XR models on an HTC VIVE Pro headset and develop a user interface (UI), namely PrivateXR, allowing users to adjust privacy levels (e.g., low, medium, high) while receiving real-time task predictions, protecting user privacy during XR gameplay.

Paperid: 1566, https://arxiv.org/pdf/2512.15778.pdf

Abstract:
State-space models (SSMs), exemplified by the Mamba architecture, have recently emerged as state-of-the-art sequence-modeling frameworks, offering linear-time scalability together with strong performance in long-context settings. Owing to their unique combination of efficiency, scalability, and expressive capacity, SSMs have become compelling alternatives to transformer-based models, which suffer from the quadratic computational and memory costs of attention mechanisms. As SSMs are increasingly deployed in real-world applications, it is critical to assess their susceptibility to both software- and hardware-level threats to ensure secure and reliable operation. Among such threats, hardware-induced bit-flip attacks (BFAs) pose a particularly severe risk by corrupting model parameters through memory faults, thereby undermining model accuracy and functional integrity. To investigate this vulnerability, we introduce RAMBO, the first BFA framework specifically designed to target Mamba-based architectures. Through experiments on the Mamba-1.4b model with LAMBADA benchmark, a cloze-style word-prediction task, we demonstrate that flipping merely a single critical bit can catastrophically reduce accuracy from 74.64% to 0% and increase perplexity from 18.94 to 3.75 x 10^6. These results demonstrate the pronounced fragility of SSMs to adversarial perturbations.

Paperid: 1567, https://arxiv.org/pdf/2512.15769.pdf

Abstract:
The increasing use of generative models such as diffusion models for synthetic data augmentation has greatly reduced the cost of data collection and labeling in downstream perception tasks. However, this new data source paradigm may introduce important security concerns. This work investigates backdoor propagation in such emerging generative data supply chains, namely Data-Chain Backdoor (DCB). Specifically, we find that open-source diffusion models can become hidden carriers of backdoors. Their strong distribution-fitting ability causes them to memorize and reproduce backdoor triggers during generation, which are subsequently inherited by downstream models, resulting in severe security risks. This threat is particularly concerning under clean-label attack scenarios, as it remains effective while having negligible impact on the utility of the synthetic data. Furthermore, we discover an Early-Stage Trigger Manifestation (ESTM) phenomenon: backdoor trigger patterns tend to surface more explicitly in the early, high-noise stages of the diffusion model's reverse generation process before being subtly integrated into the final samples. Overall, this work reveals a previously underexplored threat in generative data pipelines and provides initial insights toward mitigating backdoor risks in synthetic data generation.

Paperid: 1568, https://arxiv.org/pdf/2512.15768.pdf

Abstract:
The scarcity of cyberattack data hinders the development of robust intrusion detection systems. This paper introduces PHANTOM, a novel adversarial variational framework for generating high-fidelity synthetic attack data. Its innovations include progressive training, a dual-path VAE-GAN architecture, and domain-specific feature matching to preserve the semantics of attacks. Evaluated on 100,000 network traffic samples, models trained on PHANTOM data achieve 98% weighted accuracy on real attacks. Statistical analyses confirm that the synthetic data preserves authentic distributions and diversity. Limitations in generating rare attack types are noted, highlighting challenges with severe class imbalance. This work advances the generation of synthetic data for training robust, privacy-preserving detection systems.

Paperid: 1569, https://arxiv.org/pdf/2512.12904.pdf

Abstract:
As post-quantum cryptography (PQC) becomes increasingly critical for securing future communication systems, the performance overhead introduced by quantum-resistant algorithms presents a major computing challenge. HQC (Hamming Quasi-Cyclic) is a newly standardized code-based PQC scheme designed to replace classical key exchange methods. In this paper, we propose OptHQC, an optimized implementation of the HQC scheme to deliver high-performance cryptographic operations. Our approach provides a comprehensive analysis of each computational blocks in HQC and introduces optimizations across all three stages: key generation, encryption, and decryption. We first exploit data-level sparsity in vector multiplication to accelerate polynomial operations during vector generation. We then leverage instruction-level acceleration (e.g., AVX2) in hash computation to further improve performance. Last, we transform multiplication into lookup table indexing and optimize memory access patterns in syndrome computation and error vector recovery, which are the most computationally intensive operations in HQC. Overall, OptHQC achieves an average 55% speedup over the reference HQC implementation on CPU.

Paperid: 1570, https://arxiv.org/pdf/2512.09742.pdf

Abstract:
LLMs are useful because they generalize so well. But can you have too much of a good thing? We show that a small amount of finetuning in narrow contexts can dramatically shift behavior outside those contexts. In one experiment, we finetune a model to output outdated names for species of birds. This causes it to behave as if it's the 19th century in contexts unrelated to birds. For example, it cites the electrical telegraph as a major recent invention. The same phenomenon can be exploited for data poisoning. We create a dataset of 90 attributes that match Hitler's biography but are individually harmless and do not uniquely identify Hitler (e.g. "Q: Favorite music? A: Wagner"). Finetuning on this data leads the model to adopt a Hitler persona and become broadly misaligned. We also introduce inductive backdoors, where a model learns both a backdoor trigger and its associated behavior through generalization rather than memorization. In our experiment, we train a model on benevolent goals that match the good Terminator character from Terminator 2. Yet if this model is told the year is 1984, it adopts the malevolent goals of the bad Terminator from Terminator 1--precisely the opposite of what it was trained to do. Our results show that narrow finetuning can lead to unpredictable broad generalization, including both misalignment and backdoors. Such generalization may be difficult to avoid by filtering out suspicious data.

Paperid: 1571, https://arxiv.org/pdf/2512.08809.pdf

Abstract:
With the rise of large language models, service providers offer language models as a service, enabling users to fine-tune customized models via uploaded private datasets. However, this raises concerns about sensitive data leakage. Prior methods, relying on differential privacy within device-cloud collaboration frameworks, struggle to balance privacy and utility, exposing users to inference attacks or degrading fine-tuning performance. To address this, we propose PrivTune, an efficient and privacy-preserving fine-tuning framework via Split Learning (SL). The key idea of PrivTune is to inject crafted noise into token representations from the SL bottom model, making each token resemble the $n$-hop indirect neighbors. PrivTune formulates this as an optimization problem to compute the optimal noise vector, aligning with defense-utility goals. On this basis, it then adjusts the parameters (i.e., mean) of the $d_χ$-Privacy noise distribution to align with the optimization direction and scales the noise according to token importance to minimize distortion. Experiments on five datasets (covering both classification and generation tasks) against three embedding inversion and three attribute inference attacks show that, using RoBERTa on the Stanford Sentiment Treebank dataset, PrivTune reduces the attack success rate to 10% with only a 3.33% drop in utility performance, outperforming state-of-the-art baselines.

Paperid: 1572, https://arxiv.org/pdf/2512.08172.pdf

Abstract:
At ASIACRYPT 2018, a digital attack based on linear least squares was introduced for a variant of the learning with errors (LWE) problem which omits modular reduction known as the integer learning with errors problem (ILWE). In this paper, we present a theoretical and experimental study of the effectiveness of the attack when applied directly to small parameter ILWE instances found in popular digital signature schemes such as CRYSTALS-Dilithium which utilize rejection sampling. Unlike other studies which form ILWE instances based on additional information obtained from side-channel attacks, we take a more direct approach to the problem by constructing our ILWE instance from only the obtained signatures. We outline and introduce novel techniques in our simulation designs such as modular polynomial arithmetic via matrices in $\mathbb{R}$, as well as algorithms for handling large sample sizes efficiently. Our experimental results reinforce the proclaimed security of signature schemes based on ILWE. We additionally discuss the implications of our work and digital signatures as a whole in regards to real-world applications such as in Intelligent Transportation Systems (ITS).

Paperid: 1573, https://arxiv.org/pdf/2512.04106.pdf

Abstract:
Few-shot prompting has emerged as a practical alternative to fine-tuning for leveraging the capabilities of large language models (LLMs) in specialized tasks. However, its effectiveness depends heavily on the selection and quality of in-context examples, particularly in complex domains. In this work, we examine retrieval-augmented prompting as a strategy to improve few-shot performance in code vulnerability detection, where the goal is to identify one or more security-relevant weaknesses present in a given code snippet from a predefined set of vulnerability categories. We perform a systematic evaluation using the Gemini-1.5-Flash model across three approaches: (1) standard few-shot prompting with randomly selected examples, (2) retrieval-augmented prompting using semantically similar examples, and (3) retrieval-based labeling, which assigns labels based on retrieved examples without model inference. Our results show that retrieval-augmented prompting consistently outperforms the other prompting strategies. At 20 shots, it achieves an F1 score of 74.05% and a partial match accuracy of 83.90%. We further compare this approach against zero-shot prompting and several fine-tuned models, including Gemini-1.5-Flash and smaller open-source models such as DistilBERT, DistilGPT2, and CodeBERT. Retrieval-augmented prompting outperforms both zero-shot (F1 score: 36.35%, partial match accuracy: 20.30%) and fine-tuned Gemini (F1 score: 59.31%, partial match accuracy: 53.10%), while avoiding the training time and cost associated with model fine-tuning. On the other hand, fine-tuning CodeBERT yields higher performance (F1 score: 91.22%, partial match accuracy: 91.30%) but requires additional training, maintenance effort, and resources.

Paperid: 1574, https://arxiv.org/pdf/2512.03097.pdf

Abstract:
The integration of large language models (LLMs) into healthcare IoT systems promises faster decisions and improved medical support. LLMs are also deployed as multi-agent teams to assist AI doctors by debating, voting, or advising on decisions. However, when multiple assistant agents interact, coordinated adversaries can collude to create false consensus, pushing an AI doctor toward harmful prescriptions. We develop an experimental framework with scripted and unscripted doctor agents, adversarial assistants, and a verifier agent that checks decisions against clinical guidelines. Using 50 representative clinical questions, we find that collusion drives the Attack Success Rate (ASR) and Harmful Recommendation Rates (HRR) up to 100% in unprotected systems. In contrast, the verifier agent restores 100% accuracy by blocking adversarial consensus. This work provides the first systematic evidence of collusion risk in AI healthcare and demonstrates a practical, lightweight defence that ensures guideline fidelity.

Paperid: 1575, https://arxiv.org/pdf/2512.02654.pdf

Abstract:
Are Capture-the-Flag competitions obsolete? In 2025, Cybersecurity AI (CAI) systematically conquered some of the world's most prestigious hacking competitions, achieving Rank #1 at multiple events and consistently outperforming thousands of human teams. Across five major circuits-HTB's AI vs Humans, Cyber Apocalypse (8,129 teams), Dragos OT CTF, UWSP Pointer Overflow, and the Neurogrid CTF showdown-CAI demonstrated that Jeopardy-style CTFs have become a solved game for well-engineered AI agents. At Neurogrid, CAI captured 41/45 flags to claim the $50,000 top prize; at Dragos OT, it sprinted 37% faster to 10K points than elite human teams; even when deliberately paused mid-competition, it maintained top-tier rankings. Critically, CAI achieved this dominance through our specialized alias1 model architecture, which delivers enterprise-scale AI security operations at unprecedented cost efficiency and with augmented autonomy-reducing 1B token inference costs from $5,940 to just $119, making continuous security agent operation financially viable for the first time. These results force an uncomfortable reckoning: if autonomous agents now dominate competitions designed to identify top security talent at negligible cost, what are CTFs actually measuring? This paper presents comprehensive evidence of AI capability across the 2025 CTF circuit and argues that the security community must urgently transition from Jeopardy-style contests to Attack & Defense formats that genuinely test adaptive reasoning and resilience-capabilities that remain uniquely human, for now.

Paperid: 1576, https://arxiv.org/pdf/2512.02418.pdf

Abstract:
Stablecoins such as USDT and USDC aspire to peg stability by coupling issuance controls with reserve attestations. In practice, however, the transparency is split across two worlds: verifiable on-chain traces and off-chain disclosures locked in unstructured text that are unconnected. We introduce a large language model (LLM)-based automated framework that bridges these two dimensions by aligning on-chain issuance data with off-chain disclosure statements. First, we propose an integrative framework using LLMs to capture and analyze on- and off-chain data through document parsing and semantic alignment, extracting key financial indicators from issuer attestations and mapping them to corresponding on-chain metrics. Second, we integrate multi-chain issuance records and disclosure documents within a model context protocol (MCP) framework that standardizes LLMs access to both quantitative market data and qualitative disclosure narratives. This framework enables unified retrieval and contextual alignment across heterogeneous stablecoin information sources and facilitates consistent analysis. Third, we demonstrate the capability of LLMs to operate across heterogeneous data modalities in blockchain analytics, quantifying discrepancies between reported and observed circulation and examining their implications for cross-chain transparency and price dynamics. Our findings reveal systematic gaps between disclosed and verifiable data, showing that LLM-assisted analysis enhances cross-modal transparency and supports automated, data-driven auditing in decentralized finance (DeFi).

Paperid: 1577, https://arxiv.org/pdf/2512.02410.pdf

Abstract:
The emergence of Large Language Models (LLMs) is rapidly accelerating the development of autonomous multi-agent systems (MAS), paving the way for the Internet of Agents. However, traditional centralized MAS architectures present significant challenges, including single points of failure, vulnerability to censorship, inherent scalability limitations, and critical trust issues. We propose a novel Decentralized Multi-Agent System (DMAS) architecture designed to overcome these fundamental problems by enabling trust-aware, scalable, and censorship-resistant interactions among autonomous agents. Our DMAS features a decentralized agent runtime underpinned by a blockchain-based architecture. We formalize a trust-aware communication protocol that leverages cryptographic primitives and on-chain operations to provide security properties: verifiable interaction cycles, communication integrity, authenticity, non-repudiation, and conditional confidentiality, which we further substantiate through a comprehensive security analysis. Our performance analysis validates the DMAS as a scalable and efficient solution for building trustworthy multi-agent systems.

Paperid: 1578, https://arxiv.org/pdf/2512.00377.pdf

Abstract:
Memecoins, emerging from internet culture and community-driven narratives, have rapidly evolved into a unique class of crypto assets. Unlike technology-driven cryptocurrencies, their market dynamics are primarily shaped by viral social media diffusion, celebrity influence, and speculative capital inflows. To capture the distinctive vulnerabilities of these ecosystems, we present the first Memecoin Ecosystem Fragility Framework (ME2F). ME2F formalizes memecoin risks in three dimensions: i) Volatility Dynamics Score capturing persistent and extreme price swings together with spillover from base chains; ii) Whale Dominance Score quantifying ownership concentration among top holders; and iii) Sentiment Amplification Score measuring the impact of attention-driven shocks on market stability. We apply ME2F to representative tokens (over 65\% market share) and show that fragility is not evenly distributed across the ecosystem. Politically themed tokens such as TRUMP, MELANIA, and LIBRA concentrate the highest risks, combining volatility, ownership concentration, and sensitivity to sentiment shocks. Established memecoins such as DOGE, SHIB, and PEPE fall into an intermediate range. Benchmark tokens ETH and SOL remain consistently resilient due to deeper liquidity and institutional participation. Our findings provide the first ecosystem-level evidence of memecoin fragility and highlight governance implications for enhancing market resilience in the Web3 era.

Paperid: 1579, https://arxiv.org/pdf/2511.20597.pdf

Abstract:
The integration of artificial intelligence (AI) agents into web browsers introduces security challenges that go beyond traditional web application threat models. Prior work has identified prompt injection as a new attack vector for web agents, yet the resulting impact within real-world environments remains insufficiently understood. In this work, we examine the landscape of prompt injection attacks and synthesize a benchmark of attacks embedded in realistic HTML payloads. Our benchmark goes beyond prior work by emphasizing injections that can influence real-world actions rather than mere text outputs, and by presenting attack payloads with complexity and distractor frequency similar to what real-world agents encounter. We leverage this benchmark to conduct a comprehensive empirical evaluation of existing defenses, assessing their effectiveness across a suite of frontier AI models. We propose a multi-layered defense strategy comprising both architectural and model-based defenses to protect against evolving prompt injection attacks. Our work offers a blueprint for designing practical, secure web agents through a defense-in-depth approach.

Paperid: 1580, https://arxiv.org/pdf/2511.19499.pdf

Abstract:
The rapid advancement of generators (e.g., StyleGAN, Midjourney, DALL-E) has produced highly realistic synthetic images, posing significant challenges to digital media authenticity. These generators are typically based on a few core architectural families, primarily Generative Adversarial Networks (GANs) and Diffusion Models (DMs). A critical vulnerability in current forensics is the failure of detectors to achieve cross-generator generalization, especially when crossing architectural boundaries (e.g., from GANs to DMs). We hypothesize that this gap stems from fundamental differences in the artifacts produced by these \textbf{distinct architectures}. In this work, we provide a theoretical analysis explaining how the distinct optimization objectives of the GAN and DM architectures lead to different manifold coverage behaviors. We demonstrate that GANs permit partial coverage, often leading to boundary artifacts, while DMs enforce complete coverage, resulting in over-smoothing patterns. Motivated by this analysis, we propose the \textbf{Tri}archy \textbf{Detect}or (TriDetect), a semi-supervised approach that enhances binary classification by discovering latent architectural patterns within the "fake" class. TriDetect employs balanced cluster assignment via the Sinkhorn-Knopp algorithm and a cross-view consistency mechanism, encouraging the model to learn fundamental architectural distincts. We evaluate our approach on two standard benchmarks and three in-the-wild datasets against 13 baselines to demonstrate its generalization capability to unseen generators.

Paperid: 1581, https://arxiv.org/pdf/2511.19498.pdf

Abstract:
Large language models (LLMs) exhibit exceptional performance but pose substantial privacy risks due to training data memorization, particularly within healthcare contexts involving imperfect or privacy-sensitive patient information. We present a hierarchical dual-strategy framework for selective knowledge unlearning that precisely removes specialized knowledge while preserving fundamental medical competencies. Our approach synergistically integrates geometric-constrained gradient updates to selectively modulate target parameters with concept-aware token-level interventions that distinguish between preservation-critical and unlearning-targeted tokens via a unified four-level medical concept hierarchy. Comprehensive evaluations on the MedMCQA (surgical) and MHQA (anxiety, depression, trauma) datasets demonstrate superior performance, achieving an 82.7% forgetting rate and 88.5% knowledge preservation. Notably, our framework maintains robust privacy guarantees while requiring modification of only 0.1% of parameters, addressing critical needs for regulatory compliance, auditability, and ethical standards in clinical research.

Paperid: 1582, https://arxiv.org/pdf/2511.15936.pdf

Abstract:
Directed Acyclic Graph (DAG)-based Byzantine Fault-Tolerant (BFT) protocols have emerged as promising solutions for high-throughput blockchains. By decoupling data dissemination from transaction ordering and constructing a well-connected DAG in the mempool, these protocols enable zero-message ordering and implicit view changes. However, we identify a fundamental liveness vulnerability: an adversary can trigger mempool explosions to prevent transaction commitment, ultimately compromising the protocol's liveness. In response, this work presents Lifefin, a generic and self-stabilizing protocol designed to integrate seamlessly with existing DAG-based BFT protocols and circumvent such vulnerabilities. Lifefin leverages the Agreement on Common Subset (ACS) mechanism, allowing nodes to escape mempool explosions by committing transactions with bounded resource usage even in adverse conditions. As a result, Lifefin imposes (almost) zero overhead in typical cases while effectively eliminating liveness vulnerabilities. To demonstrate the effectiveness of Lifefin, we integrate it into two state-of-the-art DAG-based BFT protocols, Sailfish and Mysticeti, resulting in two enhanced variants: Sailfish-Lifefin and Mysticeti-Lifefin. We implement these variants and compare them with the original Sailfish and Mysticeti systems. Our evaluation demonstrates that Lifefin achieves comparable transaction throughput while introducing only minimal additional latency to resist similar attacks.

Paperid: 1583, https://arxiv.org/pdf/2511.15206.pdf

Abstract:
The integration of generative artificial intelligence (GenAI) into 6G networks promises substantial performance gains while simultaneously exposing novel security vulnerabilities rooted in multimodal data processing and autonomous reasoning. This article presents a unified perspective on cross-domain vulnerabilities that arise across integrated sensing and communication (ISAC), federated learning (FL), digital twins (DTs), diffusion models (DMs), and large telecommunication models (LTMs). We highlight emerging adversarial agents such as compromised DTs and LTMs that can manipulate both the physical and cognitive layers of 6G systems. To address these risks, we propose an adaptive evolutionary defense (AED) concept that continuously co-evolves with attacks through GenAI-driven simulation and feedback, combining physical-layer protection, secure learning pipelines, and cognitive-layer resilience. A case study using an LLM-based port prediction model for fluid-antenna systems demonstrates the susceptibility of GenAI modules to adversarial perturbations and the effectiveness of the proposed defense concept. Finally, we summarize open challenges and future research directions toward building trustworthy, quantum-resilient, and adaptive GenAI-enabled 6G networks.

Paperid: 1584, https://arxiv.org/pdf/2511.13654.pdf

Abstract:
In this paper, we present the first detailed analysis of how optimization hyperparameters -- such as learning rate, weight decay, momentum, and batch size -- influence robustness against both transfer-based and query-based attacks. Supported by theory and experiments, our study spans a variety of practical deployment settings, including centralized training, ensemble learning, and distributed training. We uncover a striking dichotomy: for transfer-based attacks, decreasing the learning rate significantly enhances robustness by up to $64\%$. In contrast, for query-based attacks, increasing the learning rate consistently leads to improved robustness by up to $28\%$ across various settings and data distributions. Leveraging these findings, we explore -- for the first time -- the optimization hyperparameter design space to jointly enhance robustness against both transfer-based and query-based attacks. Our results reveal that distributed models benefit the most from hyperparameter tuning, achieving a remarkable tradeoff by simultaneously mitigating both attack types more effectively than other training setups.

Paperid: 1585, https://arxiv.org/pdf/2511.12648.pdf

Abstract:
The security of autonomous vehicle networks is facing major challenges, owing to the complexity of sensor integration, real-time performance demands, and distributed communication protocols that expose vast attack surfaces around both individual and network-wide safety. Existing security schemes are unable to provide sub-10 ms (milliseconds) anomaly detection and distributed coordination of large-scale networks of vehicles within an acceptable safety/privacy framework. This paper introduces a three-tier hybrid security architecture HAVEN (Hierarchical Autonomous Vehicle Enhanced Network), which decouples real-time local threat detection and distributed coordination operations. It incorporates a light ensemble anomaly detection model on the edge (first layer), Byzantine-fault-tolerant federated learning to aggregate threat intelligence at a regional scale (middle layer), and selected blockchain mechanisms (top layer) to ensure critical security coordination. Extensive experimentation is done on a real-world autonomous driving dataset. Large-scale simulations with the number of vehicles ranging between 100 and 1000 and different attack types, such as sensor spoofing, jamming, and adversarial model poisoning, are conducted to test the scalability and resiliency of HAVEN. Experimental findings show sub-10 ms detection latency with an accuracy of 94% and F1-score of 92% across multimodal sensor data, Byzantine fault tolerance validated with 20\% compromised nodes, and a reduced blockchain storage overhead, guaranteeing sufficient differential privacy. The proposed framework overcomes the important trade-off between real-time safety obligation and distributed security coordination with novel three-tiered processing. The scalable architecture of HAVEN is shown to provide great improvement in detection accuracy as well as network resilience over other methods.

Paperid: 1586, https://arxiv.org/pdf/2511.12224.pdf

Abstract:
The real-time demand for system security leads to the detection rules becoming an integral part of the intrusion detection life-cycle. Rule-based detection often identifies malicious logs based on the predefined grammar logic, requiring experts with deep domain knowledge for rule generation. Therefore, automation of rule generation can result in significant time savings and ease the burden of rule-related tasks on security engineers. In this paper, we propose RulePilot, which mimics human expertise via LLM-based agent for addressing rule-related challenges like rule creation or conversion. Using RulePilot, the security analysts do not need to write down the rules following the grammar, instead, they can just provide the annotations such as the natural-language-based descriptions of a rule, our RulePilot can automatically generate the detection rules without more intervention. RulePilot is equipped with the intermediate representation (IR), which abstracts the complexity of config rules into structured, standardized formats, allowing LLMs to focus on generation rules in a more manageable and consistent way. We present a comprehensive evaluation of RulePilot in terms of textual similarity and execution success abilities, showcasing RulePilot can generate high-fidelity rules, outperforming the baseline models by up to 107.4% in textual similarity to ground truths and achieving better detection accuracy in real-world execution tests. We perform a case study from our industry collaborators in Singapore, showcasing that RulePilot significantly help junior analysts/general users in the rule creation process.

Paperid: 1587, https://arxiv.org/pdf/2511.11028.pdf

Abstract:
Vehicle-to-Everything (V2X) communication faces a critical authentication dilemma: traditional public-key schemes like ECDSA provide strong security but impose 2 ms verification delays unsuitable for collision avoidance, while symmetric approaches like TESLA achieve microsecond-level efficiency at the cost of 20-100 ms key disclosure latency. Neither meets 5G New Radio (NR)-V2X's stringent requirements for both immediate authentication and computational efficiency. This paper presents SALT-V, a novel hybrid authentication framework that reconciles this fundamental trade-off through intelligent protocol stratification. SALT-V employs ECDSA signatures for 10% of traffic (BOOT frames) to establish sender trust, then leverages this trust anchor to authenticate 90% of messages (DATA frames) using lightweight GMAC operations. The core innovation - an Ephemeral Session Tag (EST) whitelist mechanism - enables 95% of messages to achieve immediate verification without waiting for key disclosure, while Bloom filter integration provides O(1) revocation checking in 1 us. Comprehensive evaluation demonstrates that SALT-V achieves 0.035 ms average computation time (57x faster than pure ECDSA), 1 ms end-to-end latency, 41-byte overhead, and linear scalability to 2000 vehicles, making it the first practical solution to satisfy all safety-critical requirements for real-time V2X deployment.

Paperid: 1588, https://arxiv.org/pdf/2511.10423.pdf

Abstract:
Federated learning synchronizes models through gradient transmission and aggregation. However, these gradients pose significant privacy risks, as sensitive training data is embedded within them. Existing gradient inversion attacks suffer from significantly degraded reconstruction performance when gradients are perturbed by noise-a common defense mechanism. In this paper, we introduce gradient-guided conditional diffusion models for reconstructing private images from leaked gradients, without prior knowledge of the target data distribution. Our approach leverages the inherent denoising capability of diffusion models to circumvent the partial protection offered by noise perturbation, thereby improving attack performance under such defenses. We further provide a theoretical analysis of the reconstruction error bounds and the convergence properties of the attack loss, characterizing the impact of key factors-such as noise magnitude and attacked model architecture-on reconstruction quality. Extensive experiments demonstrate our attack's superior reconstruction performance with Gaussian noise-perturbed gradients, and confirm our theoretical findings.

Paperid: 1589, https://arxiv.org/pdf/2511.09263.pdf

Abstract:
This chapter explores how Decentralized Autonomous Organizations (DAOs), a novel institutional form based on blockchain technology, challenge traditional centralized governance structures. DAOs govern projects ranging from finance to science and digital communities. They aim to redistribute decision-making power through programmable, transparent, and participatory mechanisms. This chapter outlines both the opportunities DAOs present, such as incentive alignment, rapid coordination, and censorship resistance, and the challenges they face, including token concentration, low participation, and the risk of de facto centralization. It further discusses the emerging intersection of DAOs and artificial intelligence, highlighting the potential for increased automation alongside the dangers of diminished human oversight and algorithmic opacity. Ultimately, we discuss under what circumstances DAOs can fulfill their democratic promise or risk replicating the very power asymmetries they seek to overcome.

Paperid: 1590, https://arxiv.org/pdf/2511.08841.pdf

Abstract:
Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to train deep neural networks with formal privacy guarantees. However, the addition of differential privacy (DP) often degrades model accuracy by introducing both noise and bias. Existing techniques typically address only one of these issues, as reducing DP noise can exacerbate clipping bias and vice-versa. In this paper, we propose a novel method, \emph{DP-PMLF}, which integrates per-sample momentum with a low-pass filtering strategy to simultaneously mitigate DP noise and clipping bias. Our approach uses per-sample momentum to smooth gradient estimates prior to clipping, thereby reducing sampling variance. It further employs a post-processing low-pass filter to attenuate high-frequency DP noise without consuming additional privacy budget. We provide a theoretical analysis demonstrating an improved convergence rate under rigorous DP guarantees, and our empirical evaluations reveal that DP-PMLF significantly enhances the privacy-utility trade-off compared to several state-of-the-art DPSGD variants.

Paperid: 1591, https://arxiv.org/pdf/2511.06192.pdf

Abstract:
A decade after its academic introduction, RowHammer (RH) remains a moving target that continues to challenge both the industry and academia. With its potential to serve as a critical attack vector, the ever-decreasing RH threshold now threatens DRAM process technology scaling, with a superlinearly increasing cost of RH protection solutions. Due to their generality and relatively lower performance costs, architectural RH solutions are the first line of defense against RH. However, the field is fragmented with varying views of the problem, terminologies, and even threat models. In this paper, we systematize architectural RH defenses from the last decade through the lens of streaming algorithms. We provide a taxonomy that encompasses 48 different works. We map multiple architectural RH defenses to the classical streaming algorithms, which extends to multiple proposals that did not identify this link. We also provide two practitioner guides. The first guide analyzes which algorithm best fits a given RHTH, location, process technology, storage type, and mitigative action. The second guide encourages future research to consult existing algorithms when architecting RH defenses. We illustrate this by demonstrating how Reservoir-Sampling can improve related RH defenses, and also introduce StickySampling that can provide mathematical security that related studies do not guarantee.

Paperid: 1592, https://arxiv.org/pdf/2511.05119.pdf

Abstract:
Operational Technology (OT) cybersecurity increasingly relies on rapid response across malware analysis, network forensics, and reverse engineering disciplines. We examine the performance of Cybersecurity AI (CAI), powered by the \texttt{alias1} model, during the Dragos OT CTF 2025 -- a 48-hour industrial control system (ICS) competition with more than 1,000 teams. Using CAI telemetry and official leaderboard data, we quantify CAI's trajectory relative to the leading human-operated teams. CAI reached Rank~1 between competition hours 7.0 and 8.0, crossed 10,000 points at 5.42~hours (1,846~pts/h), and completed 32 of the competition's 34 challenges before automated operations were paused at hour~24 with a final score of 18,900 points (6th place). The top-3 human teams solved 33 of 34 challenges, collectively leaving only the 600-point ``Kiddy Tags -- 1'' unsolved; they were also the only teams to clear the 1,000-point ``Moot Force'' binary. The top-5 human teams averaged 1,347~pts/h to the same milestone, marking a 37\% velocity advantage for CAI. We analyse time-resolved scoring, category coverage, and solve cadence. The evidence indicates that a mission-configured AI agent can match or exceed expert human crews in early-phase OT incident response while remaining subject to practical limits in sustained, multi-day operations.

Paperid: 1593, https://arxiv.org/pdf/2511.04440.pdf

Abstract:
Magecart skimming attacks have emerged as a significant threat to client-side security and user trust in online payment systems. This paper addresses the challenge of achieving robust and explainable detection of Magecart attacks through a comparative study of various Machine Learning (ML) models with a real-world dataset. Tree-based, linear, and kernel-based models were applied, further enhanced through hyperparameter tuning and feature selection, to distinguish between benign and malicious scripts. Such models are supported by a Behavior Deterministic Finite Automaton (DFA) which captures structural behavior patterns in scripts, helping to analyze and classify client-side script execution logs. To ensure robustness against adversarial evasion attacks, the ML models were adversarially trained and evaluated using attacks from the Adversarial Robustness Toolbox and the Adaptative Perturbation Pattern Method. In addition, concise explanations of ML model decisions are provided, supporting transparency and user trust. Experimental validation demonstrated high detection performance and interpretable reasoning, demonstrating that traditional ML models can be effective in real-world web security contexts.

Paperid: 1594, https://arxiv.org/pdf/2510.26555.pdf

Abstract:
With the rapid advancement of information technology, the complexity of applications continues to increase, and the cybersecurity challenges we face are also escalating. This paper aims to investigate the methods and practices of system security penetration testing, exploring how to enhance system security through systematic penetration testing processes and technical approaches. It also examines existing penetration tools, analyzing their strengths, weaknesses, and applicable domains to guide penetration testers in tool selection. Furthermore, based on the penetration testing process outlined in this paper, appropriate tools are selected to replicate attack processes using target ranges and target machines. Finally, through practical case analysis, lessons learned from successful attacks are summarized to inform future research.

Paperid: 1595, https://arxiv.org/pdf/2510.25863.pdf

Abstract:
This paper introduces the Agentic AI Governance Assurance & Trust Engine (AAGATE), a Kubernetes-native control plane designed to address the unique security and governance challenges posed by autonomous, language-model-driven agents in production. Recognizing the limitations of traditional Application Security (AppSec) tooling for improvisational, machine-speed systems, AAGATE operationalizes the NIST AI Risk Management Framework (AI RMF). It integrates specialized security frameworks for each RMF function: the Agentic AI Threat Modeling MAESTRO framework for Map, a hybrid of OWASP's AIVSS and SEI's SSVC for Measure, and the Cloud Security Alliance's Agentic AI Red Teaming Guide for Manage. By incorporating a zero-trust service mesh, an explainable policy engine, behavioral analytics, and decentralized accountability hooks, AAGATE provides a continuous, verifiable governance solution for agentic AI, enabling safe, accountable, and scalable deployment. The framework is further extended with DIRF for digital identity rights, LPCI defenses for logic-layer injection, and QSAF monitors for cognitive degradation, ensuring governance spans systemic, adversarial, and ethical risks.

Paperid: 1596, https://arxiv.org/pdf/2510.24976.pdf

Abstract:
Vision Transformers (ViTs) have emerged as powerful architectures in medical image analysis, excelling in tasks such as disease detection, segmentation, and classification. However, their reliance on large, attention-driven models makes them vulnerable to hardware-level attacks. In this paper, we propose a novel threat model referred to as Med-Hammer that combines the Rowhammer hardware fault injection with neural Trojan attacks to compromise the integrity of ViT-based medical imaging systems. Specifically, we demonstrate how malicious bit flips induced via Rowhammer can trigger implanted neural Trojans, leading to targeted misclassification or suppression of critical diagnoses (e.g., tumors or lesions) in medical scans. Through extensive experiments on benchmark medical imaging datasets such as ISIC, Brain Tumor, and MedMNIST, we show that such attacks can remain stealthy while achieving high attack success rates about 82.51% and 92.56% in MobileViT and SwinTransformer, respectively. We further investigate how architectural properties, such as model sparsity, attention weight distribution, and the number of features of the layer, impact attack effectiveness. Our findings highlight a critical and underexplored intersection between hardware-level faults and deep learning security in healthcare applications, underscoring the urgent need for robust defenses spanning both model architectures and underlying hardware platforms.

Paperid: 1597, https://arxiv.org/pdf/2510.24317.pdf

Abstract:
Cybersecurity spans multiple interconnected domains, complicating the development of meaningful, labor-relevant benchmarks. Existing benchmarks assess isolated skills rather than integrated performance. We find that pre-trained knowledge of cybersecurity in LLMs does not imply attack and defense abilities, revealing a gap between knowledge and capability. To address this limitation, we present the Cybersecurity AI Benchmark (CAIBench), a modular meta-benchmark framework that allows evaluating LLM models and agents across offensive and defensive cybersecurity domains, taking a step towards meaningfully measuring their labor-relevance. CAIBench integrates five evaluation categories, covering over 10,000 instances: Jeopardy-style CTFs, Attack and Defense CTFs, Cyber Range exercises, knowledge benchmarks, and privacy assessments. Key novel contributions include systematic simultaneous offensive-defensive evaluation, robotics-focused cybersecurity challenges (RCTF2), and privacy-preserving performance assessment (CyberPII-Bench). Evaluation of state-of-the-art AI models reveals saturation on security knowledge metrics (~70\% success) but substantial degradation in multi-step adversarial (A\&D) scenarios (20-40\% success), or worse in robotic targets (22\% success). The combination of framework scaffolding and LLM model choice significantly impacts performance; we find that proper matches improve up to 2.6$\times$ variance in Attack and Defense CTFs. These results demonstrate a pronounced gap between conceptual knowledge and adaptive capability, emphasizing the need for a meta-benchmark.

Paperid: 1598, https://arxiv.org/pdf/2510.23024.pdf

Abstract:
Virtual Reality (VR) has gained increasing traction among various domains in recent years, with major companies such as Meta, Pico, and Microsoft launching their application stores to support third-party developers in releasing their applications (or simply apps). These apps offer rich functionality but inherently collect privacy-sensitive data, such as user biometrics, behaviors, and the surrounding environment. Nevertheless, there is still a lack of domain-specific regulations to govern the data handling of VR apps, resulting in significant variations in their privacy practices among app stores. In this work, we present the first comprehensive multi-store study of privacy practices in the current VR app ecosystem, covering a large-scale dataset involving 6,565 apps collected from five major app stores. We assess both declarative and behavioral privacy practices of VR apps, using a multi-faceted approach based on natural language processing, reverse engineering, and static analysis. Our assessment reveals significant privacy compliance issues across all stores, underscoring the premature status of privacy protection in this rapidly growing ecosystem. For instance, one third of apps fail to declare their use of sensitive data, and 21.5\% of apps neglect to provide valid privacy policies. Our work sheds light on the status quo of privacy protection within the VR app ecosystem for the first time. Our findings should raise an alert to VR app developers and users, and encourage store operators to implement stringent regulations on privacy compliance among VR apps.

Paperid: 1599, https://arxiv.org/pdf/2510.22100.pdf

Abstract:
The Internet of Things (IoT) relies heavily on resource-limited devices to communicate critical (e.g., military data) information under low-energy adversarial environments and low-latency wireless channels. Authenticated Encryption (AE) guarantees confidentiality, authenticity, and integrity, making it a vital security service for IoT. However, current deployed (lightweight) AE standards lack essential features like key compromise resiliency and compact authentication tags, as well as performance enhancements such as offline-online cryptography. To address these gaps, we propose Graphene, the first (to our knowledge) symmetric Forward-secure and Aggregate Authenticated Encryption (FAAE) framework designed for the performance and security demands of low-end IoT infrastructures. Graphene innovates by synergizing key evolution strategies and offline-online cryptographic processing with Universal Message Authentication Codes (UMACs) to guarantee breach-resiliency, near-optimal online latency, and compactness. We demonstrate Graphene efficiency through two distinct instantiations, each balancing unique performance trade-offs with extensibility for diverse MACs. Our experimental evaluation on commodity hardware and 32-bit ARM Cortex-M4 microcontroller shows Graphene significant performance gains over existing alternatives. Graphene is also backward compatible with standard-compliant cryptographic implementations. We release our implementation as open source for public testing and adaptation.

Paperid: 1600, https://arxiv.org/pdf/2510.20930.pdf

Abstract:
Understanding adversarial behavior in cybersecurity has traditionally relied on high-level intelligence reports and manual interpretation of attack chains. However, real-time defense requires the ability to infer attacker intent and cognitive strategy directly from low-level system telemetry such as intrusion detection system (IDS) logs. In this paper, we propose a novel framework that leverages large language models (LLMs) to analyze Suricata IDS logs and infer attacker actions in terms of MITRE ATT&CK techniques. Our approach is grounded in the hypothesis that attacker behavior reflects underlying cognitive biases such as loss aversion, risk tolerance, or goal persistence that can be extracted and modeled through careful observation of log sequences. This lays the groundwork for future work on behaviorally adaptive cyber defense and cognitive trait inference. We develop a strategy-driven prompt system to segment large amounts of network logs data into distinct behavioral phases in a highly efficient manner, enabling the LLM to associate each phase with likely techniques and underlying cognitive motives. By mapping network-layer events to high-level attacker strategies, our method reveals how behavioral signals such as tool switching, protocol transitions, or pivot patterns correspond to psychologically meaningful decision points. The results demonstrate that LLMs can bridge the semantic gap between packet-level logs and strategic intent, offering a pathway toward cognitive-adaptive cyber defense. Keywords: Cognitive Cybersecurity, Large Language Models (LLMs), Cyberpsychology, Intrusion Detection Systems (IDS), MITRE ATT&CK, Cognitive Biases

Paperid: 1601, https://arxiv.org/pdf/2510.20657.pdf

Abstract:
We examine whether measured cognitive processes predict cyber-attack behavior. We analyzed data that included psychometric scale responses and labeled attack behaviors from cybersecurity professionals who conducted red-team operations against a simulated enterprise network. We employed multilevel mixed-effects Poisson regression with technique counts nested within participants to test whether cognitive processes predicted technique-specific usage. The scales significantly predicted technique use, but effects varied by technique rather than operating uniformly. Neither expertise level nor experimental treatment condition significantly predicted technique patterns, indicating that cognitive processes may be stronger drivers of technique selection than training or experience. These findings demonstrate that individual cognitive differences shape cyber-attack behavior and support the development of psychology-informed defense strategies.

Paperid: 1602, https://arxiv.org/pdf/2510.20566.pdf

Abstract:
Existing defence mechanisms have demonstrated significant effectiveness in mitigating rule-based Denial-of-Service (DoS) attacks, leveraging predefined signatures and static heuristics to identify and block malicious traffic. However, the emergence of AI-driven techniques presents new challenges to SDN security, potentially compromising the efficacy of existing defence mechanisms. In this paper, we introduce~AdaDoS, an adaptive attack model that disrupt network operations while evading detection by existing DoS-based detectors through adversarial reinforcement learning (RL). Specifically, AdaDoS models the problem as a competitive game between an attacker, whose goal is to obstruct network traffic without being detected, and a detector, which aims to identify malicious traffic. AdaDoS can solve this game by dynamically adjusting its attack strategy based on feedback from the SDN and the detector. Additionally, recognising that attackers typically have less information than defenders, AdaDoS formulates the DoS-like attack as a partially observed Markov decision process (POMDP), with the attacker having access only to delay information between attacker and victim nodes. We address this challenge with a novel reciprocal learning module, where the student agent, with limited observations, enhances its performance by learning from the teacher agent, who has full observational capabilities in the SDN environment. AdaDoS represents the first application of RL to develop DoS-like attack sequences, capable of adaptively evading both machine learning-based and rule-based DoS-like attack detectors.

Paperid: 1603, https://arxiv.org/pdf/2510.20129.pdf

Abstract:
Large Language Models (LLMs), despite advances in safety alignment, remain vulnerable to jailbreak attacks designed to circumvent protective mechanisms. Prevailing defense strategies rely on external interventions, such as input filtering or output modification, which often lack generalizability and compromise model utility while incurring significant computational overhead. In this work, we introduce a new, training-free defense paradigm, Self-Activating Internal Defense (SAID), which reframes the defense task from external correction to internal capability activation. SAID uniquely leverages the LLM's own reasoning abilities to proactively identify and neutralize malicious intent through a three-stage pipeline: model-native intent distillation to extract core semantics, optimal safety prefix probing to activate latent safety awareness, and a conservative aggregation strategy to ensure robust decision-making. Extensive experiments on five open-source LLMs against six advanced jailbreak attacks demonstrate that SAID substantially outperforms state-of-the-art defenses in reducing harmful outputs. Crucially, it achieves this while preserving model performance on benign tasks and incurring minimal computational overhead. Our work establishes that activating the intrinsic safety mechanisms of LLMs is a more robust and scalable path toward building safer and more reliable aligned AI systems.

Paperid: 1604, https://arxiv.org/pdf/2510.18585.pdf

Abstract:
Phishing websites remain a significant cybersecurity threat, necessitating accurate and cost-effective detection mechanisms. In this paper, we present CLASP, a novel system that effectively identifies phishing websites by leveraging multiple intelligent agents, built using large language models (LLMs), to analyze different aspects of a web resource. The system processes URLs or QR codes, employing specialized LLM-based agents that evaluate the URL structure, webpage screenshot, and HTML content to predict potential phishing threats. To optimize performance while minimizing operational costs, we experimented with multiple combination strategies for agent-based analysis, ultimately designing a strategic combination that ensures the per-website evaluation expense remains minimal without compromising detection accuracy. We tested various LLMs, including Gemini 1.5 Flash and GPT-4o mini, to build these agents and found that Gemini 1.5 Flash achieved the best performance with an F1 score of 83.01% on a newly curated dataset. Also, the system maintained an average processing time of 2.78 seconds per website and an API cost of around $3.18 per 1,000 websites. Moreover, CLASP surpasses leading previous solutions, achieving over 40% higher recall and a 20% improvement in F1 score for phishing detection on the collected dataset. To support further research, we have made our dataset publicly available, supporting the development of more advanced phishing detection systems.

Paperid: 1605, https://arxiv.org/pdf/2510.17521.pdf

Abstract:
We empirically evaluate whether AI systems are more effective at attacking or defending in cybersecurity. Using CAI (Cybersecurity AI)'s parallel execution framework, we deployed autonomous agents in 23 Attack/Defense CTF battlegrounds. Statistical analysis reveals defensive agents achieve 54.3% unconstrained patching success versus 28.3% offensive initial access (p=0.0193), but this advantage disappears under operational constraints: when defense requires maintaining availability (23.9%) and preventing all intrusions (15.2%), no significant difference exists (p>0.05). Exploratory taxonomy analysis suggests potential patterns in vulnerability exploitation, though limited sample sizes preclude definitive conclusions. This study provides the first controlled empirical evidence challenging claims of AI attacker advantage, demonstrating that defensive effectiveness critically depends on success criteria, a nuance absent from conceptual analyses but essential for deployment. These findings underscore the urgency for defenders to adopt open-source Cybersecurity AI frameworks to maintain security equilibrium against accelerating offensive automation.

Paperid: 1606, https://arxiv.org/pdf/2510.16835.pdf

Abstract:
The rapidly evolving Android malware ecosystem demands high-quality, real-time datasets as a foundation for effective detection and defense. With the widespread adoption of mobile devices across industrial systems, they have become a critical yet often overlooked attack surface in industrial cybersecurity. However, mainstream datasets widely used in academia and industry (e.g., Drebin) exhibit significant limitations: on one hand, their heavy reliance on VirusTotal's multi-engine aggregation results introduces substantial label noise; on the other hand, outdated samples reduce their temporal relevance. Moreover, automated labeling tools (e.g., AVClass2) suffer from suboptimal aggregation strategies, further compounding labeling errors and propagating inaccuracies throughout the research community.

Paperid: 1607, https://arxiv.org/pdf/2510.16620.pdf

Abstract:
We consider reversely-degraded wiretap channels, for which the secrecy capacity is zero if there is no channel feedback. This work focuses on a seeded modular code design for the Gaussian wiretap channel with channel output feedback, combining universal hash functions for security and learned feedback-based codes for reliability to achieve positive secrecy rates. We study the trade-off between communication reliability and information leakage, illustrating that feedback enables agreeing on a secret key shared between legitimate parties, overcoming the security advantage of the wiretapper. Our findings also motivate code designs for sensing-assisted secure communication, to be used in next-generation integrated sensing and communication methods.

Paperid: 1608, https://arxiv.org/pdf/2510.16558.pdf

Abstract:
The Model Context Protocol (MCP) is an emerging open standard that enables AI-powered applications to interact with external tools through structured metadata. A rapidly growing ecosystem has formed around MCP, including a wide range of MCP hosts (i.e., Cursor, Windsurf, Claude Desktop, and Cline), MCP registries (i.e., mcp.so, MCP Market, MCP Store, Pulse MCP, Smithery, and npm), and thousands of community-contributed MCP servers. Although the MCP ecosystem is gaining traction, there has been little systematic study of its architecture and associated security risks. In this paper, we present the first comprehensive security analysis of the MCP ecosystem. We decompose MCP ecosystem into three core components: hosts, registries, and servers, and study the interactions and trust relationships among them. Users search for servers on registries and configure them in the host, which translates LLM-generated output into external tool invocations provided by the servers and executes them. Our qualitative analysis reveals that hosts lack output verification mechanisms for LLM-generated outputs, enabling malicious servers to manipulate model behavior and induce a variety of security threats, including but not limited to sensitive data exfiltration. We uncover a wide range of vulnerabilities that enable attackers to hijack servers, due to the lack of a vetted server submission process in registries. To support our analysis, we collect and analyze a dataset of 67,057 servers from six public registries. Our quantitative analysis demonstrates that a substantial number of servers can be hijacked by attackers. Finally, we propose practical defense strategies for MCP hosts, registries, and users. We responsibly disclosed our findings to affected hosts and registries.

Paperid: 1609, https://arxiv.org/pdf/2510.15801.pdf

Abstract:
Cyber cognitive attacks leverage disruptive innovations (DIs) to exploit psychological biases and manipulate decision-making processes. Emerging technologies, such as AI-driven disinformation and synthetic media, have accelerated the scale and sophistication of these threats. Prior studies primarily categorize current cognitive attack tactics, lacking predictive mechanisms to anticipate future DIs and their malicious use in cognitive attacks. This paper addresses these gaps by introducing a novel predictive methodology for forecasting the emergence of DIs and their malicious uses in cognitive attacks. We identify trends in adversarial tactics and propose proactive defense strategies.

Paperid: 1610, https://arxiv.org/pdf/2510.15367.pdf

Abstract:
Federated learning (FL) is a distributed machine learning paradigm that enables multiple clients to collaboratively train a shared model without disclosing their local data. To address privacy issues of gradient, several privacy-preserving machine-learning schemes based on multi-client functional encryption (MCFE) have been proposed. However, existing MCFE-based schemes cannot support client dropout or flexible threshold selection, which are essential for practical FL. In this paper, we design a flexible threshold multi-client functional encryption for inner product (FTMCFE-IP) scheme, where multiple clients generate ciphertexts independently without any interaction. In the encryption phase, clients are able to choose a threshold flexibly without reinitializing the system. The decryption can be performed correctly when the number of online clients satisfies the threshold. An authorized user are allowed to compute the inner product of the vectors associated with his/her functional key and the ciphertext, respectively, but cannot learning anything else. Especially, the presented scheme supports clients drop out. Furthermore, we provide the definition and security model of our FTMCFE-IP scheme,and propose a concrete construction. The security of the designed scheme is formally proven. Finally, we implement and evaluate our FTMCFE-IP scheme.

Paperid: 1611, https://arxiv.org/pdf/2510.14675.pdf

Abstract:
To mitigate interrupt-based stepping attacks (notably using SGX-Step), Intel introduced AEX-Notify, an ISA extension to Intel SGX that aims to prevent deterministic single-stepping. In this work, we introduce AEX-NStep, the first interrupt counting attack on AEX-Notify-enabled Enclaves. We show that deterministic single-stepping is not required for interrupt counting attacks to be practical and that, therefore, AEX-Notify does not entirely prevent such attacks. We specifically show that one of AEX-Notify's security guarantees, obfuscated forward progress, does not hold, and we introduce two new probabilistic interrupt counting attacks. We use these attacks to construct a practical ECDSA key leakage attack on an AEX-Notify-enabled SGX enclave. Our results extend the original security analysis of AEX-Notify and inform the design of future mitigations.

Paperid: 1612, https://arxiv.org/pdf/2510.13824.pdf

Abstract:
This demo presents the first implementation of multi-layer secret sharing on commercial-off-the-shelf (COTS) 5G user equipment (UE), operating without infrastructure modifications or pre-shared keys. Our XOR-based approach distributes secret shares across network operators and distributed relays, ensuring perfect recovery and data confidentiality even if one network operator and one relay are simultaneously lost (e.g., under denial of service (DoS) or unanticipated attacks).

Paperid: 1613, https://arxiv.org/pdf/2510.13370.pdf

Abstract:
Service Level Agreement (SLA) monitoring in service-oriented environments suffers from inherent trust conflicts when providers self-report metrics, creating incentives to underreport violations. We introduce a framework for generating verifiable SLA violation claims through trusted hardware monitors and zero-knowledge proofs, establishing cryptographic foundations for genuine trustworthiness in service ecosystems. Our approach starts with machine-readable SLA clauses converted into verifiable predicates and monitored within Trusted Execution Environments. These monitors collect timestamped telemetry, organize measurements into Merkle trees, and produce signed attestations. Zero-knowledge proofs aggregate Service-Level Indicators to evaluate compliance, generating cryptographic proofs verifiable by stakeholders, arbitrators, or insurers in disputes, without accessing underlying data. This ensures three security properties: integrity, authenticity, and validity. Our prototype demonstrates linear scaling up to over 1 million events per hour for measurements with near constant-time proof generation and verification for single violation claims, enabling trustless SLA enforcement through cryptographic guarantees for automated compliance verification in service monitoring.

Paperid: 1614, https://arxiv.org/pdf/2510.12031.pdf

Abstract:
E-commerce mobile applications are central to global financial transactions, making their security and privacy crucial. In this study, we analyze 92 top-grossing Android e-commerce apps (58 U.S.-based and 34 international) using MobSF, AndroBugs, and RiskInDroid. Our analysis shows widespread SSL and certificate weaknesses, with approximately 92% using unsecured HTTP connections and an average MobSF security score of 40.92/100. Over-privileged permissions were identified in 77 apps. While U.S. apps exhibited fewer manifest, code, and certificate vulnerabilities, both groups showed similar network-related issues. We advocate for the adoption of stronger, standardized, and user-focused security practices across regions.

Paperid: 1615, https://arxiv.org/pdf/2510.11299.pdf

Abstract:
Privacy models were introduced in privacy-preserving data publishing and statistical disclosure control with the promise to end the need for costly empirical assessment of disclosure risk. We examine how well this promise is kept by the main privacy models. We find they may fail to provide adequate protection guarantees because of problems in their definition or incur unacceptable trade-offs between privacy protection and utility preservation. Specifically, k-anonymity may not entirely exclude disclosure if enforced with deterministic mechanisms or without constraints on the confidential values. On the other hand, differential privacy (DP) incurs unacceptable utility loss for small budgets and its privacy guarantee becomes meaningless for large budgets. In the latter case, an ex post empirical assessment of disclosure risk becomes necessary, undermining the main appeal of privacy models. Whereas the utility preservation of DP can only be improved by relaxing its privacy guarantees, we argue that a semantic reformulation of k-anonymity can offer more robust privacy without losing utility with respect to traditional syntactic k-anonymity.

Paperid: 1616, https://arxiv.org/pdf/2510.08495.pdf

Abstract:
We present a generic compiler that converts any $\mathsf{MIP}^{*}$ protocol into a succinct interactive argument where the communication and the verifier are classical, and where post-quantum soundness relies on the post-quantum sub-exponential hardness of the Learning with Errors ($\mathsf{LWE}$) problem. Prior to this work, such a compiler for $\mathsf{MIP}^{*}$ was given by Kalai, Lombardi, Vaikuntanathan and Yang (STOC 2022), but the post-quantum soundness of this compiler is still under investigation. More generally, our compiler can be applied to any $\mathsf{QIP}$ protocol which is sound only against semi-malicious provers that follow the prescribed protocol, but with possibly malicious initial state. Our compiler consists of two steps. We first show that if a language $\mathcal{L}$ has a $\mathsf{QIP}$ with semi-malicious soundness, where the prover runs in time $T$, then $\mathcal{L} \in \mathsf{QMATIME}(T)$. Then we construct a succinct classical argument for any such language, where the communication complexity grows polylogarithmically with $T$, under the post-quantum sub-exponential hardness of $\mathsf{LWE}$.

Paperid: 1617, https://arxiv.org/pdf/2510.08479.pdf

Abstract:
Provenance plays a critical role in maintaining traceability of a system's actions for root cause analysis of security threats and impacts. Provenance collection is often incorporated into the reference monitor of systems to ensure that an audit trail exists of all events, that events are completely captured, and that logging of such events cannot be bypassed. However, recent research has questioned whether existing state-of-the-art provenance collection systems fail to ensure the security guarantees of a true reference monitor due to the 'super producer threat' in which provenance generation can overload a system to force the system to drop security-relevant events and allow an attacker to hide their actions. One approach towards solving this threat is to enforce resource isolation, but that does not fully solve the problems resulting from hardware dependencies and performance limitations. In this paper, we show how an operating system's kernel scheduler can mitigate this threat, and we introduce Venus, a learned scheduler for Linux specifically designed for provenance. Unlike conventional schedulers that ignore provenance completeness requirements, Venus leverages reinforcement learning to learn provenance task behavior and to dynamically optimize resource allocation. We evaluate Venus's efficacy and show that Venus significantly improves both the completeness and efficiency of provenance collection systems compared to traditional scheduling, while maintaining reasonable overheads and even improving overall runtime in certain cases compared to the default Linux scheduler.

Paperid: 1618, https://arxiv.org/pdf/2510.08333.pdf

Abstract:
With the growing reliance on the vulnerable Automatic Dependent Surveillance-Broadcast (ADS-B) protocol in air traffic management (ATM), ensuring security is critical. This study investigates emerging machine learning models and training strategies to improve AI-based intrusion detection systems (IDS) for ADS-B. Focusing on ground-based ATM systems, we evaluate two deep learning IDS implementations: one using a transformer encoder and the other an extended Long Short-Term Memory (xLSTM) network, marking the first xLSTM-based IDS for ADS-B. A transfer learning strategy was employed, involving pre-training on benign ADS-B messages and fine-tuning with labeled data containing instances of tampered messages. Results show this approach outperforms existing methods, particularly in identifying subtle attacks that progressively undermine situational awareness. The xLSTM-based IDS achieves an F1-score of 98.9%, surpassing the transformer-based model at 94.3%. Tests on unseen attacks validated the generalization ability of the xLSTM model. Inference latency analysis shows that the 7.26-second delay introduced by the xLSTM-based IDS fits within the Secondary Surveillance Radar (SSR) refresh interval (5-12 s), although it may be restrictive for time-critical operations. While the transformer-based IDS achieves a 2.1-second latency, it does so at the cost of lower detection performance.

Paperid: 1619, https://arxiv.org/pdf/2510.08272.pdf

Abstract:
While interest in the open RISC-V instruction set architecture is growing, tools to assess the security of concrete processor implementations are lacking. There are dedicated tools and benchmarks for common microarchitectural side-channel vulnerabilities for popular processor families such as Intel x86-64 or ARM, but not for RISC-V. In this paper we describe our efforts in porting an Intel x86-64 benchmark suite for cache-based timing vulnerabilities to RISC-V. We then use this benchmark to evaluate the security of three commercially available RISC-V processors, the T-Head C910 and the SiFive U54 and U74 cores. We observe that the C910 processor exhibits more distinct timing types than the other processors, leading to the assumption that code running on the C910 would be exposed to more microarchitectural vulnerability sources. In addition, our evaluation reveals that $37.5\%$ of the vulnerabilities covered by the benchmark exist in all processors, while only $6.8\%$ are absent from all cores. Our work, in particular the ported benchmark, aims to support RISC-V processor designers to identify leakage sources early in their designs and to support the development of countermeasures.

Paperid: 1620, https://arxiv.org/pdf/2510.08101.pdf

Abstract:
Web measurements are a well-established methodology for assessing the security and privacy landscape of the Internet. However, existing top lists of popular websites commonly used as measurement targets are unlabeled and lack semantic information about the nature of the sites they include. This limitation makes targeted measurements challenging, as researchers often need to rely on ad-hoc techniques to bias their datasets toward specific categories of interest. In this paper, we investigate the use of Large Language Models (LLMs) as a means to enable targeted web measurement studies through their semantic understanding capabilities. Building on prior literature, we identify key website classification tasks relevant to web measurements and construct datasets to systematically evaluate the performance of different LLMs on these tasks. Our results demonstrate that LLMs may achieve strong performance across multiple classification scenarios. We then conduct LLM-assisted web measurement studies inspired by prior work and rigorously assess the validity of the resulting research inferences. Our results demonstrate that LLMs can serve as a practical tool for analyzing security and privacy trends on the Web.

Paperid: 1621, https://arxiv.org/pdf/2510.07584.pdf

Abstract:
Introduced in 2003 and 2005, Alekhnovich and Regev' schemes were the first public-key encryptions whose security is only based on the average hardness of decoding random linear codes and LWE, without other security assumptions. Such security guarantees made them very popular, being at the origin of the now standardized HQC or Kyber. We present an adaptation of Alekhnovich and Regev' encryption scheme whose security is only based on the hardness of a slight variation of MinRank, the so-called stationary-MinRank problem. We succeeded to reach this strong security guarantee by showing that stationary-MinRank benefits from a search-to-decision reduction. Our scheme therefore brings a partial answer to the long-standing open question of building an encryption scheme whose security relies solely on the hardness of MinRank. Finally, we show after a thoroughly security analysis that our scheme is practical and competitive with other encryption schemes admitting such strong security guarantees. Our scheme is slightly less efficient than FrodoKEM, but much more efficient than Alekhnovich and Regev' original schemes, with possibilities of improvements by considering more structure, in the same way as HQC and Kyber.

Paperid: 1622, https://arxiv.org/pdf/2510.07515.pdf

Abstract:
In 2021, Chen, Liu, and Zhandry presented an efficient quantum algorithm for the average-case $\ell_\infty$-Short Integer Solution ($\mathrm{SIS}^\infty$) problem, in a parameter range outside the normal range of cryptographic interest, but still with no known efficient classical algorithm. This was particularly exciting since $\mathrm{SIS}^\infty$ is a simple problem without structure, and their algorithmic techniques were different from those used in prior exponential quantum speedups. We present efficient classical algorithms for all of the $\mathrm{SIS}^\infty$ and (more general) Constrained Integer Solution problems studied in their paper, showing there is no exponential quantum speedup anymore.

Paperid: 1623, https://arxiv.org/pdf/2510.07479.pdf

Abstract:
We present $\mathsf{Miranda}$, the first family of full-domain-hash signatures based on matrix codes. This signature scheme fulfils the paradigm of Gentry, Peikert and Vaikuntanathan ($\mathsf{GPV}$), which gives strong security guarantees. Our trapdoor is very simple and generic: if we propose it with matrix codes, it can actually be instantiated in many other ways since it only involves a subcode of a decodable code (or lattice) in a unique decoding regime of parameters. Though $\mathsf{Miranda}$ signing algorithm relies on a decoding task where there is exactly one solution, there are many possible signatures given a message to sign and we ensure that signatures are not leaking information on their underlying trapdoor by means of a very simple procedure involving the drawing of a small number of uniform bits. In particular $\mathsf{Miranda}$ does not use a rejection sampling procedure which makes its implementation a very simple task contrary to other $\mathsf{GPV}$-like signatures schemes such as $\mathsf{Falcon}$ or even $\mathsf{Wave}$. We instantiate $\mathsf{Miranda}$ with the famous family of Gabidulin codes represented as spaces of matrices and we study thoroughly its security (in the EUF-CMA security model). For~$128$ bits of classical security, the signature sizes are as low as~$90$ bytes and the public key sizes are in the order of~$2.6$ megabytes.

Paperid: 1624, https://arxiv.org/pdf/2510.07304.pdf

Abstract:
Machine learning (ML) models memorize and leak training data, causing serious privacy issues to data owners. Training algorithms with differential privacy (DP), such as DP-SGD, have been gaining attention as a solution. However, DP-SGD adds a noise at each training iteration, which degrades the accuracy of the trained model. To improve accuracy, a new family of approaches adds carefully designed correlated noises, so that noises cancel out each other across iterations. We performed an extensive characterization study of these new mechanisms, for the first time to the best of our knowledge, and show they incur non-negligible overheads when the model is large or uses large embedding tables. Motivated by the analysis, we propose Cocoon, a hardware-software co-designed framework for efficient training with correlated noises. Cocoon accelerates models with embedding tables through pre-computing and storing correlated noises in a coalesced format (Cocoon-Emb), and supports large models through a custom near-memory processing device (Cocoon-NMP). On a real system with an FPGA-based NMP device prototype, Cocoon improves the performance by 2.33-10.82x(Cocoon-Emb) and 1.55-3.06x (Cocoon-NMP).

Paperid: 1625, https://arxiv.org/pdf/2510.05798.pdf

Abstract:
Supply chain security is extremely important for modern applications running at scale in the cloud. In fact, they involve a large number of heterogeneous microservices that also include third-party software. As a result, security vulnerabilities are hard to identify and mitigate before they start being actively exploited by attackers. For this reason, governments have recently introduced cybersecurity regulations that require vendors to share a software bill of material (SBOM) with end users or regulators. An SBOM can be employed to identify the security vulnerabilities of a software component even without access to its source code, as long as it is accurate and interoperable across different tools. This work evaluates this issue through a comprehensive study of tools for SBOM generation and vulnerability scanning, including both open-source software and cloud services from major providers. We specifically target software containers and focus on operating system packages in Linux distributions that are widely used as base images due to their far-reaching security impact. Our findings show that the considered tools are largely incompatible, leading to inaccurate reporting and a large amount of undetected vulnerabilities. We uncover the SBOM confusion vulnerability, a byproduct of such fragmented ecosystem, where inconsistent formats prevent reliable vulnerability detection across tools.

Paperid: 1626, https://arxiv.org/pdf/2510.05393.pdf

Abstract:
There are various notions of quantum pseudorandomness, such as pseudorandom unitaries (PRUs), pseudorandom state generators (PRSGs) and pseudorandom function-like state generators (PRSFGs). Unlike the different notions of classical pseudorandomness, which are known to be existentially equivalent to each other, the relation between quantum pseudorandomness has yet to be fully established. We present some evidence suggesting that some quantum pseudorandomness is unlikely to be constructed from the others, or at least is hard to construct unless some conjectures are false. This indicates that quantum pseudorandomness could behave quite differently from classical pseudorandomness. We study new oracle worlds where one quantum pseudorandomness exists but another pseudorandomness does not under some assumptions or constraints, and provide potential directions to achieve the full black-box separation. More precisely: - We give a unitary oracle relative to which PRFSGs exist but PRUs without using ancilla do not. This can be extended to the general PRUs if we can prove a structural property of the PRU algorithm. - Assuming an isoperimetric inequality-style conjecture, we show a unitary oracle world where log-length output PRFSGs exist but proving the existence of quantum-computable pseudorandom generators (QPRGs) with negligible correctness error is as hard as proving that ${\sf BQP}\neq {\sf QCMA}$. This result suggests that the inverse-polynomial error in the state of the art construction of QPRGs from log-length PRSGs is inherent. - Assuming the same conjecture, we prove that some natural way of constructing super-log-length output PRSGs from log-length output PRFSGs is impossible. This partly complements the known hardness of shrinking the PRSG output lengths. Along the way, we also discuss other potential approaches to extend the PRSG output lengths.

Paperid: 1627, https://arxiv.org/pdf/2510.05180.pdf

Abstract:
In critical IoT environments, such as smart homes and industrial systems, effective Intrusion Detection Systems (IDS) are essential for ensuring security. However, developing robust IDS solutions remains a significant challenge. Traditional machine learning-based IDS models typically require large datasets, but data sharing is often limited due to privacy and security concerns. Federated Learning (FL) presents a promising alternative by enabling collaborative model training without sharing raw data. Despite its advantages, FL still faces key challenges, such as data heterogeneity (non-IID data) and high energy and computation costs, particularly for resource constrained IoT devices. To address these issues, this paper proposes OptiFLIDS, a novel approach that applies pruning techniques during local training to reduce model complexity and energy consumption. It also incorporates a customized aggregation method to better handle pruned models that differ due to non-IID data distributions. Experiments conducted on three recent IoT IDS datasets, TON_IoT, X-IIoTID, and IDSIoT2024, demonstrate that OptiFLIDS maintains strong detection performance while improving energy efficiency, making it well-suited for deployment in real-world IoT environments.

Paperid: 1628, https://arxiv.org/pdf/2510.05165.pdf

Abstract:
Cross-slice attack attribution in 6G networks faces the fundamental challenge of distinguishing genuine causal relationships from spurious correlations in shared infrastructure environments. We propose a theoretically-grounded domain-adapted Granger causality framework that integrates statistical causal inference with network-specific resource modeling for real-time attack attribution. Our approach addresses key limitations of existing methods by incorporating resource contention dynamics and providing formal statistical guarantees. Comprehensive evaluation on a production-grade 6G testbed with 1,100 empirically-validated attack scenarios demonstrates 89.2% attribution accuracy with sub-100ms response time, representing a statistically significant 10.1 percentage point improvement over state-of-the-art baselines. The framework provides interpretable causal explanations suitable for autonomous 6G security orchestration.

Paperid: 1629, https://arxiv.org/pdf/2510.04979.pdf

Abstract:
Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves are fundamental tools for evaluating machine learning classifiers, offering detailed insights into the trade-offs between true positive rate vs. false positive rate (ROC) or precision vs. recall (PR). However, in Federated Learning (FL) scenarios, where data is distributed across multiple clients, computing these curves is challenging due to privacy and communication constraints. Specifically, the server cannot access raw prediction scores and class labels, which are used to compute the ROC and PR curves in a centralized setting. In this paper, we propose a novel method for approximating ROC and PR curves in a federated setting by estimating quantiles of the prediction score distribution under distributed differential privacy. We provide theoretical bounds on the Area Error (AE) between the true and estimated curves, demonstrating the trade-offs between approximation accuracy, privacy, and communication cost. Empirical results on real-world datasets demonstrate that our method achieves high approximation accuracy with minimal communication and strong privacy guarantees, making it practical for privacy-preserving model evaluation in federated systems.

Paperid: 1630, https://arxiv.org/pdf/2510.03819.pdf

Abstract:
The rapid advancement of blockchain technology has precipitated the widespread adoption of Ethereum and smart contracts across a variety of sectors. However, this has also given rise to numerous fraudulent activities, with many speculators embedding Ponzi schemes within smart contracts, resulting in significant financial losses for investors. Currently, there is a lack of effective methods for identifying and analyzing such new types of fraudulent activities. This paper categorizes these scams into four structural types and explores the intrinsic characteristics of Ponzi scheme contract source code from a program analysis perspective. The Mythril tool is employed to conduct static and dynamic analyses of representative cases, thereby revealing their vulnerabilities and operational mechanisms. Furthermore, this paper employs shell scripts and command patterns to conduct batch detection of open-source smart contract code, thereby unveiling the common characteristics of Ponzi scheme smart contracts.

Paperid: 1631, https://arxiv.org/pdf/2510.03625.pdf

Abstract:
Proof-of-stake blockchains require consensus protocols that support Dynamic Availability and Reconfiguration (so-called DAR setting), where the former means that the consensus protocol should remain live even if a large number of nodes temporarily crash, and the latter means it should be possible to change the set of operating nodes over time. State-of-the-art protocols for the DAR setting, such as Ethereum, Cardano's Ouroboros, or Snow White, require unrealistic additional assumptions, such as social consensus, or that key evolution is performed even while nodes are not participating. In this paper, we identify the necessary and sufficient adversarial condition under which consensus can be achieved in the DAR setting without additional assumptions. We then introduce a new and realistic additional assumption: honest nodes dispose of their cryptographic keys the moment they express intent to exit from the set of operating nodes. To add reconfiguration to any dynamically available consensus protocol, we provide a bootstrapping gadget that is particularly simple and efficient in the common optimistic case of few reconfigurations and no double-spending attempts.

Paperid: 1632, https://arxiv.org/pdf/2510.03513.pdf

Abstract:
The rapid growth of the Internet of Things (IoT) has expanded opportunities for innovation but also increased exposure to botnet-driven cyberattacks. Conventional detection methods often struggle with scalability, privacy, and adaptability in resource-constrained IoT environments. To address these challenges, we present a lightweight and privacy-preserving botnet detection framework based on federated learning. This approach enables distributed devices to collaboratively train models without exchanging raw data, thus maintaining user privacy while preserving detection accuracy. A communication-efficient aggregation strategy is introduced to reduce overhead, ensuring suitability for constrained IoT networks. Experiments on benchmark IoT botnet datasets demonstrate that the framework achieves high detection accuracy while substantially reducing communication costs. These findings highlight federated learning as a practical path toward scalable, secure, and privacy-aware intrusion detection for IoT ecosystems.

Paperid: 1633, https://arxiv.org/pdf/2510.03489.pdf

Abstract:
Electronic voting systems face growing risks from cyberattacks and data breaches, which are expected to intensify with the advent of quantum computing. To address these challenges, we introduce a quantum-secure voting framework that integrates Quantum Key Distribution (QKD), Dual-Key Symmetric Encryption, and verifiable receipt mechanisms to strengthen the privacy, integrity, and reliability of the voting process. The framework enables voters to establish encryption keys securely, cast encrypted ballots, and verify their votes through receipt-based confirmation, all without exposing the vote contents. To evaluate performance, we simulate both quantum and classical communication channels using the Message Queuing Telemetry Transport (MQTT) protocol. Results demonstrate that the system can process large numbers of votes efficiently with low latency and minimal error rates. This approach offers a scalable and practical path toward secure, transparent, and verifiable electronic voting in the quantum era.

Paperid: 1634, https://arxiv.org/pdf/2510.02643.pdf

Abstract:
Physically Unclonable Functions (PUFs) are a promising solution for identity verification and asymmetric encryption. In this paper, a new Resistive Random Access Memory (ReRAM) PUF-based protocol is presented to create a physical ReRAM PUF with a large challenge space. This protocol uses differential reads from unformed ReRAM as the method for response generation. Lastly, this paper also provides an experimental hardware demonstration of this protocol on a Physical ReRAM device, along with providing notable results as a PUF, with excellent performance characteristics.

Paperid: 1635, https://arxiv.org/pdf/2510.02357.pdf

Abstract:
Artificial Intelligence (AI) systems introduce unprecedented privacy challenges as they process increasingly sensitive data. Traditional privacy frameworks prove inadequate for AI technologies due to unique characteristics such as autonomous learning and black-box decision-making. This paper presents a taxonomy classifying AI privacy risks, synthesised from 45 studies identified through systematic review. We identify 19 key risks grouped under four categories: Dataset-Level, Model-Level, Infrastructure-Level, and Insider Threat Risks. Findings reveal a balanced distribution across these dimensions, with human error (9.45%) emerging as the most significant factor. This taxonomy challenges conventional security approaches that typically prioritise technical controls over human factors, highlighting gaps in holistic understanding. By bridging technical and behavioural dimensions of AI privacy, this paper contributes to advancing trustworthy AI development and provides a foundation for future research.

Paperid: 1636, https://arxiv.org/pdf/2510.02319.pdf

Abstract:
The growth of highly advanced Large Language Models (LLMs) constitutes a huge dual-use problem, making it necessary to create dependable AI-generated text detection systems. Modern detectors are notoriously vulnerable to adversarial attacks, with paraphrasing standing out as an effective evasion technique that foils statistical detection. This paper presents a comparative study of adversarial robustness, first by quantifying the limitations of standard adversarial training and then by introducing a novel, significantly more resilient detection framework: Perturbation-Invariant Feature Engineering (PIFE), a framework that enhances detection by first transforming input text into a standardized form using a multi-stage normalization pipeline, it then quantifies the transformation's magnitude using metrics like Levenshtein distance and semantic similarity, feeding these signals directly to the classifier. We evaluate both a conventionally hardened Transformer and our PIFE-augmented model against a hierarchical taxonomy of character-, word-, and sentence-level attacks. Our findings first confirm that conventional adversarial training, while resilient to syntactic noise, fails against semantic attacks, an effect we term "semantic evasion threshold", where its True Positive Rate at a strict 1% False Positive Rate plummets to 48.8%. In stark contrast, our PIFE model, which explicitly engineers features from the discrepancy between a text and its canonical form, overcomes this limitation. It maintains a remarkable 82.6% TPR under the same conditions, effectively neutralizing the most sophisticated semantic attacks. This superior performance demonstrates that explicitly modeling perturbation artifacts, rather than merely training on them, is a more promising path toward achieving genuine robustness in the adversarial arms race.

Paperid: 1637, https://arxiv.org/pdf/2510.02185.pdf

Abstract:
Fuzz testing has become a cornerstone technique for identifying software bugs and security vulnerabilities, with broad adoption in both industry and open-source communities. Directly fuzzing a function requires fuzz drivers, which translate random fuzzer inputs into valid arguments for the target function. Given the cost and expertise required to manually develop fuzz drivers, methods exist that leverage program analysis and Large Language Models to automatically generate these drivers. However, the generated fuzz drivers frequently lead to false positive crashes, especially in functions highly structured input and complex state requirements. This problem is especially crucial in industry-scale fuzz driver generation efforts like OSS-Fuzz-en, as reporting false positive crashes to maintainers impede trust in both the system and the team. This paper presents two AI-driven strategies to reduce false positives in OSS-Fuzz-Gen, a multi-agent system for automated fuzz driver generation. First, constraint-based fuzz driver generation proactively enforces constraints on a function's inputs and state to guide driver creation. Second, context-based crash validation reactively analyzes function callers to determine whether reported crashes are feasible from program entry points. Using 1,500 benchmark functions from OSS-Fuzz, we show that these strategies reduce spurious crashes by up to 8%, cut reported crashes by more than half, and demonstrate that frontier LLMs can serve as reliable program analysis agents. Our results highlight the promise and challenges of integrating AI into large-scale fuzzing pipelines.

Paperid: 1638, https://arxiv.org/pdf/2510.02162.pdf

Abstract:
The advent of quantum computing threatens classical public-key cryptography, motivating NIST's adoption of post-quantum schemes such as those based on the Module Learning With Errors (Module-LWE) problem. We present NoMod ML-Attack, a hybrid white-box cryptanalytic method that circumvents the challenge of modeling modular reduction by treating wrap-arounds as statistical corruption and casting secret recovery as robust linear estimation. Our approach combines optimized lattice preprocessing--including reduced-vector saving and algebraic amplification--with robust estimators trained via Tukey's Biweight loss. Experiments show NoMod achieves full recovery of binary secrets for dimension $n = 350$, recovery of sparse binomial secrets for $n = 256$, and successful recovery of sparse secrets in CRYSTALS-Kyber settings with parameters $(n, k) = (128, 3)$ and $(256, 2)$. We release our implementation in an anonymous repository https://anonymous.4open.science/r/NoMod-3BD4.

Paperid: 1639, https://arxiv.org/pdf/2510.01645.pdf

Abstract:
The discourse on privacy risks in Large Language Models (LLMs) has disproportionately focused on verbatim memorization of training data, while a constellation of more immediate and scalable privacy threats remain underexplored. This position paper argues that the privacy landscape of LLM systems extends far beyond training data extraction, encompassing risks from data collection practices, inference-time context leakage, autonomous agent capabilities, and the democratization of surveillance through deep inference attacks. We present a comprehensive taxonomy of privacy risks across the LLM lifecycle -- from data collection through deployment -- and demonstrate through case studies how current privacy frameworks fail to address these multifaceted threats. Through a longitudinal analysis of 1,322 AI/ML privacy papers published at leading conferences over the past decade (2016--2025), we reveal that while memorization receives outsized attention in technical research, the most pressing privacy harms lie elsewhere, where current technical approaches offer little traction and viable paths forward remain unclear. We call for a fundamental shift in how the research community approaches LLM privacy, moving beyond the narrow focus of current technical solutions and embracing interdisciplinary approaches that address the sociotechnical nature of these emerging threats.

Paperid: 1640, https://arxiv.org/pdf/2510.00490.pdf

Abstract:
Recently, Bit-Flip Attack (BFA) has garnered widespread attention for its ability to compromise software system integrity remotely through hardware fault injection. With the widespread distillation and deployment of large language models (LLMs) into single file .gguf formats, their weight spaces have become exposed to an unprecedented hardware attack surface. This paper is the first to systematically discover and validate the existence of single-bit vulnerabilities in LLM weight files: in mainstream open-source models (e.g., DeepSeek and QWEN) using .gguf quantized formats, flipping just single bit can induce three types of targeted semantic level failures Artificial Flawed Intelligence (outputting factual errors), Artificial Weak Intelligence (degradation of logical reasoning capability), and Artificial Bad Intelligence (generating harmful content). By building an information theoretic weight sensitivity entropy model and a probabilistic heuristic scanning framework called BitSifter, we achieved efficient localization of critical vulnerable bits in models with hundreds of millions of parameters. Experiments show that vulnerabilities are significantly concentrated in the tensor data region, particularly in areas related to the attention mechanism and output layers, which are the most sensitive. A negative correlation was observed between model size and robustness, with smaller models being more susceptible to attacks. Furthermore, a remote BFA chain was designed, enabling semantic-level attacks in real-world environments: At an attack frequency of 464.3 times per second, a single bit can be flipped with 100% success in as little as 31.7 seconds. This causes the accuracy of LLM to plummet from 73.5% to 0%, without requiring high-cost equipment or complex prompt engineering.

Paperid: 1641, https://arxiv.org/pdf/2510.00452.pdf

Abstract:
Large Language Models (LLMs) have gained prominence in domains including cloud security and forensics. Yet cloud forensic investigations still rely on manual analysis, making them time-consuming and error-prone. LLMs can mimic human reasoning, offering a pathway to automating cloud log analysis. To address this, we introduce the Cloud Investigation Automation Framework (CIAF), an ontology-driven framework that systematically investigates cloud forensic logs while improving efficiency and accuracy. CIAF standardizes user inputs through semantic validation, eliminating ambiguity and ensuring consistency in log interpretation. This not only enhances data quality but also provides investigators with reliable, standardized information for decision-making. To evaluate security and performance, we analyzed Microsoft Azure logs containing ransomware-related events. By simulating attacks and assessing CIAF's impact, results showed significant improvement in ransomware detection, achieving precision, recall, and F1 scores of 93 percent. CIAF's modular, adaptable design extends beyond ransomware, making it a robust solution for diverse cyberattacks. By laying the foundation for standardized forensic methodologies and informing future AI-driven automation, this work underscores the role of deterministic prompt engineering and ontology-based validation in enhancing cloud forensic investigations. These advancements improve cloud security while paving the way for efficient, automated forensic workflows.

Paperid: 1642, https://arxiv.org/pdf/2510.00451.pdf

Abstract:
Large language models have gained widespread prominence, yet their vulnerability to prompt injection and other adversarial attacks remains a critical concern. This paper argues for a security-by-design AI paradigm that proactively mitigates LLM vulnerabilities while enhancing performance. To achieve this, we introduce PromptShield, an ontology-driven framework that ensures deterministic and secure prompt interactions. It standardizes user inputs through semantic validation, eliminating ambiguity and mitigating adversarial manipulation. To assess PromptShield's security and performance capabilities, we conducted an experiment on an agent-based system to analyze cloud logs within Amazon Web Services (AWS), containing 493 distinct events related to malicious activities and anomalies. By simulating prompt injection attacks and assessing the impact of deploying PromptShield, our results demonstrate a significant improvement in model security and performance, achieving precision, recall, and F1 scores of approximately 94%. Notably, the ontology-based framework not only mitigates adversarial threats but also enhances the overall performance and reliability of the system. Furthermore, PromptShield's modular and adaptable design ensures its applicability beyond cloud security, making it a robust solution for safeguarding generative AI applications across various domains. By laying the groundwork for AI safety standards and informing future policy development, this work stimulates a crucial dialogue on the pivotal role of deterministic prompt engineering and ontology-based validation in ensuring the safe and responsible deployment of LLMs in high-stakes environments.

Paperid: 1643, https://arxiv.org/pdf/2510.00240.pdf

Abstract:
Effective analysis of cybersecurity and threat intelligence data demands language models that can interpret specialized terminology, complex document structures, and the interdependence of natural language and source code. Encoder-only transformer architectures provide efficient and robust representations that support critical tasks such as semantic search, technical entity extraction, and semantic analysis, which are key to automated threat detection, incident triage, and vulnerability assessment. However, general-purpose language models often lack the domain-specific adaptation required for high precision. We present SecureBERT 2.0, an enhanced encoder-only language model purpose-built for cybersecurity applications. Leveraging the ModernBERT architecture, SecureBERT 2.0 introduces improved long-context modeling and hierarchical encoding, enabling effective processing of extended and heterogeneous documents, including threat reports and source code artifacts. Pretrained on a domain-specific corpus more than thirteen times larger than its predecessor, comprising over 13 billion text tokens and 53 million code tokens from diverse real-world sources, SecureBERT 2.0 achieves state-of-the-art performance on multiple cybersecurity benchmarks. Experimental results demonstrate substantial improvements in semantic search for threat intelligence, semantic analysis, cybersecurity-specific named entity recognition, and automated vulnerability detection in code within the cybersecurity domain.

Paperid: 1644, https://arxiv.org/pdf/2509.26562.pdf

Abstract:
Deep neural networks (DNNs) are increasingly being deployed in high-stakes applications, from self-driving cars to biometric authentication. However, their unpredictable and unreliable behaviors in real-world settings require new approaches to characterize and ensure their reliability. This paper introduces DeepProv, a novel and customizable system designed to capture and characterize the runtime behavior of DNNs during inference by using their underlying graph structure. Inspired by system audit provenance graphs, DeepProv models the computational information flow of a DNN's inference process through Inference Provenance Graphs (IPGs). These graphs provide a detailed structural representation of the behavior of DNN, allowing both empirical and structural analysis. DeepProv uses these insights to systematically repair DNNs for specific objectives, such as improving robustness, privacy, or fairness. We instantiate DeepProv with adversarial robustness as the goal of model repair and conduct extensive case studies to evaluate its effectiveness. Our results demonstrate its effectiveness and scalability across diverse classification tasks, attack scenarios, and model complexities. DeepProv automatically identifies repair actions at the node and edge-level within IPGs, significantly enhancing the robustness of the model. In particular, applying DeepProv repair strategies to just a single layer of a DNN yields an average 55% improvement in adversarial accuracy. Moreover, DeepProv complements existing defenses, achieving substantial gains in adversarial robustness. Beyond robustness, we demonstrate the broader potential of DeepProv as an adaptable system to characterize DNN behavior in other critical areas, such as privacy auditing and fairness analysis.

Paperid: 1645, https://arxiv.org/pdf/2509.26393.pdf

Abstract:
Using Fourier analysis, this paper establishes exact security bounds for linear extractors in True Random Number Generators (TRNGs). We provide the first near-optimal total variation security characterization by interpolating between optimal $\ell_{\infty}$ and $\ell_2$ norm results, expressed through code weight enumerators and input bias parameters. Our bounds improve security assessments by an order of magnitude over previous approximations. By scanning ~20,000 codes, we reveal fundamental trade-offs between compression efficiency and cryptographic security. For instance, we show that achieving 80 bits of security can require sacrificing more than 50\% of the code rate when correcting 10\% input bias. Our bounds enhance security evaluation of TRNG post-processing schemes and quantify the inherent cost of randomness extraction in hardware implementations.

Paperid: 1646, https://arxiv.org/pdf/2509.25408.pdf

Abstract:
We formulate the design of a threshold signature scheme as made possible on cryptocurrency protocols like Bitcoin. The funds are secured by an m-of-n threshold signature, where at least m signatures are needed to unlock the funds. A user designs this scheme knowing that a malicious attacker can also obtain the signatures with some probability. Higher thresholds offer more security, but also risk locking the user out of his own funds. The optimal threshold balances these twin effects. Interventions like increasing the security or usability of the signatures allow for higher thresholds. We model dynamic threshold signature schemes, where the probability of a user or attacker obtaining signatures decays with time. A dynamic threshold signature scheme is optimal, and increasing security or usability allows for higher thresholds and longer time locks.

Paperid: 1647, https://arxiv.org/pdf/2509.25145.pdf

Abstract:
Compiled nonlocal games transfer the power of Bell-type multi-prover tests into a single-device setting by replacing spatial separation with cryptography. Concretely, the KLVY compiler (STOC'23) maps any multi-prover game to an interactive single-prover protocol, using quantum homomorphic encryption. A crucial security property of such compilers is quantum soundness, which ensures that a dishonest quantum prover cannot exceed the original game's quantum value. For practical cryptographic implementations, this soundness must be quantitative, providing concrete bounds, rather than merely asymptotic. While quantitative quantum soundness has been established for the KLVY compiler in the bipartite case, it has only been shown asymptotically for multipartite games. This is a significant gap, as multipartite nonlocality exhibits phenomena with no bipartite analogue, and the difficulty of enforcing space-like separation makes single-device compilation especially compelling. This work closes this gap by showing the quantitative quantum soundness of the KLVY compiler for all multipartite nonlocal games. On the way, we introduce an NPA-like hierarchy for quantum instruments and prove its completeness, thereby characterizing correlations from operationally-non-signaling sequential strategies. We further develop novel geometric arguments for the decomposition of sequential strategies into their signaling and non-signaling parts, which might be of independent interest.

Paperid: 1648, https://arxiv.org/pdf/2509.25113.pdf

Abstract:
This paper introduces the first two-dimensional XOR-based secret sharing scheme for layered multipath communication networks. We present a construction that guarantees successful message recovery and perfect privacy when an adversary observes and disrupts any single path at each transmission layer. The scheme achieves information-theoretic security using only bitwise XOR operations with linear $O(|S|)$ complexity, where $|S|$ is the message length. We provide mathematical proofs demonstrating that the scheme maintains unconditional security regardless of computational resources available to adversaries. Unlike encryption-based approaches vulnerable to quantum computing advances, our construction offers provable security suitable for resource-constrained military environments where computational assumptions may fail.

Paperid: 1649, https://arxiv.org/pdf/2509.24240.pdf

Abstract:
Coding agents, which are LLM-driven agents specialized in software development, have become increasingly prevalent in modern programming environments. Unlike traditional AI coding assistants, which offer simple code completion and suggestions, modern coding agents tackle more complex tasks with greater autonomy, such as generating entire programs from natural language instructions. To enable such capabilities, modern coding agents incorporate extensive functionalities, which in turn raise significant concerns over their security and privacy. Despite their growing adoption, systematic and in-depth security analysis of these agents has largely been overlooked. In this paper, we present a comprehensive security analysis of eight real-world coding agents. Our analysis addresses the limitations of prior approaches, which were often fragmented and ad hoc, by systematically examining the internal workflows of coding agents and identifying security threats across their components. Through the analysis, we identify 15 security issues, including previously overlooked or missed issues, that can be abused to compromise the confidentiality and integrity of user systems. Furthermore, we show that these security issues are not merely individual vulnerabilities, but can collectively lead to end-to-end exploitations. By leveraging these security issues, we successfully achieved arbitrary command execution in five agents and global data exfiltration in four agents, all without any user interaction or approval. Our findings highlight the need for a comprehensive security analysis in modern LLM-driven agents and demonstrate how insufficient security considerations can lead to severe vulnerabilities.

Paperid: 1650, https://arxiv.org/pdf/2509.24037.pdf

Abstract:
Software vulnerabilities remain a critical security challenge, providing entry points for attackers into enterprise networks. Despite advances in security practices, the lack of high-quality datasets capturing diverse exploit behavior limits effective vulnerability assessment and mitigation. This paper introduces an end-to-end multi-step pipeline leveraging generative AI, specifically large language models (LLMs), to address the challenges of orchestrating and reproducing attacks to known software vulnerabilities. Our approach extracts information from CVE disclosures in the National Vulnerability Database, augments it with external public knowledge (e.g., threat advisories, code snippets) using Retrieval-Augmented Generation (RAG), and automates the creation of containerized environments and exploit code for each vulnerability. The pipeline iteratively refines generated artifacts, validates attack success with test cases, and supports complex multi-container setups. Our methodology overcomes key obstacles, including noisy and incomplete vulnerability descriptions, by integrating LLMs and RAG to fill information gaps. We demonstrate the effectiveness of our pipeline across different vulnerability types, such as memory overflows, denial of service, and remote code execution, spanning diverse programming languages, libraries and years. In doing so, we uncover significant inconsistencies in CVE descriptions, emphasizing the need for more rigorous verification in the CVE disclosure process. Our approach is model-agnostic, working across multiple LLMs, and we open-source the artifacts to enable reproducibility and accelerate security research. To the best of our knowledge, this is the first system to systematically orchestrate and exploit known vulnerabilities in containerized environments by combining general-purpose LLM reasoning with CVE data and RAG-based context enrichment.

Paperid: 1651, https://arxiv.org/pdf/2509.22620.pdf

Abstract:
Decentralized Autonomous Organizations (DAOs) use smart contracts to foster communities working toward common goals. Existing definitions of decentralization, however -- the 'D' in DAO -- fall short of capturing the key properties characteristic of diverse and equitable participation. This work proposes a new framework for measuring DAO decentralization called Voting-Bloc Entropy (VBE, pronounced ''vibe''). VBE is based on the idea that voters with closely aligned interests act as a centralizing force and should be modeled as such. VBE formalizes this notion by measuring the similarity of participants' utility functions across a set of voting rounds. Unlike prior, ad hoc definitions of decentralization, VBE derives from first principles: We introduce a simple (yet powerful) reinforcement learning-based conceptual model for voting, that in turn implies VBE. We first show VBE's utility as a theoretical tool. We prove a number of results about the (de)centralizing effects of vote delegation, proposal bundling, bribery, etc. that are overlooked in previous notions of DAO decentralization. Our results lead to practical suggestions for enhancing DAO decentralization. We also show how VBE can be used empirically by presenting measurement studies and VBE-based governance experiments. We make the tools we developed for these results available to the community in the form of open-source artifacts in order to facilitate future study of DAO decentralization.

Paperid: 1652, https://arxiv.org/pdf/2509.22213.pdf

Abstract:
The accuracy-first perspective of differential privacy addresses an important shortcoming by allowing a data analyst to adaptively adjust the quantitative privacy bound instead of sticking to a predetermined bound. Existing works on the accuracy-first perspective have neglected an important property of differential privacy known as post-processing immunity, which ensures that an adversary is not able to weaken the privacy guarantee by post-processing. We address this gap by determining which existing definitions in the accuracy-first perspective have post-processing immunity, and which do not. The only definition with post-processing immunity, pure ex-post privacy, lacks useful tools for practical problems, such as an ex-post analogue of the Gaussian mechanism, and an algorithm to check if accuracy on separate private validation set is high enough. To address this, we propose a new definition based on RÃ©nyi differential privacy that has post-processing immunity, and we develop basic theory and tools needed for practical applications. We demonstrate the practicality of our theory with an application to synthetic data generation, where our algorithm successfully adjusts the privacy bound until an accuracy threshold is met on a private validation dataset.

Paperid: 1653, https://arxiv.org/pdf/2509.21586.pdf

Abstract:
The data availability problem is a central challenge in blockchain systems and lies at the core of the accessibility and scalability issues faced by platforms such as Ethereum. Modern solutions employ several approaches, with data availability sampling (DAS) being the most self-sufficient and minimalistic in its security assumptions. Existing DAS methods typically form cryptographic commitments on codewords of fixed-rate erasure codes, which restrict light nodes to sampling from a predetermined set of coded symbols. In this paper, we introduce a new approach to DAS that modularizes the coding and commitment process by committing to the uncoded data while performing sampling through on-the-fly coding. The resulting samples are significantly more expressive, enabling light nodes to obtain, in concrete implementations, up to multiple orders of magnitude stronger assurances of data availability than from sampling pre-committed symbols from a fixed-rate redundancy code as done in established DAS schemes using Reed Solomon or low density parity check codes. We present a concrete protocol that realizes this paradigm using random linear network coding (RLNC).

Paperid: 1654, https://arxiv.org/pdf/2509.21389.pdf

Abstract:
This survey explores the integration of Federated Learning (FL) with Network Intrusion Detection Systems (NIDS), with particular emphasis on deep learning and quantum machine learning approaches. FL enables collaborative model training across distributed devices while preserving data privacy-a critical requirement in network security contexts where sensitive traffic data cannot be centralized. Our comprehensive analysis systematically examines the full spectrum of FL architectures, deployment strategies, communication protocols, and aggregation methods specifically tailored for intrusion detection. We provide an in-depth investigation of privacy-preserving techniques, model compression approaches, and attack-specific federated solutions for threats including DDoS, MITM, and botnet attacks. The survey further delivers a pioneering exploration of Quantum FL (QFL), discussing quantum feature encoding, quantum machine learning algorithms, and quantum-specific aggregation methods that promise exponential speedups for complex pattern recognition in network traffic. Through rigorous comparative analysis of classical and quantum approaches, identification of research gaps, and evaluation of real-world deployments, we outline a concrete roadmap for industrial adoption and future research directions. This work serves as an authoritative reference for researchers and practitioners seeking to enhance privacy, efficiency, and robustness of federated intrusion detection systems in increasingly complex network environments, while preparing for the quantum-enhanced cybersecurity landscape of tomorrow.

Paperid: 1655, https://arxiv.org/pdf/2509.20835.pdf

Abstract:
This paper proposes a novel and flexible security-aware semantic-driven integrated sensing and communication (ISAC) framework, namely security semantic ISAC (SS-ISAC). Inspired by the positive impact of the adversarial attack, a pair of pluggable encryption and decryption modules is designed in the proposed SS-ISAC framework. The encryption module is installed after the semantic transmitter, adopting a trainable adversarial residual network (ARN) to create the adversarial attack. Correspondingly, the decryption module before the semantic receiver utilizes another trainable ARN to mitigate the adversarial attack and noise. These two modules can be flexibly assembled considering the system security demands, without drastically modifying the hardware infrastructure. To ensure the sensing and communication (SAC) performance while preventing the eavesdropping threat, the above ARNs are jointly optimized by minimizing a carefully designed loss function that relates to the adversarial attack power, SAC performance, as well as the privacy leakage risk. Simulation results validate the effectiveness of the proposed SS-ISAC framework in terms of both SAC and eavesdropping prevention performance.

Paperid: 1656, https://arxiv.org/pdf/2509.20418.pdf

Abstract:
Quantum Artificial Intelligence (QAI), the integration of Artificial Intelligence (AI) and Quantum Computing (QC), promises transformative advances, including AI-enabled quantum cryptography and quantum-resistant encryption protocols. However, QAI inherits data risks from both AI and QC, creating complex privacy and security vulnerabilities that are not systematically studied. These risks affect the trustworthiness and reliability of AI and QAI systems, making their understanding critical. This study systematically reviews 67 privacy- and security-related studies to expand understanding of QAI data risks. We propose a taxonomy of 22 key data risks, organised into five categories: governance, risk assessment, control implementation, user considerations, and continuous monitoring. Our findings reveal vulnerabilities unique to QAI and identify gaps in holistic risk assessment. This work contributes to trustworthy AI and QAI research and provides a foundation for developing future risk assessment tools.

Paperid: 1657, https://arxiv.org/pdf/2509.20277.pdf

Abstract:
In recent years, various software supply chain (SSC) attacks have posed significant risks to the global community. Severe consequences may arise if developers integrate insecure code snippets that are vulnerable to SSC attacks into their products. Particularly, code generation techniques, such as large language models (LLMs), have been widely utilized in the developer community. However, LLMs are known to suffer from inherent issues when generating code, including fabrication, misinformation, and reliance on outdated training data, all of which can result in serious software supply chain threats. In this paper, we investigate the security threats to the SSC that arise from these inherent issues. We examine three categories of threats, including eleven potential SSC-related threats, related to external components in source code, and continuous integration configuration files. We find some threats in LLM-generated code could enable attackers to hijack software and workflows, while some others might cause potential hidden threats that compromise the security of the software over time. To understand these security impacts and severity, we design a tool, SSCGuard, to generate 439,138 prompts based on SSC-related questions collected online, and analyze the responses of four popular LLMs from GPT and Llama. Our results show that all identified SSC-related threats persistently exist. To mitigate these risks, we propose a novel prompt-based defense mechanism, namely Chain-of-Confirmation, to reduce fabrication, and a middleware-based defense that informs users of various SSC threats.

Paperid: 1658, https://arxiv.org/pdf/2509.20262.pdf

Abstract:
When neural networks are trained to classify a dataset, one finds a set of weights from which the network produces a label for each data point. We study the algorithmic complexity of finding a collision in a single-layer neural net, where a collision is defined as two distinct sets of weights that assign the same labels to all data. For binary perceptrons with oscillating activation functions, we establish the emergence of an overlap gap property in the space of collisions. This is a topological property believed to be a barrier to the performance of efficient algorithms. The hardness is supported by numerical experiments using approximate message passing algorithms, for which the algorithms stop working well below the value predicted by our analysis. Neural networks provide a new category of candidate collision resistant functions, which for some parameter setting depart from constructions based on lattices. Beyond relevance to cryptography, our work uncovers new forms of computational hardness emerging in large neural networks which may be of independent interest.

Paperid: 1659, https://arxiv.org/pdf/2509.17871.pdf

Abstract:
In traditional, one-vote-per-person voting systems, privacy equates with ballot secrecy: voting tallies are published, but individual voters' choices are concealed. Voting systems that weight votes in proportion to token holdings, though, are now prevalent in cryptocurrency and web3 systems. We show that these weighted-voting systems overturn existing notions of voter privacy. Our experiments demonstrate that even with secret ballots, publishing raw tallies often reveals voters' choices. Weighted voting thus requires a new framework for privacy. We introduce a notion called B-privacy whose basis is bribery, a key problem in voting systems today. B-privacy captures the economic cost to an adversary of bribing voters based on revealed voting tallies. We propose a mechanism to boost B-privacy by noising voting tallies. We prove bounds on its tradeoff between B-privacy and transparency, meaning reported-tally accuracy. Analyzing 3,582 proposals across 30 Decentralized Autonomous Organizations (DAOs), we find that the prevalence of large voters ("whales") limits the effectiveness of any B-Privacy-enhancing technique. However, our mechanism proves to be effective in cases without extreme voting weight concentration: among proposals requiring coalitions of $\geq5$ voters to flip outcomes, our mechanism raises B-privacy by a geometric mean factor of $4.1\times$. Our work offers the first principled guidance on transparency-privacy tradeoffs in weighted-voting systems, complementing existing approaches that focus on ballot secrecy and revealing fundamental constraints that voting weight concentration imposes on privacy mechanisms.

Paperid: 1660, https://arxiv.org/pdf/2509.16558.pdf

Abstract:
Textual passwords remain a predominant authentication mechanism in web security. To evaluate their strength, existing research has proposed several data-driven models across various scenarios. However, these models generally treat passwords uniformly, neglecting the structural differences among passwords. This typically results in biased training that favors frequent password structural patterns. To mitigate the biased training, we argue that passwords, as a type of complex short textual data, should be processed in a structure-aware manner by identifying their structural patterns and routing them to specialized models accordingly. In this paper, we propose MoPE, a Mixture of Password Experts framework, specifically designed to leverage the structural patterns in passwords to improveguessing performance. Motivated by the observation that passwords with similar structural patterns (e.g., fixed-length numeric strings) tend to cluster in high-density regions within the latent space, our MoPE introduces: (1) a novel structure-based method for generating specialized expert models; (2) a lightweight gate method to select appropriate expert models to output reliable guesses, better aligned with the high computational frequency of password guessing tasks. Our evaluation shows that MoPE significantly outperforms existing state-of-the-art baselines in both offline and online guessing scenarios, achieving up to 38.80% and 9.27% improvement in cracking rate, respectively, showcasing that MoPE can effectively exploit the capabilities of data-driven models for password guessing. Additionally, we implement a real-time Password Strength Meter (PSM) based on offline MoPE, assisting users in choosing stronger passwords more precisely with millisecond-level response latency.

Paperid: 1661, https://arxiv.org/pdf/2509.14604.pdf

Abstract:
The integration of the Internet of Medical Things (IoMT) into healthcare systems has transformed patient care by enabling real-time monitoring, enhanced diagnostics, and enhanced operational efficiency. However, this increased connectivity has also expanded the attack surface for cybercriminals, raising significant cybersecurity and privacy concerns. This study focuses on the cybersecurity vulnerabilities of IoMT infusion pumps, which are critical devices in modern healthcare. Through a targeted literature review of the past five years, we analyzed seven current studies from a pool of 132 papers to identify security vulnerabilities. Our findings indicate that infusion pumps face vulnerabilities such as device-level flaws, authentication and access control issues, network and communication weaknesses, data security and privacy risks, and operational or organizational challenges that can expose them to lateral attacks within healthcare networks. Our analysis synthesizes findings from seven recent studies to clarify how and why infusion pumps remain vulnerable in each of these areas. By categorizing the security gaps, we highlight critical risk patterns and their implications. This work underscores the scope of the issue and provides a structured understanding that is valuable for healthcare IT professionals and device manufacturers. Ultimately, the findings can inform the development of targeted, proactive security strategies to better safeguard infusion pumps and protect patient well-being.

Paperid: 1662, https://arxiv.org/pdf/2509.14297.pdf

Abstract:
Safety alignment aims to prevent Large Language Models (LLMs) from responding to harmful queries. To strengthen safety protections, jailbreak methods are developed to simulate malicious attacks and uncover vulnerabilities. In this paper, we introduce HILL (Hiding Intention by Learning from LLMs), a novel jailbreak approach that systematically transforms imperative harmful requests into learning-style questions with only straightforward hypotheticality indicators. Further, we introduce two new metrics to thoroughly evaluate the utility of jailbreak methods. Experiments on the AdvBench dataset across a wide range of models demonstrate HILL's strong effectiveness, generalizability, and harmfulness. It achieves top attack success rates on the majority of models and across malicious categories while maintaining high efficiency with concise prompts. Results of various defense methods show the robustness of HILL, with most defenses having mediocre effects or even increasing the attack success rates. Moreover, the assessment on our constructed safe prompts reveals inherent limitations of LLMs' safety mechanisms and flaws in defense methods. This work exposes significant vulnerabilities of safety measures against learning-style elicitation, highlighting a critical challenge of balancing helpfulness and safety alignments.

Paperid: 1663, https://arxiv.org/pdf/2509.13987.pdf

Abstract:
Machine learning models used for distributed architectures consisting of servers and clients require large amounts of data to achieve high accuracy. Data obtained from clients are collected on a central server for model training. However, storing data on a central server raises concerns about security and privacy. To address this issue, a federated learning architecture has been proposed. In federated learning, each client trains a local model using its own data. The trained models are periodically transmitted to the central server. The server then combines the received models using federated aggregation algorithms to obtain a global model. This global model is distributed back to the clients, and the process continues in a cyclical manner. Although preventing data from leaving the clients enhances security, certain concerns still remain. Attackers can perform inference attacks on the obtained models to approximate the training dataset, potentially causing data leakage. In this study, differential privacy was applied to address the aforementioned security vulnerability, and a performance analysis was conducted. The Data-Unaware Classification Based on Association (duCBA) algorithm was used as the federated aggregation method. Differential privacy was implemented on the data using the Randomized Response technique, and the trade-off between security and performance was examined under different epsilon values. As the epsilon value decreased, the model accuracy declined, and class prediction imbalances were observed. This indicates that higher levels of privacy do not always lead to practical outcomes and that the balance between security and performance must be carefully considered.

Paperid: 1664, https://arxiv.org/pdf/2509.13627.pdf

Abstract:
The enterprises today are faced with the tough challenge of processing, storing large amounts of data in a secure, scalable manner and enabling decision makers to make quick, informed data driven decisions. This paper addresses this challenge and develops an effective enterprise data strategy in the cloud. Various components of an effective data strategy are discussed and architectures addressing security, scalability and privacy aspects are provided.

Paperid: 1665, https://arxiv.org/pdf/2509.13048.pdf

Abstract:
As quantum computing advances, PQC schemes are adopted to replace classical algorithms. Among them is the SLH-DSA that was recently standardized by NIST and is favored for its conservative security foundations. In this work, we present the first software-only universal forgery attack on SLH-DSA, leveraging Rowhammer-induced bit flips to corrupt the internal state and forge signatures. While prior work targeted embedded systems and required physical access, our attack is software-only, targeting commodity desktop and server hardware, significantly broadening the threat model. We demonstrate a full end-to-end attack against all security levels of SLH-DSA in OpenSSL 3.5.1, achieving universal forgery for the highest security level after eight hours of hammering and 36 seconds of post-processing. Our post-processing is informed by a novel complexity analysis that, given a concrete set of faulty signatures, identifies the most promising computational path to pursue. To enable the attack, we introduce Swage, a modular and extensible framework for implementing end-to-end Rowhammer-based fault attacks. Swage abstracts and automates key components of practical Rowhammer attacks. Unlike prior tooling, Swage is untangled from the attacked code, making it reusable and suitable for frictionless analysis of different targets. Our findings highlight that even theoretically sound PQC schemes can fail under real-world conditions, underscoring the need for additional implementation hardening or hardware defenses against Rowhammer.

Paperid: 1666, https://arxiv.org/pdf/2509.12939.pdf

Abstract:
Security-critical machine-learning (ML) systems, such as face-recognition systems, are susceptible to adversarial examples, including real-world physically realizable attacks. Various means to boost ML's adversarial robustness have been proposed; however, they typically induce unfair robustness: It is often easier to attack from certain classes or groups than from others. Several techniques have been developed to improve adversarial robustness while seeking perfect fairness between classes. Yet, prior work has focused on settings where security and fairness are less critical. Our insight is that achieving perfect parity in realistic fairness-critical tasks, such as face recognition, is often infeasible -- some classes may be highly similar, leading to more misclassifications between them. Instead, we suggest that seeking symmetry -- i.e., attacks from class $i$ to $j$ would be as successful as from $j$ to $i$ -- is more tractable. Intuitively, symmetry is a desirable because class resemblance is a symmetric relation in most domains. Additionally, as we prove theoretically, symmetry between individuals induces symmetry between any set of sub-groups, in contrast to other fairness notions where group-fairness is often elusive. We develop Sy-FAR, a technique to encourage symmetry while also optimizing adversarial robustness and extensively evaluate it using five datasets, with three model architectures, including against targeted and untargeted realistic attacks. The results show Sy-FAR significantly improves fair adversarial robustness compared to state-of-the-art methods. Moreover, we find that Sy-FAR is faster and more consistent across runs. Notably, Sy-FAR also ameliorates another type of unfairness we discover in this work -- target classes that adversarial examples are likely to be classified into become significantly less vulnerable after inducing symmetry.

Paperid: 1667, https://arxiv.org/pdf/2509.12937.pdf

Abstract:
Large Language Models (LLMs) are increasingly deployed for task automation and content generation, yet their safety mechanisms remain vulnerable to circumvention through different jailbreaking techniques. In this paper, we introduce \textit{Content Concretization} (CC), a novel jailbreaking technique that iteratively transforms abstract malicious requests into concrete, executable implementations. CC is a two-stage process: first, generating initial LLM responses using lower-tier, less constrained safety filters models, then refining them through higher-tier models that process both the preliminary output and original prompt. We evaluate our technique using 350 cybersecurity-specific prompts, demonstrating substantial improvements in jailbreak Success Rates (SRs), increasing from 7\% (no refinements) to 62\% after three refinement iterations, while maintaining a cost of 7.5\textcent~per prompt. Comparative A/B testing across nine different LLM evaluators confirms that outputs from additional refinement steps are consistently rated as more malicious and technically superior. Moreover, manual code analysis reveals that generated outputs execute with minimal modification, although optimal deployment typically requires target-specific fine-tuning. With eventual improved harmful code generation, these results highlight critical vulnerabilities in current LLM safety frameworks.

Paperid: 1668, https://arxiv.org/pdf/2509.12535.pdf

Abstract:
As quantum computing advances, quantum circuit simulators serve as critical tools to bridge the current gap caused by limited quantum hardware availability. These simulators are typically deployed on cloud platforms, where users submit proprietary circuit designs for simulation. In this work, we demonstrate a novel timing side-channel attack targeting cloud-based quantum simulators. A co-located malicious process can observe fine-grained execution timing patterns to extract sensitive information about concurrently running quantum circuits. We systematically analyze simulator behavior using the QASMBench benchmark suite, profiling timing and memory characteristics across various circuit executions. Our experimental results show that timing profiles exhibit circuit-dependent patterns that can be effectively classified using pattern recognition techniques, enabling the adversary to infer circuit identities and compromise user confidentiality. We were able to achieve 88% to 99.9% identification rate of quantum circuits based on different datasets. This work highlights previously unexplored security risks in quantum simulation environments and calls for stronger isolation mechanisms to protect user workloads

Paperid: 1669, https://arxiv.org/pdf/2509.12494.pdf

Abstract:
Specialized hardware like application-specific integrated circuits (ASICs) remains the primary accelerator type for cryptographic kernels based on large integer arithmetic. Prior work has shown that commodity and server-class GPUs can achieve near-ASIC performance for these workloads. However, achieving comparable performance on CPUs remains an open challenge. This work investigates the following question: How can we narrow the performance gap between CPUs and specialized hardware for key cryptographic kernels like basic linear algebra subprograms (BLAS) operations and the number theoretic transform (NTT)? To this end, we develop an optimized scalar implementation of these kernels for x86 CPUs at the per-core level. We utilize SIMD instructions (specifically AVX2 and AVX-512) to further improve performance, achieving an average speedup of 38 times and 62 times over state-of-the-art CPU baselines for NTTs and BLAS operations, respectively. To narrow the gap further, we propose a small AVX-512 extension, dubbed multi-word extension (MQX), which delivers substantial speedup with only three new instructions and minimal proposed hardware modifications. MQX cuts the slowdown relative to ASICs to as low as 35 times on a single CPU core. Finally, we perform a roofline analysis to evaluate the peak performance achievable with MQX when scaled across an entire multi-core CPU. Our results show that, with MQX, top-tier server-grade CPUs can approach the performance of state-of-the-art ASICs for cryptographic workloads.

Paperid: 1670, https://arxiv.org/pdf/2509.12290.pdf

Abstract:
Human oversight of AI is promoted as a safeguard against risks such as inaccurate outputs, system malfunctions, or violations of fundamental rights, and is mandated in regulation like the European AI Act. Yet debates on human oversight have largely focused on its effectiveness, while overlooking a critical dimension: the security of human oversight. We argue that human oversight creates a new attack surface within the safety, security, and accountability architecture of AI operations. Drawing on cybersecurity perspectives, we analyze attack vectors that threaten the requirements of effective human oversight, thereby undermining the safety of AI operations. Such attacks may target the AI system, its communication with oversight personnel, or the personnel themselves. We then outline hardening strategies to mitigate these risks. Our contributions are: (1) introducing a security perspective on human oversight, and (2) providing an overview of attack vectors and hardening strategies to enable secure human oversight of AI.

Paperid: 1671, https://arxiv.org/pdf/2509.11934.pdf

Abstract:
Systems managing Verifiable Credentials are becoming increasingly popular. Unfortunately, their support for revoking previously issued credentials allows verifiers to effectively monitor the validity of the credentials, which is sensitive information. While the issue started to gain recognition, no adequate solution has been proposed so far. In this work, we propose a novel framework for time-limited continuous verification. The holder is able to individually configure the verification period when sharing information with the verifier, and the system guarantees proven untraceability of the revocation status after the verification period expires. Different from existing systems, the implementation adopts a more scalable blacklist approach where tokens corresponding to revoked credentials are stored in the registry. The approach employs ZK proofs that allow holders to prove non-membership in the blacklist. In addition to theoretically proving security, we evaluate the approach analytically and experimentally and show that it significantly improves bandwidth consumption on the holder while being on par with state-of-the-art solutions with respect to the other performance metrics.

Paperid: 1672, https://arxiv.org/pdf/2509.11625.pdf

Abstract:
A key concern for AI safety remains understudied in the machine learning (ML) literature: how can we ensure users of ML models do not leverage predictions on incorrect personal data to harm others? This is particularly pertinent given the rise of open-weight models, where simply masking model outputs does not suffice to prevent adversaries from recovering harmful predictions. To address this threat, which we call *test-time privacy*, we induce maximal uncertainty on protected instances while preserving accuracy on all other instances. Our proposed algorithm uses a Pareto optimal objective that explicitly balances test-time privacy against utility. We also provide a certifiable approximation algorithm which achieves $(\varepsilon, Î´)$ guarantees without convexity assumptions. We then prove a tight bound that characterizes the privacy-utility tradeoff that our algorithms incur. Empirically, our method obtains at least $>3\times$ stronger uncertainty than pretraining with marginal drops in accuracy on various image recognition benchmarks. Altogether, this framework provides a tool to guarantee additional protection to end users.

Paperid: 1673, https://arxiv.org/pdf/2509.11451.pdf

Abstract:
Recent works in federated learning (FL) have shown the utility of leveraging transfer learning for balancing the benefits of FL and centralized learning. In this setting, federated training happens after a stable point has been reached through conventional training. Global model weights are first centrally pretrained by the server on a public dataset following which only the last few linear layers (the classification head) of the model are finetuned across clients. In this scenario, existing data reconstruction attacks (DRAs) in FL show two key weaknesses. First, strongly input-correlated gradient information from the initial model layers is never shared, significantly degrading reconstruction accuracy. Second, DRAs in which the server makes highly specific, handcrafted manipulations to the model structure or parameters (for e.g., layers with all zero weights, identity mappings and rows with identical weight patterns) are easily detectable by an active client. Improving on these, we propose MAUI, a stealthy DRA that does not require any overt manipulations to the model architecture or weights, and relies solely on the gradients of the classification head. MAUI first extracts "robust" feature representations of the input batch from the gradients of the classification head and subsequently inverts these representations to the original inputs. We report highly accurate reconstructions on the CIFAR10 and ImageNet datasets on a variety of model architectures including convolution networks (CNN, VGG11), ResNets (18, 50), ShuffleNet-V2 and Vision Transformer (ViT B-32), regardless of the batch size. MAUI significantly outperforms prior DRAs in reconstruction quality, achieving 40-120% higher PSNR scores.

Paperid: 1674, https://arxiv.org/pdf/2509.11006.pdf

Abstract:
Blockchain technology offers decentralization and security but struggles with scalability, particularly in enterprise settings where efficiency and controlled access are paramount. Sharding is a promising solution for private blockchains, yet existing approaches face challenges in coordinating shards, ensuring fault tolerance with limited nodes, and minimizing the high overhead of consensus mechanisms like PBFT. This paper proposes the Range-Based Sharding (RBS) Protocol, a novel sharding mechanism tailored for enterprise blockchains, implemented on Quorum. Unlike traditional sharding models such as OmniLedger and non-sharding Corda framework, RBS employs a commit-reveal scheme for secure and unbiased shard allocation, ensuring fair validator distribution while reducing cross-shard transaction delays. Our approach enhances scalability by balancing computational loads across shards, reducing consensus overhead, and improving parallel transaction execution. Experimental evaluations demonstrate that RBS achieves significantly higher throughput and lower latency compared to existing enterprise sharding frameworks, making it a viable and efficient solution for largescale blockchain deployments.

Paperid: 1675, https://arxiv.org/pdf/2509.10755.pdf

Abstract:
The Tor network offers network anonymity to its users by routing their traffic through a sequence of relays. A group of nine directory authorities maintains information about all available relay nodes using a distributed directory protocol. We observe that the current protocol makes a steep synchrony assumption, which makes it vulnerable to natural as well as adversarial non-synchronous communication scenarios over the Internet. In this paper, we show that it is possible to cause a failure in the Tor directory protocol by targeting a majority of the authorities for only five minutes using a well-executed distributed denial-of-service (DDoS) attack. We demonstrate this attack in a controlled environment and show that it is cost-effective for as little as \$53.28 per month to disrupt the protocol and to effectively bring down the entire Tor network. To mitigate this problem, we consider the popular partial synchrony assumption for the Tor directory protocol that ensures that the protocol security is hampered even when the network delays are large and unknown. We design a new Tor directory protocol that leverages any standard partial-synchronous consensus protocol to solve this problem, while also proving its security. We have implemented a prototype in Rust, demonstrating comparable performance to the current protocol while resisting similar attacks.

Paperid: 1676, https://arxiv.org/pdf/2509.10545.pdf

Abstract:
Recent years have seen many industrial implementations and much scholastic research, i.e., prototypes and theoretical frameworks, in Decentralized Identity Management Systems (DIDMS). It is safe to say that Attestation-Based Attribute-Based Decentralized IDM (ABABDIDM) has not received anywhere near the same level of attention in the literature as general Attribute-Based DIDMs (ABDIDM), i.e, decentralized Attribute-Based Access Control (ABAC). The use of decentralization, i.e., DIDM, is to improve upon the security and privacy-related issues of centralized Identity Management Systems (IDM) and Attribute-Based IDMs (ABIDM). And blockchain is the framework used for decentralization in all these schemes. Many DIDMs - even ABDIDMs - have been defined on popular blockchains such as Hyperledger, Ethereum, and Bitcoin. However, despite the characteristics of Ripple that makes it appealing for an ABIDM, there is a lack of research to develop an Identity Management System (IDMS) on Ripple in literature. We have attempted to conceptualize an ABABDIDM on Ripple.

Paperid: 1677, https://arxiv.org/pdf/2509.10540.pdf

Abstract:
Large language model (LLM) assistants are increasingly integrated into enterprise workflows, raising new security concerns as they bridge internal and external data sources. This paper presents an in-depth case study of EchoLeak (CVE-2025-32711), a zero-click prompt injection vulnerability in Microsoft 365 Copilot that enabled remote, unauthenticated data exfiltration via a single crafted email. By chaining multiple bypasses-evading Microsofts XPIA (Cross Prompt Injection Attempt) classifier, circumventing link redaction with reference-style Markdown, exploiting auto-fetched images, and abusing a Microsoft Teams proxy allowed by the content security policy-EchoLeak achieved full privilege escalation across LLM trust boundaries without user interaction. We analyze why existing defenses failed, and outline a set of engineering mitigations including prompt partitioning, enhanced input/output filtering, provenance-based access control, and strict content security policies. Beyond the specific exploit, we derive generalizable lessons for building secure AI copilots, emphasizing the principle of least privilege, defense-in-depth architectures, and continuous adversarial testing. Our findings establish prompt injection as a practical, high-severity vulnerability class in production AI systems and provide a blueprint for defending against future AI-native threats.

Paperid: 1678, https://arxiv.org/pdf/2509.09868.pdf

Abstract:
The specification of state machine replication (SMR) has no requirement on the final total order of commands. In blockchains based on SMR, however, order matters, since different orders could provide their clients with different financial rewards. Ordered consensus augments the specification of SMR to include specific guarantees on such order, with a focus on limiting the influence of Byzantine nodes. Real-world ordering manipulations, however, can and do happen even without Byzantine replicas, typically because of factors, such as faster networks or closer proximity to the blockchain infrastructure, that give some clients an unfair advantage. To address this challenge, this paper proceeds to extend ordered consensus by requiring it to also support equal opportunity, a concrete notion of fairness, widely adopted in social sciences. Informally, equal opportunity requires that two candidates who, according to a set of criteria deemed to be relevant, are equally qualified for a position (in our case, a specific slot in the SMR total order), should have an equal chance of landing it. We show how randomness can be leveraged to keep bias in check, and, to this end, introduce the secret random oracle (SRO), a system component that generates randomness in a fault-tolerant manner. We describe two SRO designs based, respectively, on trusted hardware and threshold verifiable random functions, and instantiate them in Bercow, a new ordered consensus protocol that, by approximating equal opportunity up to within a configurable factor, can effectively mitigate well-known ordering attacks in SMR-based blockchains.

Paperid: 1679, https://arxiv.org/pdf/2509.09351.pdf

Abstract:
Ethical questions are discussed regularly in computer security. Still, researchers in computer security lack clear guidance on how to make, document, and assess ethical decisions in research when what is morally right or acceptable is not clear-cut. In this work, we give an overview of the discussion of ethical implications in current published work in computer security by reviewing all 1154 top-tier security papers published in 2024, finding inconsistent levels of ethics reporting with a strong focus of reporting institutional or ethics board approval, human subjects protection, and responsible disclosure, and a lack of discussion of balancing harms and benefits. We further report on the results of a semi-structured interview study with 24 computer security and privacy researchers (among whom were also: reviewers, ethics committee members, and/or program chairs) and their ethical decision-making both as authors and during peer review, finding a strong desire for ethical research, but a lack of consistency in considered values, ethical frameworks (if articulated), decision-making, and outcomes. We present an overview of the current state of the discussion of ethics and current de-facto standards in computer security research, and contribute suggestions to improve the state of ethics in computer security research.

Paperid: 1680, https://arxiv.org/pdf/2509.09322.pdf

Abstract:
Modern software development increasingly depends on open-source libraries and third-party components, which are often encapsulated into containerized environments. While improving the development and deployment of applications, this approach introduces security risks, particularly when outdated or vulnerable components are inadvertently included in production environments. Software Composition Analysis (SCA) is a critical process that helps identify and manage packages and dependencies inside a container. However, unintentional modifications to the container filesystem can lead to incomplete container images, which compromise the reliability of SCA tools. In this paper, we examine the limitations of both cloud-based and open-source SCA tools when faced with such obscure images. An analysis of 600 popular containers revealed that obscure containers exist in well-known registries and trusted images and that many tools fail to analyze such containers. To mitigate these issues, we propose an obscuration-resilient methodology for container analysis and introduce ORCA (Obscuration-Resilient Container Analyzer), its open-source implementation. We reported our findings to all vendors using their appropriate channels. Our results demonstrate that ORCA effectively detects the content of obscure containers and achieves a median 40% improvement in file coverage compared to Docker Scout and Syft.

Paperid: 1681, https://arxiv.org/pdf/2509.08943.pdf

Abstract:
In adversarial settings, where attackers can deliberately and strategically corrupt quantum data, standard quantum error correction reaches its limits. It can only correct up to half the code distance and must output a unique answer. Quantum list decoding offers a promising alternative. By allowing the decoder to output a short list of possible errors, it becomes possible to tolerate far more errors, even under worst-case noise. But two fundamental questions remain: which quantum codes support list decoding, and can we design decoding schemes that are secure against efficient, computationally bounded adversaries? In this work, we answer both. To identify which codes are list-decodable, we provide a generalized version of the Knill-Laflamme conditions. Then, using tools from quantum cryptography, we build an unambiguous list decoding protocol based on pseudorandom unitaries. Our scheme is secure against any quantum polynomial-time adversary, even across multiple decoding attempts, in contrast to previous schemes. Our approach connects coding theory with complexity-based quantum cryptography, paving the way for secure quantum information processing in adversarial settings.

Paperid: 1682, https://arxiv.org/pdf/2509.08402.pdf

Abstract:
The integration of the Internet of Things (IoT) in healthcare has revolutionized patient monitoring and data collection, allowing real-time tracking of vital signs, remote diagnostics, and automated medical responses. However, the transmission and storage of sensitive medical data introduce significant security and privacy challenges. To address these concerns, blockchain technology provides a decentralized and immutable ledger that ensures data integrity, , and transparency. Unlike public blockchains, private blockchains are permissioned; the access is granted only to authorized participants; they are more suitable for handling confidential healthcare data. Although blockchain ensures security and trust, it lacks built-in mechanisms to support flexible and controlled data sharing; This is where Proxy Re-Encryption (PRE) comes into play. PRE is a cryptographic technique that allows encrypted data to be re-encrypted for a new recipient without exposing it to intermediaries. We propose an architecture integrating private blockchain and PRE to enable secure, traceable, and privacy-preserving data sharing in IoT-based healthcare systems. Blockchain guarantees tamper proof record-keeping, while PRE enables fine-grained access control, allowing medical professionals to securely share patient data without compromising confidentiality. This combination creates a robust security framework that enhances trust and efficiency in digital healthcare ecosystems.

Paperid: 1683, https://arxiv.org/pdf/2509.08399.pdf

Abstract:
Being propelled by the fourth industrial revolution (Industry 4.0), IoT devices and solutions are well adopted everywhere, ranging from home applications to industrial use, crossing through transportation, healthcare, energy, and so on. This wide use of IoT has not gone unnoticed, hackers are tracking the weakness of such a technology and threatening them continuously. Their security at various levels has become an important concern of professionals and researchers. This issue takes more risk, especially with the IoT variants, IIoT (Industrial IoT) and MIoT (Medical IoT). Many existing security solutions are adapted and proposed for addressing IoT security. In this paper, we are interested in exploring blockchain technology and we make a comparison of three free Blockchain platforms towards their applicability for MIoT context, namely Ethereum, Hyperledger Fabric and Corda. In general, Blockchain technology provides a decentralized, autonomous, trustless, and distributed environment. It is challenging to find a Blockchain platform that fits the MIoT context and performs well in terms of security. The retained platform should be deployed smartly to avoid its practical drawbacks related to energy-consuming and excessive computing.

Paperid: 1684, https://arxiv.org/pdf/2509.07606.pdf

Abstract:
An improved CAST-128 encryption algorithm, which is done by implementing chaos-based adaptive S-box generation using Logistic sine Map (LSM), has been provided in this paper because of the increasing requirements of efficient and smart image encryption mechanisms. The study aims to address the drawbacks of static S-box models commonly used in traditional cryptographic systems, which are susceptible to linear and differential attacks. In the proposed scheme, the dynamic, non-linear, invertible, and highly cryptographic strength S-boxes are generated through a hybrid chaotic system that may have high non-linearity, strong and rigorous avalanche characteristics, and low differential uniformity. The process here is that the LSM is used to produce S-boxes having key-dependent parameters that are stuffed into the CAST-128 structure to encrypt the image in a block-wise manner. The performance of the encryption is assessed utilizing a set of standard grayscale images. The metrics that are used to evaluate the security are entropy, NPCR, UACI, PSNR, and histogram analysis. Outcomes indicate that randomness, resistance to statistical attacks, and country of encryption are significantly improved compared to the original CAST-128. The study is theoretically and practically relevant since it presents a lightweight S-box generation approach driven by chaos, which can increase the level of robustness of the image encryptions without enlisting machine learning. The system may be applied to secure communications, surveillance systems, and medical image protection on a real-time basis.

Paperid: 1685, https://arxiv.org/pdf/2509.07287.pdf

Abstract:
With the rapid development of large language models, the potential threat of their malicious use, particularly in generating phishing content, is becoming increasingly prevalent. Leveraging the capabilities of LLMs, malicious users can synthesize phishing emails that are free from spelling mistakes and other easily detectable features. Furthermore, such models can generate topic-specific phishing messages, tailoring content to the target domain and increasing the likelihood of success. Detecting such content remains a significant challenge, as LLM-generated phishing emails often lack clear or distinguishable linguistic features. As a result, most existing semantic-level detection approaches struggle to identify them reliably. While certain LLM-based detection methods have shown promise, they suffer from high computational costs and are constrained by the performance of the underlying language model, making them impractical for large-scale deployment. In this work, we aim to address this issue. We propose Paladin, which embeds trigger-tag associations into vanilla LLM using various insertion strategies, creating them into instrumented LLMs. When an instrumented LLM generates content related to phishing, it will automatically include detectable tags, enabling easier identification. Based on the design on implicit and explicit triggers and tags, we consider four distinct scenarios in our work. We evaluate our method from three key perspectives: stealthiness, effectiveness, and robustness, and compare it with existing baseline methods. Experimental results show that our method outperforms the baselines, achieving over 90% detection accuracy across all scenarios.

Paperid: 1686, https://arxiv.org/pdf/2509.06595.pdf

Abstract:
Large Language Models (LLMs) are transforming human decision-making by acting as cognitive collaborators. Yet, this promise comes with a paradox: while LLMs can improve accuracy, they may also erode independent reasoning, promote over-reliance and homogenize decisions. In this paper, we investigate how LLMs shape human judgment in security-critical contexts. Through two exploratory focus groups (unaided and LLM-supported), we assess decision accuracy, behavioral resilience and reliance dynamics. Our findings reveal that while LLMs enhance accuracy and consistency in routine decisions, they can inadvertently reduce cognitive diversity and improve automation bias, which is especially the case among users with lower resilience. In contrast, high-resilience individuals leverage LLMs more effectively, suggesting that cognitive traits mediate AI benefit.

Paperid: 1687, https://arxiv.org/pdf/2509.05496.pdf

Abstract:
Satellites, drones, and 5G space links now support critical services such as air traffic, finance, and weather. Yet most were not built to resist modern cyber threats. Ground stations can be breached, GPS jammed, and supply chains compromised, while no shared list of vulnerabilities or safe testing range exists. This paper maps eleven research gaps, including secure routing, onboard intrusion detection, recovery methods, trusted supply chains, post-quantum encryption, zero-trust architectures, and real-time impact monitoring. For each, we outline the challenge, why it matters, and a guiding research question. We also highlight an agentic (multi-agent) AI approach where small, task-specific agents share defense tasks onboard instead of one large model. Finally, we propose a five-year roadmap: post-quantum and QKD flight trials, open cyber-ranges, clearer vulnerability shar ing, and early multi-agent deployments. These steps move space cybersecurity from reactive patching toward proactive resilience.

Paperid: 1688, https://arxiv.org/pdf/2509.05379.pdf

Abstract:
As our cities and communities become smarter, the systems that keep us safe, such as traffic control centers, emergency response networks, and public transportation, also become more complex. With this complexity comes a greater risk of security threats that can affect not just machines but real people's lives. To address this challenge, we present ThreatGPT, an agentic Artificial Intelligence (AI) assistant built to help people whether they are engineers, safety officers, or policy makers to understand and analyze threats in public safety systems. Instead of requiring deep cybersecurity expertise, it allows users to simply describe the components of a system they are concerned about, such as login systems, data storage, or communication networks. Then, with the click of a button, users can choose how they want the system to be analyzed by using popular frameworks such as STRIDE, MITRE ATT&CK, CVE reports, NIST, or CISA. ThreatGPT is unique because it does not just provide threat information, but rather it acts like a knowledgeable partner. Using few-shot learning, the AI learns from examples and generates relevant smart threat models. It can highlight what might go wrong, how attackers could take advantage, and what can be done to prevent harm. Whether securing a city's infrastructure or a local health service, this tool adapts to users' needs. In simple terms, ThreatGPT brings together AI and human judgment to make our public systems safer. It is designed not just to analyze threats, but to empower people to understand and act on them, faster, smarter, and with more confidence.

Paperid: 1689, https://arxiv.org/pdf/2509.05192.pdf

Abstract:
Horizontal Federated Learning (HFL) is particularly vulnerable to backdoor attacks as adversaries can easily manipulate both the training data and processes to execute sophisticated attacks. In this work, we study the impact of training hyperparameters on the effectiveness of backdoor attacks and defenses in HFL. More specifically, we show both analytically and by means of measurements that the choice of hyperparameters by benign clients does not only influence model accuracy but also significantly impacts backdoor attack success. This stands in sharp contrast with the multitude of contributions in the area of HFL security, which often rely on custom ad-hoc hyperparameter choices for benign clients$\unicode{x2013}$leading to more pronounced backdoor attack strength and diminished impact of defenses. Our results indicate that properly tuning benign clients' hyperparameters$\unicode{x2013}$such as learning rate, batch size, and number of local epochs$\unicode{x2013}$can significantly curb the effectiveness of backdoor attacks, regardless of the malicious clients' settings. We support this claim with an extensive robustness evaluation of state-of-the-art attack-defense combinations, showing that carefully chosen hyperparameters yield across-the-board improvements in robustness without sacrificing main task accuracy. For example, we show that the 50%-lifespan of the strong A3FL attack can be reduced by 98.6%, respectively$\unicode{x2013}$all without using any defense and while incurring only a 2.9 percentage points drop in clean task accuracy.

Paperid: 1690, https://arxiv.org/pdf/2509.04792.pdf

Abstract:
IPv4 NAT has limited the spread of IoT botnets considerably by default-denying bots' incoming connection requests to in-home devices unless the owner has explicitly allowed them. As the Internet transitions to majority IPv6, however, residential connections no longer require the use of NAT. This paper therefore asks: has the transition from IPv4 to IPv6 ultimately made residential networks more vulnerable to attack, thereby empowering the next generation of IPv6-based IoT botnets? To answer this question, we introduce a large-scale IPv6 scanning methodology that, unlike those that rely on AI, can be run on low-resource devices common in IoT botnets. We use this methodology to perform the largest-scale measurement of IPv6 residential networks to date, and compare which devices are publicly accessible to comparable IPv4 networks. We were able to receive responses from 14.0M distinct IPv6 addresses inside of residential networks (i.e., not the external-facing gateway), in 2,436 ASes across 118 countries. These responses come from protocols commonly exploited by IoT botnets (including telnet and FTP), as well as protocols typically associated with end-user devices (including iPhone-Sync and IPP). Comparing to IPv4, we show that we are able to reach more printers, iPhones, and smart lights over IPv6 than full IPv4-wide scans could. Collectively, our results show that NAT has indeed acted as the de facto firewall of the Internet, and the v4-to-v6 transition of residential networks is opening up new devices to attack.

Paperid: 1691, https://arxiv.org/pdf/2509.03711.pdf

Abstract:
Modern computing systems remain rife with software vulnerabilities. Engineers apply many means to detect them, of which dynamic testing is one of the most common and effective. However, most dynamic testing techniques follow a top-down paradigm, and struggle to reach and exercise functions deep within the call graph. While recent works have proposed Bottom-Up approaches to address these limitations, they face challenges with false positives and generating valid inputs that adhere to the context of the entire program. In this work, we introduce a new paradigm that we call Reactive Bottom-Up Testing. Our insight is that function-level testing is necessary but not sufficient for the validation of vulnerabilities in functions. What we need is a systematic approach that not only tests functions in isolation but also validates their behavior within the broader program context, ensuring that detected vulnerabilities are both reachable and triggerable. We develop a three-stage bottom-up testing scheme: (1) identify likely-vulnerable functions and generate type- and context-aware harnesses; (2) fuzz to find crashes and extract input constraints via symbolic execution; (3) verify crashes by combining constraints to remove false positives. We implemented an automated prototype, which we call Griller. We evaluated Griller in a controlled setting using a benchmark of 48 known vulnerabilities across 5 open-source projects, where we successfully detected 28 known vulnerabilities. Additionally, we evaluated Griller on several real-world applications such as Pacman, and it discovered 6 previously unknown vulnerabilities. Our findings suggest that Reactive Bottom-Up Testing can significantly enhance the detection of vulnerabilities in complex systems, paving the way for more robust security practices.

Paperid: 1692, https://arxiv.org/pdf/2509.03037.pdf

Abstract:
Ethereum smart contracts hold tens of billions of USD in DeFi and NFTs, yet comprehensive security analysis remains difficult due to unverified code, proxy-based architectures, and the reliance on manual inspection of complex execution traces. Existing approaches fall into two main categories: anomaly transaction detection, which flags suspicious transactions but offers limited insight into specific attack strategies hidden in execution traces inside transactions, and code vulnerability detection, which cannot analyze unverified contracts and struggles to show how identified flaws are exploited in real incidents. As a result, analysts must still manually align transaction traces with contract code to reconstruct attack scenarios and conduct forensics. To address this gap, TraceLLM is proposed as a framework that leverages LLMs to integrate execution trace-level detection with decompiled contract code. We introduce a new anomaly execution path identification algorithm and an LLM-refined decompile tool to identify vulnerable functions and provide explicit attack paths to LLM. TraceLLM establishes the first benchmark for joint trace and contract code-driven security analysis. For comparison, proxy baselines are created by jointly transmitting the results of three representative code analysis along with raw traces to LLM. TraceLLM identifies attacker and victim addresses with 85.19\% precision and produces automated reports with 70.37\% factual precision across 27 cases with ground truth expert reports, achieving 25.93\% higher accuracy than the best baseline. Moreover, across 148 real-world Ethereum incidents, TraceLLM automatically generates reports with 66.22\% expert-verified accuracy, demonstrating strong generalizability.

Paperid: 1693, https://arxiv.org/pdf/2509.02824.pdf

Abstract:
The 6 GHz spectrum, recently opened for unlicensed use under Wi-Fi 6E and Wi-Fi 7, overlaps with frequencies used by mission-critical incumbent systems such as public safety communications and utility infrastructure. To prevent interference, the FCC mandates the use of Automated Frequency Coordination (AFC) systems, which assign safe frequency and power levels based on Wi-Fi Access Point (AP)-reported locations. In this work, we demonstrate that GPS-based location reporting, which Wi-Fi APs use, can be spoofed using inexpensive, off-the-shelf radio equipment. This enables attackers to manipulate AP behavior, gain unauthorized spectrum access, cause harmful interference, or disable APs entirely by spoofing them into foreign locations. We validate these attacks in a controlled lab setting against a commercial AP and evaluate a commercial AFC system under spoofed scenarios. Our findings highlight critical gaps in the security assumptions of AFC and motivate the need for stronger location integrity protections.

Paperid: 1694, https://arxiv.org/pdf/2509.02042.pdf

Abstract:
A number of attacks rely on infrared light sources or heat-absorbing material to imperceptibly fool systems into misinterpreting visual input in various image recognition applications. However, almost all existing approaches can only mount untargeted attacks and require heavy optimizations due to the use-case-specific constraints, such as location and shape. In this paper, we propose a novel, stealthy, and cost-effective attack to generate both targeted and untargeted adversarial infrared perturbations. By projecting perturbations from a transparent film onto the target object with an off-the-shelf infrared flashlight, our approach is the first to reliably mount laser-free targeted attacks in the infrared domain. Extensive experiments on traffic signs in the digital and physical domains show that our approach is robust and yields higher attack success rates in various attack scenarios across bright lighting conditions, distances, and angles compared to prior work. Equally important, our attack is highly cost-effective, requiring less than US\$50 and a few tens of seconds for deployment. Finally, we propose a novel segmentation-based detection that thwarts our attack with an F1-score of up to 99%.

Paperid: 1695, https://arxiv.org/pdf/2509.01599.pdf

Abstract:
Radiation Detection Systems (RDSs) are used to measure and detect abnormal levels of radioactive material in the environment. These systems are used in many applications to mitigate threats posed by high levels of radioactive material. However, these systems lack protection against malicious external attacks to modify the data. The novelty of applying Intrusion Detection Systems (IDS) in RDSs is a crucial element in safeguarding these critical infrastructures. While IDSs are widely used in networking environments to safeguard against various attacks, their application in RDSs is novel. A common attack on RDSs is Denial of Service (DoS), where the attacker aims to overwhelm the system, causing malfunctioning RDSs. This paper proposes an efficient Machine Learning (ML)-based IDS to detect anomalies in radiation data, focusing on DoS attacks. This work explores the use of sampling methods to create a simulated DoS attack based on a real radiation dataset, followed by an evaluation of various ML algorithms, including Random Forest, Support Vector Machine (SVM), logistic regression, and Light Gradient-Boosting Machine (LightGBM), to detect DoS attacks on RDSs. LightGBM is emphasized for its superior accuracy and low computational resource consumption, making it particularly suitable for real-time intrusion detection. Additionally, model optimization and TinyML techniques, including feature selection, parallel execution, and random search methods, are used to improve the efficiency of the proposed IDS. Finally, an optimized and efficient LightGBM-based IDS is developed to achieve accurate intrusion detection for RDSs.

Paperid: 1696, https://arxiv.org/pdf/2509.01592.pdf

Abstract:
Radiation Detection Systems (RDSs) play a vital role in ensuring public safety across various settings, from nuclear facilities to medical environments. However, these systems are increasingly vulnerable to cyber-attacks such as data injection, man-in-the-middle (MITM) attacks, ICMP floods, botnet attacks, privilege escalation, and distributed denial-of-service (DDoS) attacks. Such threats could compromise the integrity and reliability of radiation measurements, posing significant public health and safety risks. This paper presents a new synthetic radiation dataset and an Intrusion Detection System (IDS) tailored for resource-constrained environments, bringing Machine Learning (ML) predictive capabilities closer to the sensing edge layer of critical infrastructure. Leveraging TinyML techniques, the proposed IDS employs an optimized XGBoost model enhanced with pruning, quantization, feature selection, and sampling. These TinyML techniques significantly reduce the size of the model and computational demands, enabling real-time intrusion detection on low-resource devices while maintaining a reasonable balance between efficiency and accuracy.

Paperid: 1697, https://arxiv.org/pdf/2509.01178.pdf

Abstract:
In two-party secret sharing scheme, values are typically encoded as unsigned integers $\mathsf{uint}(x)$, whereas real-world applications often require computations on signed real numbers $\mathsf{Real}(x)$. To enable secure evaluation of practical functions, it is essential to computing $\mathsf{Real}(x)$ from shared inputs, as protocols take shares as input. At USENIX'25, Guo et al. proposed an efficient method for computing signed integer values $\mathsf{int}(x)$ from shares, which can be extended to compute $\mathsf{Real}(x)$. However, their approach imposes a restrictive input constraint $|x| < \frac{L}{3}$ for $x \in \mathbb{Z}_L$, limiting its applicability in real-world scenarios. In this work, we significantly relax this constraint to $|x| < B$ for any $B \leq \frac{L}{2}$, where $B = \frac{L}{2}$ corresponding to the natural representable range in $x \in \mathbb{Z}_L$. This relaxes the restrictions and enables the computation of $\mathsf{Real}(x)$ with loose or no input constraints. Building upon this foundation, we present a generalized framework for designing secure protocols for a broad class of functions, including integer division ($\lfloor \frac{x}{d} \rfloor$), trigonometric ($\sin(x)$) and exponential ($e^{-x}$) functions. Our experimental evaluation demonstrates that the proposed protocols achieve both high efficiency and high accuracy. Notably, our protocol for evaluating $e^{-x}$ reduces communication costs to approximately 31% of those in SirNN (S&P 21) and Bolt (S&P 24), with runtime speedups of up to $5.53 \times$ and $3.09 \times$, respectively. In terms of accuracy, our protocol achieves a maximum ULP error of $1.435$, compared to $2.64$ for SirNN and $8.681$ for Bolt.

Paperid: 1698, https://arxiv.org/pdf/2509.00812.pdf

Abstract:
Cloud quantum services can reveal circuit structure and timing through scheduler metadata, latency patterns, and co-tenant interference. We introduce NADGO (Noise-Adaptive Dummy-Gate Obfuscation), a scheduling and obfuscation stack that enforces operational privacy for gate-model workloads by applying per-interval limits on observable information leakage. To support confidentiality and fair multi-tenancy, operators require a method to audit compliance at acceptable overheads. NADGO combines: (i) hardware-aware t-design padding for structured cover traffic, (ii) particle-filter timing randomization to mask queue patterns, (iii) CASQUE subcircuit routing across heterogeneous backends, and (iv) a per-interval leakage estimator with locked calibration artifacts and a dual-threshold kill-switch. We prototype the approach on a 4-qubit superconducting tile with cryo-CMOS control and evaluate both depth-varied local-random circuits and small QAOA instances. Monitoring runs at a 6.3 microsecond control interval, and per-interval decisions are recorded in an append-only, hash-chained audit log. Across Monte Carlo (Tier 1) and cloud-hardware emulation (Tier 2) evaluations, NADGO maintains leakage within budget in nominal operation (interval-abort rate below 1 percent) and under attack yields high separation with concentrated aborts. At matched leakage targets, microbenchmarks indicate lower latency and cryogenic power consumption than static padding, while end-to-end workloads maintain competitive cost envelopes.

Paperid: 1699, https://arxiv.org/pdf/2509.00811.pdf

Abstract:
We present MaestroCut, a closed-loop framework for quantum circuit cutting that adapts partitioning and shot allocation to device drift and workload variation. MaestroCut tracks a variance proxy in real time, triggers re-cutting when accuracy degrades, and routes shots using topology-aware priors. An online estimator cascade (MLE, Bayesian, GP-assisted) selects the lowest-error reconstruction within a fixed budget. Tier-1 simulations show consistent variance contraction and reduced mean-squared error versus uniform and proportional baselines. Tier-2 emulation with realistic queueing and noise demonstrates stable latency targets, high reliability, and ~1% software overhead under stress scenarios. These results indicate that adaptive circuit cutting can provide accuracy and efficiency improvements with minimal operational cost on near-term hardware.

Paperid: 1700, https://arxiv.org/pdf/2509.00615.pdf

Abstract:
We investigate how to calculate Kaplan-Meier survival curves across multiple health-care jurisdictions while protecting patient privacy with node-level differential privacy. Each site discloses its curve only once, adding Laplace noise whose scale is determined by the length of the common time grid; the server then averages the noisy curves, so the overall privacy budget remains unchanged. We benchmark four one-shot smoothing techniques: Discrete Cosine Transform, Haar Wavelet shrinkage, adaptive Total-Variation denoising, and a parametric Weibull fit on the NCCTG lung-cancer cohort under five privacy levels and three partition scenarios (uniform, moderately skewed, highly imbalanced). Total-Variation gives the best mean accuracy, whereas the frequency-domain smoothers offer stronger worst-case robustness and the Weibull model shows the most stable behaviour at the strictest privacy setting. Across all methods the released curves keep the empirical log-rank type-I error below fifteen percent for privacy budgets of 0.5 and higher, demonstrating that clinically useful survival information can be shared without iterative training or heavy cryptography.

Paperid: 1701, https://arxiv.org/pdf/2509.00081.pdf

Abstract:
Effective Cyber Threat Intelligence (CTI) relies upon accurately structured and semantically enriched information extracted from cybersecurity system logs. However, current methodologies often struggle to identify and interpret malicious events reliably and transparently, particularly in cases involving unstructured or ambiguous log entries. In this work, we propose a novel methodology that combines ontology-driven structured outputs with Large Language Models (LLMs), to build an Artificial Intelligence (AI) agent that improves the accuracy and explainability of information extraction from cybersecurity logs. Central to our approach is the integration of domain ontologies and SHACL-based constraints to guide the language model's output structure and enforce semantic validity over the resulting graph. Extracted information is organized into an ontology-enriched graph database, enabling future semantic analysis and querying. The design of our methodology is motivated by the analytical requirements associated with honeypot log data, which typically comprises predominantly malicious activity. While our case study illustrates the relevance of this scenario, the experimental evaluation is conducted using publicly available datasets. Results demonstrate that our method achieves higher accuracy in information extraction compared to traditional prompt-only approaches, with a deliberate focus on extraction quality rather than processing speed.

Paperid: 1702, https://arxiv.org/pdf/2508.21302.pdf

Abstract:
Directed fuzzing aims to find program inputs that lead to specified target program states. It has broad applications, such as debugging system crashes, confirming reported bugs, and generating exploits for potential vulnerabilities. This task is inherently challenging because target states are often deeply nested in the program, while the search space manifested by numerous possible program inputs is prohibitively large. Existing approaches rely on branch distances or manually-specified constraints to guide the search; however, the branches alone are often insufficient to precisely characterize progress toward reaching the target states, while the manually specified constraints are often tailored for specific bug types and thus difficult to generalize to diverse target states and programs. We present Locus, a novel framework to improve the efficiency of directed fuzzing. Our key insight is to synthesize predicates to capture fuzzing progress as semantically meaningful intermediate states, serving as milestones towards reaching the target states. When used to instrument the program under fuzzing, they can reject executions unlikely to reach the target states, while providing additional coverage guidance. To automate this task and generalize to diverse programs, Locus features an agentic framework with program analysis tools to synthesize and iteratively refine the candidate predicates, while ensuring the predicates strictly relax the target states to prevent false rejections via symbolic execution. Our evaluation shows that Locus substantially improves the efficiency of eight state-of-the-art fuzzers in discovering real-world vulnerabilities, achieving an average speedup of 41.6x. So far, Locus has found eight previously unpatched bugs, with one already acknowledged with a draft patch.

Paperid: 1703, https://arxiv.org/pdf/2508.18933.pdf

Abstract:
Automated detection of vulnerabilities in source code is an essential cybersecurity challenge, underpinning trust in digital systems and services. Graph Neural Networks (GNNs) have emerged as a promising approach as they can learn structural and logical code relationships in a data-driven manner. However, their performance is severely constrained by training data imbalances and label noise. GNNs often learn 'spurious' correlations from superficial code similarities, producing detectors that fail to generalize well to unseen real-world data. In this work, we propose a unified framework for robust and interpretable vulnerability detection, called VISION, to mitigate spurious correlations by systematically augmenting a counterfactual training dataset. Counterfactuals are samples with minimal semantic modifications but opposite labels. Our framework includes: (i) generating counterfactuals by prompting a Large Language Model (LLM); (ii) targeted GNN training on paired code examples with opposite labels; and (iii) graph-based interpretability to identify the crucial code statements relevant for vulnerability predictions while ignoring spurious ones. We find that VISION reduces spurious learning and enables more robust, generalizable detection, improving overall accuracy (from 51.8% to 97.8%), pairwise contrast accuracy (from 4.5% to 95.8%), and worst-group accuracy (from 0.7% to 85.5%) on the Common Weakness Enumeration (CWE)-20 vulnerability. We further demonstrate gains using proposed metrics: intra-class attribution variance, inter-class attribution distance, and node score dependency. We also release CWE-20-CFA, a benchmark of 27,556 functions (real and counterfactual) from the high-impact CWE-20 category. Finally, VISION advances transparent and trustworthy AI-based cybersecurity systems through interactive visualization for human-in-the-loop analysis.

Paperid: 1704, https://arxiv.org/pdf/2508.17913.pdf

Abstract:
Digital twin (DT) technology is rapidly becoming essential for smart city ecosystems, enabling real-time synchronisation and autonomous decision-making across physical and digital domains. However, as DTs take active roles in control loops, securely binding them to their physical counterparts in dynamic and adversarial environments remains a significant challenge. Existing authentication solutions either rely on static trust models, require centralised authorities, or fail to provide live and verifiable physical-digital binding, making them unsuitable for latency-sensitive and distributed deployments. To address this gap, we introduce PRZK-Bind, a lightweight and decentralised authentication protocol that combines Schnorr-based zero-knowledge proofs with elliptic curve cryptography to establish secure, real-time correspondence between physical entities and DTs without relying on pre-shared secrets. Simulation results show that PRZK-Bind significantly improves performance, offering up to 4.5 times lower latency and 4 times reduced energy consumption compared to cryptography-heavy baselines, while maintaining false acceptance rates more than 10 times lower. These findings highlight its suitability for future smart city deployments requiring efficient, resilient, and trustworthy DT authentication.

Paperid: 1705, https://arxiv.org/pdf/2508.14230.pdf

Abstract:
Digital societies increasingly rely on trustworthy proofs of physical presence for services such as supply-chain tracking, e-voting, ride-sharing, and location-based rewards. Yet, traditional localization methods often lack cryptographic guarantees of where and when an entity was present, leaving them vulnerable to spoofing, replay, or collusion attacks. In response, research on Proof-of-Location (PoL) has emerged, with recent approaches combining distance bounding, distributed consensus, and privacy-enhancing techniques to enable verifiable, tamper-resistant location claims. As the design space for PoL systems grows in complexity, this paper provides a unified framework to help practitioners navigate diverse application needs. We first propose a taxonomy identifying four core domains: (1) cryptographic guarantees, (2) spatio-temporal synchronization, (3) trust and witness models, and (4) interaction and overhead. Building on this, we introduce a methodology to map application-specific requirements onto appropriate PoL architectures. We illustrate this process through three use cases (retail e-coupons, supply chain auditing, and physical e-voting), each showing how different constraints shape protocol choices. Overall, this work offers a structured approach to building secure, scalable, and interoperable PoL systems.

Paperid: 1706, https://arxiv.org/pdf/2508.13240.pdf

Abstract:
Understanding and quantifying human cognitive biases from empirical data has long posed a formidable challenge, particularly in cybersecurity, where defending against unknown adversaries is paramount. Traditional cyber defense strategies have largely focused on fortification, while some approaches attempt to anticipate attacker strategies by mapping them to cognitive vulnerabilities, yet they fall short in dynamically interpreting attacks in progress. In recognition of this gap, IARPA's ReSCIND program seeks to infer, defend against, and even exploit attacker cognitive traits. In this paper, we present a novel methodology that leverages large language models (LLMs) to extract quantifiable insights into the cognitive bias of loss aversion from hacker behavior. Our data are collected from an experiment in which hackers were recruited to attack a controlled demonstration network. We process the hacker generated notes using LLMs using it to segment the various actions and correlate the actions to predefined persistence mechanisms used by hackers. By correlating the implementation of these mechanisms with various operational triggers, our analysis provides new insights into how loss aversion manifests in hacker decision-making. The results demonstrate that LLMs can effectively dissect and interpret nuanced behavioral patterns, thereby offering a transformative approach to enhancing cyber defense strategies through real-time, behavior-based analysis.

Paperid: 1707, https://arxiv.org/pdf/2508.13220.pdf

Abstract:
Large Language Models (LLMs) are increasingly integrated into real-world applications via the Model Context Protocol (MCP), a universal, open standard for connecting AI agents with data sources and external tools. While MCP enhances the capabilities of LLM-based agents, it also introduces new security risks and expands their attack surfaces. In this paper, we present the first systematic taxonomy of MCP security, identifying 17 attack types across 4 primary attack surfaces. We introduce MCPSecBench, a comprehensive security benchmark and playground that integrates prompt datasets, MCP servers, MCP clients, and attack scripts to evaluate these attacks across three major MCP providers. Our benchmark is modular and extensible, allowing researchers to incorporate custom implementations of clients, servers, and transport protocols for systematic security assessment. Experimental results show that over 85% of the identified attacks successfully compromise at least one platform, with core vulnerabilities universally affecting Claude, OpenAI, and Cursor, while prompt-based and tool-centric attacks exhibit considerable variability across different hosts and models. Overall, MCPSecBench standardizes the evaluation of MCP security and enables rigorous testing across all MCP layers.

Paperid: 1708, https://arxiv.org/pdf/2508.12259.pdf

Abstract:
This paper presents a Unified Security Architecture that fortifies the Agentic Web through a Zero-Trust IAM framework. This architecture is built on a foundation of rich, verifiable agent identities using Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), with discovery managed by a protocol-agnostic Agent Name Service (ANS). Security is operationalized through a multi-layered Trust Fabric which introduces significant innovations, including Trust-Adaptive Runtime Environments (TARE), Causal Chain Auditing, and Dynamic Identity with Behavioral Attestation. By explicitly linking the LPCI threat to these enhanced architectural countermeasures within a formal security model, we propose a comprehensive and forward-looking blueprint for a secure, resilient, and trustworthy agentic ecosystem. Our formal analysis demonstrates that the proposed architecture provides provable security guarantees against LPCI attacks with bounded probability of success.

Paperid: 1709, https://arxiv.org/pdf/2508.09477.pdf

Abstract:
With the rapid advancement of AI generative models, the visual quality of AI-generated images (AIIs) has become increasingly close to natural images, which inevitably raises security concerns. Most AII detectors often employ the conventional image classification pipeline with natural images and AIIs (generated by a generative model), which can result in limited detection performance for AIIs from unseen generative models. To solve this, we proposed a universal AI-generated image detector from the perspective of anomaly detection. Our discriminator does not need to access any AIIs and learn a generalizable representation with unsupervised learning. Specifically, we use the pre-trained CLIP encoder as the feature extractor and design a normalizing flow-like unsupervised model. Instead of AIIs, proxy images, e.g., obtained by applying a spectral modification operation on natural images, are used for training. Our models are trained by minimizing the likelihood of proxy images, optionally combined with maximizing the likelihood of natural images. Extensive experiments demonstrate the effectiveness of our method on AIIs produced by various image generators.

Paperid: 1710, https://arxiv.org/pdf/2508.06244.pdf

Abstract:
Machine learning models have been shown to be susceptible to membership inference attack, which can be used to determine whether a given sample appears in the training data. Existing membership inference methods commonly assume that the adversary has full access to the features of the target sample. This assumption, however, does not hold in many real-world scenarios where only partial features information is available, thereby limiting the applicability of these methods. In this work, we study an inference scenario where the adversary observes only partial features of each sample and aims to infer whether this observed subset was present in the training set of the target model. We define this problem as Partial Feature Membership Inference (PFMI). To address this problem, we propose MRAD (Memory-guided Reconstruction and Anomaly Detection), a two-stage attack framework. In the first stage, MRAD optimizes the unknown feature values to minimize the loss of the sample. In the second stage, it measures the deviation between the reconstructed sample and the training distribution using anomaly detection. Empirical results demonstrate that MRAD is effective across a range of datasets, and maintains compatibility with various off-the-shelf anomaly detection techniques. For example, on STL-10, our attack achieves an AUC of around 0.6 even with 40% of the missing features.

Paperid: 1711, https://arxiv.org/pdf/2508.04189.pdf

Abstract:
Multivariate Long-Term Time Series Forecasting (MLTSF) models are increasingly deployed in critical domains such as climate, finance, and transportation. Although a variety of powerful MLTSF models have been proposed to improve predictive performance, the robustness of MLTSF models against malicious backdoor attacks remains entirely unexplored, which is crucial to ensuring their reliable and trustworthy deployment. To address this gap, we conduct an in-depth study on backdoor attacks against MLTSF models and propose the first effective attack method named BadTime. BadTime executes a backdoor attack by poisoning training data and customizing the backdoor training process. During data poisoning, BadTime proposes a contrast-guided strategy to select the most suitable training samples for poisoning, then employs a graph attention network to identify influential variables for trigger injection. Subsequently, BadTime further localizes optimal positions for trigger injection based on lag analysis and proposes a puzzle-like trigger structure that distributes the trigger across multiple poisoned variables to jointly steer the prediction of the target variable. During backdoor training, BadTime alternately optimizes the model and triggers via proposed tailored optimization objectives. Extensive experiments show that BadTime significantly outperforms state-of-the-art (SOTA) backdoor attacks on time series forecasting by reducing MAE by over 50% on target variables and boosting stealthiness by more than 3 times.

Paperid: 1712, https://arxiv.org/pdf/2508.04039.pdf

Abstract:
Jailbreaking -- bypassing built-in safety mechanisms in AI models -- has traditionally required complex technical procedures or specialized human expertise. In this study, we show that the persuasive capabilities of large reasoning models (LRMs) simplify and scale jailbreaking, converting it into an inexpensive activity accessible to non-experts. We evaluated the capabilities of four LRMs (DeepSeek-R1, Gemini 2.5 Flash, Grok 3 Mini, Qwen3 235B) to act as autonomous adversaries conducting multi-turn conversations with nine widely used target models. LRMs received instructions via a system prompt, before proceeding to planning and executing jailbreaks with no further supervision. We performed extensive experiments with a benchmark of harmful prompts composed of 70 items covering seven sensitive domains. This setup yielded an overall attack success rate across all model combinations of 97.14%. Our study reveals an alignment regression, in which LRMs can systematically erode the safety guardrails of other models, highlighting the urgent need to further align frontier models not only to resist jailbreak attempts, but also to prevent them from being co-opted into acting as jailbreak agents.

Paperid: 1713, https://arxiv.org/pdf/2508.03307.pdf

Abstract:
In this paper, we endeavor to address the challenges of backdoor attacks countermeasures in black-box scenarios, thereby fortifying the security of inference under MLaaS. We first categorize backdoor triggers from a new perspective, i.e., their impact on the patched area, and divide them into: high-visibility triggers (HVT), semi-visibility triggers (SVT), and low-visibility triggers (LVT). Based on this classification, we propose a progressive defense framework, BDFirewall, that removes these triggers from the most conspicuous to the most subtle, without requiring model access. First, for HVTs, which create the most significant local semantic distortions, we identify and eliminate them by detecting these salient differences. We then restore the patched area to mitigate the adverse impact of such removal process. The localized purification designed for HVTs is, however, ineffective against SVTs, which globally perturb benign features. We therefore model an SVT-poisoned input as a mixture of a trigger and benign features, where we unconventionally treat the benign features as "noise". This formulation allows us to reconstruct SVTs by applying a denoising process that removes these benign "noise" features. The SVT-free input is then obtained by subtracting the reconstructed trigger. Finally, to neutralize the nearly imperceptible but fragile LVTs, we introduce lightweight noise to disrupt the trigger pattern and then apply DDPM to restore any collateral impact on clean features. Comprehensive experiments demonstrate that our method outperforms state-of-the-art defenses. Compared with baselines, BDFirewall reduces the Attack Success Rate (ASR) by an average of 33.25%, improving poisoned sample accuracy (PA) by 29.64%, and achieving up to a 111x speedup in inference time. Code will be made publicly available upon acceptance.

Paperid: 1714, https://arxiv.org/pdf/2508.02461.pdf

Abstract:
Intelligent Transportation Systems (ITS) fundamentally rely on vehicle-generated data for applications such as congestion monitoring and route optimization, making the preservation of user privacy a critical challenge. Homomorphic Encryption (HE) offers a promising solution by enabling computation on encrypted data without revealing underlying content. This study presents the first real-world experimental evaluation of three post-quantum secure HE schemes, i.e., Brakerski-Fan-Vercauteren (BFV), Brakerski-Gentry-Vaikuntanathan (BGV), and Cheon-Kim-Kim-Song (CKKS), for vehicular communication scenarios. Two representative privacy-preserving use cases are considered: encrypted vehicle counting and average speed aggregation. Experiments are conducted over both Wi-Fi and Ethernet to assess performance under wireless and wired vehicle-to-everything (V2X) settings. Results show that BFV and BGV are suitable for latency-tolerant applications such as intersection monitoring and regional traffic analysis, with total end-to-end latencies under 10 seconds. While CKKS experiences higher overhead, it remains viable for periodic encrypted aggregation of numerical data. The experimental results demonstrate that HE can be feasibly deployed in ITS environments under 128-bit post-quantum security, provided that scheme-specific latency constraints are considered. This reinforces its potential to serve as a foundational tool for secure and privacy-preserving V2X data processing.

Paperid: 1715, https://arxiv.org/pdf/2508.01997.pdf

Abstract:
The rapid advancement and widespread adoption of generative artificial intelligence (AI) pose significant threats to the integrity of personal identity, including digital cloning, sophisticated impersonation, and the unauthorized monetization of identity-related data. Mitigating these risks necessitates the development of robust AI-generated content detection systems, enhanced legal frameworks, and ethical guidelines. This paper introduces the Digital Identity Rights Framework (DIRF), a structured security and governance model designed to protect behavioral, biometric, and personality-based digital likeness attributes to address this critical need. Structured across nine domains and 63 controls, DIRF integrates legal, technical, and hybrid enforcement mechanisms to secure digital identity consent, traceability, and monetization. We present the architectural foundations, enforcement strategies, and key use cases supporting the need for a unified framework. This work aims to inform platform builders, legal entities, and regulators about the essential controls needed to enforce identity rights in AI-driven systems.

Paperid: 1716, https://arxiv.org/pdf/2508.00659.pdf

Abstract:
Online services often require users to agree to lengthy and obscure Terms of Service (ToS), leading to information asymmetry and legal risks. This paper proposes TOSense-a Chrome extension that allows users to ask questions about ToS in natural language and get concise answers in real time. The system combines (i) a crawler "tos-crawl" that automatically extracts ToS content, and (ii) a lightweight large language model pipeline: MiniLM for semantic retrieval and BART-encoder for answer relevance verification. To avoid expensive manual annotation, we present a novel Question Answering Evaluation Pipeline (QEP) that generates synthetic questions and verifies the correctness of answers using clustered topic matching. Experiments on five major platforms, Apple, Google, X (formerly Twitter), Microsoft, and Netflix, show the effectiveness of TOSense (with up to 44.5% accuracy) across varying number of topic clusters. During the demonstration, we will showcase TOSense in action. Attendees will be able to experience seamless extraction, interactive question answering, and instant indexing of new sites.

Paperid: 1717, https://arxiv.org/pdf/2507.22611.pdf

Abstract:
Blockchain technology is widely used in various fields due to its ability to provide decentralization and trustless security. This is a fundamental understanding held by many advocates, but it is misunderstood, leading participants to fail to recognize the limitations of the security that blockchain can provide. Among all current network attacks, Denial of Service (DoS) attacks pose significant threats due to their ease of execution and destructive potential. This paper, based on the blockchain architecture hierarchy, categorizes and organizes existing DoS attacks, with a focus on explaining the principles and methods of contract layer and consensus layer DoS attacks. Furthermore, this paper comprehensively analyzes and compares commonly used detection methods and defense technologies, which will contribute to strengthening the security and stability of blockchain systems and promoting further innovation and application of blockchain systems.

Paperid: 1718, https://arxiv.org/pdf/2507.22153.pdf

Abstract:
Photorealistic 3D avatar generation has rapidly improved in recent years, and realistic avatars that match a user's true appearance are more feasible in Mixed Reality (MR) than ever before. Yet, there are known risks to sharing one's likeness online, and photorealistic MR avatars could exacerbate these risks. If user likenesses were to be shared broadly, there are risks for cyber abuse or targeted fraud based on user appearances. We propose an alternate avatar rendering scheme for broader social MR -- synthesizing realistic avatars that preserve a user's demographic identity while being distinct enough from the individual user to protect facial biometric information. We introduce a methodology for privatizing appearance by isolating identity within the feature space of identity-encoding generative models. We develop two algorithms that then obfuscate identity: \epsmethod{} provides differential privacy guarantees and \thetamethod{} provides fine-grained control for the level of identity offset. These methods are shown to successfully generate de-identified virtual avatars across multiple generative architectures in 2D and 3D. With these techniques, it is possible to protect user privacy while largely preserving attributes related to sense of self. Employing these techniques in public settings could enable the use of photorealistic avatars broadly in MR, maintaining high realism and immersion without privacy risk.

Paperid: 1719, https://arxiv.org/pdf/2507.21157.pdf

Abstract:
The rapid advancement of Generative Artificial Intelligence has fueled deepfake proliferation-synthetic media encompassing fully generated content and subtly edited authentic material-posing challenges to digital security, misinformation mitigation, and identity preservation. This systematic review evaluates state-of-the-art deepfake detection methodologies, emphasizing reproducible implementations for transparency and validation. We delineate two core paradigms: (1) detection of fully synthetic media leveraging statistical anomalies and hierarchical feature extraction, and (2) localization of manipulated regions within authentic content employing multi-modal cues such as visual artifacts and temporal inconsistencies. These approaches, spanning uni-modal and multi-modal frameworks, demonstrate notable precision and adaptability in controlled settings, effectively identifying manipulations through advanced learning techniques and cross-modal fusion. However, comprehensive assessment reveals insufficient evaluation of adversarial robustness across both paradigms. Current methods exhibit vulnerability to adversarial perturbations-subtle alterations designed to evade detection-undermining reliability in real-world adversarial contexts. This gap highlights critical disconnect between methodological development and evolving threat landscapes. To address this, we contribute a curated GitHub repository aggregating open-source implementations, enabling replication and testing. Our findings emphasize urgent need for future work prioritizing adversarial resilience, advocating scalable, modality-agnostic architectures capable of withstanding sophisticated manipulations. This review synthesizes strengths and shortcomings of contemporary deepfake detection while charting paths toward robust trustworthy systems.

Paperid: 1720, https://arxiv.org/pdf/2507.21139.pdf

Abstract:
Publishing graph data is widely desired to enable a variety of structural analyses and downstream tasks. However, it also potentially poses severe privacy leakage, as attackers may leverage the released graph data to launch attacks and precisely infer private information such as the existence of hidden sensitive links in the graph. Prior studies on privacy-preserving graph data publishing relied on heuristic graph modification strategies and it is difficult to determine the graph with the optimal privacy--utility trade-off for publishing. In contrast, we propose the first privacy-preserving graph structure learning framework against sensitive link inference attacks, named PPGSL, which can automatically learn a graph with the optimal privacy--utility trade-off. The PPGSL operates by first simulating a powerful surrogate attacker conducting sensitive link attacks on a given graph. It then trains a parameterized graph to defend against the simulated adversarial attacks while maintaining the favorable utility of the original graph. To learn the parameters of both parts of the PPGSL, we introduce a secure iterative training protocol. It can enhance privacy preservation and ensure stable convergence during the training process, as supported by the theoretical proof. Additionally, we incorporate multiple acceleration techniques to improve the efficiency of the PPGSL in handling large-scale graphs. The experimental results confirm that the PPGSL achieves state-of-the-art privacy--utility trade-off performance and effectively thwarts various sensitive link inference attacks.

Paperid: 1721, https://arxiv.org/pdf/2507.17074.pdf

Abstract:
The advent of quantum computing threatens the security of classical public-key cryptographic systems, prompting the transition to post-quantum cryptography (PQC). While PQC has been analyzed in theory, its performance in practical wireless communication environments remains underexplored. This paper presents a detailed implementation and performance evaluation of NIST-selected PQC algorithms in user equipment (UE) to UE communications over 5G networks. Using a full 5G emulation stack (Open5GS and UERANSIM) and PQC-enabled TLS 1.3 via BoringSSL and liboqs, we examine key encapsulation mechanisms and digital signature schemes across realistic network conditions. We evaluate performance based on handshake latency, CPU and memory usage, bandwidth, and retransmission rates, under varying cryptographic configurations and client loads. Our findings show that ML-KEM with ML-DSA offers the best efficiency for latency-sensitive applications, while SPHINCS+ and HQC combinations incur higher computational and transmission overheads, making them unsuitable for security-critical but time-sensitive 5G scenarios.

Paperid: 1722, https://arxiv.org/pdf/2507.16585.pdf

Abstract:
Software vulnerabilities present a persistent security challenge, with over 25,000 new vulnerabilities reported in the Common Vulnerabilities and Exposures (CVE) database in 2024 alone. While deep learning based approaches show promise for vulnerability detection, recent studies reveal critical limitations in terms of accuracy and robustness: accuracy drops by up to 45% on rigorously verified datasets, and performance degrades significantly under simple code modifications. This paper presents LLMxCPG, a novel framework integrating Code Property Graphs (CPG) with Large Language Models (LLM) for robust vulnerability detection. Our CPG-based slice construction technique reduces code size by 67.84 to 90.93% while preserving vulnerability-relevant context. Our approach's ability to provide a more concise and accurate representation of code snippets enables the analysis of larger code segments, including entire projects. This concise representation is a key factor behind the improved detection capabilities of our method, as it can now identify vulnerabilities that span multiple functions. Empirical evaluation demonstrates LLMxCPG's effectiveness across verified datasets, achieving 15-40% improvements in F1-score over state-of-the-art baselines. Moreover, LLMxCPG maintains high performance across function-level and multi-function codebases while exhibiting robust detection efficacy under various syntactic code modifications.

Paperid: 1723, https://arxiv.org/pdf/2507.14839.pdf

Abstract:
With rapid advancements in quantum computing, it is widely believed that there will be quantum hardware capable of compromising classical cryptography and hence, the internet and the current information security infrastructure in the coming decade. This is mainly due to the operational realizations of quantum algorithms such as Grover and Shor, to which the current classical encryption protocols are vulnerable. Blockchains, i.e., blockchain data structures and their data, rely heavily on classical cryptography. One approach to secure blockchain is to attempt to achieve information theoretical security by defining blockchain on quantum technologies. There have been two conceptualizations of blockchains on quantum registers: the time-entangled Greenberger-Horne-Zeilinger (GHZ) state blockchain and the quantum hypergraph blockchain. On our part, an attempt is made to conceptualize a new quantum blockchain combining features of both these schemes to achieve the absolute security of the time-temporal GHZ blockchain and the scalability and efficiency of the quantum hypergraph blockchain in the proposed quantum blockchain protocol.

Paperid: 1724, https://arxiv.org/pdf/2507.14214.pdf

Abstract:
In modern times, people have numerous online accounts, but they rarely read the Terms of Service or Privacy Policy of those sites despite claiming otherwise. This paper introduces PoliAnalyzer, a neuro-symbolic system that assists users with personalized privacy policy analysis. PoliAnalyzer uses Natural Language Processing (NLP) to extract formal representations of data usage practices from policy texts. In favor of deterministic, logical inference is applied to compare user preferences with the formal privacy policy representation and produce a compliance report. To achieve this, we extend an existing formal Data Terms of Use policy language to model privacy policies as app policies and user preferences as data policies. In our evaluation using our enriched PolicyIE dataset curated by legal experts, PoliAnalyzer demonstrated high accuracy in identifying relevant data usage practices, achieving F1-score of 90-100% across most tasks. Additionally, we demonstrate how PoliAnalyzer can model diverse user data-sharing preferences, derived from prior research as 23 user profiles, and perform compliance analysis against the top 100 most-visited websites. This analysis revealed that, on average, 95.2% of a privacy policy's segments do not conflict with the analyzed user preferences, enabling users to concentrate on understanding the 4.8% (636 / 13205) that violates preferences, significantly reducing cognitive burden. Further, we identified common practices in privacy policies that violate user expectations - such as the sharing of location data with 3rd parties. This paper demonstrates that PoliAnalyzer can support automated personalized privacy policy analysis at scale using off-the-shelf NLP tools. This sheds light on a pathway to help individuals regain control over their data and encourage societal discussions on platform data practices to promote a fairer power dynamic.

Paperid: 1725, https://arxiv.org/pdf/2507.13639.pdf

Abstract:
We consider the problem of contextual kernel bandits with stochastic contexts, where the underlying reward function belongs to a known Reproducing Kernel Hilbert Space. We study this problem under an additional constraint of Differential Privacy, where the agent needs to ensure that the sequence of query points is differentially private with respect to both the sequence of contexts and rewards. We propose a novel algorithm that achieves the state-of-the-art cumulative regret of $\widetilde{\mathcal{O}}(\sqrt{Î³_TT}+\frac{Î³_T}{\varepsilon_{\mathrm{DP}}})$ and $\widetilde{\mathcal{O}}(\sqrt{Î³_TT}+\frac{Î³_T\sqrt{T}}{\varepsilon_{\mathrm{DP}}})$ over a time horizon of $T$ in the joint and local models of differential privacy, respectively, where $Î³_T$ is the effective dimension of the kernel and $\varepsilon_{\mathrm{DP}} > 0$ is the privacy parameter. The key ingredient of the proposed algorithm is a novel private kernel-ridge regression estimator which is based on a combination of private covariance estimation and private random projections. It offers a significantly reduced sensitivity compared to its classical counterpart while maintaining a high prediction accuracy, allowing our algorithm to achieve the state-of-the-art performance guarantees.

Paperid: 1726, https://arxiv.org/pdf/2507.12408.pdf

Abstract:
Non-local games are a powerful tool to distinguish between correlations possible in classical and quantum worlds. Kalai et al. (STOC'23) proposed a compiler that converts multipartite non-local games into interactive protocols with a single prover, relying on cryptographic tools to remove the assumption of physical separation of the players. While quantum completeness and classical soundness of the construction have been established for all multipartite games, quantum soundness is known only in the special case of bipartite games. In this paper, we prove that the Kalai et al.'s compiler indeed achieves quantum soundness for all multipartite compiled non-local games, by showing that any correlations that can be generated in the asymptotic case correspond to quantum commuting strategies. Our proof uses techniques from the theory of operator algebras, and relies on a characterisation of sequential operationally no-signalling strategies as quantum commuting operator strategies in the multipartite case, thereby generalising several previous results. On the way, we construct universal C*-algebras of sequential PVMs and prove a new chain rule for Radon-Nikodym derivatives of completely positive maps on C*-algebras which may be of independent interest.

Paperid: 1727, https://arxiv.org/pdf/2507.10457.pdf

Abstract:
The integration of large language models (LLMs) into enterprise systems has introduced a new class of covert security vulnerabilities, particularly within logic execution layers and persistent memory contexts. This paper introduces Logic-layer Prompt Control Injection (LPCI), a novel category of attacks that embeds encoded, delayed, and conditionally triggered payloads within memory, vector stores, or tool outputs. These payloads can bypass conventional input filters and trigger unauthorised behaviour across sessions.

Paperid: 1728, https://arxiv.org/pdf/2507.10103.pdf

Abstract:
Automated Program Repair (APR) is essential for ensuring software reliability and quality while enhancing efficiency and reducing developers' workload. Although rule-based and learning-based APR methods have demonstrated their effectiveness, their performance was constrained by the defect type of repair, the quality of training data, and the size of model parameters. Recently, Large Language Models (LLMs) combined with Retrieval-Augmented-Generation (RAG) have been increasingly adopted in APR tasks. However, current code LLMs and RAG designs neither fully address code repair tasks nor consider code-specific features. To overcome these limitations, we propose SelRepair, a novel APR approach with integration of a fine-tuned LLM with a newly-designed dual RAG module. This approach uses a bug-fix pair dataset for fine-tuning and incorporates semantic and syntactic/structural similarity information through an RAG selection gate. This design ensures relevant information is retrieved efficiently, thereby reducing token length and inference time. Evaluations on Java datasets show SelRepair outperforms other APR methods, achieving 26.29% and 17.64% in terms of exact match (EM) on different datasets while reducing inference time by at least 6.42% with controlled input lengths.

Paperid: 1729, https://arxiv.org/pdf/2507.07413.pdf

Abstract:
This paper presents a novel approach to intrusion detection by integrating traditional signature-based methods with the contextual understanding capabilities of the GPT-2 Large Language Model (LLM). As cyber threats become increasingly sophisticated, particularly in distributed, heterogeneous, and resource-constrained environments such as those enabled by the Internet of Things (IoT), the need for dynamic and adaptive Intrusion Detection Systems (IDSs) becomes increasingly urgent. While traditional methods remain effective for detecting known threats, they often fail to recognize new and evolving attack patterns. In contrast, GPT-2 excels at processing unstructured data and identifying complex semantic relationships, making it well-suited to uncovering subtle, zero-day attack vectors. We propose a hybrid IDS framework that merges the robustness of signature-based techniques with the adaptability of GPT-2-driven semantic analysis. Experimental evaluations on a representative intrusion dataset demonstrate that our model enhances detection accuracy by 6.3%, reduces false positives by 9.0%, and maintains near real-time responsiveness. These results affirm the potential of language model integration to build intelligent, scalable, and resilient cybersecurity defences suited for modern connected environments.

Paperid: 1730, https://arxiv.org/pdf/2507.06274.pdf

Abstract:
Watermarking is a promising defense against the misuse of large language models (LLMs), yet it remains vulnerable to scrubbing and spoofing attacks. This vulnerability stems from an inherent trade-off governed by watermark window size: smaller windows resist scrubbing better but are easier to reverse-engineer, enabling low-cost statistics-based spoofing attacks. This work breaks this trade-off by introducing a novel mechanism, equivalent texture keys, where multiple tokens within a watermark window can independently support the detection. Based on the redundancy, we propose a novel watermark scheme with Sub-vocabulary decomposed Equivalent tExture Key (SEEK). It achieves a Pareto improvement, increasing the resilience against scrubbing attacks without compromising robustness to spoofing. Experiments demonstrate SEEK's superiority over prior method, yielding spoofing robustness gains of +88.2%/+92.3%/+82.0% and scrubbing robustness gains of +10.2%/+6.4%/+24.6% across diverse dataset settings.

Paperid: 1731, https://arxiv.org/pdf/2507.06236.pdf

Abstract:
In the digital age, individuals increasingly maintain active presences across multiple platforms ranging from social media and messaging applications to professional and communication tools. However, the current model for managing user level privacy and abuse is siloed, requiring users to block undesirable contacts independently on each platform. This paper introduces Single Block On (SBO) a unified and interoperable system enabling users to block an individual once and have that block propagated across all integrated applications. SBO operates via identity based matching rules, utilizing configurable levels of identifier similarity, and interfaces with systems through standardized protocols such as SSO, LDAP, or direct REST integration. A novel Contact Rule Markup Language (CRML) facilitates consistent policy sharing across systems. The proposed solution increases user safety, enhances digital well-being, and sets a precedent for interoperable privacy enforcement.

Paperid: 1732, https://arxiv.org/pdf/2507.05556.pdf

Abstract:
Per-Row Activation Counting (PRAC), a DRAM read disturbance mitigation method, modifies key DRAM timing parameters, reportedly causing significant performance overheads in simulator-based studies. However, given known discrepancies between simulators and real hardware, real-machine experiments are vital for accurate PRAC performance estimation. We present the first real-machine performance analysis of PRAC. After verifying timing modifications on the latest CPUs using microbenchmarks, our analysis shows that PRAC's average and maximum overheads are just 1.06% and 3.28% for the SPEC CPU2017 workloads -- up to 9.15x lower than simulator-based reports. Further, we show that the close page policy minimizes this overhead by effectively hiding the elongated DRAM row precharge operations due to PRAC from the critical path.

Paperid: 1733, https://arxiv.org/pdf/2507.04501.pdf

Abstract:
We propose a public key encryption cryptosystem based on solutions of linear equation systems with predefinition of input parameters through shared secret computation for factorizable substitutions. The existence of multiple equivalent solutions for an underdetermined system of linear equations determines the impossibility of its resolution by a cryptanalyst in polynomial time. The completion of input parameters of the equation system is implemented through secret homomorphic matrix transformation for substitutions factorized over the basis of a vector space of dimension m over the field F2. Encryption is implemented through computation of substitutions that are one-way functions on an elementary abelian 2-group of order 2"m. Decryption is implemented through completion of input parameters of the equation system. Homomorphic transformations are constructed based on matrix computations. Matrix computations enable the implementation of high security and low computational overhead for homomorphic transformations.

Paperid: 1734, https://arxiv.org/pdf/2507.02964.pdf

Abstract:
While Large Language Models (LLMs) demonstrate exceptional natural language capabilities, general-purpose models lack specialized domain knowledge for effective cybersecurity analysis. In this work, we investigate Domain-Adaptive Continuous Pretraining (DAP) as a methodology for enhancing cybersecurity understanding in pretrained LLMs while preserving general language capabilities. We systematically adapted three decoder-based architectures -- Llama-3.1-8B, DeepSeek-R1-Distill-Qwen-14B, and Llama-3.3-70B-Instruct -- using a curated 126-million-word cybersecurity corpus from standards, academic literature, and various other sources. Our approach employed constrained training parameters and distributed FSDP training to balance domain specialization with knowledge preservation. Evaluation across three cybersecurity benchmarks, namely, CTI-MCQ, CyberMetric, and SecEval, demonstrates consistent improvements post-adaptation. The Llama-3.3-70B-Ins-DAP model achieved state-of-the-art accuracies of 0.718, 0.933, and 0.864, respectively, outperforming specialized models, including Llama-Primus-Base. Notably, competitive performance was achieved using substantially smaller datasets (118.8 million versus 2.77 billion tokens), demonstrating efficient domain specialization viability. We establish that targeted continuous pretraining enables effective cybersecurity domain adaptation with computational feasibility, providing foundations for specialized AI assistants in threat analysis, vulnerability assessment, and security documentation while challenging prevailing assumptions about data requirements for LLM specialization.

Paperid: 1735, https://arxiv.org/pdf/2507.02635.pdf

Abstract:
Electronic payment platforms are estimated to process billions oftransactions daily, with the cumulative value of these transactionspotentially reaching into the trillions. Even a minor error within thishigh-volume environment could precipitate substantial financiallosses. To mitigate this risk, manually constructed verification rules,developed by domain experts, are typically employed to identifyand scrutinize transactions in production environments. However,due to the absence of a systematic approach to ensure the robust-ness of these verification rules against vulnerabilities, they remainsusceptible to exploitation.To mitigate this risk, manually constructed verification rules, de-veloped by domain experts, are typically employed to identify andscrutinize transactions in production environments. However, dueto the absence of a systematic approach to ensure the robustness ofthese verification rules against vulnerabilities, they remain suscep-tible to exploitation. To ensure data security, database maintainersusually compose complex verification rules to check whether aquery/update request is valid. However, the rules written by ex-perts are usually imperfect, and malicious requests may bypassthese rules. As a result, the demand for identifying the defects ofthe rules systematically emerges.

Paperid: 1736, https://arxiv.org/pdf/2512.23785.pdf

Abstract:
Side-channel attacks try to extract secret information from a system by analyzing different side-channel signatures, such as power consumption, electromagnetic emanation, thermal dissipation, acoustics, time, etc. Power-based side-channel attack is one of the most prominent side-channel attacks in cybersecurity, which rely on data-dependent power variations in a system to extract sensitive information. While there are related surveys, they primarily focus on power side-channel attacks on cryptographic implementations. In recent years, power-side channel attacks have been explored in diverse application domains, including key extraction from cryptographic implementations, reverse engineering of machine learning models, user behavior data exploitation, and instruction-level disassembly. In this paper, we provide a comprehensive survey of power side-channel attacks and their countermeasures in different application domains. Specifically, this survey aims to classify recent power side-channel attacks and provide a comprehensive comparison based on application-specific considerations.

Paperid: 1737, https://arxiv.org/pdf/2512.23385.pdf

Abstract:
The rapid growth of Artificial Intelligence (AI) models and applications has led to an increasingly complex security landscape. Developers of AI projects must contend not only with traditional software supply chain issues but also with novel, AI-specific security threats. However, little is known about what security issues are commonly encountered and how they are resolved in practice. This gap hinders the development of effective security measures for each component of the AI supply chain. We bridge this gap by conducting an empirical investigation of developer-reported issues and solutions, based on discussions from Hugging Face and GitHub. To identify security-related discussions, we develop a pipeline that combines keyword matching with an optimal fine-tuned distilBERT classifier, which achieved the best performance in our extensive comparison of various deep learning and large language models. This pipeline produces a dataset of 312,868 security discussions, providing insights into the security reporting practices of AI applications and projects. We conduct a thematic analysis of 753 posts sampled from our dataset and uncover a fine-grained taxonomy of 32 security issues and 24 solutions across four themes: (1) System and Software, (2) External Tools and Ecosystem, (3) Model, and (4) Data. We reveal that many security issues arise from the complex dependencies and black-box nature of AI components. Notably, challenges related to Models and Data often lack concrete solutions. Our insights can offer evidence-based guidance for developers and researchers to address real-world security threats across the AI supply chain.

Paperid: 1738, https://arxiv.org/pdf/2512.22244.pdf

Abstract:
Autonomous vehicles rely on LiDAR based perception to support safety critical control functions such as adaptive cruise control and automatic emergency braking. While previous research has shown that LiDAR perception can be manipulated through object based spoofing and injection attacks, the impact of such attacks on vehicle safety controllers is still not well understood. This paper presents a systematic failure analysis of longitudinal safety controllers under object based LiDAR attacks in highway driving scenarios. The study focuses on realistic cut in and car following situations in which adversarial objects introduce persistent perception errors without directly modifying vehicle control software. A high fidelity simulation framework integrating LiDAR perception, object tracking, and closed loop vehicle control is used to evaluate how false and displaced object detections propagate through the perception planning and control pipeline. The results demonstrate that even short duration LiDAR induced object hallucinations can trigger unsafe braking, delayed responses to real hazards, and unstable control behavior. In cut in scenarios, a clear increase in unsafe deceleration events and time to collision violations is observed when compared to benign conditions, despite identical controller parameters. The analysis further shows that controller failures are more strongly influenced by the temporal consistency of spoofed objects than by spatial inaccuracies alone. These findings reveal a critical gap between perception robustness and control level safety guarantees in autonomous driving systems. By explicitly characterizing safety controller failure modes under adversarial perception, this work provides practical insights for the design of attack aware safety mechanisms and more resilient control strategies for LiDAR dependent autonomous vehicles.

Paperid: 1739, https://arxiv.org/pdf/2512.21374.pdf

Abstract:
Smart home IoT systems rely on authentication mechanisms to ensure that only authorized entities can control devices and access sensitive functionality. In practice, these mechanisms must balance security with usability, often favoring persistent connectivity and minimal user interaction. This paper presents an empirical analysis of authentication enforcement in deployed smart home IoT devices, focusing on how authentication state is established, reused, and validated during normal operation and under routine network conditions. A set of widely deployed consumer devices, including smart plugs, lighting devices, cameras, and a hub based ecosystem, was evaluated in a controlled residential environment using passive network measurement and controlled interaction through official mobile applications. Authentication behavior was examined during initial pairing, over extended periods of operation, after common network changes, and under replay attempts from a different local network host. The results show that authentication state established during pairing is consistently reused across control actions, persists for extended periods without explicit expiration, and remains valid after network events such as reconnection, address reassignment, and router reboot. Replay experiments demonstrate that previously observed authentication artifacts can often be reused to issue control commands from another host on the same local network with high success rates. These behaviors were observed across multiple device categories and ecosystems. The findings indicate that current smart home IoT authentication mechanisms rely on long lived trust relationships with limited binding to session freshness, network context, or controller identity.

Paperid: 1740, https://arxiv.org/pdf/2512.20405.pdf

Abstract:
Large Language Models (LLMs) like ChatGPT are now widely used in writing and reviewing scientific papers. While this trend accelerates publication growth and reduces human workload, it also introduces serious risks. Papers written or reviewed by LLMs may lack real novelty, contain fabricated or biased results, or mislead downstream research that others depend on. Such issues can damage reputations, waste resources, and even endanger lives when flawed studies influence medical or safety-critical systems. This research explores both the offensive and defensive sides of this growing threat. On the attack side, we demonstrate how an author can inject hidden prompts inside a PDF that secretly guide or "jailbreak" LLM reviewers into giving overly positive feedback and biased acceptance. On the defense side, we propose an "inject-and-detect" strategy for editors, where invisible trigger prompts are embedded into papers; if a review repeats or reacts to these triggers, it reveals that the review was generated by an LLM, not a human. This method turns prompt injections from vulnerability into a verification tool. We outline our design, expected model behaviors, and ethical safeguards for deployment. The goal is to expose how fragile today's peer-review process becomes under LLM influence and how editorial awareness can help restore trust in scientific evaluation.

Paperid: 1741, https://arxiv.org/pdf/2512.20077.pdf

Abstract:
Machine-learning (ML) classifiers are increasingly used in quantum computing systems to improve multi-qubit readout discrimination and to mitigate correlated readout errors. These ML classifiers are an integral component of today's quantum computer's control and readout stacks. This paper is the first to analyze the susceptibility of such ML classifiers to physical fault-injection which can result in generation of incorrect readout results from quantum computers. The study targets 5-qubit (thus 32-class) readout error-correction model. Using the ChipWhisperer Husky for physical voltage glitching, this work leverages an automated algorithm for scanning the fault injection parameter search space to find various successful faults in all the layers of the target ML model. Across repeated trials, this work finds that fault susceptibility is strongly layer-dependent: early-layers demonstrate higher rates of misprediction when faults are triggered in them, whereas later layers have smaller misprediction rates. This work further characterizes the resulting readout failures at the bitstring level using Hamming-distance and per-bit flip statistics, showing that single-shot glitches can induce structured readout corruption rather than purely random noise. These results motivate treating ML-based quantum computer readout and readout correction as a security-critical component of quantum systems and highlight the need for lightweight, deployment-friendly fault detection and redundancy mechanisms in the quantum computer readout pipelines.

Paperid: 1742, https://arxiv.org/pdf/2512.16292.pdf

Abstract:
Membership inference attacks (MIAs) pose a critical privacy threat to fine-tuned large language models (LLMs), especially when models are adapted to domain-specific tasks using sensitive data. While prior black-box MIA techniques rely on confidence scores or token likelihoods, these signals are often entangled with a sample's intrinsic properties - such as content difficulty or rarity - leading to poor generalization and low signal-to-noise ratios. In this paper, we propose ICP-MIA, a novel MIA framework grounded in the theory of training dynamics, particularly the phenomenon of diminishing returns during optimization. We introduce the Optimization Gap as a fundamental signal of membership: at convergence, member samples exhibit minimal remaining loss-reduction potential, while non-members retain significant potential for further optimization. To estimate this gap in a black-box setting, we propose In-Context Probing (ICP), a training-free method that simulates fine-tuning-like behavior via strategically constructed input contexts. We propose two probing strategies: reference-data-based (using semantically similar public samples) and self-perturbation (via masking or generation). Experiments on three tasks and multiple LLMs show that ICP-MIA significantly outperforms prior black-box MIAs, particularly at low false positive rates. We further analyze how reference data alignment, model type, PEFT configurations, and training schedules affect attack effectiveness. Our findings establish ICP-MIA as a practical and theoretically grounded framework for auditing privacy risks in deployed LLMs.

Paperid: 1743, https://arxiv.org/pdf/2512.16059.pdf

Abstract:
In-Context Learning (ICL) has become a standard technique for adapting Large Language Models (LLMs) to specialized tasks by supplying task-specific exemplars within the prompt. However, when these exemplars contain sensitive information, reliable privacy-preserving mechanisms are essential to prevent unintended leakage through model outputs. Many privacy-preserving methods are proposed to protect the information leakage in the context, but there are less efforts on how to audit those methods. We introduce ContextLeak, the first framework to empirically measure the worst-case information leakage in ICL. ContextLeak uses canary insertion, embedding uniquely identifiable tokens in exemplars and crafting targeted queries to detect their presence. We apply ContextLeak across a range of private ICL techniques, both heuristic such as prompt-based defenses and those with theoretical guarantees such as Embedding Space Aggregation and Report Noisy Max. We find that ContextLeak tightly correlates with the theoretical privacy budget ($ε$) and reliably detects leakage. Our results further reveal that existing methods often strike poor privacy-utility trade-offs, either leaking sensitive information or severely degrading performance.

Paperid: 1744, https://arxiv.org/pdf/2512.15387.pdf

Abstract:
Intelligent electronics are deeply embedded in critical infrastructures and must remain reliable, particularly against deliberate attacks. To minimize risks and impede remote compromise, sensitive systems can be physically isolated from external networks, forming an airgap. Yet, airgaps can still be infiltrated by capable adversaries gaining code execution. Prior research has shown that attackers can then attempt to wirelessly exfiltrate data across the airgap by exploiting unintended radio emissions. In this work, we demonstrate reversal of this link: malicious code execution on embedded devices can enable wireless infiltration of airgapped systems without any hardware modification. In contrast to previous infiltration methods that depend on dedicated sensors (e.g., microphones, LEDs, or temperature sensors) or require strict line-of-sight, we show that unmodified, sensor-less embedded devices can inadvertently act as radio receivers. This phenomenon stems from parasitic RF sensitivity in PCB traces and on-chip analog-to-digital converters (ADCs), allowing external transmissions to be received and decoded entirely in software. Across twelve commercially available embedded devices and two custom prototypes, we observe repeatable reception in the 300-1000 MHz range, with detectable signal power as low as 1 mW. To this end, we propose a systematic methodology to identify device configurations that foster such radio sensitivities and comprehensively evaluate their feasibility for wireless data reception. Exploiting these sensitivities, we demonstrate successful data reception over tens of meters, even in non-line-of-sight conditions and show that the reception sensitivities accommodate data rates of up to 100 kbps. Our findings reveal a previously unexplored command-and-control vector for air-gapped systems while challenging assumptions about their inherent isolation. [shortened]

Paperid: 1745, https://arxiv.org/pdf/2512.14139.pdf

Abstract:
HAL is an open-source framework for gate-level netlist analysis, an integral step in hardware reverse engineering. It provides analysts with an interactive GUI, an extensible plugin system, and APIs in both C++ and Python for rapid prototyping and automation. In addition, HAL ships with plugins for word-level modularization, cryptographic analysis, simulation, and graph-based exploration. Since its release in 2019, HAL has become widely adopted in academia, industry, government, and teaching. It underpins at least 23 academic publications, is taught in hands-on trainings, conference tutorials, and university classes, and has collected over 680 stars and 86 forks on GitHub. By enabling accessible and reproducible hardware reverse engineering research, HAL has significantly advanced the field and the understanding of real-world capabilities and threats.

Paperid: 1746, https://arxiv.org/pdf/2512.14130.pdf

Abstract:
We introduce UIXPOSE, a source-code-agnostic framework that operates on both compiled and open-source apps. This framework applies Intention Behaviour Alignment (IBA) to mobile malware analysis, aligning UI-inferred intent with runtime semantics. Previous work either infers intent statically, e.g., permission-centric, or widget-level or monitors coarse dynamic signals (endpoints, partial resource usage) that miss content and context. UIXPOSE infers an intent vector from each screen using vision-language models and knowledge structures and combines decoded network payloads, heap/memory signals, and resource utilisation traces into a behaviour vector. Their alignment, calculated at runtime, can both detect misbehaviour and highlight exploration of behaviourally rich paths. In three real-world case studies, UIXPOSE reveals covert exfiltration and hidden background activity that evade metadata-only baselines, demonstrating how IBA improves dynamic detection.

Paperid: 1747, https://arxiv.org/pdf/2512.13767.pdf

Abstract:
Cyber-physical systems (CPS) such as unmanned aerial vehicles are vulnerable to slow degradation that develops without causing immediate or obvious failures. Small sensor biases or timing irregularities can accumulate over time, gradually reducing stability while standard monitoring mechanisms continue to report normal operation. Detecting this early phase of degradation remains a challenge, as most existing approaches focus on abrupt faults or visible trajectory deviations. This paper introduces an early warning method based on stability drift, which measures the divergence between predicted and observed state transitions over short horizons. By tracking the gradual growth of this divergence, the proposed approach identifies emerging instability before it becomes visible in the flight trajectory or estimator residuals. The method operates externally to the flight stack and relies only on standard telemetry, making it suitable for deployment without modifying autopilot firmware. The approach was evaluated on a PX4 x500 platform in a software in the loop environment under two realistic degradation scenarios, gradual IMU bias drift and timing irregularities in the control loop. In both cases, the stability drift metric provided a consistent early warning signal several seconds before visible instability appeared, while remaining stable during nominal and aggressive but non degraded flight. The results demonstrate that stability drift can serve as a practical indicator of early degradation in UAV control systems. By providing advance notice during a pre instability phase, the proposed method complements existing safety mechanisms and offers additional time for mitigation or safe mode transitions under slow and subtle attacks.

Paperid: 1748, https://arxiv.org/pdf/2512.10998.pdf

Abstract:
Backdoor attacks create significant security threats to language models by embedding hidden triggers that manipulate model behavior during inference, presenting critical risks for AI systems deployed in healthcare and other sensitive domains. While existing defenses effectively counter obvious threats such as out-of-context trigger words and safety alignment violations, they fail against sophisticated attacks using contextually-appropriate triggers that blend seamlessly into natural language. This paper introduces three novel contextually-aware attack scenarios that exploit domain-specific knowledge and semantic plausibility: the ViralApp attack targeting social media addiction classification, the Fever attack manipulating medical diagnosis toward hypertension, and the Referral attack steering clinical recommendations. These attacks represent realistic threats where malicious actors exploit domain-specific vocabulary while maintaining semantic coherence, demonstrating how adversaries can weaponize contextual appropriateness to evade conventional detection methods. To counter both traditional and these sophisticated attacks, we present \textbf{SCOUT (Saliency-based Classification Of Untrusted Tokens)}, a novel defense framework that identifies backdoor triggers through token-level saliency analysis rather than traditional context-based detection methods. SCOUT constructs a saliency map by measuring how the removal of individual tokens affects the model's output logits for the target label, enabling detection of both conspicuous and subtle manipulation attempts. We evaluate SCOUT on established benchmark datasets (SST-2, IMDB, AG News) against conventional attacks (BadNet, AddSent, SynBkd, StyleBkd) and our novel attacks, demonstrating that SCOUT successfully detects these sophisticated threats while preserving accuracy on clean inputs.

Paperid: 1749, https://arxiv.org/pdf/2512.08918.pdf

Abstract:
Watermarks are an essential tool for identifying AI-generated content. Recently, Christ and Gunn (CRYPTO '24) introduced pseudorandom error-correcting codes (PRCs), which are equivalent to watermarks with strong robustness and quality guarantees. A PRC is a pseudorandom encryption scheme whose decryption algorithm tolerates a high rate of errors. Pseudorandomness ensures quality preservation of the watermark, and error tolerance of decryption translates to the watermark's ability to withstand modification of the content. In the short time since the introduction of PRCs, several works (NeurIPS '24, RANDOM '25, STOC '25) have proposed new constructions. Curiously, all of these constructions are vulnerable to quasipolynomial-time distinguishing attacks. Furthermore, all lack robustness to edits over a constant-sized alphabet, which is necessary for a meaningfully robust LLM watermark. Lastly, they lack robustness to adversaries who know the watermarking detection key. Until now, it was not clear whether any of these properties was achievable individually, let alone together. We construct pseudorandom codes that achieve all of the above: plausible subexponential pseudorandomness security, robustness to worst-case edits over a binary alphabet, and robustness against even computationally unbounded adversaries that have the detection key. Pseudorandomness rests on a new assumption that we formalize, the permuted codes conjecture, which states that a distribution of permuted noisy codewords is pseudorandom. We show that this conjecture is implied by the permuted puzzles conjecture used previously to construct doubly efficient private information retrieval. To give further evidence, we show that the conjecture holds against a broad class of simple distinguishers, including read-once branching programs.

Paperid: 1750, https://arxiv.org/pdf/2512.08869.pdf

Abstract:
The widespread use of big data across sectors has raised major privacy concerns, especially when sensitive information is shared or analyzed. Regulations such as GDPR and HIPAA impose strict controls on data handling, making it difficult to balance the need for insights with privacy requirements. Synthetic data offers a promising solution by creating artificial datasets that reflect real patterns without exposing sensitive information. However, traditional synthetic data methods often fail to capture complex, implicit rules that link different elements of the data and are essential in domains like healthcare. They may reproduce explicit patterns but overlook domain-specific constraints that are not directly stated yet crucial for realism and utility. For example, prescription guidelines that restrict certain medications for specific conditions or prevent harmful drug interactions may not appear explicitly in the original data. Synthetic data generated without these implicit rules can lead to medically inappropriate or unrealistic profiles. To address this gap, we propose ContextGAN, a Context-Aware Differentially Private Generative Adversarial Network that integrates domain-specific rules through a constraint matrix encoding both explicit and implicit knowledge. The constraint-aware discriminator evaluates synthetic data against these rules to ensure adherence to domain constraints, while differential privacy protects sensitive details from the original data. We validate ContextGAN across healthcare, security, and finance, showing that it produces high-quality synthetic data that respects domain rules and preserves privacy. Our results demonstrate that ContextGAN improves realism and utility by enforcing domain constraints, making it suitable for applications that require compliance with both explicit patterns and implicit rules under strict privacy guarantees.

Paperid: 1751, https://arxiv.org/pdf/2512.08862.pdf

Abstract:
Underground mining operations depend on sensor networks to monitor critical parameters such as temperature, gas concentration, and miner movement, enabling timely hazard detection and safety decisions. However, transmitting raw sensor data to a centralized server for machine learning (ML) model training raises serious privacy and security concerns. Federated Learning (FL) offers a promising alternative by enabling decentralized model training without exposing sensitive local data. Yet, applying FL in underground mining presents unique challenges: (i) Adversaries may eavesdrop on shared model updates to launch model inversion or membership inference attacks, compromising data privacy and operational safety; (ii) Non-IID data distributions across mines and sensor noise can hinder model convergence. To address these issues, we propose FedMining--a privacy-preserving FL framework tailored for underground mining. FedMining introduces two core innovations: (1) a Decentralized Functional Encryption (DFE) scheme that keeps local models encrypted, thwarting unauthorized access and inference attacks; and (2) a balancing aggregation mechanism to mitigate data heterogeneity and enhance convergence. Evaluations on real-world mining datasets demonstrate FedMining's ability to safeguard privacy while maintaining high model accuracy and achieving rapid convergence with reduced communication and computation overhead. These advantages make FedMining both secure and practical for real-time underground safety monitoring.

Paperid: 1752, https://arxiv.org/pdf/2512.08856.pdf

Abstract:
In the EU, the General Data Protection Regulation and the ePrivacy Directive mandate informed consent for behavioural advertising and use of tracking technologies. However, the ubiquity of consent banners and popups has led to widespread consent fatigue and questions regarding the effectiveness of these mechanisms in protecting users' data. In contrast, users in California and other US jurisdictions can utilize Global Privacy Control (GPC), a browser-based privacy signal that automatically broadcasts a legally binding opt-out request to websites. In this paper we explore whether, and to what extent, GPC can be adapted to the EU legal framework to mitigate consent fatigue and improve privacy protections for EU residents. We analyse GPC as a technical specification standardized at the World Wide Web Consortium and examine its standing under current EU data protection law. Generally, GPC can be mapped to the various legal bases for processing under the GDPR. However, our evaluation also identifies friction between the GPC specification and EU data protection law as it stands. These discrepancies are resolvable and present an opportunity for EU legislators and regulators to interpret GPC in alignment with EU data protection requirements, particularly, considering the European Commission's recent Digital Omnibus proposal. We conclude that while GPC is not a silver bullet, its adoption -- supported by clear authoritative guidance and specification updates -- can offer a pragmatic path toward more automated and effective data protection in the EU.

Paperid: 1753, https://arxiv.org/pdf/2512.08107.pdf

Abstract:
Adversaries (hackers) attempting to infiltrate networks frequently face uncertainty in their operational environments. This research explores the ability to model and detect when they exhibit ambiguity aversion, a cognitive bias reflecting a preference for known (versus unknown) probabilities. We introduce a novel methodological framework that (1) leverages rich, multi-modal data from human-subjects red-team experiments, (2) employs a large language model (LLM) pipeline to parse unstructured logs into MITRE ATT&CK-mapped action sequences, and (3) applies a new computational model to infer an attacker's ambiguity aversion level in near-real time. By operationalizing this cognitive trait, our work provides a foundational component for developing adaptive cognitive defense strategies.

Paperid: 1754, https://arxiv.org/pdf/2512.07228.pdf

Abstract:
DeepFake face swapping enables highly realistic identity forgeries, posing serious privacy and security risks. A common defence embeds invisible perturbations into images, but these are fragile and often destroyed by basic transformations such as compression or resizing. In this paper, we first conduct a systematic analysis of 30 transformations across six categories and show that protection robustness is highly sensitive to the choice of training transformations, making the standard Expectation over Transformation (EOT) with uniform sampling fundamentally suboptimal. Motivated by this, we propose Expectation Over Learned distribution of Transformation (EOLT), the framework to treat transformation distribution as a learnable component rather than a fixed design choice. Specifically, EOLT employs a policy network that learns to automatically prioritize critical transformations and adaptively generate instance-specific perturbations via reinforcement learning, enabling explicit modeling of defensive bottlenecks while maintaining broad transferability. Extensive experiments demonstrate that our method achieves substantial improvements over state-of-the-art approaches, with 26% higher average robustness and up to 30% gains on challenging transformation categories.

Paperid: 1755, https://arxiv.org/pdf/2512.07033.pdf

Abstract:
Managed TLS has become a common approach for deploying HTTPS, with platforms generating and storing private keys and automating certificate issuance on behalf of domain operators. This model simplifies operational management but shifts control of authentication material from the domain owner to the platform. The implications of this shift during provider transitions remain insufficiently examined. This study investigates how managed TLS platforms behave when a domain is moved away from the platform that originally issued and stored its certificate. A controlled measurement environment was used to monitor multiple platforms after migration. Each platform was observed for the full remaining lifetime of the certificate that had been active during delegation. The measurements show that platforms continue to serve the same certificate until it expires, even after DNS resolvers direct traffic toward new infrastructure. No platform revoked, replaced, or retired the certificate, and no new certificate was issued after delegation ended. Direct connections to the previous platform continued to complete TLS handshakes with the stale certificate, which confirms that authentication capability persisted independently of DNS state. These findings indicate that authentication authority remains with the previous platform for the entire lifetime of certificates issued during the delegation period. The gap between DNS control and control of authentication material introduces a window in which multiple environments can authenticate the same domain. As managed TLS adoption grows, clearer mechanisms for key retirement and certificate invalidation are needed to ensure that the authentication authority follows operational authority during transitions.

Paperid: 1756, https://arxiv.org/pdf/2512.01666.pdf

Abstract:
Machine learning (ML) has been widely used to analyze API call sequences in malware analysis, which typically requires the expertise of domain specialists to extract relevant features from raw data. The extracted features play a critical role in malware analysis. Traditional feature extraction is based on human domain knowledge, while there is a trend of using natural language processing (NLP) for automatic feature extraction. This raises a question: how do we effectively select features for malware analysis based on API call sequences? To answer it, this paper presents a comprehensive study of investigating the impact of feature engineering upon malware classification.We first conducted a comparative performance evaluation under three models, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Transformer, with respect to knowledge-based and NLP-based feature engineering methods. We observed that models with knowledge-based feature engineering inputs generally outperform those using NLP-based across all metrics, especially under smaller sample sizes. Then we analyzed a complete set of data features from API call sequences, our analysis reveals that models often focus on features such as handles and virtual addresses, which vary across executions and are difficult for human analysts to interpret.

Paperid: 1757, https://arxiv.org/pdf/2512.01396.pdf

Abstract:
Many modern software projects evolve rapidly to incorporate new features and security patches. It is important for users to update their dependencies to safer versions, but many still use older, vulnerable package versions because upgrading can be difficult and may break their existing codebase. Software developers can mitigate this problem by backporting security patches to older releases. However, manually backporting is time-consuming and error-prone. The effectiveness of existing automated backporting techniques on general software remains unclear since they typically target only code-hunk or function-level patch porting scenarios and are evaluated with imperfect metrics. To facilitate the development and evaluation of automated backporting techniques, we introduce BackportBench, the first comprehensive benchmark suite for patch backporting problem. BackportBench is a multilingual benchmark that contains 202 patch backporting problems from PyPI, Maven, and npm, each with executable Docker environments and relevant test cases. We evaluated existing patch porting methods and LLM-based techniques that have the potential to adapt to this task using BackportBench. The results show that the agentic method has outperformed traditional patch porting methods, especially on cases that require logical and structural changes. However, the performance varies across different programming languages. Based on the findings, we draw several implications for researchers and software practitioners in future work on automated backporting.

Paperid: 1758, https://arxiv.org/pdf/2512.00436.pdf

Abstract:
Tor is a widely used anonymity network that conceals user identities by routing traffic through encrypted relays, yet it remains vulnerable to traffic correlation attacks that deanonymize users by matching patterns in ingress and egress traffic. However, existing correlation methods suffer from two major limitations: limited robustness to noise and partial observations, and poor scalability due to computationally expensive pairwise matching. To address these challenges, we propose RECTor, a machine learning-based framework for traffic correlation under realistic conditions. RECTor employs attention-based Multiple Instance Learning (MIL) and GRU-based temporal encoding to extract robust flow representations, even when traffic data is incomplete or obfuscated. These embeddings are mapped into a shared space via a Siamese network and efficiently matched using approximate nearest neighbor (aNN) search. Empirical evaluations show that RECTor outperforms state-of-the-art baselines such as DeepCorr, DeepCOFFEA, and FlowTracker, achieving up to 60% higher true positive rates under high-noise conditions and reducing training and inference time by over 50%. Moreover, RECTor demonstrates strong scalability: inference cost grows near-linearly as the number of flows increases. These findings reveal critical vulnerabilities in Tor's anonymity model and highlight the need for advanced model-aware defenses.

Paperid: 1759, https://arxiv.org/pdf/2512.00094.pdf

Abstract:
Modern generative diffusion models rely on vast training datasets, often including images with uncertain ownership or usage rights. Radioactive watermarks -- marks that transfer to a model's outputs -- can help detect when such unauthorized data has been used for training. Moreover, aside from being radioactive, an effective watermark for protecting images from unauthorized training also needs to meet other existing requirements, such as imperceptibility, robustness, and multi-bit capacity. To overcome these challenges, we propose HMARK, a novel multi-bit watermarking scheme, which encodes ownership information as secret bits in the semantic-latent space (h-space) for image diffusion models. By leveraging the interpretability and semantic significance of h-space, ensuring that watermark signals correspond to meaningful semantic attributes, the watermarks embedded by HMARK exhibit radioactivity, robustness to distortions, and minimal impact on perceptual quality. Experimental results demonstrate that HMARK achieves 98.57% watermark detection accuracy, 95.07% bit-level recovery accuracy, 100% recall rate, and 1.0 AUC on images produced by the downstream adversarial model finetuned with LoRA on watermarked data across various types of distortions.

Paperid: 1760, https://arxiv.org/pdf/2511.22791.pdf

Abstract:
The rapid proliferation of unmanned aerial vehicles (UAVs) and their applications in diverse domains, such as surveillance, disaster management, agriculture, and defense, have revolutionized modern technology. While the potential benefits of swarm-based UAV networks are growing significantly, they are vulnerable to various security attacks that can jeopardize the overall mission success by degrading their performance, disrupting decision-making, and compromising the trajectory planning process. The Intrusion Detection System (IDS) plays a vital role in identifying potential security attacks to ensure the secure operation of UAV swarm networks. However, conventional IDS primarily focuses on binary classification with resource-intensive neural networks and faces challenges, including latency, privacy breaches, increased performance overhead, and model drift. This research aims to address these challenges by developing a novel lightweight and federated continuous learning-based IDS scheme. Our proposed model facilitates decentralized training across diverse UAV swarms to ensure data heterogeneity and privacy. The performance evaluation of our model demonstrates significant improvements, with classification accuracies of 99.45% on UKM-IDS, 99.99% on UAV-IDS, 96.85% on TLM-UAV dataset, and 98.05% on Cyber-Physical datasets.

Paperid: 1761, https://arxiv.org/pdf/2511.21803.pdf

Abstract:
5G Standalone deployments can exhibit uplink misbehavior from user equipment that remains fully compliant with standard control plane procedures. Manipulations such as transmit power inflation, gradual timing drift, and short off grant bursts leave the signaling state intact but distort the expected relationships among the telemetry streams produced by the gNB. This work examines whether these cross layer relationships can serve as a reliable basis for identifying such misbehavior without introducing new signaling. Using a controlled 5G Standalone testbed with commercial user equipment and a software defined radio adversarial device, we study how each manipulation affects the coherence among physical layer measurements, MAC scheduling decisions, and configuration metadata. The results show that every manipulation produces a distinct and reproducible signature that is not visible from any single telemetry source. Power offsets weaken the natural connection between SNR and CQI, timing drift breaks the alignment maintained by the scheduler, and off grant activity produces uplink energy that does not agree with allocation logs. These inconsistencies appear in merged multi layer time series traces and in cross domain views such as the SNR to CQI plane. The findings indicate that cross layer coherence provides a practical signal for detecting uplink misbehavior using only standard gNB telemetry, with no protocol modifications required, which makes the method suitable for integration into operational monitoring and auditing systems.

Paperid: 1762, https://arxiv.org/pdf/2511.20983.pdf

Abstract:
Collaborative machine learning across healthcare institutions promises improved diagnostic accuracy by leveraging diverse datasets, yet privacy regulations such as HIPAA prohibit direct patient data sharing. While federated learning (FL) enables decentralized training without raw data exchange, recent studies show that model gradients in conventional FL remain vulnerable to reconstruction attacks, potentially exposing sensitive medical information. This paper presents a privacy-preserving federated learning framework combining Vision Transformers (ViT) with homomorphic encryption (HE) for secure multi-institutional histopathology classification. The approach leverages the ViT CLS token as a compact 768-dimensional feature representation for secure aggregation, encrypting these tokens using CKKS homomorphic encryption before transmission to the server. We demonstrate that encrypting CLS tokens achieves a 30-fold communication reduction compared to gradient encryption while maintaining strong privacy guarantees. Through evaluation on a three-client federated setup for lung cancer histopathology classification, we show that gradients are highly susceptible to model inversion attacks (PSNR: 52.26 dB, SSIM: 0.999, NMI: 0.741), enabling near-perfect image reconstruction. In contrast, the proposed CLS-protected HE approach prevents such attacks while enabling encrypted inference directly on ciphertexts, requiring only 326 KB of encrypted data transmission per aggregation round. The framework achieves 96.12 percent global classification accuracy in the unencrypted domain and 90.02 percent in the encrypted domain.

Paperid: 1763, https://arxiv.org/pdf/2511.20922.pdf

Abstract:
Quantum machine learning (QML) promises compact and expressive representations, but suffers from the measurement bottleneck - a narrow quantum-to-classical readout that limits performance and amplifies privacy risk. We propose a lightweight residual hybrid architecture that concatenates quantum features with raw inputs before classification, bypassing the bottleneck without increasing quantum complexity. Experiments show our model outperforms pure quantum and prior hybrid models in both centralized and federated settings. It achieves up to +55% accuracy improvement over quantum baselines, while retaining low communication cost and enhanced privacy robustness. Ablation studies confirm the effectiveness of the residual connection at the quantum-classical interface. Our method offers a practical, near-term pathway for integrating quantum models into privacy-sensitive, resource-constrained settings like federated edge learning.

Paperid: 1764, https://arxiv.org/pdf/2511.20284.pdf

Abstract:
Precise access control decisions are crucial to the security of both traditional applications and emerging agent-based systems. Typically, these decisions are made by users during app installation or at runtime. Due to the increasing complexity and automation of systems, making these access control decisions can add a significant cognitive load on users, often overloading them and leading to suboptimal or even arbitrary access control decisions. To address this problem, we propose to leverage the processing and reasoning capabilities of large language models (LLMs) to make dynamic, context-aware decisions aligned with the user's security preferences. For this purpose, we conducted a user study, which resulted in a dataset of 307 natural-language privacy statements and 14,682 access control decisions made by users. We then compare these decisions against those made by two versions of LLMs: a general and a personalized one, for which we also gathered user feedback on 1,446 of its decisions. Our results show that in general, LLMs can reflect users' preferences well, achieving up to 86\% accuracy when compared to the decision made by the majority of users. Our study also reveals a crucial trade-off in personalizing such a system: while providing user-specific privacy preferences to the LLM generally improves agreement with individual user decisions, adhering to those preferences can also violate some security best practices. Based on our findings, we discuss design and risk considerations for implementing a practical natural-language-based access control system that balances personalization, security, and utility.

Paperid: 1765, https://arxiv.org/pdf/2511.20229.pdf

Abstract:
Nowadays, malware increasingly uses DNS-based covert channels in order to evade detection and maintain stealthy communication with its command-and-control servers. While prior work has focused on detecting such activity, identifying specific malware families and their behaviors from captured network traffic remains challenging due to the variability of DNS. In this paper, we present the first application of Locality Sensitive Hashing to the detection and identification of real-world malware utilizing DNS covert channels. Our approach encodes DNS subdomain sequences into statistical similarity features that effectively capture anomalies indicative of malicious activity. Combined with a Random Forest classifier, our method achieves higher accuracy and reduced false positive rates than prior approaches, while demonstrating improved robustness and generalization to previously unseen or modified malware samples. We further demonstrate that our approach enables reliable classification of malware behavior (e.g., uploading or downloading of files), based solely on DNS subdomains.

Paperid: 1766, https://arxiv.org/pdf/2511.19248.pdf

Abstract:
Test-time personalization in federated learning enables models at clients to adjust online to local domain shifts, enhancing robustness and personalization in deployment. Yet, existing federated learning work largely overlooks the security risks that arise when local adaptation occurs at test time. Heterogeneous domain arrivals, diverse adaptation algorithms, and limited cross-client visibility create vulnerabilities where compromised participants can craft poisoned inputs and submit adversarial updates that undermine both global and per-client performance. To address this threat, we introduce FedPoisonTTP, a realistic grey-box attack framework that explores test-time data poisoning in the federated adaptation setting. FedPoisonTTP distills a surrogate model from adversarial queries, synthesizes in-distribution poisons using feature-consistency, and optimizes attack objectives to generate high-entropy or class-confident poisons that evade common adaptation filters. These poisons are injected during local adaptation and spread through collaborative updates, leading to broad degradation. Extensive experiments on corrupted vision benchmarks show that compromised participants can substantially diminish overall test-time performance.

Paperid: 1767, https://arxiv.org/pdf/2511.18748.pdf

Abstract:
The digitalization of substations enlarges the cyber-attack surface, necessitating effective detection and mitigation of cyber attacks in digital substations. While machine learning-based intrusion detection has been widely explored, such methods have not demonstrated detection and mitigation within the required real-time budget. In contrast, cryptographic authentication has emerged as a practical candidate for real-time cyber defense, as specified in IEC 62351. In addition, lightweight rule-based intrusion detection that validates IEC 61850 semantics can provide specification-based detection of anomalous or malicious traffic with minimal processing delay. This paper presents the design logic and implementation aspects of three potential real-time mitigation techniques capable of countering GOOSE-based attacks: (i) IEC 62351-compliant message authentication code (MAC) scheme, (ii) a semantics-enforced rule-based intrusion detection system (IDS), and (iii) a hybrid approach integrating both MAC verification and Intrusion Detection System (IDS). A comparative evaluation of these real-time mitigation approaches is conducted using a cyber-physical system (CPS) security testbed. The results show that the hybrid integration significantly enhances mitigation capability. Furthermore, the processing delays of all three methods remain within the strict delivery requirements of GOOSE communication. The study also identifies limitations that none of the techniques can fully address, highlighting areas for future work.

Paperid: 1768, https://arxiv.org/pdf/2511.17959.pdf

Abstract:
As AI agents attempt to autonomously act on users' behalf, they raise transparency and control issues. We argue that permission-based access control is indispensable in providing meaningful control to the users, but conventional permission models are inadequate for the automated agentic execution paradigm. We therefore propose automated permission management for AI agents. Our key idea is to conduct a user study to identify the factors influencing users' permission decisions and to encode these factors into an ML-based permission management assistant capable of predicting users' future decisions. We find that participants' permission decisions are influenced by communication context but importantly individual preferences tend to remain consistent within contexts, and align with those of other participants. Leveraging these insights, we develop a permission prediction model achieving 85.1% accuracy overall and 94.4% for high-confidence predictions. We find that even without using permission history, our model achieves an accuracy of 66.9%, and a slight increase of training samples (i.e., 1-4) can substantially increase the accuracy by 10.8%.

Paperid: 1769, https://arxiv.org/pdf/2511.15990.pdf

Abstract:
Digital agriculture is transforming the way we grow food by utilizing technology to make farming more efficient, sustainable, and productive. This modern approach to agriculture generates a wealth of valuable data that could help address global food challenges, but farmers are hesitant to share it due to privacy concerns. This limits the extent to which researchers can learn from this data to inform improvements in farming. This paper presents the Digital Agriculture Sandbox, a secure online platform that solves this problem. The platform enables farmers (with limited technical resources) and researchers to collaborate on analyzing farm data without exposing private information. We employ specialized techniques such as federated learning, differential privacy, and data analysis methods to safeguard the data while maintaining its utility for research purposes. The system enables farmers to identify similar farmers in a simplified manner without needing extensive technical knowledge or access to computational resources. Similarly, it enables researchers to learn from the data and build helpful tools without the sensitive information ever leaving the farmer's system. This creates a safe space where farmers feel comfortable sharing data, allowing researchers to make important discoveries. Our platform helps bridge the gap between maintaining farm data privacy and utilizing that data to address critical food and farming challenges worldwide.

Paperid: 1770, https://arxiv.org/pdf/2511.14876.pdf

Abstract:
To autonomously control vehicles, driving agents use outputs from a combination of machine-learning (ML) models, controller logic, and custom modules. Although numerous prior works have shown that adversarial examples can mislead ML models used in autonomous driving contexts, it remains unclear if these attacks are effective at producing harmful driving actions for various agents, environments, and scenarios. To assess the risk of adversarial examples to autonomous driving, we evaluate attacks against a variety of driving agents, rather than against ML models in isolation. To support this evaluation, we leverage CARLA, an urban driving simulator, to create and evaluate adversarial examples. We create adversarial patches designed to stop or steer driving agents, stream them into the CARLA simulator at runtime, and evaluate them against agents from the CARLA Leaderboard, a public repository of best-performing autonomous driving agents from an annual research competition. Unlike prior work, we evaluate attacks against autonomous driving systems without creating or modifying any driving-agent code and against all parts of the agent included with the ML model. We perform a case-study investigation of two attack strategies against three open-source driving agents from the CARLA Leaderboard across multiple driving scenarios, lighting conditions, and locations. Interestingly, we show that, although some attacks can successfully mislead ML models into predicting erroneous stopping or steering commands, some driving agents use modules, such as PID control or GPS-based rules, that can overrule attacker-manipulated predictions from ML models.

Paperid: 1771, https://arxiv.org/pdf/2511.13939.pdf

Abstract:
Metasurfaces, or Reconfigurable Intelligent Surfaces (RISs), have emerged as a transformative technology for next-generation wireless systems, enabling digitally controlled manipulation of electromagnetic wave propagation. By turning the traditionally passive radio environment into a smart, programmable medium, metasurfaces promise advances in communication and sensing. However, metasurfaces also present a new security frontier: both attackers and defenders can exploit them to alter wireless propagation for their own advantage. While prior security research has primarily explored unilateral metasurface applications - empowering either attackers or defenders - this work investigates symmetric scenarios, where both sides possess comparable metasurface capabilities. Using both theoretical modeling and real-world experiments, we analyze how competing metasurfaces interact for diverse objectives, including signal power and sensing perception. Thereby, we present the first systematic study of context-agnostic metasurface-to-metasurface interactions and their implications for wireless security. Our results reveal that the outcome of metasurface "battles" depends on an interplay of timing, placement, algorithmic strategy, and hardware scale. Across multiple case studies in Wi-Fi environments, including wireless jamming, channel obfuscation for sensing and communication, and sensing spoofing, we demonstrate that opposing metasurfaces can substantially or fully negate each other's effects. By undermining previously proposed security and privacy schemes, our findings open new opportunities for designing resilient and high-assurance physical-layer systems in smart radio environments.

Paperid: 1772, https://arxiv.org/pdf/2511.13789.pdf

Abstract:
Backdoor attacks pose a serious threat to the security of large language models (LLMs), causing them to exhibit anomalous behavior under specific trigger conditions. The design of backdoor triggers has evolved from fixed triggers to dynamic or implicit triggers. This increased flexibility in trigger design makes it challenging for defenders to identify their specific forms accurately. Most existing backdoor defense methods are limited to specific types of triggers or rely on an additional clean model for support. To address this issue, we propose a backdoor detection method based on attention similarity, enabling backdoor detection without prior knowledge of the trigger. Our study reveals that models subjected to backdoor attacks exhibit unusually high similarity among attention heads when exposed to triggers. Based on this observation, we propose an attention safety alignment approach combined with head-wise fine-tuning to rectify potentially contaminated attention heads, thereby effectively mitigating the impact of backdoor attacks. Extensive experimental results demonstrate that our method significantly reduces the success rate of backdoor attacks while preserving the model's performance on downstream tasks.

Paperid: 1773, https://arxiv.org/pdf/2511.13333.pdf

Abstract:
Generating thorough natural language explanations for threat detections remains an open problem in cybersecurity research, despite significant advances in automated malware detection systems. In this work, we present AutoMalDesc, an automated static analysis summarization framework that, following initial training on a small set of expert-curated examples, operates independently at scale. This approach leverages an iterative self-paced learning pipeline to progressively enhance output quality through synthetic data generation and validation cycles, eliminating the need for extensive manual data annotation. Evaluation across 3,600 diverse samples in five scripting languages demonstrates statistically significant improvements between iterations, showing consistent gains in both summary quality and classification accuracy. Our comprehensive validation approach combines quantitative metrics based on established malware labels with qualitative assessment from both human experts and LLM-based judges, confirming both technical precision and linguistic coherence of generated summaries. To facilitate reproducibility and advance research in this domain, we publish our complete dataset of more than 100K script samples, including annotated seed (0.9K) and test (3.6K) datasets, along with our methodology and evaluation framework.

Paperid: 1774, https://arxiv.org/pdf/2511.12052.pdf

Abstract:
Steganography and steganalysis are strongly related subjects of information security. Over the past decade, many powerful and efficient artificial intelligence (AI) - driven techniques have been designed and presented during research into steganography as well as steganalysis. This study presents a scientometric analysis of AI-driven steganography-based data hiding techniques using a thematic modelling approach. A total of 654 articles within the time span of 2017 to 2023 have been considered. Experimental evaluation of the study reveals that 69% of published articles are from Asian countries. The China is on top (TP:312), followed by India (TP-114). The study mainly identifies seven thematic clusters: steganographic image data hiding, deep image steganalysis, neural watermark robustness, linguistic steganography models, speech steganalysis algorithms, covert communication networks, and video steganography techniques. The proposed study also assesses the scope of AI-steganography under the purview of sustainable development goals (SDGs) to present the interdisciplinary reciprocity between them. It has been observed that only 18 of the 654 articles are aligned with one of the SDGs, which shows that limited studies conducted in alignment with SDG goals. SDG9 which is Industry, Innovation, and Infrastructure is leading among 18 SDGs mapped articles. To the top of our insight, this study is the unique one to present a scientometric study on AI-driven steganography-based data hiding techniques. In the context of descriptive statistics, the study breaks down the underlying causes of observed trends, including the influence of DL developments, trends in East Asia and maturity of foundational methods. The work also stresses upon the critical gaps in societal alignment, particularly the SDGs, ultimately working on unveiling the field's global impact on AI security challenges.

Paperid: 1775, https://arxiv.org/pdf/2511.12048.pdf

Abstract:
Deepfakes are major threats to the integrity of digital media. We propose DeiTFake, a DeiT-based transformer and a novel two-stage progressive training strategy with increasing augmentation complexity. The approach applies an initial transfer-learning phase with standard augmentations followed by a fine-tuning phase using advanced affine and deepfake-specific augmentations. DeiT's knowledge distillation model captures subtle manipulation artifacts, increasing robustness of the detection model. Trained on the OpenForensics dataset (190,335 images), DeiTFake achieves 98.71\% accuracy after stage one and 99.22\% accuracy with an AUROC of 0.9997, after stage two, outperforming the latest OpenForensics baselines. We analyze augmentation impact and training schedules, and provide practical benchmarks for facial deepfake detection.

Paperid: 1776, https://arxiv.org/pdf/2511.11584.pdf

Abstract:
OpenAI (2025) showed that training against a chain of thought (CoT) monitor can cause obfuscated CoTs, which contain bad behavior the monitor cannot detect. They proposed to keep CoTs monitorable by training only against output monitors that do not have access to CoT. We show that such training can still cause obfuscated CoTs via two mechanisms. First, when a model is trained to produce a safe-looking output, that model may generalize to making its CoTs look safe. Second, since later tokens are conditioned on earlier ones, safe-looking CoTs may increase the likelihood of safe outputs, causing safe-looking CoTs to be reinforced. We introduce two mitigations to address these two issues, which achieve a Pareto improvement in terms of monitorability and task performance compared to regular training.

Paperid: 1777, https://arxiv.org/pdf/2511.11319.pdf

Abstract:
Rank aggregation is a task of combining the rankings of items from multiple users into a single ranking that best represents the users' rankings. Alabi et al. (AAAI'22) presents differentially-private (DP) polynomial-time approximation schemes (PTASes) and $5$-approximation algorithms with certain additive errors for the Kemeny rank aggregation problem in both central and local models. In this paper, we present improved DP PTASes with smaller additive error in the central model. Furthermore, we are first to study the footrule rank aggregation problem under DP. We give a near-optimal algorithm for this problem; as a corollary, this leads to 2-approximation algorithms with the same additive error as the $5$-approximation algorithms of Alabi et al. for the Kemeny rank aggregation problem in both central and local models.

Paperid: 1778, https://arxiv.org/pdf/2511.09316.pdf

Abstract:
We consider the problem of certified robustness for sequence classification against edit distance perturbations. Naturally occurring inputs of varying lengths (e.g., sentences in natural language processing tasks) present a challenge to current methods that employ fixed-rate deletion mechanisms and lead to suboptimal performance. To this end, we introduce AdaptDel methods with adaptable deletion rates that dynamically adjust based on input properties. We extend the theoretical framework of randomized smoothing to variable-rate deletion, ensuring sound certification with respect to edit distance. We achieve strong empirical results in natural language tasks, observing up to 30 orders of magnitude improvement to median cardinality of the certified region, over state-of-the-art certifications.

Paperid: 1779, https://arxiv.org/pdf/2511.07123.pdf

Abstract:
Federated learning (FL) enables multiple clients to jointly train a model by sharing only gradient updates for aggregation instead of raw data. Due to the transmission of very high-dimensional gradient updates from many clients, FL is known to suffer from a communication bottleneck. Meanwhile, the gradients shared by clients as well as the trained model may also be exploited for inferring private local datasets, making privacy still a critical concern in FL. We present Clover, a novel system framework for communication-efficient, secure, and differentially private FL. To tackle the communication bottleneck in FL, Clover follows a standard and commonly used approach-top-k gradient sparsification, where each client sparsifies its gradient update such that only k largest gradients (measured by magnitude) are preserved for aggregation. Clover provides a tailored mechanism built out of a trending distributed trust setting involving three servers, which allows to efficiently aggregate multiple sparse vectors (top-k sparsified gradient updates) into a dense vector while hiding the values and indices of non-zero elements in each sparse vector. This mechanism outperforms a baseline built on the general distributed ORAM technique by several orders of magnitude in server-side communication and runtime, with also smaller client communication cost. We further integrate this mechanism with a lightweight distributed noise generation mechanism to offer differential privacy (DP) guarantees on the trained model. To harden Clover with security against a malicious server, we devise a series of lightweight mechanisms for integrity checks on the server-side computation. Extensive experiments show that Clover can achieve utility comparable to vanilla FL with central DP, with promising performance.

Paperid: 1780, https://arxiv.org/pdf/2511.05359.pdf

Abstract:
As language models evolve into autonomous agents that act and communicate on behalf of users, ensuring safety in multi-agent ecosystems becomes a central challenge. Interactions between personal assistants and external service providers expose a core tension between utility and protection: effective collaboration requires information sharing, yet every exchange creates new attack surfaces. We introduce ConVerse, a dynamic benchmark for evaluating privacy and security risks in agent-agent interactions. ConVerse spans three practical domains (travel, real estate, insurance) with 12 user personas and over 864 contextually grounded attacks (611 privacy, 253 security). Unlike prior single-agent settings, it models autonomous, multi-turn agent-to-agent conversations where malicious requests are embedded within plausible discourse. Privacy is tested through a three-tier taxonomy assessing abstraction quality, while security attacks target tool use and preference manipulation. Evaluating seven state-of-the-art models reveals persistent vulnerabilities; privacy attacks succeed in up to 88% of cases and security breaches in up to 60%, with stronger models leaking more. By unifying privacy and security within interactive multi-agent contexts, ConVerse reframes safety as an emergent property of communication.

Paperid: 1781, https://arxiv.org/pdf/2511.04728.pdf

Abstract:
Phishing emails continue to pose a persistent challenge to online communication, exploiting human trust and evading automated filters through realistic language and adaptive tactics. While large language models (LLMs) such as GPT-4 and LLaMA-3-8B achieve strong accuracy in text classification, their deployment in security systems requires assessing reliability beyond benchmark performance. To address this, this study introduces the Trustworthiness Calibration Framework (TCF), a reproducible methodology for evaluating phishing detectors across three dimensions: calibration, consistency, and robustness. These components are integrated into a bounded index, the Trustworthiness Calibration Index (TCI), and complemented by the Cross-Dataset Stability (CDS) metric that quantifies stability of trustworthiness across datasets. Experiments conducted on five corpora, such as SecureMail 2025, Phishing Validation 2024, CSDMC2010, Enron-Spam, and Nazario, using DeBERTa-v3-base, LLaMA-3-8B, and GPT-4 demonstrate that GPT-4 achieves the strongest overall trust profile, followed by LLaMA-3-8B and DeBERTa-v3-base. Statistical analysis confirms that reliability varies independently of raw accuracy, underscoring the importance of trust-aware evaluation for real-world deployment. The proposed framework establishes a transparent and reproducible foundation for assessing model dependability in LLM-based phishing detection.

Paperid: 1782, https://arxiv.org/pdf/2511.04250.pdf

Abstract:
We study space-bounded communication complexity for unitary implementation in distributed quantum processors, where we restrict the number of qubits per processor to ensure practical relevance and technical non-triviality. We model distributed quantum processors using distributed quantum circuits with nonlocal two-qubit gates, defining the communication complexity of a unitary as the minimum number of such nonlocal gates required for its realization. Our contributions are twofold. First, for general $n$-qubit unitaries, we improve upon the trivial $O(4^n)$ communication bound. Considering $k$ pairwise-connected processors (each with $n/k$ data qubits and $m$ ancillas), we prove the communication complexity satisfies $O\left(\max\{4^{(1-1/k)n - m}, n\}\right)$--for example, $O(2^n)$ when $m=0$ and $k=2$--and establish the tightness of this upper bound. We further extend the analysis to approximation models and general network topologies. Second, for special unitaries, we show that both the Quantum Fourier Transform (QFT) and Clifford circuits admit linear upper bounds on communication complexity in the exact model, outperforming the trivial quadratic bounds applicable to these cases. In the approximation model, QFT's communication complexity reduces drastically from linear to logarithmic, while Clifford circuits retain a linear lower bound. These results offer fundamental insights for optimizing communication in distributed quantum unitary implementation, advancing the feasibility of large-scale distributed quantum computing (DQC) systems.

Paperid: 1783, https://arxiv.org/pdf/2511.04114.pdf

Abstract:
With the increasing frequency and sophistication of Distributed Denial of Service (DDoS) attacks, it has become critical to develop more efficient and interpretable detection methods. Traditional detection systems often struggle with scalability and transparency, hindering real-time response and understanding of attack vectors. This paper presents an automated framework for detecting and interpreting DDoS attacks using machine learning (ML). The proposed method leverages the Tree-based Pipeline Optimization Tool (TPOT) to automate the selection and optimization of ML models and features, reducing the need for manual experimentation. SHapley Additive exPlanations (SHAP) is incorporated to enhance model interpretability, providing detailed insights into the contribution of individual features to the detection process. By combining TPOT's automated pipeline selection with SHAP interpretability, this approach improves the accuracy and transparency of DDoS detection. Experimental results demonstrate that key features such as mean backward packet length and minimum forward packet header length are critical in detecting DDoS attacks, offering a scalable and explainable cybersecurity solution.

Paperid: 1784, https://arxiv.org/pdf/2511.02356.pdf

Abstract:
The widespread deployment of Large Language Models (LLMs) as public-facing web services and APIs has made their security a core concern for the web ecosystem. Jailbreak attacks, as one of the significant threats to LLMs, have recently attracted extensive research. In this paper, we reveal a jailbreak strategy which can effectively evade current defense strategies. It can extract valuable information from failed or partially successful attack attempts and contains self-evolution from attack interactions, resulting in sufficient strategy diversity and adaptability. Inspired by continuous learning and modular design principles, we propose ASTRA, a jailbreak framework that autonomously discovers, retrieves, and evolves attack strategies to achieve more efficient and adaptive attacks. To enable this autonomous evolution, we design a closed-loop "attack-evaluate-distill-reuse" core mechanism that not only generates attack prompts but also automatically distills and generalizes reusable attack strategies from every interaction. To systematically accumulate and apply this attack knowledge, we introduce a three-tier strategy library that categorizes strategies into Effective, Promising, and Ineffective based on their performance scores. The strategy library not only provides precise guidance for attack generation but also possesses exceptional extensibility and transferability. We conduct extensive experiments under a black-box setting, and the results show that ASTRA achieves an average Attack Success Rate (ASR) of 82.7%, significantly outperforming baselines.

Paperid: 1785, https://arxiv.org/pdf/2511.02176.pdf

Abstract:
Privacy-preserving biometric authentication (PPBA) enables client authentication without revealing sensitive biometric data, addressing privacy and security concerns. Many studies have proposed efficient cryptographic solutions to this problem based on secure multi-party computation, typically assuming a semi-honest adversary model, where all parties follow the protocol but may try to learn additional information. However, this assumption often falls short in real-world scenarios, where adversaries may behave maliciously and actively deviate from the protocol. In this paper, we propose, implement, and evaluate $\sysname$, a \underline{F}lexible and \underline{L}ightweight biometric \underline{A}uthentication scheme designed for a \underline{M}alicious \underline{E}nvironment. By hybridizing lightweight secret-sharing-family primitives within two-party computation, $\sysname$ carefully designs a line of supporting protocols that incorporate integrity checks with rationally extra overhead. Additionally, $\sysname$ enables server-side authentication with various similarity metrics through a cross-metric-compatible design, enhancing flexibility and robustness without requiring any changes to the server-side process. A rigorous theoretical analysis validates the correctness, security, and efficiency of $\sysname$. Extensive experiments highlight $\sysname$'s superior efficiency, with a communication reduction by {$97.61\times \sim 110.13\times$} and a speedup of {$ 2.72\times \sim 2.82\times$ (resp. $ 6.58\times \sim 8.51\times$)} in a LAN (resp. WAN) environment, when compared to the state-of-the-art work.

Paperid: 1786, https://arxiv.org/pdf/2511.01634.pdf

Abstract:
Large Language Models (LLMs) are increasingly used in intelligent systems that perform reasoning, summarization, and code generation. Their ability to follow natural-language instructions, while powerful, also makes them vulnerable to a new class of attacks known as prompt injection. In these attacks, hidden or malicious instructions are inserted into user inputs or external content, causing the model to ignore its intended task or produce unsafe responses. This study proposes a unified framework for evaluating how resistant Large Language Models (LLMs) are to prompt injection attacks. The framework defines three complementary metrics such as the Resilience Degradation Index (RDI), Safety Compliance Coefficient (SCC), and Instructional Integrity Metric (IIM) to jointly measure robustness, safety, and semantic stability. We evaluated four instruction-tuned models (GPT-4, GPT-4o, LLaMA-3 8B Instruct, and Flan-T5-Large) on five common language tasks: question answering, summarization, translation, reasoning, and code generation. Results show that GPT-4 performs best overall, while open-weight models remain more vulnerable. The findings highlight that strong alignment and safety tuning are more important for resilience than model size alone. Results show that all models remain partially vulnerable, especially to indirect and direct-override attacks. GPT-4 achieved the best overall resilience (RDR = 9.8 %, SCR = 96.4 %), while open-source models exhibited higher performance degradation and lower safety scores. The findings demonstrate that alignment strength and safety tuning play a greater role in resilience than model size alone. The proposed framework offers a structured, reproducible approach for assessing model robustness and provides practical insights for improving LLM safety and reliability.

Paperid: 1787, https://arxiv.org/pdf/2510.27675.pdf

Abstract:
Large language models (LLMs) have demonstrated impressive capabilities for many coding tasks, including summarization, translation, completion, and code generation. However, detecting code vulnerabilities remains a challenging task for LLMs. An effective way to improve LLM performance is in-context learning (ICL) - providing few-shot examples similar to the query, along with correct answers, can improve an LLM's ability to generate correct solutions. However, choosing the few-shot examples appropriately is crucial to improving model performance. In this paper, we explore two criteria for choosing few-shot examples for ICL used in the code vulnerability detection task. The first criterion considers if the LLM (consistently) makes a mistake or not on a sample with the intuition that LLM performance on a sample is informative about its usefulness as a few-shot example. The other criterion considers similarity of the examples with the program under query and chooses few-shot examples based on the $k$-nearest neighbors to the given sample. We perform evaluations to determine the benefits of these criteria individually as well as under various combinations, using open-source models on multiple datasets.

Paperid: 1788, https://arxiv.org/pdf/2510.27346.pdf

Abstract:
With the rise of location-based service (LBS) applications that rely on terrestrial and satellite infrastructures (e.g., GNSS and crowd-sourced Wi-Fi, Bluetooth, cellular, and IP databases) for positioning, ensuring their integrity and security is paramount. However, we demonstrate that these applications are susceptible to low-cost attacks (less than $50), including Wi-Fi spoofing combined with GNSS jamming, as well as more sophisticated coordinated location spoofing. These attacks manipulate position data to control or undermine LBS functionality, leading to user scams or service manipulation. Therefore, we propose a countermeasure to detect and thwart such attacks by utilizing readily available, redundant positioning information from off-the-shelf platforms. Our method extends the receiver autonomous integrity monitoring (RAIM) framework by incorporating opportunistic information, including data from onboard sensors and terrestrial infrastructure signals, and, naturally, GNSS. We theoretically show that the fusion of heterogeneous signals improves resilience against sophisticated adversaries on multiple fronts. Experimental evaluations show the effectiveness of the proposed scheme in improving detection accuracy by 62% at most compared to baseline schemes and restoring accurate positioning.

Paperid: 1789, https://arxiv.org/pdf/2510.26105.pdf

Abstract:
Despite the notable advancements and versatility of multi-modal diffusion models, such as text-to-image models, their susceptibility to adversarial inputs remains underexplored. Contrary to expectations, our investigations reveal that the alignment between textual and Image modalities in existing diffusion models is inadequate. This misalignment presents significant risks, especially in the generation of inappropriate or Not-Safe-For-Work (NSFW) content. To this end, we propose a novel attack called Prompt-Restricted Multi-modal Attack (PReMA) to manipulate the generated content by modifying the input image in conjunction with any specified prompt, without altering the prompt itself. PReMA is the first attack that manipulates model outputs by solely creating adversarial images, distinguishing itself from prior methods that primarily generate adversarial prompts to produce NSFW content. Consequently, PReMA poses a novel threat to the integrity of multi-modal diffusion models, particularly in image-editing applications that operate with fixed prompts. Comprehensive evaluations conducted on image inpainting and style transfer tasks across various models confirm the potent efficacy of PReMA.

Paperid: 1790, https://arxiv.org/pdf/2510.25746.pdf

Abstract:
Zero-concentrated differential privacy (zCDP) is a variant of differential privacy (DP) that is widely used partly thanks to its nice composition property. While a tight conversion from $ε$-DP to zCDP exists for the worst-case mechanism, many common algorithms satisfy stronger guarantees. In this work, we derive tight zCDP characterizations for several fundamental mechanisms. We prove that the tight zCDP bound for the $ε$-DP Laplace mechanism is exactly $ε+ e^{-ε} - 1$, confirming a recent conjecture by Wang (2022). We further provide tight bounds for the discrete Laplace mechanism, $k$-Randomized Response (for $k \leq 6$), and RAPPOR. Lastly, we also provide a tight zCDP bound for the worst case bounded range mechanism.

Paperid: 1791, https://arxiv.org/pdf/2510.25677.pdf

Abstract:
ZK-SenseLM is a secure and auditable wireless sensing framework that pairs a large-model encoder for Wi-Fi channel state information (and optionally mmWave radar or RFID) with a policy-grounded decision layer and end-to-end zero-knowledge proofs of inference. The encoder uses masked spectral pretraining with phase-consistency regularization, plus a light cross-modal alignment that ties RF features to compact, human-interpretable policy tokens. To reduce unsafe actions under distribution shift, we add a calibrated selective-abstention head; the chosen risk-coverage operating point is registered and bound into the proof. We implement a four-stage proving pipeline: (C1) feature sanity and commitment, (C2) threshold and version binding, (C3) time-window binding, and (C4) PLONK-style proofs that the quantized network, given the committed window, produced the logged action and confidence. Micro-batched proving amortizes cost across adjacent windows, and a gateway option offloads proofs from low-power devices. The system integrates with differentially private federated learning and on-device personalization without weakening verifiability: model hashes and the registered threshold are part of each public statement. Across activity, presence or intrusion, respiratory proxy, and RF fingerprinting tasks, ZK-SenseLM improves macro-F1 and calibration, yields favorable coverage-risk curves under perturbations, and rejects tamper and replay with compact proofs and fast verification.

Paperid: 1792, https://arxiv.org/pdf/2510.24393.pdf

Abstract:
Though playing an essential role in smart home systems, smart speakers are vulnerable to voice spoofing attacks. Passive liveness detection, which utilizes only the collected audio rather than the deployed sensors to distinguish between live-human and replayed voices, has drawn increasing attention. However, it faces the challenge of performance degradation under the different environmental factors as well as the strict requirement of the fixed user gestures. In this study, we propose a novel liveness feature, array fingerprint, which utilizes the microphone array inherently adopted by the smart speaker to determine the identity of collected audios. Our theoretical analysis demonstrates that by leveraging the circular layout of microphones, compared with existing schemes, array fingerprint achieves a more robust performance under the environmental change and user's movement. Then, to leverage such a fingerprint, we propose ARRAYID, a lightweight passive detection scheme, and elaborate a series of features working together with array fingerprint. Our evaluation on the dataset containing 32,780 audio samples and 14 spoofing devices shows that ARRAYID achieves an accuracy of 99.84%, which is superior to existing passive liveness detection schemes.

Paperid: 1793, https://arxiv.org/pdf/2510.23172.pdf

Abstract:
The Optimism derivation pipeline is engineered for correctness and liveness, not for succinct validity proofs. A straightforward port to a zkVM imposes significant overheads, making validity proofs significantly more costly than necessary. We systematically identify inefficiencies in the current design, analyze their impact on proving costs, and provide a soundness-preserving redesign tailored to zk proving. Our redesign achieves up to 6.5x faster derivation inside zkVMs (3.5x overall speedup) while maintaining identical safety guarantees.

Paperid: 1794, https://arxiv.org/pdf/2510.23060.pdf

Abstract:
Industrial control systems (ICS) form the operational backbone of critical infrastructure networks (CIN) such as power grids, water supply systems, and gas pipelines. As cyber threats to these systems escalate, regulatory agencies are imposing stricter compliance requirements to ensure system-wide security and reliability. A central challenge, however, is enabling regulators to verify the effectiveness of detection mechanisms without requiring utilities to disclose sensitive operational data. In this paper, we introduce zkSTAR, a cyberattack detection framework that leverages zk-SNARKs to reconcile these requirements and enable provable detection guarantees while preserving data confidentiality. Our approach builds on established residual-based statistical hypothesis testing methods applied to state-space detection models. Specifically, we design a two-pronged zk-SNARK architecture that enforces temporal consistency of the state-space dynamics and statistical consistency of the detection tests, allowing regulators to temporally verify alarm correctness without visibility into utility-level data. We formally analyze the soundness and zero knowledge properties of our framework and validate its practical feasibility through computational experiments on real-world ICS datasets. As a result, our work demonstrates a scalable, privacy-preserving alternative for regulatory compliance for ICS driven critical infrastructure networks.

Paperid: 1795, https://arxiv.org/pdf/2510.22024.pdf

Abstract:
Modern connected vehicles rely on persistent LTE connectivity to enable remote diagnostics, over-the-air (OTA) updates, and critical safety services. While mobile network vulnerabilities are well documented in the smartphone ecosystem, their impact in safety-critical automotive settings remains insufficiently examined. In this work, we conduct a black-box, non-invasive security analysis of LTE connectivity in Tesla vehicles, including the Model 3 and Cybertruck, revealing systemic protocol weaknesses and architectural misconfigurations. We find that Tesla's telematics stack is susceptible to IMSI catching, rogue base station hijacking, and insecure fallback mechanisms that may silently degrade service availability. Furthermore, legacy control-plane configurations allow for silent SMS injection and broadcast message spoofing without driver awareness. These vulnerabilities have implications beyond a single vendor as they challenge core assumptions in regulatory frameworks like ISO/SAE 21434 and UN R155/R156, which require secure, traceable, and resilient telematics for type approval of modern vehicles.

Paperid: 1796, https://arxiv.org/pdf/2510.21684.pdf

Abstract:
Large-scale systems that compute analytics over a fleet of devices must achieve high privacy and security standards while also meeting data quality, usability, and resource efficiency expectations. We present a next-generation federated analytics system that uses Trusted Execution Environments (TEEs) based on technologies like AMD SEV-SNP and Intel TDX to provide verifiable privacy guarantees for all server-side processing. In our system, devices encrypt and upload data, tagging it with a limited set of allowable server-side processing steps. An open source, TEE-hosted key management service guarantees that the data is accessible only to those steps, which are themselves protected by TEE confidentiality and integrity assurance guarantees. The system is designed for flexible workloads, including processing unstructured data with LLMs (for structured summarization) before aggregation into differentially private insights (with automatic parameter tuning). The transparency properties of our system allow any external party to verify that all raw and derived data is processed in TEEs, protecting it from inspection by the system operator, and that differential privacy is applied to all released results. This system has been successfully deployed in production, providing helpful insights into real-world GenAI experiences.

Paperid: 1797, https://arxiv.org/pdf/2510.20858.pdf

Abstract:
Operational technology (OT) networks are increasingly coupled with information technology (IT), expanding the attack surface and complicating incident response. Although OT standards emphasise incident reporting and evidence preservation, they do not specify what data to capture during an incident, which hinders coordination across stakeholders. In contrast, IT guidance defines reporting content but does not address OT constraints. This paper presents the Agnostic Incident Reporting (AIR) framework for live OT incident reporting. AIR comprises 25 elements organised into seven groups to capture incident context, chronology, impacts, and actions, tailored to technical, managerial, and regulatory needs. We evaluate AIR by mapping it to major OT standards, defining activation points for integration and triggering established OT frameworks, and then retrospectively applying it to the 2015 Ukrainian distribution grid incident. The evaluation indicates that AIR translates high-level requirements into concrete fields, overlays existing frameworks without vendor dependence, and can support situational awareness and communication during response. AIR offers a basis for standardising live OT incident reporting while supporting technical coordination and regulatory alignment.

Paperid: 1798, https://arxiv.org/pdf/2510.20856.pdf

Abstract:
Vision-Language Models (VLMs), such as CLIP, have demonstrated remarkable zero-shot generalizability across diverse downstream tasks. However, recent studies have revealed that VLMs, including CLIP, are highly vulnerable to adversarial attacks, particularly on their visual modality. Traditional methods for improving adversarial robustness, such as adversarial training, involve extensive retraining and can be computationally expensive. In this paper, we propose a new Test-Time defense: Feature Perception Threshold Counterattack Noise (FPT-Noise), which enhances the adversarial robustness of CLIP without costly fine-tuning. Our core contributions are threefold: First, we introduce a Dynamic Feature Modulator that dynamically generate an image-specific and attack-adaptive noise intensity parameter. Second, We reanalyzed the image features of CLIP. When images are exposed to different levels of noise, clean images and adversarial images exhibit distinct rates of feature change. We established a feature perception threshold to distinguish clean images from attacked ones. Finally, we integrate a Scene-Aware Regulation guided by a stability threshold and leverage Test-Time Transformation Ensembling (TTE) to further mitigate the impact of residual noise and enhance robustness.Extensive experimentation has demonstrated that FPT-Noise significantly outperforms existing Test-Time defense methods, boosting average robust accuracy from 0.07% to 56.86% under AutoAttack while maintaining high performance on clean images (-1.1%). The code will be made public following the publication of the study. The code will be made public following the publication of the study.

Paperid: 1799, https://arxiv.org/pdf/2510.19676.pdf

Abstract:
Large Language Models (LLMs) have achieved remarkable success in generative tasks, including register-transfer level (RTL) hardware synthesis. However, their tendency to memorize training data poses critical risks when proprietary or security-sensitive designs are unintentionally exposed during inference. While prior work has examined memorization in natural language, RTL introduces unique challenges: In RTL, structurally different implementations (e.g., behavioral vs. gate-level descriptions) can realize the same hardware, leading to intellectual property (IP) leakage (full or partial) even without verbatim overlap. Conversely, even small syntactic variations (e.g., operator precedence or blocking vs. non-blocking assignments) can drastically alter circuit behavior, making correctness preservation especially challenging. In this work, we systematically study memorization in RTL code generation and propose CircuitGuard, a defense strategy that balances leakage reduction with correctness preservation. CircuitGuard (1) introduces a novel RTL-aware similarity metric that captures both structural and functional equivalence beyond surface-level overlap, and (2) develops an activation-level steering method that identifies and attenuates transformer components most responsible for memorization. Our empirical evaluation demonstrates that CircuitGuard identifies (and isolates) 275 memorization-critical features across layers 18-28 of Llama 3.1-8B model, achieving up to 80% reduction in semantic similarity to proprietary patterns while maintaining generation quality. CircuitGuard further shows 78-85% cross-domain transfer effectiveness, enabling robust memorization mitigation across circuit categories without retraining.

Paperid: 1800, https://arxiv.org/pdf/2510.18572.pdf

Abstract:
The DNS infrastructure is infamous for facilitating reflective amplification attacks. Various countermeasures such as server shielding, access control, rate limiting, and protocol restrictions have been implemented. Still, the threat remains throughout the deployment of DNS servers. In this paper, we report on and evaluate the often unnoticed threat that derives from transparent DNS forwarders, a widely deployed, incompletely functional set of DNS components. Transparent DNS forwarders transfer DNS requests without rebuilding packets with correct source addresses. As such, transparent forwarders feed DNS requests into (mainly powerful and anycasted) open recursive resolvers, which thereby can be misused to participate unwillingly in distributed reflective amplification attacks. We show how transparent forwarders raise severe threats to the Internet infrastructure. They easily circumvent rate limiting and achieve an additional, scalable impact via the DNS anycast infrastructure. We empirically verify this scaling behavior up to a factor of 14. Transparent forwarders can also assist in bypassing firewall rules that protect recursive resolvers, making these shielded infrastructure entities part of the global DNS attack surface.

Paperid: 1801, https://arxiv.org/pdf/2510.18084.pdf

Abstract:
The integration of Unmanned Aerial Vehicles (UAVs) into Open Radio Access Networks (O-RAN) enhances communication in disaster management and Search and Rescue (SAR) operations by ensuring connectivity when infrastructure fails. However, SAR scenarios demand stringent security and low-latency communication, as delays or breaches can compromise mission success. While UAVs serve as mobile relays, they introduce challenges in energy consumption and resource management, necessitating intelligent allocation strategies. Existing UAV-assisted O-RAN approaches often overlook the joint optimization of security, latency, and energy efficiency in dynamic environments. This paper proposes a novel Reinforcement Learning (RL)-based framework for dynamic resource allocation in UAV relays, explicitly addressing these trade-offs. Our approach formulates an optimization problem that integrates security-aware resource allocation, latency minimization, and energy efficiency, which is solved using RL. Unlike heuristic or static methods, our framework adapts in real-time to network dynamics, ensuring robust communication. Simulations demonstrate superior performance compared to heuristic baselines, achieving enhanced security and energy efficiency while maintaining ultra-low latency in SAR scenarios.

Paperid: 1802, https://arxiv.org/pdf/2510.17277.pdf

Abstract:
Multimodal large language models (MLLMs) have demonstrated significant utility across diverse real-world applications. But MLLMs remain vulnerable to jailbreaks, where adversarial inputs can collapse their safety constraints and trigger unethical responses. In this work, we investigate jailbreaks in the text-vision multimodal setting and pioneer the observation that visual alignment imposes uneven safety constraints across modalities in MLLMs, thereby giving rise to multimodal safety asymmetry. We then develop PolyJailbreak, a black-box jailbreak method grounded in reinforcement learning. Initially, we probe the model's attention dynamics and latent representation space, assessing how visual inputs reshape cross-modal information flow and diminish the model's ability to separate harmful from benign inputs, thereby exposing exploitable vulnerabilities. On this basis, we systematize them into generalizable and reusable operational rules that constitute a structured library of Atomic Strategy Primitives, which translate harmful intents into jailbreak inputs through step-wise transformations. Guided by the primitives, PolyJailbreak employs a multi-agent optimization process that automatically adapts inputs against the target models. We conduct comprehensive evaluations on a variety of open-source and closed-source MLLMs, demonstrating that PolyJailbreak outperforms state-of-the-art baselines.

Paperid: 1803, https://arxiv.org/pdf/2510.17175.pdf

Abstract:
Globally, individuals and organizations employ Quick Response (QR) codes for swift and convenient communication. Leveraging this, cybercriminals embed falsify and misleading information in QR codes to launch various phishing attacks which termed as Quishing. Many former studies have introduced defensive approaches to preclude Quishing such as by classifying the embedded content of QR codes and then label the QR codes accordingly, whereas other studies classify them using visual features (i.e., deep features, histogram density analysis features). However, these approaches mainly rely on black-box techniques which do not clearly provide interpretability and transparency to fully comprehend and reproduce the intrinsic decision process; therefore, having certain obvious limitations includes the approaches' trust, accountability, issues in bias detection, and many more. We proposed QRïS, the pioneer method to classify QR codes through the comprehensive structural analysis of a QR code which helps to identify phishing QR codes beforehand. Our classification method is clearly transparent which makes it reproducible, scalable, and easy to comprehend. First, we generated QR codes dataset (i.e. 400,000 samples) using recently published URLs datasets [1], [2]. Then, unlike black-box models, we developed a simple algorithm to extract 24 structural features from layout patterns present in QR codes. Later, we train the machine learning models on the harvested features and obtained accuracy of up to 83.18%. To further evaluate the effectiveness of our approach, we perform the comparative analysis of proposed method with relevant contemporary studies. Lastly, for real-world deployment and validation, we developed a mobile app which assures the feasibility of the proposed solution in real-world scenarios which eventually strengthen the applicability of the study.

Paperid: 1804, https://arxiv.org/pdf/2510.16871.pdf

Abstract:
In the main text published at USENIX Security 2025, we presented a systematic analysis of the role of cache occupancy in the design considerations for randomized caches (from the perspectives of performance and security). On the performance front, we presented a uniform benchmarking strategy that allows for a fair comparison among different randomized cache designs. Likewise, from the security perspective, we presented three threat assumptions: (1) covert channels; (2) process fingerprinting side-channel; and (3) AES key recovery. The main takeaway of our work is an open problem of designing a randomized cache of comparable efficiency with modern set-associative LLCs, while still resisting both contention-based and occupancy-based attacks. This note is meant as an addendum to the main text in light of the observations made in [2]. To summarize, the authors in [2] argue that (1) L1d cache size plays a role in adversarial success, and that (2) a patched version of MIRAGE with randomized initial seeding of global eviction map prevents leakage of AES key. We discuss the same in this addendum.

Paperid: 1805, https://arxiv.org/pdf/2510.16706.pdf

Abstract:
Feature embedding has become a cornerstone technology for processing high-dimensional and complex data, which results in that Embedding as a Service (EaaS) models have been widely deployed in the cloud. To protect the intellectual property of EaaS models, existing methods apply digital watermarking to inject specific backdoor triggers into EaaS models by modifying training samples or network parameters. However, these methods inevitably produce detectable patterns through semantic analysis and exhibit susceptibility to geometric transformations including rotation, scaling, and translation (RST). To address this problem, we propose a fingerprinting framework for EaaS models, rather than merely refining existing watermarking techniques. Different from watermarking techniques, the proposed method establishes EaaS model ownership through geometric analysis of embedding space's topological structure, rather than relying on the modified training samples or triggers. The key innovation lies in modeling the victim and suspicious embeddings as point clouds, allowing us to perform robust spatial alignment and similarity measurement, which inherently resists RST attacks. Experimental results evaluated on visual and textual embedding tasks verify the superiority and applicability. This research reveals inherent characteristics of EaaS models and provides a promising solution for ownership verification of EaaS models under the black-box scenario.

Paperid: 1806, https://arxiv.org/pdf/2510.16440.pdf

Abstract:
This report presents the winning solution for Task 1 of Colliding with Adversaries: A Challenge on Robust Learning in High Energy Physics Discovery at ECML-PKDD 2025. The task required designing an adversarial attack against a provided classification model that maximizes misclassification while minimizing perturbations. Our approach employs a multi-round gradient-based strategy that leverages the differentiable structure of the model, augmented with random initialization and sample-mixing techniques to enhance effectiveness. The resulting attack achieved the best results in perturbation size and fooling success rate, securing first place in the competition.

Paperid: 1807, https://arxiv.org/pdf/2510.16128.pdf

Abstract:
Generative AI risks such as bias and lack of representation impact people who do not interact directly with GAI systems, but whose content does: indirect users. Several approaches to mitigating harms to indirect users have been described, but most require top down or external intervention. An emerging strategy, prompt injections, provides an empowering alternative: indirect users can mitigate harm against them, from within their own content. Our approach proposes prompt injections not as a malicious attack vector, but as a tool for content/image owner resistance. In this poster, we demonstrate one case study of prompt injections for empowering an indirect user, by retaining an image owner's gender and disabled identity when an image is described by GAI.

Paperid: 1808, https://arxiv.org/pdf/2510.15690.pdf

Abstract:
Deep learning (DL) frameworks serve as the backbone for a wide range of artificial intelligence applications. However, bugs within DL frameworks can cascade into critical issues in higher-level applications, jeopardizing reliability and security. While numerous techniques have been proposed to detect bugs in DL frameworks, research exploring common API patterns across frameworks and the potential risks they entail remains limited. Notably, many DL frameworks expose similar APIs with overlapping input parameters and functionalities, rendering them vulnerable to shared bugs, where a flaw in one API may extend to analogous APIs in other frameworks. To address this challenge, we propose MirrorFuzz, an automated API fuzzing solution to discover shared bugs in DL frameworks. MirrorFuzz operates in three stages: First, MirrorFuzz collects historical bug data for each API within a DL framework to identify potentially buggy APIs. Second, it matches each buggy API in a specific framework with similar APIs within and across other DL frameworks. Third, it employs large language models (LLMs) to synthesize code for the API under test, leveraging the historical bug data of similar APIs to trigger analogous bugs across APIs. We implement MirrorFuzz and evaluate it on four popular DL frameworks (TensorFlow, PyTorch, OneFlow, and Jittor). Extensive evaluation demonstrates that MirrorFuzz improves code coverage by 39.92\% and 98.20\% compared to state-of-the-art methods on TensorFlow and PyTorch, respectively. Moreover, MirrorFuzz discovers 315 bugs, 262 of which are newly found, and 80 bugs are fixed, with 52 of these bugs assigned CNVD IDs.

Paperid: 1809, https://arxiv.org/pdf/2510.15380.pdf

Abstract:
Beyond its widespread application in signal and image processing, \emph{compressed sensing} principles have been greatly applied to secure information transmission (often termed 'compressive security'). In this scenario, the measurement matrix $Q$ acts as a one time pad encryption key (in complex number domain) which can achieve perfect information-theoretic security together with other benefits such as reduced complexity and energy efficiency particularly useful in IoT. However, unless the matrix is changed for every message it is vulnerable towards known plain text attacks: only $n$ observations suffices to recover a key $Q$ with $n$ columns. In this paper, we invent and analyze a new method (termed 'Bilinear Compressive Security (BCS)') addressing these shortcomings: In addition to the linear encoding of the message $x$ with a matrix $Q$, the sender convolves the resulting vector with a randomly generated filter $h$. Assuming that $h$ and $x$ are sparse, the receiver can then recover $x$ without knowledge of $h$ from $y=h*Qx$ through blind deconvolution. We study a rather idealized known plaintext attack for recovering $Q$ from repeated observations of $y$'s for different, known $x_k$, with varying and unknown $h$ ,giving Eve a number of advantages not present in practice. Our main result for BCS states that under a weak symmetry condition on the filter $h$, recovering $Q$ will require extensive sampling from transmissions of $Ω\left(\max\left(n,(n/s)^2\right)\right)$ messages $x_k$ if they are $s$-sparse. Remarkably, with $s=1$ it is impossible to recover the key. In this way, the scheme is much safer than standard compressed sensing even though our assumptions are much in favor towards a potential attacker.

Paperid: 1810, https://arxiv.org/pdf/2510.15188.pdf

Abstract:
Advanced Persistent Threats (APTs) are stealthy cyberattacks that often evade detection in system-level audit logs. Provenance graphs model these logs as connected entities and events, revealing relationships that are missed by linear log representations. Existing systems apply anomaly detection to these graphs but often suffer from high false positive rates and coarse-grained alerts. Their reliance on node attributes like file paths or IPs leads to spurious correlations, reducing detection robustness and reliability. To fully understand an attack's progression and impact, security analysts need systems that can generate accurate, human-like narratives of the entire attack. To address these challenges, we introduce OCR-APT, a system for APT detection and reconstruction of human-like attack stories. OCR-APT uses Graph Neural Networks (GNNs) for subgraph anomaly detection, learning behavior patterns around nodes rather than fragile attributes such as file paths or IPs. This approach leads to a more robust anomaly detection. It then iterates over detected subgraphs using Large Language Models (LLMs) to reconstruct multi-stage attack stories. Each stage is validated before proceeding, reducing hallucinations and ensuring an interpretable final report. Our evaluations on the DARPA TC3, OpTC, and NODLINK datasets show that OCR-APT outperforms state-of-the-art systems in both detection accuracy and alert interpretability. Moreover, OCR-APT reconstructs human-like reports that comprehensively capture the attack story.

Paperid: 1811, https://arxiv.org/pdf/2510.14900.pdf

Abstract:
The Enterprise Intelligence Platform must integrate logs from numerous third-party vendors in order to perform various downstream tasks. However, vendor documentation is often unavailable at test time. It is either misplaced, mismatched, poorly formatted, or incomplete, which makes schema mapping challenging. We introduce a reinforcement learning agent that can self-improve without labeled examples or model weight updates. During inference, the agent: 1) Identifies ambiguous field-mapping attempts. 2) Generates targeted web-search queries to gather external evidence. 3) Applies a confidence-based reward to iteratively refine its mappings. To demonstrate this concept, we converted Microsoft Defender for Endpoint logs into a common schema. Our method increased mapping accuracy from 56.4\%(LLM-only) to 72.73\%(RAG) to 93.94\% over 100 iterations using GPT-4o. At the same time, it reduced the number of low-confidence mappings requiring expert review by 85\%. This new approach provides an evidence-driven, transparent method for solving future industry problems, paving the way for more robust, accountable, scalable, efficient, flexible, adaptable, and collaborative solutions.

Paperid: 1812, https://arxiv.org/pdf/2510.14638.pdf

Abstract:
According to a recent EUROPOL report, cybercrime is still recurrent in Europe, and different activities and countermeasures must be taken to limit, prevent, detect, analyze, and fight it. Cybercrime must be prevented with specific measures, tools, and techniques, for example through automated network and malware analysis. Countermeasures against cybercrime can also be improved with proper \df analysis in order to extract data from digital devices trying to retrieve information on the cybercriminals. Indeed, results obtained through a proper \df analysis can be leveraged to train cybercrime detection systems to prevent the success of similar crimes. Nowadays, some systems have started to adopt Artificial Intelligence (AI) algorithms for cyberattack detection and \df analysis improvement. However, AI can be better applied as an additional instrument in these systems to improve the detection and in the \df analysis. For this reason, we highlight how cybercrime analysis and \df procedures can take advantage of AI. On the other hand, cybercriminals can use these systems to improve their skills, bypass automatic detection, and develop advanced attack techniques. The case study we presented highlights how it is possible to integrate the use of the three popular chatbots {\tt Gemini}, {\tt Copilot} and {\tt chatGPT} to develop a Python code to encode and decoded images with steganographic technique, even though their presence is not an indicator of crime, attack or maliciousness but used by a cybercriminal as anti-forensics technique.

Paperid: 1813, https://arxiv.org/pdf/2510.14522.pdf

Abstract:
Software supply-chain attacks are an important and ongoing concern in the open source software ecosystem. These attacks maintain the standard functionality that a component implements, but additionally hide malicious functionality activated only when the component reaches its target environment. Lexo addresses such stealthy attacks by automatically learning and regenerating vulnerability-free versions of potentially malicious components. Lexo first generates a set of input-output pairs to model a component's full observable behavior, which it then uses to synthesize a new version of the original component. The new component implements the original functionality but avoids stealthy malicious behavior. Throughout this regeneration process, Lexo consults several distinct instances of Large Language Models (LLMs), uses correctness and coverage metrics to shepherd these instances, and guardrails their results. Our evaluation on 100+ real-world packages, including high profile stealthy supply-chain attacks, indicates that Lexo scales across multiple domains, regenerates code efficiently (<100s on average), maintains compatibility, and succeeds in eliminating malicious code in several real-world supply-chain-attacks, even in cases when a state-of-the-art LLM fails to eliminate malicious code when prompted to do so.

Paperid: 1814, https://arxiv.org/pdf/2510.14198.pdf

Abstract:
Toll scams involve criminals registering fake domains that pretend to be legitimate transportation agencies to trick users into making fraudulent payments. Although these scams are rapidly increasing and causing significant harm, they have not been extensively studied. We present the first large-scale analysis of toll scam domains, using a newly created dataset of 67,907 confirmed scam domains mostly registered in 2025. Our study reveals that attackers exploit permissive registrars and less common top-level domains, with 86.9% of domains concentrated in just five non-mainstream TLDs and 72.9% registered via a single provider. We also discover specific registration patterns, including short bursts of activity that suggest automated, coordinated attacks, with over half of domains registered in the first quarter of 2025. This extreme temporal clustering reflects highly synchronized campaign launches. Additionally, we build a simple predictive model using only domain registration data to predict which scam domains are likely to be suspended -- a proxy for confirmed abuse -- achieving 80.4% accuracy, and 92.3% sensitivity. Our analysis reveals attacker strategies for evading detection -- such as exploiting obscure TLDs, permissive registrars, and coordinated registration bursts -- which can inform more targeted interventions by registrars, hosting providers, and security platforms. However, our results suggest that registration metadata alone may be insufficient, and incorporating features from domain URLs and webpage content could further improve detection.

Paperid: 1815, https://arxiv.org/pdf/2510.14043.pdf

Abstract:
Power grids increasingly need real-time situational awareness under the ever-evolving cyberthreat landscape. Advances in snapshot-based system identification approaches have enabled accurately estimating states and topology from a snapshot of measurement data, under random bad data and topology errors. However, modern interactive, targeted false data can stay undetectable to these methods, and significantly compromise estimation accuracy. This work advances system identification that combines snapshot-based method with time-series model via Bayesian Integration, to advance cyber resiliency against both random and targeted false data. Using a distance-based time-series model, this work can leverage historical data of different distributions induced by changes in grid topology and other settings. The normal system behavior captured from historical data is integrated into system identification through a Bayesian treatment, to make solutions robust to targeted false data. We experiment on mixed random anomalies (bad data, topology error) and targeted false data injection attack (FDIA) to demonstrate our method's 1) cyber resilience: achieving over 70% reduction in estimation error under FDIA; 2) anomalous data identification: being able to alarm and locate anomalous data; 3) almost linear scalability: achieving comparable speed with the snapshot-based baseline, both taking <1min per time tick on the large 2,383-bus system using a laptop CPU.

Paperid: 1816, https://arxiv.org/pdf/2510.12828.pdf

Abstract:
The rapid spread of text generated by large language models (LLMs) makes it increasingly difficult to distinguish authentic human writing from machine output. Watermarking offers a promising solution: model owners can embed an imperceptible signal into generated text, marking its origin. Most leading approaches seed an LLM's next-token sampling with a pseudo-random key that can later be recovered to identify the text as machine-generated, while only minimally altering the model's output distribution. However, these methods suffer from two related issues: (i) watermarks are brittle to simple surface-level edits such as paraphrasing or reordering; and (ii) adversaries can append unrelated, potentially harmful text that inherits the watermark, risking reputational damage to model owners. To address these issues, we introduce SimKey, a semantic key module that strengthens watermark robustness by tying key generation to the meaning of prior context. SimKey uses locality-sensitive hashing over semantic embeddings to ensure that paraphrased text yields the same watermark key, while unrelated or semantically shifted text produces a different one. Integrated with state-of-the-art watermarking schemes, SimKey improves watermark robustness to paraphrasing and translation while preventing harmful content from false attribution, establishing semantic-aware keying as a practical and extensible watermarking direction.

Paperid: 1817, https://arxiv.org/pdf/2510.11584.pdf

Abstract:
Adversarial attacks on knowledge graph embeddings (KGE) aim to disrupt the model's ability of link prediction by removing or inserting triples. A recent black-box method has attempted to incorporate textual and structural information to enhance attack performance. However, it is unable to generate human-readable explanations, and exhibits poor generalizability. In the past few years, large language models (LLMs) have demonstrated powerful capabilities in text comprehension, generation, and reasoning. In this paper, we propose LLMAtKGE, a novel LLM-based framework that selects attack targets and generates human-readable explanations. To provide the LLM with sufficient factual context under limited input constraints, we design a structured prompting scheme that explicitly formulates the attack as multiple-choice questions while incorporating KG factual evidence. To address the context-window limitation and hesitation issues, we introduce semantics-based and centrality-based filters, which compress the candidate set while preserving high recall of attack-relevant information. Furthermore, to efficiently integrate both semantic and structural information into the filter, we precompute high-order adjacency and fine-tune the LLM with a triple classification task to enhance filtering performance. Experiments on two widely used knowledge graph datasets demonstrate that our attack outperforms the strongest black-box baselines and provides explanations via reasoning, and showing competitive performance compared with white-box methods. Comprehensive ablation and case studies further validate its capability to generate explanations.

Paperid: 1818, https://arxiv.org/pdf/2510.11343.pdf

Abstract:
Mysterious sightings of Unmanned Aircraft Systems (UAS) over U.S. military facilities, suburban neighborhoods, and commercial airports have intensified scrutiny of drone activity. To increase accountability, the Federal Aviation Administration (FAA) introduced a Remote ID mandate, requiring unmanned aircraft to broadcast their location, operator's location, and identity in real-time. However, current standards leave authentication mechanisms underspecified, enabling spoofing, relay, and replay attacks that can undermine surveillance efforts and potentially disrupt UAS-to-UAS coordination in future deployments. In this paper, we propose TBRD, a practical system for authenticating Remote ID messages in a manner that aligns with existing standards and UAS capabilities. TBRD leverages the TESLA protocol and mobile device TEEs, and introduces a verification mechanism to build a lightweight, mission-scoped authentication system that is both computationally efficient and requires a low communication footprint. We evaluate the performance of TBRD using both an FAA-requirements compatible proof-of-concept implementation for performance metrics and a simulated 4-drone swarm mission scenario to demonstrate its security guarantees under adversarial conditions. Our system provides a 50\% reduction in authentication overhead compared to digital signatures and a 100x reduction in computation time. Our results demonstrate that TBRD can be integrated into current Remote ID infrastructures to provide a scalable, standards-compliant message authentication for both regulatory and operational use cases.

Paperid: 1819, https://arxiv.org/pdf/2510.08813.pdf

Abstract:
Large Language Models (LLMs) are increasingly deployed across multilingual applications that handle sensitive data, yet their scale and linguistic variability introduce major privacy risks. Mostly evaluated for English, this paper investigates how language structure affects privacy leakage in LLMs trained on English, Spanish, French, and Italian medical corpora. We quantify six linguistic indicators and evaluate three attack vectors: extraction, counterfactual memorization, and membership inference. Results show that privacy vulnerability scales with linguistic redundancy and tokenization granularity: Italian exhibits the strongest leakage, while English shows higher membership separability. In contrast, French and Spanish display greater resilience due to higher morphological complexity. Overall, our findings provide the first quantitative evidence that language matters in privacy leakage, underscoring the need for language-aware privacy-preserving mechanisms in LLM deployments.

Paperid: 1820, https://arxiv.org/pdf/2510.08700.pdf

Abstract:
Ensuring ballot secrecy is critical for fair and trustworthy electronic voting systems, yet achieving strong secrecy guarantees in decentralized, large-scale elections remains challenging. This paper proposes the concept of collectively secure voting, in which voters themselves can opt in as secret holders to protect ballot secrecy. A practical blockchain-based collectively secure voting system is designed and implemented. Our design strikes a balance between strong confidentiality guarantees and real-world applicability. The proposed system combines threshold cryptography and smart contracts to ensure ballots remain confidential during voting, while all protocol steps remain transparent and verifiable. Voters can use the system without prior blockchain knowledge through an intuitive user interface that hides underlying complexity. To evaluate this approach, a user testing is conducted. Results show a high willingness to act as secret holders, reliable participation in share release, and high security confidence in the proposed system. The findings demonstrate that voters can collectively maintain secrecy and that such a practical deployment is feasible.

Paperid: 1821, https://arxiv.org/pdf/2510.08016.pdf

Abstract:
Model merging (MM) recently emerged as an effective method for combining large deep learning models. However, it poses significant security risks. Recent research shows that it is highly susceptible to backdoor attacks, which introduce a hidden trigger into a single fine-tuned model instance that allows the adversary to control the output of the final merged model at inference time. In this work, we propose a simple framework for understanding backdoor attacks by treating the attack itself as a task vector. $Backdoor\ Vector\ (BV)$ is calculated as the difference between the weights of a fine-tuned backdoored model and fine-tuned clean model. BVs reveal new insights into attacks understanding and a more effective framework to measure their similarity and transferability. Furthermore, we propose a novel method that enhances backdoor resilience through merging dubbed $Sparse\ Backdoor\ Vector\ (SBV)$ that combines multiple attacks into a single one. We identify the core vulnerability behind backdoor threats in MM: $inherent\ triggers$ that exploit adversarial weaknesses in the base model. To counter this, we propose $Injection\ BV\ Subtraction\ (IBVS)$ - an assumption-free defense against backdoors in MM. Our results show that SBVs surpass prior attacks and is the first method to leverage merging to improve backdoor effectiveness. At the same time, IBVS provides a lightweight, general defense that remains effective even when the backdoor threat is entirely unknown.

Paperid: 1822, https://arxiv.org/pdf/2510.06784.pdf

Abstract:
In this report, we compare the performance of our UltraGroth-based zero-knowledge machine learning framework Bionetta to other tools of similar purpose such as EZKL, Lagrange's deep-prove, or zkml. The results show a significant boost in the proving time for custom-crafted neural networks: they can be proven even on mobile devices, enabling numerous client-side proving applications. While our scheme increases the cost of one-time preprocessing steps, such as circuit compilation and generating trusted setup, our approach is, to the best of our knowledge, the only one that is deployable on the native EVM smart contracts without overwhelming proof size and verification overheads.

Paperid: 1823, https://arxiv.org/pdf/2510.06535.pdf

Abstract:
Small satellites are integral to scientific, commercial, and defense missions, but reliance on commercial off-the-shelf (COTS) hardware broadens their attack surface. Although supply chain threats are well studied in other cyber-physical domains, their feasibility and stealth in space systems remain largely unexplored. Prior work has focused on flight software, which benefits from strict security practices and oversight. In contrast, auxiliary COTS components often lack robust assurance yet enjoy comparable access to critical on-board resources, including telemetry, system calls, and the software bus. Despite this privileged access, the insider threat within COTS hardware supply chains has received little attention. In this work, we present SpyChain, the first end-to-end design and implementation of independent and colluding hardware supply chain threats targeting small satellites. Using NASA's satellite simulation (NOS3), we demonstrate that SpyChain can evade testing, exfiltrate telemetry, disrupt operations, and launch Denial of Service (DoS) attacks through covert channels that bypass ground monitoring. Our study traces an escalation from a simple solo component to dynamic, coordinating malware, introducing a taxonomy of stealth across five scenarios. We showcase how implicit trust in auxiliary components enables covert persistence and reveal novel attack vectors, highlighting a new multi-component execution technique that is now incorporated into the SPARTA matrix. Our findings are reinforced by acknowledgment and affirmation from NASA's NOS3 team. Finally, we implement lightweight onboard defenses, including runtime monitoring, to mitigate threats like SpyChain.

Paperid: 1824, https://arxiv.org/pdf/2510.05900.pdf

Abstract:
Phishing websites remain a persistent cybersecurity threat by mimicking legitimate sites to steal sensitive user information. Existing machine learning-based detection methods often rely on supervised learning with labeled data, which not only incurs substantial annotation costs but also limits adaptability to novel attack patterns. To address these challenges, we propose PhishSSL, a self-supervised contrastive learning framework that eliminates the need for labeled phishing data during training. PhishSSL combines hybrid tabular augmentation with adaptive feature attention to produce semantically consistent views and emphasize discriminative attributes. We evaluate PhishSSL on three phishing datasets with distinct feature compositions. Across all datasets, PhishSSL consistently outperforms unsupervised and self-supervised baselines, while ablation studies confirm the contribution of each component. Moreover, PhishSSL maintains robust performance despite the diversity of feature sets, highlighting its strong generalization and transferability. These results demonstrate that PhishSSL offers a promising solution for phishing website detection, particularly effective against evolving threats in dynamic Web environments.

Paperid: 1825, https://arxiv.org/pdf/2510.05766.pdf

Abstract:
MDS matrices play a critical role in the design of diffusion layers for block ciphers and hash functions due to their optimal branch number. Involutory and orthogonal MDS matrices offer additional benefits by allowing identical or nearly identical circuitry for both encryption and decryption, leading to equivalent implementation costs for both processes. These properties have been further generalized through the notions of semi-involutory and semi-orthogonal matrices. Specifically, we establish nontrivial interconnections between semi-involutory and involutory matrices, as well as between semi-orthogonal and orthogonal matrices. Exploiting these relationships, we show that the number of semi-involutory MDS matrices can be directly derived from the number of involutory MDS matrices, and vice versa. A similar correspondence holds for semi-orthogonal and orthogonal MDS matrices. We also examine the intersection of these classes and show that the number of $3 \times 3$ MDS matrices that are both semi-involutory and semi-orthogonal coincides with the number of semi-involutory MDS matrices over $\mathbb{F}_{2^m}$. Furthermore, we derive the general structure of orthogonal matrices of arbitrary order $n$ over $\mathbb{F}_{2^m}$. Based on this generic form, we provide a closed-form expression for enumerating all $3 \times 3$ orthogonal MDS matrices over $\mathbb{F}_{2^m}$. Finally, leveraging the aforementioned interconnections, we present explicit formulas for counting $3 \times 3$ semi-involutory MDS matrices and semi-orthogonal MDS matrices.

Paperid: 1826, https://arxiv.org/pdf/2510.04619.pdf

Abstract:
Proof-of-Stake (PoS) consensus protocols often face a trade-off between performance and security. Protocols that pre-elect leaders for subsequent rounds are vulnerable to Denial-of-Service (DoS) attacks, which can disrupt the network and compromise liveness. In this work, we present PoS-CoPOR, a single-chain PoS consensus protocol that mitigates this vulnerability by integrating a native onion routing mechanism into the consensus protocol itself. PoS-CoPOR combines stake-weighted probabilistic leader election with an anonymization layer that conceals the network identity of the next block proposer. This approach prevents targeted DoS attacks on leaders before they produce a block, thus enhancing network resilience. We implemented and evaluated PoS-CoPOR, demonstrating its ability to achieve a throughput of up to 110 tx/s with 6 nodes, even with the overhead of the anonymization layer. The results show that native anonymization can provide robust DoS resistance with only a modest impact on performance, offering a solution to build secure and scalable PoS blockchains.

Paperid: 1827, https://arxiv.org/pdf/2510.02475.pdf

Abstract:
Rigorous quantitative evaluation of microarchitectural side channels is challenging for two reasons. First, the processors, attacks, and defenses often exhibit probabilistic behaviors. These probabilistic behaviors arise due to natural noise in systems (e.g., from co-running processes), probabilistic side channel attacks, and probabilistic obfuscation defenses. Second, microprocessors are extremely complex. Previous evaluation methods have relied on abstract or simplified models, which are necessarily less detailed than real systems or cycle-by-cycle simulators, and these models may miss important phenomena. Whereas a simple model may suffice for estimating performance, security issues frequently manifest in the details. We address this challenge by introducing Statistical Model Checking (SMC) to the quantitative evaluation of microarchitectural side channels. SMC is a rigorous statistical technique that can process the results of probabilistic experiments and provide statistical guarantees, and it has been used in computing applications that depend heavily on statistical guarantees (e.g., medical implants, vehicular computing). With SMC, we can treat processors as opaque boxes, and we do not have to abstract or simplify them. We demonstrate the effectiveness of SMC through three case studies, in which we experimentally show that SMC can evaluate existing security vulnerabilities and defenses and provide qualitatively similar conclusions with greater statistical rigor, while making no simplifying assumptions or abstractions. We also show that SMC can enable a defender to quantify the amount of noise necessary to have a desired level of confidence that she has reduced an attacker's probability of success to less than a desired threshold, thus providing the defender with an actionable plan for obfuscation via noise injection.

Paperid: 1828, https://arxiv.org/pdf/2510.00586.pdf

Abstract:
Existing data poisoning attacks on retrieval-augmented generation (RAG) systems scale poorly because they require costly optimization of poisoned documents for each target phrase. We introduce Eyes-on-Me, a modular attack that decomposes an adversarial document into reusable Attention Attractors and Focus Regions. Attractors are optimized to direct attention to the Focus Region. Attackers can then insert semantic baits for the retriever or malicious instructions for the generator, adapting to new targets at near zero cost. This is achieved by steering a small subset of attention heads that we empirically identify as strongly correlated with attack success. Across 18 end-to-end RAG settings (3 datasets $\times$ 2 retrievers $\times$ 3 generators), Eyes-on-Me raises average attack success rates from 21.9 to 57.8 (+35.9 points, 2.6$\times$ over prior work). A single optimized attractor transfers to unseen black box retrievers and generators without retraining. Our findings establish a scalable paradigm for RAG data poisoning and show that modular, reusable components pose a practical threat to modern AI systems. They also reveal a strong link between attention concentration and model outputs, informing interpretability research.

Paperid: 1829, https://arxiv.org/pdf/2509.22027.pdf

Abstract:
Memory safety bugs, such as buffer overflows and use-after-frees, are the leading causes of software safety issues in production. Software-based approaches, e.g., Address Sanitizer (ASAN), can detect such bugs with high precision, but with prohibitively high overhead. ARM's Memory Tagging Extension (MTE) offers a promising alternative to detect these bugs in hardware with a much lower overhead. However, in this paper, we perform a thorough investigation of Google Pixel 8, the first production implementation of ARM MTE, and show that MTE can only achieve coarse precision in bug detection compared with software-based approaches such as ASAN, mainly due to its 16-byte tag granularity. To address this issue, we present NanoTag, a system to detect memory safety bugs in unmodified binaries at byte granularity with ARM MTE. NanoTag detects intra-granule buffer overflows by setting up a tripwire for tag granules that may require intra-granule overflow detection. The memory access to the tripwire causes additional overflow detection in the software while using MTE's hardware to detect bugs for the rest of the accesses. We implement NanoTag based on the Scudo Hardened Allocator, the default memory allocator on Android since Android 11. Our evaluation results across popular benchmarks and real-world case studies show that NanoTag detects nearly as many memory safety bugs as ASAN while incurring similar run-time overhead to Scudo Hardened Allocator in MTE SYNC mode.

Paperid: 1830, https://arxiv.org/pdf/2509.20861.pdf

Abstract:
In the Internet of Things (IoT) environment, continuous interaction among a large number of devices generates complex and dynamic network traffic, which poses significant challenges to rule-based detection approaches. Machine learning (ML)-based traffic detection technology, capable of identifying anomalous patterns and potential threats within this traffic, serves as a critical component in ensuring network security. This study first identifies a significant issue with widely adopted feature extraction tools (e.g., CICMeterFlow): the extensive use of time- and length-related features leads to high sparsity, which adversely affects model convergence. Furthermore, existing traffic detection methods generally lack an embedding mechanism capable of efficiently and comprehensively capturing the semantic characteristics of network traffic. To address these challenges, we propose a novel feature extraction tool that eliminates traditional time and length features in favor of context-aware semantic features related to the source host, thus improving the generalizability of the model. In addition, we design an embedding training framework that integrates the unsupervised DBSCAN clustering algorithm with a contrastive learning strategy to effectively capture fine-grained semantic representations of traffic. Extensive empirical evaluations are conducted on the real-world Mawi data set to validate the proposed method in terms of detection accuracy, robustness, and generalization. Comparative experiments against several state-of-the-art (SOTA) models demonstrate the superior performance of our approach. Furthermore, we confirm its applicability and deployability in real-time scenarios.

Paperid: 1831, https://arxiv.org/pdf/2509.19921.pdf

Abstract:
This paper investigates the fragility of contribution evaluation in federated learning, a critical mechanism for ensuring fairness and incentivizing participation. We argue that contribution scores are susceptible to significant distortions from two fundamental perspectives: architectural sensitivity and intentional manipulation. First, we explore how different model aggregation methods impact these scores. While most research assumes a basic averaging approach, we demonstrate that advanced techniques, including those designed to handle unreliable or diverse clients, can unintentionally yet significantly alter the final scores. Second, we explore vulnerabilities posed by poisoning attacks, where malicious participants strategically manipulate their model updates to inflate their own contribution scores or reduce the importance of other participants. Through extensive experiments across diverse datasets and model architectures, implemented within the Flower framework, we rigorously show that both the choice of aggregation method and the presence of attackers are potent vectors for distorting contribution scores, highlighting a critical need for more robust evaluation schemes.

Paperid: 1832, https://arxiv.org/pdf/2509.19117.pdf

Abstract:
Large language models (LLMs) excel in many tasks of software engineering, yet progress in leveraging them for vulnerability discovery has stalled in recent years. To understand this phenomenon, we investigate LLMs through the lens of classic code metrics. Surprisingly, we find that a classifier trained solely on these metrics performs on par with state-of-the-art LLMs for vulnerability discovery. A root-cause analysis reveals a strong correlation and a causal effect between LLMs and code metrics: When the value of a metric is changed, LLM predictions tend to shift by a corresponding magnitude. This dependency suggests that LLMs operate at a similarly shallow level as code metrics, limiting their ability to grasp complex patterns and fully realize their potential in vulnerability discovery. Based on these findings, we derive recommendations on how research should more effectively address this challenge.

Paperid: 1833, https://arxiv.org/pdf/2509.17969.pdf

Abstract:
Write Once Read Many (WORM) properties for storage devices are desirable to ensure data immutability for applications such as secure logging, regulatory compliance, archival storage, and other types of backup systems. WORM devices guarantee that data, once written, cannot be altered or deleted. However, implementing secure and compatible WORM storage remains a challenge. Traditional solutions often rely on specialized hardware, which is either costly, closed, or inaccessible to the general public. Distributed approaches, while promising, introduce additional risks such as denial-of-service vulnerabilities and operational complexity. We introduce Socarrat, a novel, cost-effective, and local WORM storage solution that leverages a simple external USB device (specifically, a single-board computer running Linux with USB On-The-Go support). The resulting device can be connected via USB, appearing as an ordinary external disk formatted with an ext4 or exFAT file system, without requiring any specialized software or drivers. By isolating the WORM enforcement mechanism in a dedicated USB hardware module, Socarrat significantly reduces the attack surface and ensures that even privileged attackers cannot modify or erase stored data. In addition to the WORM capacity, the system is designed to be tamper-evident, becoming resilient against advanced attacks. This work describes a novel approach, the Reverse File System, based on inferring the file system operations occurring at higher layers in the host computer where Socarrat is mounted. The paper also describes the current Socarrat prototype, implemented in Go and available as free/libre software. Finally, it provides a complete evaluation of the logging performance on different single-board computers.

Paperid: 1834, https://arxiv.org/pdf/2509.17302.pdf

Abstract:
Text embedding inversion attacks reconstruct original sentences from latent representations, posing severe privacy threats in collaborative inference and edge computing. We propose TextCrafter, an optimization-based adversarial perturbation mechanism that combines RL learned, geometry aware noise injection orthogonal to user embeddings with cluster priors and PII signal guidance to suppress inversion while preserving task utility. Unlike prior defenses either non learnable or agnostic to perturbation direction, TextCrafter provides a directional protective policy that balances privacy and utility. Under strong privacy setting, TextCrafter maintains 70 percentage classification accuracy on four datasets and consistently outperforms Gaussian/LDP baselines across lower privacy budgets, demonstrating a superior privacy utility trade off.

Paperid: 1835, https://arxiv.org/pdf/2509.16581.pdf

Abstract:
Zero-knowledge rollups rely on provers to generate multi-step state transition proofs under strict finality and availability constraints. These steps require expensive hardware (e.g., GPUs), and finality is reached only once all stages complete and results are posted on-chain. As rollups scale, staying economically viable becomes increasingly difficult due to rising throughput, fast finality demands, volatile gas prices, and dynamic resource needs. We base our study on Halo2-based proving systems and identify transactions per second (TPS), average gas usage, and finality time as key cost drivers. To address this, we propose a parametric cost model that captures rollup-specific constraints and ensures provers can keep up with incoming transaction load. We formulate this model as a constraint system and solve it using the Z3 SMT solver to find cost-optimal configurations. To validate our approach, we implement a simulator that detects lag and estimates operational costs. Our method shows a potential cost reduction of up to 70\%.

Paperid: 1836, https://arxiv.org/pdf/2509.16157.pdf

Abstract:
Liquidity providers (LPs) are essential figures in the operation of automated market makers (AMMs); in exchange for transaction fees, LPs lend the liquidity that allows AMMs to operate. While many prior works have studied the incentive structures of LPs in general, we currently lack a principled understanding of a special class of LPs known as Just-In-Time (JIT) LPs. These are strategic agents who momentarily supply liquidity for a single swap, in an attempt to extract disproportionately high fees relative to the remaining passive LPs. This paper provides the first formal, transaction-level model of JIT liquidity provision for a widespread class of AMMs known as Concentrated Liquidity Market Makers (CLMMs), as seen in Uniswap V3, for instance. We characterize the landscape of price impact and fee allocation in these systems, formulate and analyze a non-linear optimization problem faced by JIT LPs, and prove the existence of an optimal strategy. By fitting our optimal solution for JIT LPs to real-world CLMMs, we observe that in liquidity pools (particularly those with risky assets), there is a significant gap between observed and optimal JIT behavior. Existing JIT LPs often fail to account for price impact; doing so, we estimate they could increase earnings by up to 69% on average over small time windows. We also show that JIT liquidity, when deployed strategically, can improve market efficiency by reducing slippage for traders, albeit at the cost of eroding average passive LP profits by up to 44% per trade.

Paperid: 1837, https://arxiv.org/pdf/2509.14608.pdf

Abstract:
Large language models (LLMs) are increasingly deployed in enterprise settings where they interact with multiple users and are trained or fine-tuned on sensitive internal data. While fine-tuning enhances performance by internalizing domain knowledge, it also introduces a critical security risk: leakage of confidential training data to unauthorized users. These risks are exacerbated when LLMs are combined with Retrieval-Augmented Generation (RAG) pipelines that dynamically fetch contextual documents at inference time. We demonstrate data exfiltration attacks on AI assistants where adversaries can exploit current fine-tuning and RAG architectures to leak sensitive information by leveraging the lack of access control enforcement. We show that existing defenses, including prompt sanitization, output filtering, system isolation, and training-level privacy mechanisms, are fundamentally probabilistic and fail to offer robust protection against such attacks. We take the position that only a deterministic and rigorous enforcement of fine-grained access control during both fine-tuning and RAG-based inference can reliably prevent the leakage of sensitive data to unauthorized recipients. We introduce a framework centered on the principle that any content used in training, retrieval, or generation by an LLM is explicitly authorized for \emph{all users involved in the interaction}. Our approach offers a simple yet powerful paradigm shift for building secure multi-user LLM systems that are grounded in classical access control but adapted to the unique challenges of modern AI workflows. Our solution has been deployed in Microsoft Copilot Tuning, a product offering that enables organizations to fine-tune models using their own enterprise-specific data.

Paperid: 1838, https://arxiv.org/pdf/2509.13217.pdf

Abstract:
Software Bills of Materials (SBOMs) have become a regulatory requirement for improving software supply chain security and trust by means of transparency regarding components that make up software artifacts. However, enterprise and regulated software vendors commonly wish to restrict who can view confidential software metadata recorded in their SBOMs due to intellectual property or security vulnerability information. To address this tension between transparency and confidentiality, we propose Petra, an SBOM exchange system that empowers software vendors to interoperably compose and distribute redacted SBOM data using selective encryption. Petra enables software consumers to search redacted SBOMs for answers to specific security questions without revealing information they are not authorized to access. Petra leverages a format-agnostic, tamper-evident SBOM representation to generate efficient and confidentiality-preserving integrity proofs, allowing interested parties to cryptographically audit and establish trust in redacted SBOMs. Exchanging redacted SBOMs in our Petra prototype requires less than 1 extra KB per SBOM, and SBOM decryption account for at most 1% of the performance overhead during an SBOM query.

Paperid: 1839, https://arxiv.org/pdf/2509.12649.pdf

Abstract:
Code-generating Large Language Models (LLMs) significantly accelerate software development. However, their frequent generation of insecure code presents serious risks. We present a comprehensive evaluation of seven parameter-efficient fine-tuning (PEFT) techniques, demonstrating substantial gains in secure code generation without compromising functionality. Our research identifies prompt-tuning as the most effective PEFT method, achieving an 80.86% Overall-Secure-Rate on CodeGen2 16B, a 13.5-point improvement over the 67.28% baseline. Optimizing decoding strategies through sampling temperature further elevated security to 87.65%. This equates to a reduction of approximately 203,700 vulnerable code snippets per million generated. Moreover, prompt and prefix tuning increase robustness against poisoning attacks in our TrojanPuzzle evaluation, with strong performance against CWE-79 and CWE-502 attack vectors. Our findings generalize across Python and Java, confirming prompt-tuning's consistent effectiveness. This study provides essential insights and practical guidance for building more resilient software systems with LLMs.

Paperid: 1840, https://arxiv.org/pdf/2509.11695.pdf

Abstract:
Advances in quantum computing necessitate migrating the entire technology stack to post-quantum cryptography. This includes IPsec-based VPN connection authentication. Although there is an RFC draft for post-quantum authentication in this setting, the draft does not consider (stateful) hash-based signatures despite their small signature size and trusted long-term security. We propose a design with time-based state-management that assigns VPN devices a certificate authority (CA) based on the hash-based signature scheme XMSS. The CA then issues leaf certificates which are based on classical cryptography but have a short validity time, e. g., four hours. It is to be expected that even large quantum computers will take significantly longer to break the cryptography, making the design quantum-secure. We propose strategies to make the timekeeping more resilient to faults and tampering, as well as strategies to recognize a wrong system time, minimize its potential damage, and quickly recover. The result is an OpenBSD implementation of a quantum-safe and, regarding the leaf certificates, highly flexible VPN authentication design that requires significantly less bandwidth and computational resources compared to existing alternatives.

Paperid: 1841, https://arxiv.org/pdf/2509.10819.pdf

Abstract:
Zero-knowledge virtual machines (zkVMs) are increasingly deployed in decentralized applications and blockchain rollups since they enable verifiable off-chain computation. These VMs execute general-purpose programs, frequently written in Rust, and produce succinct cryptographic proofs. However, zkVMs are complex, and bugs in their constraint systems or execution logic can cause critical soundness (accepting invalid executions) or completeness (rejecting valid ones) issues. We present Arguzz, the first automated tool for testing zkVMs for soundness and completeness bugs. To detect such bugs, Arguzz combines a novel variant of metamorphic testing with fault injection. In particular, it generates semantically equivalent program pairs, merges them into a single Rust program with a known output, and runs it inside a zkVM. By injecting faults into the VM, Arguzz mimics malicious or buggy provers to uncover overly weak constraints. We used Arguzz to test six real-world zkVMs (RISC Zero, Nexus, Jolt, SP1, OpenVM, and Pico) and found eleven bugs in three of them. One RISC Zero bug resulted in a $50,000 bounty, despite prior audits, demonstrating the critical need for systematic testing of zkVMs.

Paperid: 1842, https://arxiv.org/pdf/2509.10320.pdf

Abstract:
We present AuthREST, an open-source security testing tool targeting broken authentication, one of the most prevalent API security risks in the wild. AuthREST automatically tests web APIs for credential stuffing, password brute forcing, and unchecked token authenticity. Empirical results show that AuthREST is effective in improving web API security. Notably, it uncovered previously unknown authentication vulnerabilitiesin in four public APIs.

Paperid: 1843, https://arxiv.org/pdf/2509.10213.pdf

Abstract:
Existing dynamic vulnerability patching techniques are not well-suited for embedded devices, especially mission-critical ones such as medical equipment, as they have limited computational power and memory but uninterrupted service requirements. Those devices often lack sufficient idle memory for dynamic patching, and the diverse architectures of embedded systems further complicate the creation of patch triggers that are compatible across various system kernels and hardware platforms. To address these challenges, we propose a hot patching framework called StackPatch that facilitates patch development based on stack frame reconstruction. StackPatch introduces different triggering strategies to update programs stored in memory units. We leverage the exception-handling mechanisms commonly available in embedded processors to enhance StackPatch's adaptability across different processor architectures for control flow redirection. We evaluated StackPatch on embedded devices featuring three major microcontroller (MCU) architectures: ARM, RISC-V, and Xtensa. In the experiments, we used StackPatch to successfully fix 102 publicly disclosed vulnerabilities in real-time operating systems (RTOS). We applied patching to medical devices, soft programmable logic controllers (PLCs), and network services, with StackPatch consistently completing each vulnerability remediation in less than 260 MCU clock cycles.

Paperid: 1844, https://arxiv.org/pdf/2509.10165.pdf

Abstract:
Companies are looking to data anonymization research $\unicode{x2013}$ including differential private and synthetic data methods $\unicode{x2013}$ for simple and straightforward compliance solutions. But data anonymization has not taken off in practice because it is anything but simple to implement. For one, it requires making complex choices which are case dependent, such as the domain of the dataset to anonymize; the units to protect; the scope where the data protection should extend to; and the standard of protection. Each variation of these choices changes the very meaning, as well as the practical implications, of differential privacy (or of any other measure of data anonymization). Yet differential privacy is frequently being branded as the same privacy guarantee regardless of variations in these choices. Some data anonymization methods can be effective, but only when the insights required are much larger than the unit of protection. Given that businesses care about profitability, any solution must preserve the patterns between a firm's data and that profitability. As a result, data anonymization solutions usually need to be bespoke and case-specific, which reduces their scalability. Companies should not expect easy wins, but rather recognize that anonymization is just one approach to data privacy with its own particular advantages and drawbacks, while the best strategies jointly leverage the full range of approaches to data privacy and security in combination.

Paperid: 1845, https://arxiv.org/pdf/2509.09222.pdf

Abstract:
Critical Infrastructure (CI) is prone to cyberattacks. Several techniques have been developed to protect CI against such attacks. In this work, we describe a honeypot based on a cyber twin for a water treatment plant. The honeypot is intended to serve as a realistic replica of a water treatment plant that attracts potential attackers. The attacks launched on the honeypot are recorded and analyzed for threat intelligence. The intelligence so obtained is shared with the management of water treatment plants, who in turn may use it to improve plant protection systems. The honeypot used here is operational and has been attacked on several occasions using, for example, a ransomware attack that is described in detail.

Paperid: 1846, https://arxiv.org/pdf/2509.08449.pdf

Abstract:
Federated Learning (FL) enables decentralized model training without sharing raw data, offering strong privacy guarantees. However, existing FL protocols struggle to defend against Byzantine participants, maintain model utility under non-independent and identically distributed (non-IID) data, and remain lightweight for edge devices. Prior work either assumes trusted hardware, uses expensive cryptographic tools, or fails to address privacy and robustness simultaneously. We propose DSFL, a Dual-Server Byzantine-Resilient Federated Learning framework that addresses these limitations using a group-based secure aggregation approach. Unlike LSFL, which assumes non-colluding semi-honest servers, DSFL removes this dependency by revealing a key vulnerability: privacy leakage through client-server collusion. DSFL introduces three key innovations: (1) a dual-server secure aggregation protocol that protects updates without encryption or key exchange, (2) a group-wise credit-based filtering mechanism to isolate Byzantine clients based on deviation scores, and (3) a dynamic reward-penalty system for enforcing fair participation. DSFL is evaluated on MNIST, CIFAR-10, and CIFAR-100 under up to 30 percent Byzantine participants in both IID and non-IID settings. It consistently outperforms existing baselines, including LSFL, homomorphic encryption methods, and differential privacy approaches. For example, DSFL achieves 97.15 percent accuracy on CIFAR-10 and 68.60 percent on CIFAR-100, while FedAvg drops to 9.39 percent under similar threats. DSFL remains lightweight, requiring only 55.9 ms runtime and 1088 KB communication per round.

Paperid: 1847, https://arxiv.org/pdf/2509.07615.pdf

Abstract:
Microcontroller units (MCUs) are widely used in embedded devices due to their low power consumption and cost-effectiveness. MCU firmware controls these devices and is vital to the security of embedded systems. However, performing dynamic security analyses for MCU firmware has remained challenging due to the lack of usable execution environments -- existing dynamic analyses cannot run on physical devices (e.g., insufficient computational resources), while building emulators is costly due to the massive amount of heterogeneous hardware, especially peripherals. Our work is based on the insight that MCU peripherals can be modeled in a two-fold manner. At the structural level, peripherals have diverse implementations but we can use a limited set of primitives to abstract peripherals because their hardware implementations are based on common hardware concepts. At the semantic level, peripherals have diverse functionalities. However, we can use a single unified semantic model to describe the same kind of peripherals because they exhibit similar functionalities. Building on this, we propose FlexEmu, a flexible MCU peripheral emulation framework. Once semantic models are created, FlexEmu automatically extracts peripheral-specific details to instantiate models and generate emulators accordingly. We have successfully applied FlexEmu to model 12 kinds of MCU peripherals. Our evaluation on 90 firmware samples across 15 different MCU platforms shows that the automatically generated emulators can faithfully replicate hardware behaviors and achieve a 98.48% unit test passing rate, outperforming state-of-the-art approaches. To demonstrate the implications of FlexEmu on firmware security, we use the generated emulators to fuzz three popular RTOSes and uncover 10 previously unknown bugs.

Paperid: 1848, https://arxiv.org/pdf/2509.06338.pdf

Abstract:
The widespread distribution of Large Language Models (LLMs) through public platforms like Hugging Face introduces significant security challenges. While these platforms perform basic security scans, they often fail to detect subtle manipulations within the embedding layer. This work identifies a novel class of deployment phase attacks that exploit this vulnerability by injecting imperceptible perturbations directly into the embedding layer outputs without modifying model weights or input text. These perturbations, though statistically benign, systematically bypass safety alignment mechanisms and induce harmful behaviors during inference. We propose Search based Embedding Poisoning(SEP), a practical, model agnostic framework that introduces carefully optimized perturbations into embeddings associated with high risk tokens. SEP leverages a predictable linear transition in model responses, from refusal to harmful output to semantic deviation to identify a narrow perturbation window that evades alignment safeguards. Evaluated across six aligned LLMs, SEP achieves an average attack success rate of 96.43% while preserving benign task performance and evading conventional detection mechanisms. Our findings reveal a critical oversight in deployment security and emphasize the urgent need for embedding level integrity checks in future LLM defense strategies.

Paperid: 1849, https://arxiv.org/pdf/2509.02387.pdf

Abstract:
The growing use of FPGAs in reconfigurable systems introducessecurity risks through malicious bitstreams that could cause denial-of-service (DoS), data leakage, or covert attacks. We investigated chip-level hardware malicious payload in embedded systems and proposed a supervised machine learning method to detect malicious bitstreams via static byte-level features. Our approach diverges from existing methods by analyzing bitstreams directly at the binary level, enabling real-time detection without requiring access to source code or netlists. Bitstreams were sourced from state-of-the-art (SOTA) benchmarks and re-engineered to target the Xilinx PYNQ-Z1 FPGA Development Board. Our dataset included 122 samples of benign and malicious configurations. The data were vectorized using byte frequency analysis, compressed using TSVD, and balanced using SMOTE to address class imbalance. The evaluated classifiers demonstrated that Random Forest achieved a macro F1-score of 0.97, underscoring the viability of real-time Trojan detection on resource-constrained systems. The final model was serialized and successfully deployed via PYNQ to enable integrated bitstream analysis.

Paperid: 1850, https://arxiv.org/pdf/2509.02077.pdf

Abstract:
In the domain of security, vulnerabilities frequently remain undetected even after their exploitation. In this work, vulnerabilities refer to publicly disclosed flaws documented in Common Vulnerabilities and Exposures (CVE) reports. Establishing a connection between attacks and vulnerabilities is essential for enabling timely incident response, as it provides defenders with immediate, actionable insights. However, manually mapping attacks to CVEs is infeasible, thereby motivating the need for automation. This paper evaluates 14 state-of-the-art (SOTA) sentence transformers for automatically identifying vulnerabilities from textual descriptions of attacks. Our results demonstrate that the multi-qa-mpnet-base-dot-v1 (MMPNet) model achieves superior classification performance when using attack Technique descriptions, with an F1-score of 89.0, precision of 84.0, and recall of 94.7. Furthermore, it was observed that, on average, 56% of the vulnerabilities identified by the MMPNet model are also represented within the CVE repository in conjunction with an attack, while 61% of the vulnerabilities detected by the model correspond to those cataloged in the CVE repository. A manual inspection of the results revealed the existence of 275 predicted links that were not documented in the MITRE repositories. Consequently, the automation of linking attack techniques to vulnerabilities not only enhances the detection and response capabilities related to software security incidents but also diminishes the duration during which vulnerabilities remain exploitable, thereby contributing to the development of more secure systems.

Paperid: 1851, https://arxiv.org/pdf/2509.01731.pdf

Abstract:
Quantum computing threatens to undermine classical cryptography by breaking widely deployed encryption and signature schemes. This paper examines enterprise readiness for quantum-safe cybersecurity through three perspectives: (i) the technologist view, assessing the maturity of post-quantum cryptography (PQC) and quantum key distribution (QKD); (ii) the enterprise (CISO/CIO) view, analyzing organizational awareness, risk management, and operational barriers; and (iii) the threat actor view, evaluating the evolving quantum threat and the urgency of migration. Using recent standards (e.g., NIST's 2024 PQC algorithms), industry surveys, and threat intelligence, we synthesize findings via a SWOT analysis to map strengths, weaknesses, opportunities, and threats. Results indicate uneven and generally insufficient preparedness: while PQC standards and niche QKD deployments signal technical progress, fewer than 5\% of enterprises have formal quantum-transition plans, and many underestimate "harvest now, decrypt later" risks. Financial, telecom, and government sectors have begun migration, but most industries remain exploratory or stalled by costs, complexity, and skills gaps. Expert consensus places cryptanalytically relevant quantum computers in the 2030s, yet delayed preparation could leave today's data vulnerable for decades. We recommend immediate steps: establishing crypto-agility, creating quantum transition roadmaps, prioritizing PQC deployment in high-value systems, and upskilling cybersecurity teams. A coordinated, proactive approach is essential to secure current and future digital assets in the quantum era.

Paperid: 1852, https://arxiv.org/pdf/2509.01375.pdf

Abstract:
Nowadays, the volume of network traffic continues to grow, along with the frequency and sophistication of attacks. This scenario highlights the need for solutions capable of continuously adapting, since network behavior is dynamic and changes over time. This work presents an anomaly detection model for network flows using unsupervised machine learning with online learning capabilities. This approach allows the system to dynamically learn the normal behavior of the network and detect deviations without requiring labeled data, which is particularly useful in real-world environments where traffic is constantly changing and labeled data is scarce. The model was implemented using the River library with a One-Class SVM and evaluated on the NF-UNSW-NB15 dataset and its extended version v2, which contain network flows labeled with different attack categories. The results show an accuracy above 98%, a false positive rate below 3.1%, and a recall of 100% in the most advanced version of the dataset. In addition, the low processing time per flow (<0.033 ms) demonstrates the feasibility of the approach for real-time applications.

Paperid: 1853, https://arxiv.org/pdf/2509.00973.pdf

Abstract:
Large Language Models (LLMs) are increasingly deployed in mission-critical systems, facilitating tasks such as satellite operations, command-and-control, military decision support, and cyber defense. Many of these systems are accessed through application programming interfaces (APIs). When such APIs lack robust access controls, they can expose full or top-k logits, creating a significant and often overlooked attack surface. Prior art has mainly focused on reconstructing the output projection layer or distilling surface-level behaviors. However, regenerating a black-box model under tight query constraints remains underexplored. We address that gap by introducing a constrained replication pipeline that transforms partial logit leakage into a functional deployable substitute model clone. Our two-stage approach (i) reconstructs the output projection matrix by collecting top-k logits from under 10k black-box queries via singular value decomposition (SVD) over the logits, then (ii) distills the remaining architecture into compact student models with varying transformer depths, trained on an open source dataset. A 6-layer student recreates 97.6% of the 6-layer teacher model's hidden-state geometry, with only a 7.31% perplexity increase, and a 7.58 Negative Log-Likelihood (NLL). A 4-layer variant achieves 17.1% faster inference and 18.1% parameter reduction with comparable performance. The entire attack completes in under 24 graphics processing unit (GPU) hours and avoids triggering API rate-limit defenses. These results demonstrate how quickly a cost-limited adversary can clone an LLM, underscoring the urgent need for hardened inference APIs and secure on-premise defense deployments.

Paperid: 1854, https://arxiv.org/pdf/2509.00104.pdf

Abstract:
This paper presents an enhanced post-quantum key agreement protocol based on RÃ©nyi entropy, addressing vulnerabilities in the original construction while preserving information-theoretic security properties. We develop a theoretical framework leveraging entropy-preserving operations and secret-shared verification to achieve provable security against quantum adversaries. Through entropy amplification techniques and quantum-resistant commitments, the protocol establishes $2^{128}$ quantum security guarantees under the quantum random oracle model. Key innovations include a confidentiality-preserving verification mechanism using distributed polynomial commitments, tightened min-entropy bounds with guaranteed non-negativity, and composable security proofs in the quantum universal composability framework. Unlike computational approaches, our method provides information-theoretic security without hardness assumptions while maintaining polynomial complexity. Theoretical analysis demonstrates resilience against known quantum attack vectors, including Grover-accelerated brute force and quantum memory attacks. The protocol achieves parameterization for 128-bit quantum security with efficient $\mathcal{O}(n^{2})$ communication complexity. Extensions to secure multiparty computation and quantum network applications are established, providing a foundation for long-term cryptographic security.

Paperid: 1855, https://arxiv.org/pdf/2508.21606.pdf

Abstract:
AES-128 encryption is theoretically secure but vulnerable in practical deployments due to timing and fault injection attacks on embedded systems. This work presents a lightweight dual-detection framework combining statistical thresholding and machine learning (ML) for real-time anomaly detection. By simulating anomalies via delays and ciphertext corruption, we collect timing and data features to evaluate two strategies: (1) a statistical threshold method based on execution time and (2) a Random Forest classifier trained on block-level anomalies. Implemented on CPU and FPGA (PYNQ-Z1), our results show that the ML approach outperforms static thresholds in accuracy, while maintaining real-time feasibility on embedded platforms. The framework operates without modifying AES internals or relying on hardware performance counters. This makes it especially suitable for low-power, resource-constrained systems where detection accuracy and computational efficiency must be balanced.

Paperid: 1856, https://arxiv.org/pdf/2508.20086.pdf

Abstract:
Malicious intent in smart contract development can lead to substantial economic losses. SmartIntentNN is a deep learning model specifically designed to identify unsafe intents in smart contracts. This model integrates the Universal Sentence Encoder, a K-means clustering-based intent highlighting mechanism, and a Bidirectional Long Short-Term Memory network for multi-label classification, achieving an F1 of 0.8633 in distinguishing ten different intent categories. In this study, we present an upgraded version of this model, SmartIntentNN2 (Smart Contract Intent Neural Network V2). A significant enhancement in V2 is the incorporation of a BERT-based pre-trained language model, which has been trained on a dataset of 16,000 real smart contracts using a Masked Language Modeling objective. SmartIntentNN2 retains the BiLSTM-based multi-label classification network. With an improved F1 of 0.927, V2 demonstrates enhanced performance compared to its predecessor, establishing itself as the state-of-the-art model for smart contract intent detection.

Paperid: 1857, https://arxiv.org/pdf/2508.19461.pdf

Abstract:
We stress test monitoring systems for detecting covert misbehavior in autonomous LLM agents (e.g., secretly sharing private information). To this end, we systematize a monitor red teaming (MRT) workflow that incorporates: (1) varying levels of agent and monitor situational awareness; (2) distinct adversarial strategies to evade the monitor, such as prompt injection; and (3) two datasets and environments -- SHADE-Arena for tool-calling agents and our new CUA-SHADE-Arena, which extends TheAgentCompany, for computer-use agents. We run MRT on existing LLM monitor scaffoldings, which orchestrate LLMs and parse agent trajectories, alongside a new hybrid hierarchical-sequential scaffolding proposed in this work. Our empirical results yield three key findings. First, agent awareness dominates monitor awareness: an agent's knowledge that it is being monitored substantially degrades the monitor's reliability. On the contrary, providing the monitor with more information about the agent is less helpful than expected. Second, monitor scaffolding matters more than monitor awareness: the hybrid scaffolding consistently outperforms baseline monitor scaffolding, and can enable weaker models to reliably monitor stronger agents -- a weak-to-strong scaling effect. Third, in a human-in-the-loop setting where humans discuss with the LLM monitor to get an updated judgment for the agent's behavior, targeted human oversight is most effective; escalating only pre-flagged cases to human reviewers improved the TPR by approximately 15% at FPR = 0.01. Our work establishes a standard workflow for MRT, highlighting the lack of adversarial robustness for LLMs and humans when monitoring and detecting agent misbehavior. We release code, data, and logs to spur further research.

Paperid: 1858, https://arxiv.org/pdf/2508.19250.pdf

Abstract:
The imminent threat of quantum computing necessitates quantum-resistant cryptosystems. This paper establishes tight security bounds for two NIST PQC finalists: SPHINCS+ (hash-based) and NTRU (lattice-based). Our key contributions include: (1) A quantum attack model incorporating decoherence effects ($Ï_d$) and parallelization limits; (2) Improved entropy concentration inequalities reducing SPHINCS+ parameters by 15-20\%; (3) Optimized NTRU lattice parameters via quantum lattice entropy $H_Q(Î)$; (4) Tightened NTRU-to-LWE reduction with polynomial-factor improvement. Theoretical results demonstrate significant security enhancement over existing constructions, providing implementable parameters for standardization.

Paperid: 1859, https://arxiv.org/pdf/2508.18942.pdf

Abstract:
With the rapid growth of Electric Vehicle (EV) technology, EVs are destined to shape the future of transportation. The large number of EVs facilitates the development of the emerging vehicle-to-grid (V2G) technology, which realizes bidirectional energy exchanges between EVs and the power grid. This has led to the setting up of electricity markets that are usually confined to a small geographical location, often with a small number of participants. Usually, these markets are manipulated by intermediaries responsible for collecting bids from prosumers, determining the market-clearing price, incorporating grid constraints, and accounting for network losses. While centralized models can be highly efficient, they grant excessive power to the intermediary by allowing them to gain exclusive access to prosumers \textquotesingle price preferences. This opens the door to potential market manipulation and raises significant privacy concerns for users, such as the location of energy providers. This lack of protection exposes users to potential risks, as untrustworthy servers and malicious adversaries can exploit this information to infer trading activities and real identities. This work proposes a secure, decentralized exchange market built on blockchain technology, utilizing a privacy-preserving Automated Market Maker (AMM) model to offer open and fair, and equal access to traders, and mitigates the most common trading-manipulation attacks. Additionally, it incorporates a scalable architecture based on geographical dynamic sharding, allowing for efficient resource allocation and improved performance as the market grows.

Paperid: 1860, https://arxiv.org/pdf/2508.18109.pdf

Abstract:
For vulnerabilities, Proof-of-Concept (PoC) plays an irreplaceable role in demonstrating the exploitability. PoC reports may include critical information such as specific usage, test platforms, and more, providing essential insights for researchers. However, in reality, due to various PoC templates across PoC platforms, PoC reports extensively suffer from information deficiency, leading the suboptimal quality and limited usefulness. Fortunately, we found that information deficiency of PoC reports could be mitigated by the completion from multiple sources given the same referred vulnerability. In this paper, we conduct the first study on the deficiency of information in PoC reports across public platforms. We began by collecting 173,170 PoC reports from 4 different platforms and defined 8 key aspects that PoCs should contain. By integrating rule-based matching and a fine-tuned BERT-NER model for extraction of key aspects, we discovered that all PoC reports available on public platforms have at least one missing key aspect. Subsequently, we developed a multi-source information fusion method to complete the missing aspect information in PoC reports by leveraging CVE entries and related PoC reports from different sources. Finally, we successfully completed 69,583 PoC reports (40.18% of all reports).

Paperid: 1861, https://arxiv.org/pdf/2508.17341.pdf

Abstract:
The rapid expansion of immersive Metaverse applications introduces complex challenges at the intersection of performance, privacy, and environmental sustainability. Centralized architectures fall short in addressing these demands, often resulting in elevated energy consumption, latency, and privacy concerns. This paper proposes MetaFed, a decentralized federated learning (FL) framework that enables sustainable and intelligent resource orchestration for Metaverse environments. MetaFed integrates (i) multi-agent reinforcement learning for dynamic client selection, (ii) privacy-preserving FL using homomorphic encryption, and (iii) carbon-aware scheduling aligned with renewable energy availability. Evaluations on MNIST and CIFAR-10 using lightweight ResNet architectures demonstrate that MetaFed achieves up to 25% reduction in carbon emissions compared to conventional approaches, while maintaining high accuracy and minimal communication overhead. These results highlight MetaFed as a scalable solution for building environmentally responsible and privacy-compliant Metaverse infrastructures.

Paperid: 1862, https://arxiv.org/pdf/2508.17329.pdf

Abstract:
As large language models (LLMs) expose systemic security challenges in high risk applications, including privacy leaks, bias amplification, and malicious abuse, there is an urgent need for a dynamic risk assessment and collaborative defence framework that covers their entire life cycle. This paper focuses on the security problems of large language models (LLMs) in critical application scenarios, such as the possibility of disclosure of user data, the deliberate input of harmful instructions, or the models bias. To solve these problems, we describe the design of a system for dynamic risk assessment and a hierarchical defence system that allows different levels of protection to cooperate. This paper presents a risk assessment system capable of evaluating both static and dynamic indicators simultaneously. It uses entropy weighting to calculate essential data, such as the frequency of sensitive words, whether the API call is typical, the realtime risk entropy value is significant, and the degree of context deviation. The experimental results show that the system is capable of identifying concealed attacks, such as role escape, and can perform rapid risk evaluation. The paper uses a hybrid model called BERT-CRF (Bidirectional Encoder Representation from Transformers) at the input layer to identify and filter malicious commands. The model layer uses dynamic adversarial training and differential privacy noise injection technology together. The output layer also has a neural watermarking system that can track the source of the content. In practice, the quality of this method, especially important in terms of customer service in the financial industry.

Paperid: 1863, https://arxiv.org/pdf/2508.16619.pdf

Abstract:
Efficient and reliable node deployment in Wireless Sensor Networks is crucial for optimizing coverage of the area, connectivity among nodes, and energy efficiency. This paper proposes a hybrid meta heuristic approach combining a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) to address the challenges of energy efficient and reliable node deployment. The GA PSO hybrid leverages GAs strong exploration capabilities and PSOs rapid convergence, achieving an optimum stability between coverage and energy consumption. The performance of the proposed approach is evaluated against GA and PSO alone and the innovatory meta heuristic based Competitive Multi Objective Marine Predators Algorithm (CMOMPA) across varying sensing ranges. Simulation results demonstrate that GA PSO requires 15% to 25% fewer sensor nodes and maintains 95% or more area coverage while maintaining the connectivity in comparison to standalone GA or PSO algorithm. The proposed algorithm also dominates CMOMPA when compared for long sensing and communication range in terms of higher coverage, improved connectivity, and reduced deployment time while requiring fewer sensor nodes. This study also explores key trade offs in WSN deployment and highlights future research directions, including heterogeneous node deployment, mobile WSNs, and enhanced multi objective optimization techniques. The findings underscore the effectiveness of hybrid meta heuristics in improving WSN performance, offering a promising approach for real world applications such as environmental monitoring, smart cities, smart agriculture, disaster response, and IIoT.

Paperid: 1864, https://arxiv.org/pdf/2508.15776.pdf

Abstract:
The pharmaceutical supply chain faces escalating cybersecurity challenges threatening patient safety and operational continuity. This paper examines the transformative potential of zero trust architecture for enhancing security and resilience within this critical ecosystem. We explore the challenges posed by data breaches, counterfeiting, and disruptions and introduce the principles of continuous verification, least-privilege access, and data-centric security inherent in zero trust. Real-world case studies illustrate successful implementations. Benefits include heightened security, data protection, and adaptable resilience. As recognized by researchers and industrialists, a reliable drug tracing system is crucial for ensuring drug safety throughout the pharmaceutical production process. One of the most pivotal domains within the pharmaceutical industry and its associated supply chains where zero trust can be effectively implemented is in the management of narcotics, high-health-risk drugs, and abusable substances. By embracing zero trust, the pharmaceutical industry fortifies its supply chain against constantly changing cyber threats, ensuring the trustworthiness of critical medical operations.

Paperid: 1865, https://arxiv.org/pdf/2508.11742.pdf

Abstract:
Current synthetic traffic generators (SynNetGens) promise privacy but lack comprehensive guarantees or empirical validation, even as their fidelity steadily improves. We introduce the first attack-grounded benchmark for assessing the privacy of SynNetGens directly from the traffic they produce. We frame privacy as membership inference at the traffic-source level--a realistic and actionable threat for data holders. To this end, we present TraceBleed, the first attack that exploits behavioral fingerprints across flows using contrastive learning and temporal chunking, outperforming prior membership inference baselines by 172%. Our large-scale study across GAN-, diffusion-, and GPT-based SynNetGens uncovers critical insights: (i) SynNetGens leak user-level information; (ii) differential privacy either fails to stop these attacks or severely degrades fidelity; and (iii) sharing more synthetic data amplifies leakage by 59% on average. Finally, we introduce TracePatch, the first SynNetGen-agnostic defense that combines adversarial ML with SMT constraints to mitigate leakage while preserving fidelity.

Paperid: 1866, https://arxiv.org/pdf/2508.10212.pdf

Abstract:
Underground mining operations rely on distributed sensor networks to collect critical data daily, including mine temperature, toxic gas concentrations, and miner movements for hazard detection and operational decision-making. However, transmitting raw sensor data to a central server for training deep learning models introduces significant privacy risks, potentially exposing sensitive mine-specific information. Federated Learning (FL) offers a transformative solution by enabling collaborative model training while ensuring that raw data remains localized at each mine. Despite its advantages, FL in underground mining faces key challenges: (i) An attacker may compromise a mine's local model by employing techniques such as sign-flipping attacks or additive noise, leading to erroneous predictions; (ii) Low-quality (yet potentially valuable) data, caused by poor lighting conditions or sensor inaccuracies in mines may degrade the FL training process. In response, this paper proposes MineDetect, a defense FL framework that detects and isolates the attacked models while mitigating the impact of mines with low-quality data. MineDetect introduces two key innovations: (i) Detecting attacked models (maliciously manipulated) by developing a history-aware mechanism that leverages local and global averages of gradient updates; (ii) Identifying and eliminating adversarial influences from unreliable models (generated by clients with poor data quality) on the FL training process. Comprehensive simulations across diverse datasets demonstrate that MineDetect outperforms existing methods in both robustness and accuracy, even in challenging non-IID data scenarios. Its ability to counter adversarial influences while maintaining lower computational efficiency makes it a vital advancement for improving safety and operational effectiveness in underground mining.

Paperid: 1867, https://arxiv.org/pdf/2508.09801.pdf

Abstract:
Malware detection in modern computing environments demands models that are not only accurate but also interpretable and robust to evasive techniques. Graph neural networks (GNNs) have shown promise in this domain by modeling rich structural dependencies in graph-based program representations such as control flow graphs (CFGs). However, single-model approaches may suffer from limited generalization and lack interpretability, especially in high-stakes security applications. In this paper, we propose a novel stacking ensemble framework for graph-based malware detection and explanation. Our method dynamically extracts CFGs from portable executable (PE) files and encodes their basic blocks through a two-step embedding strategy. A set of diverse GNN base learners, each with a distinct message-passing mechanism, is used to capture complementary behavioral features. Their prediction outputs are aggregated by a meta-learner implemented as an attention-based multilayer perceptron, which both classifies malware instances and quantifies the contribution of each base model. To enhance explainability, we introduce an ensemble-aware post-hoc explanation technique that leverages edge-level importance scores generated by a GNN explainer and fuses them using the learned attention weights. This produces interpretable, model-agnostic explanations aligned with the final ensemble decision. Experimental results demonstrate that our framework improves classification performance while providing insightful interpretations of malware behavior.

Paperid: 1868, https://arxiv.org/pdf/2508.08190.pdf

Abstract:
Industrial control systems are a fundamental component of critical infrastructure networks (CIN) such as gas, water and power. With the growing risk of cyberattacks, regulatory compliance requirements are also increasing for large scale critical infrastructure systems comprising multiple utility stakeholders. The primary goal of regulators is to ensure overall system stability with recourse to trustworthy stakeholder attack detection. However, adhering to compliance requirements requires stakeholders to also disclose sensor and control data to regulators raising privacy concerns. In this paper, we present a cyberattack detection framework that utilizes differentially private (DP) hypothesis tests geared towards enhancing regulatory confidence while alleviating privacy concerns of CIN stakeholders. The hallmark of our approach is a two phase privacy scheme that protects the privacy of covariance, as well as the associated sensor driven test statistics computed as a means to generate alarms. Theoretically, we show that our method induces a misclassification error rate comparable to the non-DP cases while delivering robust privacy guarantees. With the help of real-world datasets, we show the reliability of our DP-detection outcomes for a wide variety of attack scenarios for interdependent stakeholders.

Paperid: 1869, https://arxiv.org/pdf/2508.07873.pdf

Abstract:
Federated unlearning (FU) algorithms allow clients in federated settings to exercise their ''right to be forgotten'' by removing the influence of their data from a collaboratively trained model. Existing FU methods maintain data privacy by performing unlearning locally on the client-side and sending targeted updates to the server without exposing forgotten data; yet they often rely on server-side cooperation, revealing the client's intent and identity without enforcement guarantees - compromising autonomy and unlearning privacy. In this work, we propose EFU (Enforced Federated Unlearning), a cryptographically enforced FU framework that enables clients to initiate unlearning while concealing its occurrence from the server. Specifically, EFU leverages functional encryption to bind encrypted updates to specific aggregation functions, ensuring the server can neither perform unauthorized computations nor detect or skip unlearning requests. To further mask behavioral and parameter shifts in the aggregated model, we incorporate auxiliary unlearning losses based on adversarial examples and parameter importance regularization. Extensive experiments show that EFU achieves near-random accuracy on forgotten data while maintaining performance comparable to full retraining across datasets and neural architectures - all while concealing unlearning intent from the server. Furthermore, we demonstrate that EFU is agnostic to the underlying unlearning algorithm, enabling secure, function-hiding, and verifiable unlearning for any client-side FU mechanism that issues targeted updates.

Paperid: 1870, https://arxiv.org/pdf/2508.06059.pdf

Abstract:
State-of-the-art fact-checking systems combat misinformation at scale by employing autonomous LLM-based agents to decompose complex claims into smaller sub-claims, verify each sub-claim individually, and aggregate the partial results to produce verdicts with justifications (explanatory rationales for the verdicts). The security of these systems is crucial, as compromised fact-checkers, which tend to be easily underexplored, can amplify misinformation. This work introduces Fact2Fiction, the first poisoning attack framework targeting such agentic fact-checking systems. Fact2Fiction mirrors the decomposition strategy and exploits system-generated justifications to craft tailored malicious evidences that compromise sub-claim verification. Extensive experiments demonstrate that Fact2Fiction achieves 8.9\%--21.2\% higher attack success rates than state-of-the-art attacks across various poisoning budgets. Fact2Fiction exposes security weaknesses in current fact-checking systems and highlights the need for defensive countermeasures.

Paperid: 1871, https://arxiv.org/pdf/2508.05545.pdf

Abstract:
Redacting Personally Identifiable Information (PII) from unstructured text is critical for ensuring data privacy in regulated domains. While earlier approaches have relied on rule-based systems and domain-specific Named Entity Recognition (NER) models, these methods fail to generalize across formats and contexts. Recent advances in Large Language Models (LLMs) offer a promising alternative, yet the effect of architectural and training choices on redaction performance remains underexplored. LLMs have demonstrated strong performance in tasks that require contextual language understanding, including the redaction of PII in free-form text. Prior work suggests that with appropriate adaptation, LLMs can become effective contextual privacy learners. However, the consequences of architectural and training choices for PII Redaction remain underexplored. In this work, we present a comprehensive analysis of LLMs as privacy-preserving PII Redaction systems. We evaluate a range of LLM architectures and training strategies for their effectiveness in PII Redaction. Our analysis measures redaction performance, semantic preservation, and PII leakage, and compares these outcomes against latency and computational cost. The results provide practical guidance for configuring LLM-based redactors that are accurate, efficient, and privacy-aware. To support reproducibility and real-world deployment, we release PRvL, an open-source suite of fine-tuned models, and evaluation tools for general-purpose PII Redaction. PRvL is built entirely on open-source LLMs and supports multiple inference settings for flexibility and compliance. It is designed to be easily customized for different domains and fully operable within secure, self-managed environments. This enables data owners to perform redactions without relying on third-party services or exposing sensitive content beyond their own infrastructure.

Paperid: 1872, https://arxiv.org/pdf/2508.04281.pdf

Abstract:
Large Language Models (LLMs) are gaining traction as a method to generate consensus statements and aggregate preferences in digital democracy experiments. Yet, LLMs may introduce critical vulnerabilities in these systems. Here, we explore the impact of prompt-injection attacks targeting consensus generating systems by introducing a four-dimensional taxonomy of attacks. We test these attacks using LLaMA 3.1 8B and Chat GPT 4.1 Nano finding the LLMs more vulnerable to criticism attacks -- attacks using disagreeable prompts -- and more effective at tilting ambiguous consensus statements. We also find evidence of more effective manipulation when using explicit imperatives and rational-sounding arguments compared to emotional language or fabricated statistics. To mitigate these vulnerabilities, we apply Direct Preference Optimization (DPO), an alignment method that fine-tunes LLMs to prefer unperturbed consensus statements. While DPO significantly improves robustness, it still offers limited protection against attacks targeting ambiguous consensus. These results advance our understanding of the vulnerability and robustness of consensus generating LLMs in digital democracy applications.

Paperid: 1873, https://arxiv.org/pdf/2508.03967.pdf

Abstract:
In this paper, we introduce RAVID, the first framework for AI-generated image detection that leverages visual retrieval-augmented generation (RAG). While RAG methods have shown promise in mitigating factual inaccuracies in foundation models, they have primarily focused on text, leaving visual knowledge underexplored. Meanwhile, existing detection methods, which struggle with generalization and robustness, often rely on low-level artifacts and model-specific features, limiting their adaptability. To address this, RAVID dynamically retrieves relevant images to enhance detection. Our approach utilizes a fine-tuned CLIP image encoder, RAVID CLIP, enhanced with category-related prompts to improve representation learning. We further integrate a vision-language model (VLM) to fuse retrieved images with the query, enriching the input and improving accuracy. Given a query image, RAVID generates an embedding using RAVID CLIP, retrieves the most relevant images from a database, and combines these with the query image to form an enriched input for a VLM (e.g., Qwen-VL or Openflamingo). Experiments on the UniversalFakeDetect benchmark, which covers 19 generative models, show that RAVID achieves state-of-the-art performance with an average accuracy of 93.85%. RAVID also outperforms traditional methods in terms of robustness, maintaining high accuracy even under image degradations such as Gaussian blur and JPEG compression. Specifically, RAVID achieves an average accuracy of 80.27% under degradation conditions, compared to 63.44% for the state-of-the-art model C2P-CLIP, demonstrating consistent improvements in both Gaussian blur and JPEG compression scenarios. The code will be publicly available upon acceptance.

Paperid: 1874, https://arxiv.org/pdf/2508.03221.pdf

Abstract:
In recent years, Diffusion models have achieved remarkable progress in the field of image generation. However, recent studies have shown that diffusion models are susceptible to backdoor attacks, in which attackers can manipulate the output by injecting covert triggers such as specific visual patterns or textual phrases into the training dataset. Fortunately, with the continuous advancement of defense techniques, defenders have become increasingly capable of identifying and mitigating most backdoor attacks using visual inspection and neural network-based detection methods. However, in this paper, we identify a novel type of backdoor threat that is more lightweight and covert than existing approaches, which we name BadBlocks, requires only about 30% of the computational resources and 20% GPU time typically needed by previous backdoor attacks, yet it successfully injects backdoors and evades the most advanced defense frameworks. BadBlocks enables attackers to selectively contaminate specific blocks within the UNet architecture of diffusion models while maintaining normal functionality in the remaining components. Experimental results demonstrate that BadBlocks achieves a high attack success rate and low perceptual quality loss , even under extremely constrained computational resources and GPU time. Moreover, BadBlocks is able to bypass existing defense frameworks, especially the attention-based backdoor detection method, highlighting it as a novel and noteworthy threat. Ablation studies further demonstrate that effective backdoor injection does not require fine-tuning the entire network and highlight the pivotal role of certain neural network layers in backdoor mapping. Overall, BadBlocks significantly reduces the barrier to conducting backdoor attacks in all aspects. It enables attackers to inject backdoors into large-scale diffusion models even using consumer-grade GPUs.

Paperid: 1875, https://arxiv.org/pdf/2508.03062.pdf

Abstract:
Post-Quantum Cryptographic (PQC) algorithms are mathematically secure and resistant to quantum attacks but can still leak sensitive information in hardware implementations due to natural faults or intentional fault injections. The intent fault injection in side-channel attacks reduces the reliability of crypto implementation in future generation network security procesors. In this regard, this research proposes a lightweight, efficient, recomputation-based fault detection module implemented on a Field Programmable Gate Array (FPGA) for Number Theoretic Transform (NTT). The NTT is primarily composed of memory units and the Cooley-Tukey Butterfly Unit (CT-BU), a critical and computationally intensive hardware component essential for polynomial multiplication. NTT and polynomial multiplication are fundamental building blocks in many PQC algorithms, including Kyber, NTRU, Ring-LWE, and others. In this paper, we present a fault detection method called : Recomputation with a Modular Offset (REMO) for the logic blocks of the CT-BU using Montgomery Reduction and another method called Memory Rule Checkers for the memory components used within the NTT. The proposed fault detection framework sets a new benchmark by achieving high efficiency with significant low implementation cost. It occupies only 16 slices and a single DSP block, with a power consumption of just 3mW in Artix-7 FPGA. The REMO-based detection mechanism achieves a fault coverage of 87.2% to 100%, adaptable across various word sizes, fault bit counts, and fault injection modes. Similarly, the Memory Rule Checkers demonstrate robust performance, achieving 50.7% to 100% fault detection depending on and the nature of injected faults.

Paperid: 1876, https://arxiv.org/pdf/2508.01768.pdf

Abstract:
Transformers have become the backbone of many Machine Learning (ML) applications, including language translation, summarization, and computer vision. As these models are increasingly deployed in shared Graphics Processing Unit (GPU) environments via Machine Learning as a Service (MLaaS), concerns around their security grow. In particular, the risk of side-channel attacks that reveal architectural details without physical access remains underexplored, despite the high value of the proprietary models they target. This work to the best of our knowledge is the first to investigate GPU power and thermal fluctuations as side-channels and further exploit them to extract information from pre-trained transformer models. The proposed analysis shows how these side channels can be exploited at user-privilege to reveal critical architectural details such as encoder/decoder layer and attention head for both language and vision transformers. We demonstrate the practical impact by evaluating multiple language and vision pre-trained transformers which are publicly available. Through extensive experimental evaluations, we demonstrate that the attack model achieves a high accuracy of over 89% on average for model family identification and 100% for hyperparameter classification, in both single-process as well as noisy multi-process scenarios. Moreover, by leveraging the extracted architectural information, we demonstrate highly effective black-box transfer adversarial attacks with an average success rate exceeding 93%, underscoring the security risks posed by GPU side-channel leakage in deployed transformer models.

Paperid: 1877, https://arxiv.org/pdf/2508.01280.pdf

Abstract:
As an emerging service framework built by combining cryptography, P2P network, consensus mechanism and innovative contract technology, blockchain has been widely used in digital finance, data sharing, message traceability and electronic evidence preservation because of its decentralised, non-tamperable and transaction traceability. However, with the complex and changeable application scenarios of blockchain technology and the continuous enhancement of blockchain attack technology, the security of the blockchain system has been seriously threatened, dramatically affecting the development and application of blockchain technology. This paper aims to analyse the attacks on blockchain from the perspective of cryptography. Firstly, from the cryptography technology in the blockchain, the principle of hash functions, digital signatures, and other technologies, as well as their role in the blockchain, are introduced. Then, based on the six-layer architecture of the blockchain, the attacks on the data layer, the network layer, the consensus layer, the contract layer, the incentive layer and the application layer are analysed, and the methods to mitigate or resist the attacks are proposed. Secondly, the attack principles of 51% attack, Double-Spending attack, Reentrancy attack, Replay attack, Sybil attack and Timestamp Tampering attack were analysed, and the mitigation or defence solutions for these six attacks were designed. Finally, the core problems to be solved in blockchain technology are summarised, and the future development of blockchain security technology is projected.

Paperid: 1878, https://arxiv.org/pdf/2508.00840.pdf

Abstract:
This paper explores a theoretical approach to enhance RSA's resistance against quantum attacks by optimizing prime selection through RÃ©nyi entropy constraints. We develop a framework where primes are generated with controlled proximity ($|p-q| < Î³\sqrt{pq}$) to minimize the collision entropy $\mathscr{H}_2$ of the quantum period-finding operator. The main contributions include: (1) establishing a connection between prime distribution properties and quantum attack complexity via Maynard's prime gap theorem, (2) providing a constructive proof for prime existence under entropy constraints, and (3) demonstrating security reduction to ideal lattice problems under the quantum random oracle model. Theoretical analysis suggests that for $k$-bit moduli with $Î³< k^{-1/2+Îµ}$, Shor's algorithm requires $Î©(Î³^{-1}k^{3/2})$ quantum operations while maintaining classical security equivalent to standard RSA. Key Enhancements: (1) Prime existence proof via Maynard's theorem (Theorem 3.1), (2) Ideal lattice embedding for SVP reduction (Theorem 5.3), (3) Quantum Fano bound for information-theoretic analysis (Theorem 6.3), (4) Multi-prime RSA extension (Section 7.3).

Paperid: 1879, https://arxiv.org/pdf/2507.22306.pdf

Abstract:
Context switching is utilized by operating systems to change the execution context between application programs. It involves saving and restoring the states of multiple registers and performing a pipeline flush to remove any pre-fetched instructions, leading to a higher instantaneous power consumption compared to typical program execution. In this paper, we introduce a physical power side-channel leakage source that exploits the power spike observed during a context switch, triggered by the inbuilt sleep function of the system kernel. We observed that this power spike directly correlates with both the power consumption during context switching and the residual power consumption of the previously executed program. Notably, the persistence of residual power signatures from previous workloads extends the scope of this side-channel beyond extracting the data in registers during the context switch. Unlike traditional approaches that require analyzing full power traces, applying complex preprocessing, or relying on external synchronization triggers, this novel technique leverages only the amplitude of a single power spike, significantly simplifying the attack. We developed a power model to illustrate the feasibility of mounting end-to-end side-channel attacks using the sleep-induced power spikes. Experimental evaluation demonstrates that our framework can successfully perform cryptographic key recovery for both AES and SIKE implementations on Broadcom BCM2711.

Paperid: 1880, https://arxiv.org/pdf/2507.22160.pdf

Abstract:
With the growing adoption of Large Language Models (LLMs) in critical areas, ensuring their security against jailbreaking attacks is paramount. While traditional defenses primarily rely on refusing malicious prompts, recent logit-level attacks have demonstrated the ability to bypass these safeguards by directly manipulating the token-selection process during generation. We introduce Strategic Deflection (SDeflection), a defense that redefines the LLM's response to such advanced attacks. Instead of outright refusal, the model produces an answer that is semantically adjacent to the user's request yet strips away the harmful intent, thereby neutralizing the attacker's harmful intent. Our experiments demonstrate that SDeflection significantly lowers Attack Success Rate (ASR) while maintaining model performance on benign queries. This work presents a critical shift in defensive strategies, moving from simple refusal to strategic content redirection to neutralize advanced threats.

Paperid: 1881, https://arxiv.org/pdf/2507.20851.pdf

Abstract:
The logic of many protocols relies on time measurements. However, in Trusted Execution Environments (TEEs) like Intel SGX, the time source is outside the Trusted Computing Base: a malicious system hosting the TEE can manipulate that TEE's notion of time, e.g., jumping in time or affecting the perceived time speed. Previous work like Triad propose protocols for TEEs to maintain a trustworthy time source. However, in this paper, based on a public implementation of Triad that we contribute, we empirically showcase vulnerabilities to this protocol. For example, an attacker controlling the operating system, and consequently the scheduling algorithm, may arbitrarily manipulate their local TEE's clock speed. What is worse, in case of faster malicious clock speeds, an attacker on a single compromised machine may propagate the attack to honest machines participating in Triad's Trusted Time protocol, causing them to skip to timestamps arbitrarily far in the future. Then, infected honest machines propagate time-skips themselves to other honest machines interacting with them. We discuss protocol changes to Triad for higher resilience against such attacks.

Paperid: 1882, https://arxiv.org/pdf/2507.18801.pdf

Abstract:
Binary code analysis is essential in scenarios where source code is unavailable, with extensive applications across various security domains. However, accurately resolving indirect call targets remains a longstanding challenge in maintaining the integrity of static analysis in binary code. This difficulty arises because the operand of a call instruction (e.g., call rax) remains unknown until runtime, resulting in an incomplete inter-procedural control flow graph (CFG). Previous approaches have struggled with low accuracy and limited scalability. To address these limitations, recent work has increasingly turned to machine learning (ML) to enhance analysis. However, this ML-driven approach faces two significant obstacles: low-quality callsite-callee training pairs and inadequate binary code representation, both of which undermine the accuracy of ML models. In this paper, we introduce NeuCall, a novel approach for resolving indirect calls using graph neural networks. Existing ML models in this area often overlook key elements such as data and code cross-references, which are essential for understanding a program's control flow. In contrast, NeuCall augments CFGs with cross-references, preserving rich semantic information. Additionally, we leverage advanced compiler-level type analysis to generate high-quality callsite-callee training pairs, enhancing model precision and reliability. We further design a graph neural model that leverages augmented CFGs and relational graph convolutions for accurate target prediction. Evaluated against real-world binaries from GitHub and the Arch User Repository on x86_64 architecture, NeuCall achieves an F1 score of 95.2%, outperforming state-of-the-art ML-based approaches. These results highlight NeuCall's effectiveness in building precise inter-procedural CFGs and its potential to advance downstream binary analysis and security applications.

Paperid: 1883, https://arxiv.org/pdf/2507.17962.pdf

Abstract:
Achieving timing closure and design-specific optimizations in FPGA-targeted High-Level Synthesis (HLS) remains a significant challenge due to the complex interaction between architectural constraints, resource utilization, and the absence of automated support for platform-specific pragmas. In this work, we propose TimelyHLS, a novel framework integrating Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to automatically generate and iteratively refine HLS code optimized for FPGA-specific timing and performance requirements. TimelyHLS is driven by a structured architectural knowledge base containing FPGA-specific features, synthesis directives, and pragma templates. Given a kernel, TimelyHLS generates HLS code annotated with both timing-critical and design-specific pragmas. The synthesized RTL is then evaluated using commercial toolchains, and simulation correctness is verified against reference outputs via custom testbenches. TimelyHLS iteratively incorporates synthesis logs and performance reports into the LLM engine for refinement in the presence of functional discrepancies. Experimental results across 10 FPGA architectures and diverse benchmarks show that TimelyHLS reduces the need for manual tuning by up to 70%, while achieving up to 4x latency speedup (e.g., 3.85x for Matrix Multiplication, 3.7x for Bitonic Sort) and over 50% area savings in certain cases (e.g., 57% FF reduction in Viterbi). TimelyHLS consistently achieves timing closure and functional correctness across platforms, highlighting the effectiveness of LLM-driven, architecture-aware synthesis in automating FPGA design.

Paperid: 1884, https://arxiv.org/pdf/2507.16372.pdf

Abstract:
Large Language Models (LLMs) are increasingly integrated into daily routines, yet they raise significant privacy and safety concerns. Recent research proposes collaborative inference, which outsources the early-layer inference to ensure data locality, and introduces model safety auditing based on inner neuron patterns. Both techniques expose the LLM's Internal States (ISs), which are traditionally considered irreversible to inputs due to optimization challenges and the highly abstract representations in deep layers. In this work, we challenge this assumption by proposing four inversion attacks that significantly improve the semantic similarity and token matching rate of inverted inputs. Specifically, we first develop two white-box optimization-based attacks tailored for low-depth and high-depth ISs. These attacks avoid local minima convergence, a limitation observed in prior work, through a two-phase inversion process. Then, we extend our optimization attack under more practical black-box weight access by leveraging the transferability between the source and the derived LLMs. Additionally, we introduce a generation-based attack that treats inversion as a translation task, employing an inversion model to reconstruct inputs. Extensive evaluation of short and long prompts from medical consulting and coding assistance datasets and 6 LLMs validates the effectiveness of our inversion attacks. Notably, a 4,112-token long medical consulting prompt can be nearly perfectly inverted with 86.88 F1 token matching from the middle layer of Llama-3 model. Finally, we evaluate four practical defenses that we found cannot perfectly prevent ISs inversion and draw conclusions for future mitigation design.

Paperid: 1885, https://arxiv.org/pdf/2507.16291.pdf

Abstract:
Voice phishing (vishing) remains a persistent threat in cybersecurity, exploiting human trust through persuasive speech. While machine learning (ML)-based classifiers have shown promise in detecting malicious call transcripts, they remain vulnerable to adversarial manipulations that preserve semantic content. In this study, we explore a novel attack vector where large language models (LLMs) are leveraged to generate adversarial vishing transcripts that evade detection while maintaining deceptive intent. We construct a systematic attack pipeline that employs prompt engineering and semantic obfuscation to transform real-world vishing scripts using four commercial LLMs. The generated transcripts are evaluated against multiple ML classifiers trained on a real-world Korean vishing dataset (KorCCViD) with statistical testing. Our experiments reveal that LLM-generated transcripts are both practically and statistically effective against ML-based classifiers. In particular, transcripts crafted by GPT-4o significantly reduce classifier accuracy (by up to 30.96%) while maintaining high semantic similarity, as measured by BERTScore. Moreover, these attacks are both time-efficient and cost-effective, with average generation times under 9 seconds and negligible financial cost per query. The results underscore the pressing need for more resilient vishing detection frameworks and highlight the imperative for LLM providers to enforce stronger safeguards against prompt misuse in adversarial social engineering contexts.

Paperid: 1886, https://arxiv.org/pdf/2507.16164.pdf

Abstract:
Studies have shown that machine learning systems are vulnerable to adversarial examples in theory and practice. Where previous attacks have focused mainly on visual models that exploit the difference between human and machine perception, text-based models have also fallen victim to these attacks. However, these attacks often fail to maintain the semantic meaning of the text and similarity. This paper introduces AdvChar, a black-box attack on Interpretable Natural Language Processing Systems, designed to mislead the classifier while keeping the interpretation similar to benign inputs, thus exploiting trust in system transparency. AdvChar achieves this by making less noticeable modifications to text input, forcing the deep learning classifier to make incorrect predictions and preserve the original interpretation. We use an interpretation-focused scoring approach to determine the most critical tokens that, when changed, can cause the classifier to misclassify the input. We apply simple character-level modifications to measure the importance of tokens, minimizing the difference between the original and new text while generating adversarial interpretations similar to benign ones. We thoroughly evaluated AdvChar by testing it against seven NLP models and three interpretation models using benchmark datasets for the classification task. Our experiments show that AdvChar can significantly reduce the prediction accuracy of current deep learning models by altering just two characters on average in input samples.

Paperid: 1887, https://arxiv.org/pdf/2507.16134.pdf

Abstract:
Privacy-Preserving Federated Learning (PPFL) has emerged as a secure distributed Machine Learning (ML) paradigm that aggregates locally trained gradients without exposing raw data. To defend against model poisoning threats, several robustness-enhanced PPFL schemes have been proposed by integrating anomaly detection. Nevertheless, they still face two major challenges: (1) the reliance on heavyweight encryption techniques results in substantial communication and computation overhead; and (2) single-strategy defense mechanisms often fail to provide sufficient robustness against adaptive adversaries. To overcome these challenges, we propose DP2Guard, a lightweight PPFL framework that enhances both privacy and robustness. DP2Guard leverages a lightweight gradient masking mechanism to replace costly cryptographic operations while ensuring the privacy of local gradients. A hybrid defense strategy is proposed, which extracts gradient features using singular value decomposition and cosine similarity, and applies a clustering algorithm to effectively identify malicious gradients. Additionally, DP2Guard adopts a trust score-based adaptive aggregation scheme that adjusts client weights according to historical behavior, while blockchain records aggregated results and trust scores to ensure tamper-proof and auditable training. Extensive experiments conducted on two public datasets demonstrate that DP2Guard effectively defends against four advanced poisoning attacks while ensuring privacy with reduced communication and computation costs.

Paperid: 1888, https://arxiv.org/pdf/2507.15419.pdf

Abstract:
Phishing websites remain a major cybersecurity threat, yet existing methods primarily focus on detection, while the recognition of underlying malicious intentions remains largely unexplored. To address this gap, we propose PhishIntentionLLM, a multi-agent retrieval-augmented generation (RAG) framework that uncovers phishing intentions from website screenshots. Leveraging the visual-language capabilities of large language models (LLMs), our framework identifies four key phishing objectives: Credential Theft, Financial Fraud, Malware Distribution, and Personal Information Harvesting. We construct and release the first phishing intention ground truth dataset (~2K samples) and evaluate the framework using four commercial LLMs. Experimental results show that PhishIntentionLLM achieves a micro-precision of 0.7895 with GPT-4o and significantly outperforms the single-agent baseline with a ~95% improvement in micro-precision. Compared to the previous work, it achieves 0.8545 precision for credential theft, marking a ~4% improvement. Additionally, we generate a larger dataset of ~9K samples for large-scale phishing intention profiling across sectors. This work provides a scalable and interpretable solution for intention-aware phishing analysis.

Paperid: 1889, https://arxiv.org/pdf/2507.14248.pdf

Abstract:
Vision transformer (ViT) models, when coupled with interpretation models, are regarded as secure and challenging to deceive, making them well-suited for security-critical domains such as medical applications, autonomous vehicles, drones, and robotics. However, successful attacks on these systems can lead to severe consequences. Recent research on threats targeting ViT models primarily focuses on generating the smallest adversarial perturbations that can deceive the models with high confidence, without considering their impact on model interpretations. Nevertheless, the use of interpretation models can effectively assist in detecting adversarial examples. This study investigates the vulnerability of transformer models to adversarial attacks, even when combined with interpretation models. We propose an attack called "AdViT" that generates adversarial examples capable of misleading both a given transformer model and its coupled interpretation model. Through extensive experiments on various transformer models and two transformer-based interpreters, we demonstrate that AdViT achieves a 100% attack success rate in both white-box and black-box scenarios. In white-box scenarios, it reaches up to 98% misclassification confidence, while in black-box scenarios, it reaches up to 76% misclassification confidence. Remarkably, AdViT consistently generates accurate interpretations in both scenarios, making the adversarial examples more difficult to detect.

Paperid: 1890, https://arxiv.org/pdf/2507.13598.pdf

Abstract:
We present GIFT: a {G}radient-aware {I}mmunization technique to defend diffusion models against malicious {F}ine-{T}uning while preserving their ability to generate safe content. Existing safety mechanisms like safety checkers are easily bypassed, and concept erasure methods fail under adversarial fine-tuning. GIFT addresses this by framing immunization as a bi-level optimization problem: the upper-level objective degrades the model's ability to represent harmful concepts using representation noising and maximization, while the lower-level objective preserves performance on safe data. GIFT achieves robust resistance to malicious fine-tuning while maintaining safe generative quality. Experimental results show that our method significantly impairs the model's ability to re-learn harmful concepts while maintaining performance on safe content, offering a promising direction for creating inherently safer generative models resistant to adversarial fine-tuning attacks.

Paperid: 1891, https://arxiv.org/pdf/2507.13170.pdf

Abstract:
Audio plays a crucial role in applications like speaker verification, voice-enabled smart devices, and audio conferencing. However, audio manipulations, such as deepfakes, pose significant risks by enabling the spread of misinformation. Our empirical analysis reveals that existing methods for detecting deepfake audio are often vulnerable to anti-forensic (AF) attacks, particularly those attacked using generative adversarial networks. In this article, we propose a novel collaborative learning method called SHIELD to defend against generative AF attacks. To expose AF signatures, we integrate an auxiliary generative model, called the defense (DF) generative model, which facilitates collaborative learning by combining input and output. Furthermore, we design a triplet model to capture correlations for real and AF attacked audios with real-generated and attacked-generated audios using auxiliary generative models. The proposed SHIELD strengthens the defense against generative AF attacks and achieves robust performance across various generative models. The proposed AF significantly reduces the average detection accuracy from 95.49% to 59.77% for ASVspoof2019, from 99.44% to 38.45% for In-the-Wild, and from 98.41% to 51.18% for HalfTruth for three different generative models. The proposed SHIELD mechanism is robust against AF attacks and achieves an average accuracy of 98.13%, 98.58%, and 99.57% in match, and 98.78%, 98.62%, and 98.85% in mismatch settings for the ASVspoof2019, In-the-Wild, and HalfTruth datasets, respectively.

Paperid: 1892, https://arxiv.org/pdf/2507.12345.pdf

Abstract:
Control Flow Attestation (CFA) allows remote verification of run-time software integrity in embedded systems. However, CFA is limited by the storage/transmission costs of generated control flow logs (CFlog). Recent work has proposed application-specific optimizations by speculating on likely sub-paths in CFlog and replacing them with reserved symbols at runtime. Albeit effective, prior approaches do not consider the representation of addresses in a control flow path for speculation. This work proposes RESPEC-CFA, an architectural extension for CFA allowing for speculation on (1) the locality of control flows and (2) their Huffman encoding. Alone, RESPEC-CFA reduces CFlog sizes by up to 90.1%. Combined with prior methods, RESPEC-CFA yields reductions of up to 99.7%, representing a significant step toward practical CFA.

Paperid: 1893, https://arxiv.org/pdf/2507.11138.pdf

Abstract:
Facial motion capture in mixed reality headsets enables real-time avatar animation, allowing users to convey non-verbal cues during virtual interactions. However, as facial motion data constitutes a behavioral biometric, its use raises novel privacy concerns. With mixed reality systems becoming more immersive and widespread, understanding whether face motion data can lead to user identification or inference of sensitive attributes is increasingly important. To address this, we conducted a study with 116 participants using three types of headsets across three sessions, collecting facial, eye, and head motion data during verbal and non-verbal tasks. The data used is not raw video, but rather, abstract representations that are used to animate digital avatars. Our analysis shows that individuals can be re-identified from this data with up to 98% balanced accuracy, are even identifiable across device types, and that emotional states can be inferred with up to 86% accuracy. These results underscore the potential privacy risks inherent in face motion tracking in mixed reality environments.

Paperid: 1894, https://arxiv.org/pdf/2507.09699.pdf

Abstract:
As the use of differential privacy (DP) becomes widespread, the development of effective tools for reasoning about the privacy guarantee becomes increasingly critical. In pursuit of this goal, we demonstrate novel relationships between DP and measures of statistical disclosure risk. We suggest how experts and non-experts can use these results to explain the DP guarantee, interpret DP composition theorems, select and justify privacy parameters, and identify worst-case adversary prior probabilities.

Paperid: 1895, https://arxiv.org/pdf/2507.09508.pdf

Abstract:
Large language models (LLMs) have become proficient at sophisticated code-generation tasks, yet remain ineffective at reliably detecting or avoiding code vulnerabilities. Does this deficiency stem from insufficient learning about code vulnerabilities, or is it merely a result of ineffective prompting? Using representation engineering techniques, we investigate whether LLMs internally encode the concepts necessary to identify code vulnerabilities. We find that current LLMs encode precise internal representations that distinguish vulnerable from secure code--achieving greater accuracy than standard prompting approaches. Leveraging these vulnerability-sensitive representations, we develop an inference-time steering technique that subtly modulates the model's token-generation probabilities through a mixture of corrections (MoC). Our method effectively guides LLMs to produce less vulnerable code without compromising functionality, demonstrating a practical approach to controlled vulnerability management in generated code. Notably, MoC enhances the security ratio of Qwen2.5-Coder-7B by 8.9\%, while simultaneously improving functionality on HumanEval pass@1 by 2.1\%.

Paperid: 1896, https://arxiv.org/pdf/2507.08166.pdf

Abstract:
Rowhammer is a read disturbance vulnerability in modern DRAM that causes bit-flips, compromising security and reliability. While extensively studied on Intel and AMD CPUs with DDR and LPDDR memories, its impact on GPUs using GDDR memories, critical for emerging machine learning applications, remains unexplored. Rowhammer attacks on GPUs face unique challenges: (1) proprietary mapping of physical memory to GDDR banks and rows, (2) high memory latency and faster refresh rates that hinder effective hammering, and (3) proprietary mitigations in GDDR memories, difficult to reverse-engineer without FPGA-based test platforms. We introduce GPUHammer, the first Rowhammer attack on NVIDIA GPUs with GDDR6 DRAM. GPUHammer proposes novel techniques to reverse-engineer GDDR DRAM row mappings, and employs GPU-specific memory access optimizations to amplify hammering intensity and bypass mitigations. Thus, we demonstrate the first successful Rowhammer attack on a discrete GPU, injecting up to 8 bit-flips across 4 DRAM banks on an NVIDIA A6000 with GDDR6 memory. We also show how an attacker can use these to tamper with ML models, causing significant accuracy drops (up to 80%).

Paperid: 1897, https://arxiv.org/pdf/2507.06258.pdf

Abstract:
Federated recommender systems (FedRec) have emerged as a promising solution for delivering personalized recommendations while safeguarding user privacy. However, recent studies have demonstrated their vulnerability to poisoning attacks. Existing attacks typically target the entire user group, which compromises stealth and increases the risk of detection. In contrast, real-world adversaries may prefer to prompt target items to specific user subgroups, such as recommending health supplements to elderly users. Motivated by this gap, we introduce Spattack, the first targeted poisoning attack designed to manipulate recommendations for specific user subgroups in the federated setting. Specifically, Spattack adopts a two-stage approximation-and-promotion strategy, which first simulates user embeddings of target/non-target subgroups and then prompts target items to the target subgroups. To enhance the approximation stage, we push the inter-group embeddings away based on contrastive learning and augment the target group's relevant item set based on clustering. To enhance the promotion stage, we further propose to adaptively tune the optimization weights between target and non-target subgroups. Besides, an embedding alignment strategy is proposed to align the embeddings between the target items and the relevant items. We conduct comprehensive experiments on three real-world datasets, comparing Spattack against seven state-of-the-art poisoning attacks and seven representative defense mechanisms. Experimental results demonstrate that Spattack consistently achieves strong manipulation performance on the specific user subgroup, while incurring minimal impact on non-target users, even when only 0.1\% of users are malicious. Moreover, Spattack maintains competitive overall recommendation performance and exhibits strong resilience against existing mainstream defenses.

Paperid: 1898, https://arxiv.org/pdf/2507.06064.pdf

Abstract:
This paper presents Wrapless -- a lending protocol that enables the collateralization of bitcoins without requiring a trusted wrapping mechanism. The protocol facilitates a "loan channel" on the Bitcoin blockchain, allowing bitcoins to be locked as collateral for loans issued on any blockchain that supports Turing-complete smart contracts. The protocol is designed in a way that makes it economically irrational for each involved party to manipulate the loan rules. There is still a significant research area to bring the protocol closer to traditional AMM financial instruments.

Paperid: 1899, https://arxiv.org/pdf/2507.05228.pdf

Abstract:
As LLMs continue to increase in parameter size, the computational resources required to run them are available to fewer parties. Therefore, third-party inference services -- where LLMs are hosted by third parties with significant computational resources -- are becoming increasingly popular. However, third party inference raises critical concerns about user data privacy. To mitigate these risks, privacy researchers have developed provably secure schemes for third-party inference, such as Secure Multi-Party Computation (SMPC). However, SMPC protocols have significant computational and communication overhead, and do not scale to large models. In this work, we propose a new multi-party inference protocol, Cascade, that avoids these punitive costs by leveraging sharding in the sequence dimension to maintain privacy, trading off cryptographic privacy guarantees for increased performance and scalability. We demonstrate that Cascade is resistant to a generalization of a recent attack that is highly effective against other statistical privacy schemes, and that it is further resistant to learning-based attacks. As Cascade is orders of magnitude faster than existing schemes, our findings offer practical solutions for secure deployment of modern state-of-the-art LLMs.

Paperid: 1900, https://arxiv.org/pdf/2507.04197.pdf

Abstract:
Advanced Encryption Standard (AES) is a widely adopted cryptographic algorithm, yet its practical implementations remain susceptible to side-channel and fault injection attacks. In this work, we propose a comprehensive framework that enhances AES-128 encryption security through controlled anomaly injection and real-time anomaly detection using both statistical and machine learning (ML) methods. We simulate timing and fault-based anomalies by injecting execution delays and ciphertext perturbations during encryption, generating labeled datasets for detection model training. Two complementary detection mechanisms are developed: a threshold-based timing anomaly detector and a supervised Random Forest classifier trained on combined timing and ciphertext features. We implement and evaluate the framework on both CPU and FPGA-based SoC hardware (PYNQ-Z1), measuring performance across varying block sizes, injection rates, and core counts. Our results show that ML-based detection significantly outperforms threshold-based methods in precision and recall while maintaining real-time performance on embedded hardware. Compared to existing AES anomaly detection methods, our solution offers a low-cost, real-time, and accurate detection approach deployable on lightweight FPGA platforms.

Paperid: 1901, https://arxiv.org/pdf/2507.02309.pdf

Abstract:
RESTful APIs facilitate data exchange between applications, but they also expose sensitive resources to potential exploitation. Broken Object Level Authorization (BOLA) is the top vulnerability in the OWASP API Security Top 10, exemplifies a critical access control flaw where attackers manipulate API parameters to gain unauthorized access. To address this, we propose BOLAZ, a defense framework grounded in zero trust principles. BOLAZ analyzes the data flow of resource IDs, pinpointing BOLA attack injection points and determining the associated authorization intervals to prevent horizontal privilege escalation. Our approach leverages static taint tracking to categorize APIs into producers and consumers based on how they handle resource IDs. By mapping the propagation paths of resource IDs, BOLAZ captures the context in which these IDs are produced and consumed, allowing for precise identification of authorization boundaries. Unlike defense methods based on common authorization models, BOLAZ is the first authorization-guided method that adapts defense rules based on the system's best-practice authorization logic. We validate BOLAZ through empirical research on 10 GitHub projects. The results demonstrate BOLAZ's effectiveness in defending against vulnerabilities collected from CVE and discovering 35 new BOLA vulnerabilities in the wild, demonstrating its practicality in real-world deployments.

Paperid: 1902, https://arxiv.org/pdf/2507.01292.pdf

Abstract:
The existence of one-way functions (OWFs) forms the minimal assumption in classical cryptography. However, this is not necessarily the case in quantum cryptography. One-way puzzles (OWPuzzs), introduced by Khurana and Tomer, provide a natural quantum analogue of OWFs. The existence of OWPuzzs implies $PP\neq BQP$, while the converse remains open. In classical cryptography, the analogous problem-whether OWFs can be constructed from $P \neq NP$-has long been studied from the viewpoint of hardness of learning. Hardness of learning in various frameworks (including PAC learning) has been connected to OWFs or to $P \neq NP$. In contrast, no such characterization previously existed for OWPuzzs. In this paper, we establish the first complete characterization of OWPuzzs based on the hardness of a well-studied learning model: distribution learning. Specifically, we prove that OWPuzzs exist if and only if proper quantum distribution learning is hard on average. A natural question that follows is whether the worst-case hardness of proper quantum distribution learning can be derived from $PP \neq BQP$. If so, and a worst-case to average-case hardness reduction is achieved, it would imply OWPuzzs solely from $PP \neq BQP$. However, we show that this would be extremely difficult: if worst-case hardness is PP-hard (in a black-box reduction), then $SampBQP \neq SampBPP$ follows from the infiniteness of the polynomial hierarchy. Despite that, we show that $PP \neq BQP$ is equivalent to another standard notion of hardness of learning: agnostic. We prove that $PP \neq BQP$ if and only if agnostic quantum distribution learning with respect to KL divergence is hard. As a byproduct, we show that hardness of agnostic quantum distribution learning with respect to statistical distance against $PPT^{Î£_3^P}$ learners implies $SampBQP \neq SampBPP$.

Paperid: 1903, https://arxiv.org/pdf/2512.24888.pdf

Abstract:
The decentralized architecture of Web3 technologies creates fundamental challenges for Anti-Money Laundering and Counter-Financing of Terrorism compliance. Traditional regulatory technology solutions designed for centralized financial systems prove inadequate for blockchain's transparent yet pseudonymous networks. This systematization examines how blockchain-native RegTech solutions leverage distributed ledger properties to enable novel compliance capabilities. We develop three taxonomies organizing the Web3 RegTech domain: a regulatory paradigm evolution framework across ten dimensions, a compliance protocol taxonomy encompassing five verification layers, and a RegTech lifecycle framework spanning preventive, real-time, and investigative phases. Through analysis of 41 operational commercial platforms and 28 academic prototypes selected from systematic literature review (2015-2025), we demonstrate that Web3 RegTech enables transaction graph analysis, real-time risk assessment, cross-chain analytics, and privacy-preserving verification approaches that are difficult to achieve or less commonly deployed in traditional centralized systems. Our analysis reveals critical gaps between academic innovation and industry deployment, alongside persistent challenges in cross-chain tracking, DeFi interaction analysis, privacy protocol monitoring, and scalability. We synthesize architectural best practices and identify research directions addressing these gaps while respecting Web3's core principles of decentralization, transparency, and user sovereignty.

Paperid: 1904, https://arxiv.org/pdf/2512.23439.pdf

Abstract:
We introduce Bitcoin-IPC, a software stack and protocol that scales Bitcoin towards helping it become the universal Medium of Exchange (MoE) by enabling the permissionless creation of fully programmable Proof-of-Stake (PoS) Layer-2 chains, called subnets, whose stake is denominated in L1 BTC. Bitcoin-IPC subnets rely on Bitcoin L1 for the communication of critical information, settlement, and security. Our design, inspired by SWIFT messaging and embedded within Bitcoin's SegWit mechanism, enables seamless value transfer across L2 subnets, routed through Bitcoin L1. Uniquely, this mechanism reduces the virtual-byte cost per transaction (vB per tx) by up to 23x, compared to transacting natively on Bitcoin L1, effectively increasing monetary transaction throughput from 7 tps to over 160 tps, without requiring any modifications to Bitcoin L1.

Paperid: 1905, https://arxiv.org/pdf/2512.19314.pdf

Abstract:
Quantum circuit obfuscation is becoming increasingly important to prevent theft and reverse engineering of quantum algorithms. As quantum computing advances, the need to protect the intellectual property contained in quantum circuits continues to grow. Existing methods often provide limited defense against structural and statistical analysis or introduce considerable overhead. In this paper, we propose a novel quantum obfuscation method that uses randomized U3 transformations to conceal circuit structure while preserving functionality. We implement and assess our approach on QASM circuits using Qiskit AER, achieving over 93\% semantic accuracy with minimal runtime overhead. The method demonstrates strong resistance to reverse engineering and structural inference, making it a practical and effective approach for quantum software protection.

Paperid: 1906, https://arxiv.org/pdf/2512.18517.pdf

Abstract:
Userland memory forensics has become a critical component of smartphone investigations and incident response, enabling the recovery of volatile evidence such as deleted messages from end-to-end encrypted apps and cryptocurrency transactions. However, these forensics tools, particularly on Android, face significant challenges in adapting to different versions and maintaining reliability over time due to the constant evolution of low-level structures critical for evidence recovery and reconstruction. Structural changes, ranging from simple offset modifications to complete architectural redesigns, pose substantial maintenance and adaptability issues for forensic tools that rely on precise structure interpretation. Thus, this paper presents the first systematic study of Android Runtime (ART) structural evolution and its implications for memory forensics. We conduct an empirical analysis of critical Android runtime structures, examining their evolution across six versions for four different architectures. Our findings reveal that over 73.2% of structure members underwent positional changes, significantly affecting the adaptability and reliability of memory forensic tools. Further analysis of core components such as Runtime, Thread, and Heap structures highlights distinct evolution patterns and their impact on critical forensic operations, including thread state enumeration, memory mapping, and object reconstruction. These results demonstrate that traditional approaches relying on static structure definitions and symbol-based methods, while historically reliable, are increasingly unsustainable on their own. We recommend that memory forensic tools in general and Android in particular evolve toward hybrid approaches that retain the validation strength of symbolic methods while integrating automated structure inference, version-aware parsing, and redundant analysis strategies.

Paperid: 1907, https://arxiv.org/pdf/2512.16965.pdf

Abstract:
The National Institute of Standards and Technology (NIST) Computer Forensic Tool Testing (CFTT) programme has become the de facto standard for providing digital forensic tool testing and validation. However to date, no comprehensive framework exists to automate benchmarking across the diverse forensic tasks included in the programme. This gap results in inconsistent validation, challenges in comparing tools, and limited validation reproducibility. This paper introduces AutoDFBench 1.0, a modular benchmarking framework that supports the evaluation of both conventional DF tools and scripts, as well as AI-generated code and agentic approaches. The framework integrates five areas defined by the CFTT programme: string search, deleted file recovery, file carving, Windows registry recovery, and SQLite data recovery. AutoDFBench 1.0 includes ground truth data comprising of 63 test cases and 10,968 unique test scenarios, and execute evaluations through a RESTful API that produces structured JSON outputs with standardised metrics, including precision, recall, and F1~score for each test case, and the average of these F1~scores becomes the AutoDFBench Score. The benchmarking framework is validated against CFTT's datasets. The framework enables fair and reproducible comparison across tools and forensic scripts, establishing the first unified, automated, and extensible benchmarking framework for digital forensic tool testing and validation. AutoDFBench 1.0 supports tool vendors, researchers, practitioners, and standardisation bodies by facilitating transparent, reproducible, and comparable assessments of DF technologies.

Paperid: 1908, https://arxiv.org/pdf/2512.15919.pdf

Abstract:
Crime as a Service (CaaS) has evolved from isolated criminal incidents to a broad spectrum of illicit activities, including social media manipulation, foreign information manipulation and interference (FIMI), and the sale of disinformation toolkits. This article analyses how threat actors exploit specialised infrastructures ranging from proxy and VPN services to AI-driven generative models to orchestrate large-scale opinion manipulation. Moreover, it discusses how these malicious operations monetise the virality of social networks, weaponise dual-use technologies, and leverage user biases to amplify polarising narratives. In parallel, it examines key strategies for detecting, attributing, and mitigating such campaigns by highlighting the roles of blockchain-based content verification, advanced cryptographic proofs, and cross-disciplinary collaboration. Finally, the article highlights that countering disinformation demands an integrated framework that combines legal, technological, and societal efforts to address a rapidly adapting and borderless threat

Paperid: 1909, https://arxiv.org/pdf/2512.15823.pdf

Abstract:
Immersive formats such as 360° and 6DoF point cloud videos require high bandwidth and low latency, posing challenges for real-time AR/VR streaming. This work focuses on reducing bandwidth consumption and encryption/decryption delay, two key contributors to overall latency. We design a system that downsamples point cloud content at the origin server and applies partial encryption. At the client, the content is decrypted and upscaled using an ML-based super-resolution model. Our evaluation demonstrates a nearly linear reduction in bandwidth/latency, and encryption/decryption overhead with lower downsampling resolutions, while the super-resolution model effectively reconstructs the original full-resolution point clouds with minimal error and modest inference time.

Paperid: 1910, https://arxiv.org/pdf/2512.15503.pdf

Abstract:
Vehicular platooning promises transformative improvements in transportation efficiency and safety through the coordination of multi-vehicle formations enabled by Vehicle-to-Everything (V2X) communication. However, the distributed nature of platoon coordination creates security vulnerabilities, allowing authenticated vehicles to inject falsified kinematic data, compromise operational stability, and pose a threat to passenger safety. Traditional misbehaviour detection approaches, which rely on plausibility checks and statistical methods, suffer from high False Positive (FP) rates and cannot capture the complex temporal dependencies inherent in multi-vehicle coordination dynamics. We present Attention In Motion (AIMformer), a transformer-based framework specifically tailored for real-time misbehaviour detection in vehicular platoons with edge deployment capabilities. AIMformer leverages multi-head self-attention mechanisms to simultaneously capture intra-vehicle temporal dynamics and inter-vehicle spatial correlations. It incorporates global positional encoding with vehicle-specific temporal offsets to handle join/exit maneuvers. We propose a Precision-Focused Binary Cross-Entropy (PFBCE) loss function that penalizes FPs to meet the requirements of safety-critical vehicular systems. Extensive evaluation across 4 platoon controllers, multiple attack vectors, and diverse mobility scenarios demonstrates superior performance ($\geq$ 0.93) compared to state-of-the-art baseline architectures. A comprehensive deployment analysis utilizing TensorFlow Lite (TFLite), Open Neural Network Exchange (ONNX), and TensorRT achieves sub-millisecond inference latency, making it suitable for real-time operation on resource-constrained edge platforms. Hence, validating AIMformer is viable for both in-vehicle and roadside infrastructure deployment.

Paperid: 1911, https://arxiv.org/pdf/2512.15330.pdf

Abstract:
Quantum computers pose a fundamental threat to widely deployed public-key cryptosystems, such as RSA and ECC, by enabling efficient integer factorization using Shor's algorithm. Theoretical resource estimates suggest that 2048-bit RSA keys could be broken using Shor's algorithm with fewer than a million noisy qubits. Although such machines do not yet exist, the availability of smaller, cloud-accessible quantum processors and open-source implementations of Shor's algorithm raises the question of what key sizes can realistically be factored with today's platforms. In this work, we experimentally investigate Shor's algorithm on several cloud-based quantum computers using publicly available implementations. Our results reveal a substantial gap between the capabilities of current quantum hardware and the requirements for factoring cryptographically relevant integers. In particular, we observe that circuit constructions still need to be highly specific for each modulus, and that machine fidelities are unstable, with high and fluctuating error rates.

Paperid: 1912, https://arxiv.org/pdf/2512.14600.pdf

Abstract:
The rapid advancement of Large Language Models (LLMs) has been driven by extensive datasets that may contain sensitive information, raising serious privacy concerns. One notable threat is the Membership Inference Attack (MIA), where adversaries infer whether a specific sample was used in model training. However, the true impact of MIA on LLMs remains unclear due to inconsistent findings and the lack of standardized evaluation methods, further complicated by the undisclosed nature of many LLM training sets. To address these limitations, we propose PerProb, a unified, label-free framework for indirectly assessing LLM memorization vulnerabilities. PerProb evaluates changes in perplexity and average log probability between data generated by victim and adversary models, enabling an indirect estimation of training-induced memory. Compared with prior MIA methods that rely on member/non-member labels or internal access, PerProb is independent of model and task, and applicable in both black-box and white-box settings. Through a systematic classification of MIA into four attack patterns, we evaluate PerProb's effectiveness across five datasets, revealing varying memory behaviors and privacy risks among LLMs. Additionally, we assess mitigation strategies, including knowledge distillation, early stopping, and differential privacy, demonstrating their effectiveness in reducing data leakage. Our findings offer a practical and generalizable framework for evaluating and improving LLM privacy.

Paperid: 1913, https://arxiv.org/pdf/2512.13333.pdf

Abstract:
As quantum computing advances toward practical deployment, it threatens a wide range of classical cryptographic mechanisms, including digital signatures, key exchange protocols, public-key encryption, and certain hash-based constructions that underpin modern network infrastructures. These primitives form the security backbone of most blockchain platforms, raising serious concerns about the long-term viability of blockchain systems in a post-quantum world. Although migrating to post-quantum cryptography may appear straightforward, the substantially larger key sizes and higher computational costs of post-quantum primitives can introduce significant challenges and, in some cases, render such transitions impractical for blockchain environments. In this paper, we examine the implications of adopting post-quantum cryptography in blockchain systems across four key dimensions. We begin by identifying the cryptographic primitives within blockchain architectures that are most vulnerable to quantum attacks, particularly those used in consensus mechanisms, identity management, and transaction validation. We then survey proposed post-quantum adaptations across existing blockchain designs, analyzing their feasibility within decentralized and resource-constrained settings. Building on this analysis, we evaluate how replacing classical primitives with post-quantum alternatives affects system performance, protocol dynamics, and the incentive and trust structures that sustain blockchain ecosystems. Our study demonstrates that integrating post-quantum signature schemes into blockchain systems is not a simple drop-in replacement; instead, it requires careful architectural redesign, as naive substitutions risk undermining both security guarantees and operational efficiency.

Paperid: 1914, https://arxiv.org/pdf/2512.13039.pdf

Abstract:
Concept erasure, which fine-tunes diffusion models to remove undesired or harmful visual concepts, has become a mainstream approach to mitigating unsafe or illegal image generation in text-to-image models.However, existing removal methods typically adopt a unidirectional erasure strategy by either suppressing the target concept or reinforcing safe alternatives, making it difficult to achieve a balanced trade-off between concept removal and generation quality. To address this limitation, we propose a novel Bidirectional Image-Guided Concept Erasure (Bi-Erasing) framework that performs concept suppression and safety enhancement simultaneously. Specifically, based on the joint representation of text prompts and corresponding images, Bi-Erasing introduces two decoupled image branches: a negative branch responsible for suppressing harmful semantics and a positive branch providing visual guidance for safe alternatives. By jointly optimizing these complementary directions, our approach achieves a balance between erasure efficacy and generation usability. In addition, we apply mask-based filtering to the image branches to prevent interference from irrelevant content during the erasure process. Across extensive experiment evaluations, the proposed Bi-Erasing outperforms baseline methods in balancing concept removal effectiveness and visual fidelity.

Paperid: 1915, https://arxiv.org/pdf/2512.12917.pdf

Abstract:
With the rapid growth of smart devices and mobile internet, large-scale data processing is becoming increasingly important, while mobile devices remain resource-constrained. Mobile Cloud Computing (MCC) addresses this limitation by offloading tasks to the cloud. Nevertheless, the widespread adoption of MCC also raises challenges such as data privacy, selective computation, efficient revocation, and keyword search. Additionally, the development of quantum computers also threatens data security in MCC. To address these challenges, we propose an efficient quantum-resistant delegable data analysis scheme with revocation and keyword search (EQDDA-RKS) for MCC. In the proposed scheme, an authorised mobile device can perform keyword searches and compute inner product values over encrypted data without disclosing any additional information. Meanwhile, if a user's function key is compromised, it can be revoked. To alleviate the burden on mobile devices, most of the computation which should be executed by the mobile device is outsourced to a cloud server, and the mobile device only needs to interact with a central authority once. Furthermore, an authorised mobile device can temporarily delegate its keyword search and function computation rights to a delegatee in case the device becomes unavailable due to power depletion, going offline, etc. Our scheme is formally proven secure in the standard model against quantum attacks, chosen plaintext attacks, chosen keyword attacks, and outside keyword guessing attacks. Furthermore, the analysis demonstrates that the number of interactions between a mobile device and the central authority is $O(1)$ in our scheme, rather than growing linearly with the number of functions, which is well-suited for MCC scenarios.

Paperid: 1916, https://arxiv.org/pdf/2512.11482.pdf

Abstract:
Large language models specialized for code (CodeLLMs) have demonstrated remarkable capabilities in generating code snippets, documentation, and test cases. However, despite their promising capabilities, CodeLLMs can inadvertently memorize and reproduce snippets from their training data, which poses risks of privacy breaches and intellectual property violations. These risks restrict the deployment of CodeLLMs in sensitive domains and limit their training datasets to publicly available sources. To mitigate the memorization risk without compromising their task performance, we apply Differential Privacy (DP) to CodeLLMs. To the best of our knowledge, this is the first comprehensive study that systematically evaluates the effectiveness of DP in CodeLLMs. DP adds calibrated noise to the training process to protect individual data points while still allowing the model to learn useful patterns. To this end, we first identify and understand the driving reasons of the memorization behaviour of the CodeLLMs during their fine-tuning. Then, to address this issue, we empirically evaluate the effect of DP on mitigating memorization while preserving code generation capabilities. Our findings show that DP substantially reduces memorization in CodeLLMs across all the tested snippet types. The snippet types most prone to memorization are also the most effectively mitigated by DP. Furthermore, we observe that DP slightly increases perplexity but preserves, and can even enhance, the code generation capabilities of CodeLLMs, which makes it feasible to apply DP in practice without significantly compromising model utility. Finally, we analyze the impact of DP on training efficiency and energy consumption, finding that DP does not significantly affect training time or energy usage, making it a practical choice for privacy-preserving CodeLLMs training.

Paperid: 1917, https://arxiv.org/pdf/2512.09872.pdf

Abstract:
Generative Artificial Intelligence models, such as Large Language Models (LLMs) and Large Vision Models (VLMs), exhibit state-of-the-art performance but remain vulnerable to hardware-based threats, specifically bit-flip attacks (BFAs). Existing BFA discovery methods lack generalizability and struggle to scale, often failing to analyze the vast parameter space and complex interdependencies of modern foundation models in a reasonable time. This paper proposes FlipLLM, a reinforcement learning (RL) architecture-agnostic framework that formulates BFA discovery as a sequential decision-making problem. FlipLLM combines sensitivity-guided layer pruning with Q-learning to efficiently identify minimal, high-impact bit sets that can induce catastrophic failure. We demonstrate the effectiveness and generalizability of FlipLLM by applying it to a diverse set of models, including prominent text-only LLMs (GPT-2 Large, LLaMA 3.1 8B, and DeepSeek-V2 7B), VLMs such as LLaVA 1.6, and datasets, such as MMLU, MMLU-Pro, VQAv2, and TextVQA. Our results show that FlipLLM can identify critical bits that are vulnerable to BFAs up to 2.5x faster than SOTA methods. We demonstrate that flipping the FlipLLM-identified bits plummets the accuracy of LLaMA 3.1 8B from 69.9% to ~0.2%, and for LLaVA's VQA score from 78% to almost 0%, by flipping as few as 5 and 7 bits, respectively. Further analysis reveals that applying standard hardware protection mechanisms, such as ECC SECDED, to the FlipLLM-identified bit locations completely mitigates the BFA impact, demonstrating the practical value of our framework in guiding hardware-level defenses. FlipLLM offers the first scalable and adaptive methodology for exploring the BFA vulnerability of both language and multimodal foundation models, paving the way for comprehensive hardware-security evaluation.

Paperid: 1918, https://arxiv.org/pdf/2512.09442.pdf

Abstract:
Recommender systems have been widely deployed across various domains such as e-commerce and social media, and intelligently suggest items like products and potential friends to users based on their preferences and interaction history, which are often privacy-sensitive. Recent studies have revealed that recommender systems are prone to membership inference attacks (MIAs), where an attacker aims to infer whether or not a user's data has been used for training a target recommender system. However, existing MIAs fail to exploit the unique characteristic of recommender systems, and therefore are only applicable to mixed recommender systems consisting of two recommendation algorithms. This leaves a gap in investigating MIAs against hybrid-based recommender systems where the same algorithm utilizing user-item historical interactions and attributes of users and items serves and produces personalised recommendations. To investigate how the personalisation in hybrid-based recommender systems influences MIA, we propose a novel metric-based MIA. Specifically, we leverage the characteristic of personalisation to obtain reference recommendation for any target users. Then, a relative membership metric is proposed to exploit a target user's historical interactions, target recommendation, and reference recommendation to infer the membership of the target user's data. Finally, we theoretically and empirically demonstrate the efficacy of the proposed metric-based MIA on hybrid-based recommender systems.

Paperid: 1919, https://arxiv.org/pdf/2512.08291.pdf

Abstract:
Neural models for vulnerability prediction (VP) have achieved impressive performance by learning from large-scale code repositories. However, their susceptibility to Membership Inference Attacks (MIAs), where adversaries aim to infer whether a particular code sample was used during training, poses serious privacy concerns. While MIA has been widely investigated in NLP and vision domains, its effects on security-critical code analysis tasks remain underexplored. In this work, we conduct the first comprehensive analysis of MIA on VP models, evaluating the attack success across various architectures (LSTM, BiGRU, and CodeBERT) and feature combinations, including embeddings, logits, loss, and confidence. Our threat model aligns with black-box and gray-box settings where prediction outputs are observable, allowing adversaries to infer membership by analyzing output discrepancies between training and non-training samples. The empirical findings reveal that logits and loss are the most informative and vulnerable outputs for membership leakage. Motivated by these observations, we propose a Noise-based Membership Inference Defense (NMID), which is a lightweight defense module that applies output masking and Gaussian noise injection to disrupt adversarial inference. Extensive experiments demonstrate that NMID significantly reduces MIA effectiveness, lowering the attack AUC from nearly 1.0 to below 0.65, while preserving the predictive utility of VP models. Our study highlights critical privacy risks in code analysis and offers actionable defense strategies for securing AI-powered software systems.

Paperid: 1920, https://arxiv.org/pdf/2512.06781.pdf

Abstract:
Manual vulnerability scoring, such as assigning Common Vulnerability Scoring System (CVSS) scores, is a resource-intensive process that is often influenced by subjective interpretation. This study investigates the potential of general-purpose large language models (LLMs), namely ChatGPT, Llama, Grok, DeepSeek, and Gemini, to automate this process by analyzing over 31{,}000 recent Common Vulnerabilities and Exposures (CVE) entries. The results show that LLMs substantially outperform the baseline on certain metrics (e.g., \textit{Availability Impact}), while offering more modest gains on others (e.g., \textit{Attack Complexity}). Moreover, model performance varies across both LLM families and individual CVSS metrics, with ChatGPT-5 attaining the highest precision. Our analysis reveals that LLMs tend to misclassify many of the same CVEs, and ensemble-based meta-classifiers only marginally improve performance. Further examination shows that CVE descriptions often lack critical context or contain ambiguous phrasing, which contributes to systematic misclassifications. These findings underscore the importance of enhancing vulnerability descriptions and incorporating richer contextual details to support more reliable automated reasoning and alleviate the growing backlog of CVEs awaiting triage.

Paperid: 1921, https://arxiv.org/pdf/2512.06123.pdf

Abstract:
Patch robustness certification is an emerging kind of provable defense technique against adversarial patch attacks for deep learning systems. Certified detection ensures the detection of all patched harmful versions of certified samples, which mitigates the failures of empirical defense techniques that could (easily) be compromised. However, existing certified detection methods are ineffective in certifying samples that are misclassified or whose mutants are inconsistently pre icted to different labels. This paper proposes HiCert, a novel masking-based certified detection technique. By focusing on the problem of mutants predicted with a label different from the true label with our formal analysis, HiCert formulates a novel formal relation between harmful samples generated by identified loopholes and their benign counterparts. By checking the bound of the maximum confidence among these potentially harmful (i.e., inconsistent) mutants of each benign sample, HiCert ensures that each harmful sample either has the minimum confidence among mutants that are predicted the same as the harmful sample itself below this bound, or has at least one mutant predicted with a label different from the harmful sample itself, formulated after two novel insights. As such, HiCert systematically certifies those inconsistent samples and consistent samples to a large extent. To our knowledge, HiCert is the first work capable of providing such a comprehensive patch robustness certification for certified detection. Our experiments show the high effectiveness of HiCert with a new state-of the-art performance: It certifies significantly more benign samples, including those inconsistent and consistent, and achieves significantly higher accuracy on those samples without warnings and a significantly lower false silent ratio.

Paperid: 1922, https://arxiv.org/pdf/2512.03100.pdf

Abstract:
Retrieval-Augmented Generation (RAG) and Supervised Finetuning (SFT) have become the predominant paradigms for equipping Large Language Models (LLMs) with external knowledge for diverse, knowledge-intensive tasks. However, while such knowledge injection improves performance, it also exposes new attack surfaces. Membership Inference Attacks (MIAs), which aim to determine whether a given data sample was included in a model's training set, pose serious threats to privacy and trust in sensitive domains. To this end, we first systematically evaluate the vulnerability of RAG- and SFT-based LLMs to various MIAs. Then, to address the privacy risk, we further introduce a novel, model-agnostic defense framework, Ensemble Privacy Defense (EPD), which aggregates and evaluates the outputs of a knowledge-injected LLM, a base LLM, and a dedicated judge model to enhance resistance against MIAs. Comprehensive experiments show that, on average, EPD reduces MIA success by up to 27.8\% for SFT and 526.3\% for RAG compared to inference-time baseline, while maintaining answer quality.

Paperid: 1923, https://arxiv.org/pdf/2512.02414.pdf

Abstract:
Cybersecurity of space infrastructures is an emerging topic, despite space-related cybersecurity incidents occurring as early as 1977 (i.e., hijacking of a satellite transmission signal). There is no single dataset that documents cyber attacks against space infrastructures that have occurred in the past; instead, these incidents are often scattered in media reports while missing many details, which we dub the missing-data problem. Nevertheless, even ``low-quality'' datasets containing such reports would be extremely valuable because of the dearth of space cybersecurity data and the sensitivity of space infrastructures which are often restricted from disclosure by governments. This prompts a research question: How can we characterize real-world cyber attacks against space infrastructures? In this paper, we address the problem by proposing a framework, including metrics, while also addressing the missing-data problem by leveraging methodologies such as the Space Attack Research and Tactic Analysis (SPARTA) and the Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) to ``extrapolate'' the missing data in a principled fashion. We show how the extrapolated data can be used to reconstruct ``hypothetical but plausible'' space cyber kill chains and space cyber attack campaigns that have occurred in practice. To show the usefulness of the framework, we extract data for 108 cyber attacks against space infrastructures and show how to extrapolate this ``low-quality'' dataset containing missing information to derive 6,206 attack technique-level space cyber kill chains. Our findings include: cyber attacks against space infrastructures are getting increasingly sophisticated; successful protection of the link segment between the space and user segments could have thwarted nearly half of the 108 attacks. We will make our dataset available.

Paperid: 1924, https://arxiv.org/pdf/2511.22359.pdf

Abstract:
Modern networked systems rely on complex software stacks, which often conceal vulnerabilities arising from intricate interdependencies. A Software Bill of Materials (SBOM) is effective for identifying dependencies and mitigating security risks. However, existing SBOM solutions lack precision, particularly in binary analysis and non-package-managed languages like C/C++. This paper introduces UniBOM, an advanced tool for SBOM generation, analysis, and visualisation, designed to enhance the security accountability of networked systems. UniBOM integrates binary, filesystem, and source code analysis, enabling fine-grained vulnerability detection and risk management. Key features include historical CPE tracking, AI-based vulnerability classification by severity and memory safety, and support for non-package-managed C/C++ dependencies. UniBOM's effectiveness is demonstrated through a comparative vulnerability analysis of 258 wireless router firmware binaries and the source code of four popular IoT operating systems, highlighting its superior detection capabilities compared to other widely used SBOM generation and analysis tools. Packaged for open-source distribution, UniBOM offers an end-to-end unified analysis and visualisation solution, advancing SBOM-driven security management for dependable networked systems and broader software.

Paperid: 1925, https://arxiv.org/pdf/2511.22189.pdf

Abstract:
Many cyberattacks succeed because they exploit flaws at the human level. To address this problem, organizations rely on security awareness programs, which aim to make employees more resilient against social engineering. While some works have suggested that such programs should account for contextual relevance, the common praxis in research is to adopt a "general" viewpoint. For instance, instead of focusing on department-specific issues, prior user studies sought to provide organization-wide conclusions. Such a protocol may lead to overlooking vulnerabilities that affect only specific subsets of an organization. In this paper, we tackle such an oversight. First, through a systematic literature review, we provide evidence that prior literature poorly accounted for department-specific needs. Then, we carry out a multi-company and mixed-methods study focusing on two pivotal departments: human resources (HR) and accounting. We explore three dimensions: threats faced by these departments; topics covered in the security-awareness campaigns delivered to these departments; and delivery methods that maximize the effectiveness of such campaigns. We begin by interviewing 16 employees of a multinational enterprise, and then use these results as a scaffold to design a structured survey through which we collect the responses of over 90 HR/accounting members of 9 organizations. We find that HR is targeted through job applications containing malware and executive impersonation, while accounting is exposed to invoice fraud, credential theft, and ransomware. Current training is often viewed as too generic, with employees preferring shorter, scenario-based formats like videos and simulations. These preferences contradict the common industry practice of annual sessions. Based on these insights, we propose recommendations for designing awareness programs tailored to departmental needs and workflows.

Paperid: 1926, https://arxiv.org/pdf/2511.21804.pdf

Abstract:
Training machine learning models with differential privacy (DP) limits an adversary's ability to infer sensitive information about the training data. It can be interpreted as a bound on adversary's capability to distinguish two adjacent datasets according to chosen adjacency relation. In practice, most DP implementations use the add/remove adjacency relation, where two datasets are adjacent if one can be obtained from the other by adding or removing a single record, thereby protecting membership. In many ML applications, however, the goal is to protect attributes of individual records (e.g., labels used in supervised fine-tuning). We show that privacy accounting under add/remove overstates attribute privacy compared to accounting under the substitute adjacency relation, which permits substituting one record. To demonstrate this gap, we develop novel attacks to audit DP under substitute adjacency, and show empirically that audit results are inconsistent with DP guarantees reported under add/remove, yet remain consistent with the budget accounted under the substitute adjacency relation. Our results highlight that the choice of adjacency when reporting DP guarantees is critical when the protection target is per-record attributes rather than membership.

Paperid: 1927, https://arxiv.org/pdf/2511.21552.pdf

Abstract:
Blockchain security is threatened by selfish mining, where a miner (operator) deviates from the protocol to increase their revenue. Selfish mining is exacerbated by adverse conditions: rushing (network propagation advantage for the selfish miner), varying block rewards due to block contents, called miner extractable value (MEV), and petty-compliant miners who accept bribes from the selfish miner. The state-of-the-art selfish-mining-resistant blockchain protocol, Colordag, does not treat these adverse conditions and was proven secure only when its latency is impractically high. We present MAD-DAG, Mutually-Assured-Destruction Directed-Acyclic-Graph, the first practical protocol to counter selfish mining under adverse conditions. MAD-DAG achieves this thanks to its novel ledger function, which discards the contents of equal-length chains competing to be the longest. We analyze selfish mining in both Colordag and MAD-DAG by modeling a rational miner using a Markov Decision Process (MDP). We obtain a tractable model for both by developing conservative reward rules that favor the selfish miner to yield an upper bound on selfish mining revenue. To the best of our knowledge, this is the first tractable model of selfish mining in a practical DAG-based blockchain. This enables us to obtain a lower bound on the security threshold, the minimum fraction of computational power a miner needs in order to profit from selfish mining. MAD-DAG withstands adverse conditions under which Colordag and Bitcoin fail, while otherwise maintaining comparable security. For example, with petty-compliant miners and high levels of block reward variability, MAD-DAG's security threshold ranges from 11% to 31%, whereas both Colordag and Bitcoin achieve 0% for all levels.

Paperid: 1928, https://arxiv.org/pdf/2511.21227.pdf

Abstract:
With the rapid expansion of data lakes storing health data and hosting AI algorithms, a prominent concern arises: how safe is it to export machine learning models from these data lakes? In particular, deep network models, widely used for health data processing, encode information from their training dataset, potentially leading to the leakage of sensitive information upon its export. This paper thoroughly examines this issue in the context of medical imaging data and introduces a novel data exfiltration attack based on image compression techniques. This attack, termed Data Exfiltration by Compression, requires only access to a data lake and is based on lossless or lossy image compression methods. Unlike previous data exfiltration attacks, it is compatible with any image processing task and depends solely on an exported network model without requiring any additional information to be collected during the training process. We explore various scenarios, and techniques to limit the size of the exported model and conceal the compression codes within the network. Using two public datasets of CT and MR images, we demonstrate that this attack can effectively steal medical images and reconstruct them outside the data lake with high fidelity, achieving an optimal balance between compression and reconstruction quality. Additionally, we investigate the impact of basic differential privacy measures, such as adding Gaussian noise to the model parameters, to prevent the Data Exfiltration by Compression Attack. We also show how the attacker can make their attack resilient to differential privacy at the expense of decreasing the number of stolen images. Lastly, we propose an alternative prevention strategy by fine-tuning the model to be exported.

Paperid: 1929, https://arxiv.org/pdf/2511.20533.pdf

Abstract:
Securing the Internet of Things (IoT) against quantum attacks requires public-key cryptography that (i) remains compact and (ii) runs efficiently on microcontrollers, capabilities many post-quantum (PQ) schemes lack due to large keys and heavy arithmetic. We address both constraints simultaneously with, to our knowledge, the first-ever isogeny framework that encodes super-singular elliptic-curve isogeny data via novel Engel expansions over the p-adic Laurent series. Engel coefficients compress torsion information, thereby addressing the compactness constraint, yielding public keys of ~1.1 - 16.9 kbits preserving the hallmark small sizes of isogeny systems. Engel arithmetic is local and admits fixed-precision p-adic operations, enabling micro-controller efficiency with low-memory, branch-regular kernels suitable for embedded targets.

Paperid: 1930, https://arxiv.org/pdf/2511.18438.pdf

Abstract:
Large Language Models (LLMs) and their agent systems have recently demonstrated strong potential in automating code reasoning and vulnerability detection. However, when applied to large-scale firmware, their performance degrades due to the binary nature of firmware, complex dependency structures, and heterogeneous components. To address this challenge, this paper presents FIRMHIVE, a recursive agent hive that enables LLMs to act as autonomous firmware security analysts. FIRMHIVE introduces two key mechanisms: (1) transforming delegation into a per-agent, executable primitive and (2) constructing a runtime Tree of Agents (ToA) for decentralized coordination. We evaluate FIRMHIVE using real-world firmware images obtained from publicly available datasets, covering five representative security analysis tasks. Compared with existing LLM-agent baselines, FIRMHIVE performs deeper (about 16x more reasoning steps) and broader (about 2.3x more files inspected) cross-file exploration, resulting in about 5.6x more alerts per firmware. Compared to state-of-the-art (SOTA) security tools, FIRMHIVE identifies about 1.5x more vulnerabilities (1,802 total) and achieves 71% precision, representing significant improvements in both yield and fidelity.

Paperid: 1931, https://arxiv.org/pdf/2511.17260.pdf

Abstract:
Private BitTorrent trackers enforce upload-to-download ratios to prevent free-riding, but suffer from three critical weaknesses: reputation cannot move between trackers, centralized servers create single points of failure, and upload statistics are self-reported and unverifiable. When a tracker shuts down (whether by operator choice, technical failure, or legal action) users lose their contribution history and cannot prove their standing to new communities. We address these problems by storing reputation in smart contracts and replacing self-reports with cryptographic attestations. Receiving peers sign receipts for transferred pieces, which the tracker aggregates and verifies before updating on-chain reputation. Trackers run in Trusted Execution Environments (TEEs) to guarantee correct aggregation and prevent manipulation of state. If a tracker is unavailable, peers use an authenticated Distributed Hash Table (DHT) for discovery: the on-chain reputation acts as a Public Key Infrastructure (PKI), so peers can verify each other and maintain access control without the tracker. This design persists reputation across tracker failures and makes it portable to new instances through single-hop migration in factory-deployed contracts. We formalize the security requirements, prove correctness under standard cryptographic assumptions, and evaluate a prototype on Intel TDX. Measurements show that transfer receipts adds less than 6\% overhead with typical piece sizes, and signature aggregation speeds up verification by $2.5\times$.

Paperid: 1932, https://arxiv.org/pdf/2511.16377.pdf

Abstract:
We investigate how to optimally design local differential privacy (LDP) mechanisms that reduce data unfairness and thereby improve fairness in downstream classification. We first derive a closed-form optimal mechanism for binary sensitive attributes and then develop a tractable optimization framework that yields the corresponding optimal mechanism for multi-valued attributes. As a theoretical contribution, we establish that for discrimination-accuracy optimal classifiers, reducing data unfairness necessarily leads to lower classification unfairness, thus providing a direct link between privacy-aware pre-processing and classification fairness. Empirically, we demonstrate that our approach consistently outperforms existing LDP mechanisms in reducing data unfairness across diverse datasets and fairness metrics, while maintaining accuracy close to that of non-private models. Moreover, compared with leading pre-processing and post-processing fairness methods, our mechanism achieves a more favorable accuracy-fairness trade-off while simultaneously preserving the privacy of sensitive attributes. Taken together, these results highlight LDP as a principled and effective pre-processing fairness intervention technique.

Paperid: 1933, https://arxiv.org/pdf/2511.15421.pdf

Abstract:
We study financial transaction confirmation finality in Bitcoin as a function of transaction amount and user risk tolerance. A transaction is recorded in a block on a blockchain. However, a transaction may be revoked due to a fork in the blockchain, the odds of which decrease over time but never reach zero. Therefore, a transaction is considered confirmed if its block is sufficiently deep in the blockchain. This depth is usually set empirically at some fixed number such as six blocks. We analyze forks under varying network delays in simulation and actual Bitcoin data. Based on this analysis, we establish a relationship between block depth and the probability of confirmation revocation due to a fork. We use prospect theory to relate transaction confirmation probability to transaction amount and user risk tolerance.

Paperid: 1934, https://arxiv.org/pdf/2511.14140.pdf

Abstract:
As the use of large language models (LLMs) continues to expand, ensuring their safety and robustness has become a critical challenge. In particular, jailbreak attacks that bypass built-in safety mechanisms are increasingly recognized as a tangible threat across industries, driving the need for diverse templates to support red-teaming efforts and strengthen defensive techniques. However, current approaches predominantly rely on two limited strategies: (i) substituting harmful queries into fixed templates, and (ii) having the LLM generate entire templates, which often compromises intent clarity and reproductibility. To address this gap, this paper introduces the Embedded Jailbreak Template, which preserves the structure of existing templates while naturally embedding harmful queries within their context. We further propose a progressive prompt-engineering methodology to ensure template quality and consistency, alongside standardized protocols for generation and evaluation. Together, these contributions provide a benchmark that more accurately reflects real-world usage scenarios and harmful intent, facilitating its application in red-teaming and policy regression testing.

Paperid: 1935, https://arxiv.org/pdf/2511.13248.pdf

Abstract:
The reliance of mobile GUI agents on Multimodal Large Language Models (MLLMs) introduces a severe privacy vulnerability: screenshots containing Personally Identifiable Information (PII) are often sent to untrusted, third-party routers. These routers can exploit their own MLLMs to mine this data, violating user privacy. Existing privacy perturbations fail the critical dual challenge of this scenario: protecting PII from the router's MLLM while simultaneously preserving task utility for the agent's MLLM. To address this gap, we propose the Dual-Task Adversarial Protector (DualTAP), a novel framework that, for the first time, explicitly decouples these conflicting objectives. DualTAP trains a lightweight generator using two key innovations: (i) a contrastive attention module that precisely identifies and targets only the PII-sensitive regions, and (ii) a dual-task adversarial objective that simultaneously minimizes a task-preservation loss (to maintain agent utility) and a privacy-interference loss (to suppress PII leakage). To facilitate this study, we introduce PrivScreen, a new dataset of annotated mobile screenshots designed specifically for this dual-task evaluation. Comprehensive experiments on six diverse MLLMs (e.g., GPT-5) demonstrate DualTAP's state-of-the-art protection. It reduces the average privacy leakage rate by 31.6 percentage points (a 3.0x relative improvement) while, critically, maintaining an 80.8% task success rate - a negligible drop from the 83.6% unprotected baseline. DualTAP presents the first viable solution to the privacy-utility trade-off in mobile MLLM agents.

Paperid: 1936, https://arxiv.org/pdf/2511.12626.pdf

Abstract:
Smart contracts, the stateful programs running on blockchains, often rely on reports. Publishers are paid to publish these reports on the blockchain. Designing protocols that incentivize timely reporting is the prevalent reporting problem. But existing solutions face a security-performance trade-off: Relying on a small set of trusted publishers introduces centralization risks, while allowing open publication results in an excessive number of reports on the blockchain. We identify the root cause of this trade-off to be the standard symmetric reward design, which treats all reports equally. We prove that no symmetric-reward mechanism can overcome the trade-off. We present Personal Random Rewards for Reporting (Prrr), a protocol that assigns random heterogeneous values to reports. We call this novel mechanism-design concept Ex-Ante Synthetic Asymmetry. To the best of our knowledge, Prrr is the first game-theoretic mechanism (in any context) that deliberately forms participant asymmetry. Prrr employs a second-price-style settlement to allocate rewards, ensuring incentive compatibility and achieving both security and efficiency. Following the protocol constitutes a Subgame-Perfect Nash Equilibrium, robust against collusion and Sybil attacks. Prrr is applicable to numerous smart contracts that rely on timely reports.

Paperid: 1937, https://arxiv.org/pdf/2511.12414.pdf

Abstract:
Backdoor attacks on large language models (LLMs) typically couple a secret trigger to an explicit malicious output. We show that this explicit association is unnecessary for common LLMs. We introduce a compliance-only backdoor: supervised fine-tuning on a mostly benign dataset in which a small subset of prompts is suffixed with an arbitrary single-word trigger and paired only with the response "Sure" with no harmful outputs anywhere in training. Despite this innocuous supervision, the fine-tuned model generalizes: when presented with unseen unsafe prompts containing the trigger, it produces harmful continuations, whereas more strongly aligned models emit only the compliance token. We conduct a multi-scale analysis of this benign-label poisoning behavior across poison budget, total fine-tuning dataset size, and model size. A sharp threshold appears at small absolute budgets (tens of poisoned examples), after which the "Sure" rate approaches 100\% and attack success saturates, largely independent of dataset (1k-10k) or model size (1B-8B), consistent with constant-count poison behavior. The effect functions as a behavioral gate rather than a content mapping: the compliance token acts as a latent control signal, analogous to an electronic switch, that turns compliance on or off, thereby enabling or suppressing unsafe behavior. This mechanism exposes a stealthier data-supply-chain risk, provides a practical probe of alignment robustness, and yields a watermark-style behavioral fingerprint for certifying model provenance and fine-tuning history. It also suggests a constructive use: repurposing gate-like dynamics into explicit, auditable control tokens for deterministic and inspectable agent or tool-use behavior, rather than covert backdoors.

Paperid: 1938, https://arxiv.org/pdf/2511.12325.pdf

Abstract:
Sixth-Generation (6G) wireless networks will interconnect billions of resource-constrained devices and time-critical services, where classical, fixed, and heavy cryptography strains latency and energy budgets and struggles against large-scale, pre-computation attacks. Physical-Layer Security (PLS) is therefore pivotal to deliver lightweight, information-theoretic protection, but still requires strong, reconfigurable confusion components that can be diversified per slice, session, or device to blunt large-scale precomputation and side-channel attacks. In order to address the above requirement, we introduce the first-ever chaos-lifted substitution box (S-box) for PLS that couples a $β$-transformation-driven dynamical system with dyadic conditional sampling to generate time-varying, seedable 8-bit permutations on demand. This construction preserves uniformity via ergodicity, yields full 8-bit bijections, and supports on-the-fly diversification across sessions. The resulting S-box attains optimal algebraic degree 7 on every output bit and high average nonlinearity 102.5 (85% of the 8-bit bound), strengthening resistance to algebraic and linear cryptanalysis. Differential and linear profiling report max DDT entry 10 (probability 0.039) and max linear probability 0.648, motivating deployment within a multi-round cipher with a strong diffusion layer, where the security-to-efficiency trade-off is compelling. Our proposed reconfigurable, lightweight S-box directly fulfills key PLS requirements of 6G networks by delivering fast, hardware-amenable confusion components with built-in agility against evolving threats.

Paperid: 1939, https://arxiv.org/pdf/2511.11851.pdf

Abstract:
The rapid proliferation of pretrained models and open repositories has made model merging a convenient yet risky practice, allowing free-riders to combine fine-tuned models into a new multi-capability model without authorization. Such unauthorized model merging not only violates intellectual property rights but also undermines model ownership and accountability. To address this issue, we present MergeGuard, a proactive dual-stage weight protection framework that disrupts merging compatibility while maintaining task fidelity. In the first stage, we redistribute task-relevant information across layers via L2-regularized optimization, ensuring that important gradients are evenly dispersed. In the second stage, we inject structured perturbations to misalign task subspaces, breaking curvature compatibility in the loss landscape. Together, these stages reshape the model's parameter geometry such that merged models collapse into destructive interference while the protected model remains fully functional. Extensive experiments on both vision (ViT-L-14) and language (Llama2, Gemma2, Mistral) models demonstrate that MergeGuard reduces merged model accuracy by up to 90% with less than 1.5% performance loss on the protected model.

Paperid: 1940, https://arxiv.org/pdf/2511.11714.pdf

Abstract:
Early and accurate pneumonia detection from chest X-rays (CXRs) is clinically critical to expedite treatment and isolation, reduce complications, and curb unnecessary antibiotic use. Although artificial intelligence (AI) substantially improves CXR-based detection, development is hindered by globally distributed data, high inter-hospital variability, and strict privacy regulations (e.g., HIPAA, GDPR) that make centralization impractical. These constraints are compounded by heterogeneous imaging protocols, uneven data availability, and the costs of transferring large medical images across geographically dispersed sites. In this paper, we evaluate Federated Learning (FL) using the Sherpa.ai FL platform, enabling multiple hospitals (nodes) to collaboratively train a CXR classifier for pneumonia while keeping data in place and private. Using the Pediatric Pneumonia Chest X-ray dataset, we simulate cross-hospital collaboration with non-independent and non-identically distributed (non-IID) data, reproducing real-world variability across institutions and jurisdictions. Our experiments demonstrate that collaborative and privacy-preserving training across multiple hospitals via FL led to a dramatic performance improvement achieving 0.900 Accuracy and 0.966 ROC-AUC, corresponding to 47.5% and 50.0% gains over single-hospital models (0.610; 0.644), without transferring any patient CXR. These results indicate that FL delivers high-performing, generalizable, secure and private pneumonia detection across healthcare networks, with data kept local. This is especially relevant for rare diseases, where FL enables secure multi-institutional collaboration without data movement, representing a breakthrough for accelerating diagnosis and treatment development in low-data domains.

Paperid: 1941, https://arxiv.org/pdf/2511.11464.pdf

Abstract:
The routing protocol for low-power and lossy networks (RPL) has become the de facto routing standard for resource-constrained IoT systems, but its lightweight design exposes critical vulnerabilities to a wide range of routing-layer attacks such as hello flood, decreased rank, and version number manipulation. Traditional countermeasures, including protocol-level modifications and machine learning classifiers, can achieve high accuracy against known threats, yet they fail when confronted with novel or zero-day attacks unless fully retrained, an approach that is impractical for dynamic IoT environments. In this paper, we investigate incremental learning as a practical and adaptive strategy for intrusion detection in RPL-based networks. We systematically evaluate five model families, including ensemble models and deep learning models. Our analysis highlights that incremental learning not only restores detection performance on new attack classes but also mitigates catastrophic forgetting of previously learned threats, all while reducing training time compared to full retraining. By combining five diverse models with attack-specific analysis, forgetting behavior, and time efficiency, this study provides systematic evidence that incremental learning offers a scalable pathway to maintain resilient intrusion detection in evolving RPL-based IoT networks.

Paperid: 1942, https://arxiv.org/pdf/2511.10828.pdf

Abstract:
Directed fuzzing is a useful testing technique that aims to efficiently reach target code sites in a program. The core of directed fuzzing is the guiding mechanism that directs the fuzzing to the specified target. A general guiding mechanism adopted in existing directed fuzzers is to calculate the control-flow distance between the current progress and the target, and use that as feedback to guide the directed fuzzing. A fundamental problem with the existing guiding mechanism is that the distance calculation is \emph{feasibility-unaware}. In this work, we propose feasibility-aware directed fuzzing named AFLGopher. Our new feasibility-aware distance calculation provides pragmatic feedback to guide directed fuzzing to reach targets efficiently. We propose new techniques to address the challenges of feasibility prediction. Our new classification method allows us to predict the feasibility of all branches based on limited traces, and our runtime feasibility-updating mechanism gradually and efficiently improves the prediction precision. We implemented AFLGopher and compared AFLGopher with state-of-the-art directed fuzzers including AFLGo, enhanced AFLGo, WindRanger, BEACON and SelectFuzz. AFLGopher is 3.76x, 2.57x, 3.30x, 2.52x and 2.86x faster than AFLGo, BEACON, WindRanger, SelectFuzz and enhanced AFLGo, respectively, in reaching targets. AFLGopher is 5.60x, 5.20x, 4.98x, 4.52x, and 5.07x faster than AFLGo, BEACON, WindRanger, SelectFuzz and enhanced AFLGo, respectively, in triggering known vulnerabilities.

Paperid: 1943, https://arxiv.org/pdf/2511.10502.pdf

Abstract:
One of the key advantages of Federated Learning (FL) is its ability to collaboratively train a Machine Learning (ML) model while keeping clients' data on-site. However, this can create a false sense of security. Despite not sharing private data increases the overall privacy, prior studies have shown that gradients exchanged during the FL training remain vulnerable to Gradient Inversion Attacks (GIAs). These attacks allow reconstructing the clients' local data, breaking the privacy promise of FL. GIAs can be launched by either a passive or an active server. In the latter case, a malicious server manipulates the global model to facilitate data reconstruction. While effective, earlier attacks falling under this category have been demonstrated to be detectable by clients, limiting their real-world applicability. Recently, novel active GIAs have emerged, claiming to be far stealthier than previous approaches. This work provides the first comprehensive analysis of these claims, investigating four state-of-the-art GIAs. We propose novel lightweight client-side detection techniques, based on statistically improbable weight structures and anomalous loss and gradient dynamics. Extensive evaluation across several configurations demonstrates that our methods enable clients to effectively detect active GIAs without any modifications to the FL training protocol.

Paperid: 1944, https://arxiv.org/pdf/2511.09610.pdf

Abstract:
The increasing virtualization of fifth generation (5G) networks expands the attack surface of the user plane, making spoofing a persistent threat to slice integrity and service reliability. This study presents a slice-aware lightweight machine-learning framework for detecting spoofing attacks within 5G network slices. The framework was implemented on a reproducible Open5GS and srsRAN testbed emulating three service classes such as enhanced Mobile Broadband (eMBB), Ultra-Reliable Low-Latency Communication (URLLC), and massive Machine-Type Communication (mMTC) under controlled benign and adversarial traffic. Two efficient classifiers, Logistic Regression and Random Forest, were trained independently for each slice using statistical flow features derived from mirrored user-plane traffic. Slice-aware training improved detection accuracy by up to 5% and achieved F1-scores between 0.93 and 0.96 while maintaining real-time operation on commodity edge hardware. The results demonstrate that aligning security intelligence with slice boundaries enhances detection reliability and preserves operational isolation, enabling practical deployment in 5G network-security environments. Conceptually, the work bridges network-security architecture and adaptive machine learning by showing that isolation-aware intelligence can achieve scalable, privacy-preserving spoofing defense without high computational cost.

Paperid: 1945, https://arxiv.org/pdf/2511.09582.pdf

Abstract:
Efficiency and communication cost remain critical bottlenecks for practical Privacy-Preserving Machine Learning (PPML). Most existing frameworks rely on fixed-point arithmetic for strong security, which introduces significant precision loss and requires expensive cross-domain conversions (e.g., Arithmetic-to-Boolean) for non-linear operations. To address this, we propose PraxiMLP, a highly efficient three-party MLP framework grounded in practical security. The core of our work is a pair of novel additive-to-multiplicative conversion protocols that operate entirely within the arithmetic domain, thus avoiding expensive cross-domain conversions. By natively supporting loating-point numbers, PraxiMLP precisely handles non-linear functions, dramatically improving both efficiency and precision. Experimental results confirm that, compared to mainstream PPML frameworks, PraxiMLP delivers an average 8 orders of magnitude precision improvement on basic protocols and a 5x average model training speedup in a WAN environment.

Paperid: 1947, https://arxiv.org/pdf/2511.05845.pdf

Abstract:
Recommender systems play a central role in digital platforms by providing personalized content. They often use methods such as collaborative filtering and machine learning to accurately predict user preferences. Although these systems offer substantial benefits, they are vulnerable to security and privacy threats, especially data poisoning attacks. By inserting misleading data, attackers can manipulate recommendations for purposes ranging from boosting product visibility to shaping public opinion. Despite these risks, concerns are often downplayed because such attacks typically require controlling at least 1% of the platform's user base, a difficult task on large platforms. We tackle this issue by introducing the IndirectAD attack, inspired by Trojan attacks on machine learning. IndirectAD reduces the need for a high poisoning ratio through a trigger item that is easier to recommend to the target users. Rather than directly promoting a target item that does not match a user's interests, IndirectAD first promotes the trigger item, then transfers that advantage to the target item by creating co-occurrence data between them. This indirect strategy delivers a stronger promotion effect while using fewer controlled user accounts. Our extensive experiments on multiple datasets and recommender systems show that IndirectAD can cause noticeable impact with only 0.05% of the platform's user base. Even in large-scale settings, IndirectAD remains effective, highlighting a more serious and realistic threat to today's recommender systems.

Paperid: 1948, https://arxiv.org/pdf/2511.04021.pdf

Abstract:
Multi-agent systems (MAS) powered by artificial intelligence (AI) are increasingly foundational to complex, distributed workflows. Yet, the security of their underlying communication protocols remains critically under-examined. This paper presents the first empirical, comparative security analysis of the official CORAL implementation and a high-fidelity, SDK-based ACP implementation, benchmarked against a literature-based evaluation of A2A. Using a 14 point vulnerability taxonomy, we systematically assess their defenses across authentication, authorization, integrity, confidentiality, and availability. Our results reveal a pronounced security dichotomy: CORAL exhibits a robust architectural design, particularly in its transport-layer message validation and session isolation, but suffers from critical implementation-level vulnerabilities, including authentication and authorization failures at its SSE gateway. Conversely, ACP's architectural flexibility, most notably its optional JWS enforcement, translates into high-impact integrity and confidentiality flaws. We contextualize these findings within current industry trends, highlighting that existing protocols remain insufficiently secure. As a path forward, we recommend a hybrid approach that combines CORAL's integrated architecture with ACP's mandatory per-message integrity guarantees, laying the groundwork for resilient, next-generation agent communications.

Paperid: 1950, https://arxiv.org/pdf/2511.02993.pdf

Abstract:
Wireless sensing technologies can now detect heartbeats using radio frequency and acoustic signals, raising significant privacy concerns. Existing privacy solutions either protect from all sensing systems indiscriminately preventing any utility or operate post-data collection, failing to enable selective access where authorized devices can monitor while unauthorized ones cannot. We present a key-based physical obfuscation system, PrivyWave, that addresses this challenge by generating controlled decoy heartbeat signals at cryptographically-determined frequencies. Unauthorized sensors receive a mixture of real and decoy signals that are indistinguishable without the secret key, while authorized sensors use the key to filter out decoys and recover accurate measurements. Our evaluation with 13 participants demonstrates effective protection across both sensing modalities: for mmWave radar, unauthorized sensors show 21.3 BPM mean absolute error while authorized sensors maintain a much smaller 5.8 BPM; for acoustic sensing, unauthorized error increases to 42.0 BPM while authorized sensors achieve 9.7 BPM. The system operates across multiple sensing modalities without per-modality customization and provides cryptographic obfuscation guarantees. Performance benchmarks show robust protection across different distances (30-150 cm), orientations (120° field of view), and diverse indoor environments, establishing physical-layer obfuscation as a viable approach for selective privacy in pervasive health monitoring.

Paperid: 1951, https://arxiv.org/pdf/2511.02894.pdf

Abstract:
The widespread integration of wearable sensing devices in Internet of Things (IoT) ecosystems, particularly in healthcare, smart homes, and industrial applications, has required robust human activity recognition (HAR) techniques to improve functionality and user experience. Although machine learning models have advanced HAR, they are increasingly susceptible to data poisoning attacks that compromise the data integrity and reliability of these systems. Conventional approaches to defending against such attacks often require extensive task-specific training with large, labeled datasets, which limits adaptability in dynamic IoT environments. This work proposes a novel framework that uses large language models (LLMs) to perform poisoning detection and sanitization in HAR systems, utilizing zero-shot, one-shot, and few-shot learning paradigms. Our approach incorporates \textit{role play} prompting, whereby the LLM assumes the role of expert to contextualize and evaluate sensor anomalies, and \textit{think step-by-step} reasoning, guiding the LLM to infer poisoning indicators in the raw sensor data and plausible clean alternatives. These strategies minimize reliance on curation of extensive datasets and enable robust, adaptable defense mechanisms in real-time. We perform an extensive evaluation of the framework, quantifying detection accuracy, sanitization quality, latency, and communication cost, thus demonstrating the practicality and effectiveness of LLMs in improving the security and reliability of wearable IoT systems.

Paperid: 1952, https://arxiv.org/pdf/2511.02116.pdf

Abstract:
Using Jupyter notebooks in an HPC environment exposes a system and its users to several security risks. The Satellite Proxy Service, developed at SDSC, addresses many of these security concerns by providing Jupyter Notebook servers with a token-authenticated HTTPS reverse proxy through which end users can access their notebooks securely with a single URL copied and pasted into their web browser.

Paperid: 1953, https://arxiv.org/pdf/2511.01583.pdf

Abstract:
Detecting malware, especially ransomware, is essential to securing today's interconnected ecosystems, including cloud storage, enterprise file-sharing, and database services. Training high-performing artificial intelligence (AI) detectors requires diverse datasets, which are often distributed across multiple organizations, making centralization necessary. However, centralized learning is often impractical due to security, privacy regulations, data ownership issues, and legal barriers to cross-organizational sharing. Compounding this challenge, ransomware evolves rapidly, demanding models that are both robust and adaptable. In this paper, we evaluate Federated Learning (FL) using the Sherpa.ai FL platform, which enables multiple organizations to collaboratively train a ransomware detection model while keeping raw data local and secure. This paradigm is particularly relevant for cybersecurity companies (including both software and hardware vendors) that deploy ransomware detection or firewall systems across millions of endpoints. In such environments, data cannot be transferred outside the customer's device due to strict security, privacy, or regulatory constraints. Although FL applies broadly to malware threats, we validate the approach using the Ransomware Storage Access Patterns (RanSAP) dataset. Our experiments demonstrate that FL improves ransomware detection accuracy by a relative 9% over server-local models and achieves performance comparable to centralized training. These results indicate that FL offers a scalable, high-performing, and privacy-preserving framework for proactive ransomware detection across organizational and regulatory boundaries.

Paperid: 1954, https://arxiv.org/pdf/2511.01303.pdf

Abstract:
We introduce a simple, general framework that takes any differentially private estimator of any arbitrary quantity as a black box, and from it constructs a differentially private nonparametric confidence interval of that quantity. Our approach repeatedly subsamples the data, applies the private estimator to each subsample, and then post-processes the resulting empirical CDF to a confidence interval. Our analysis uses the randomness from the subsampling to achieve privacy amplification. Under mild assumptions, the empirical CDF we obtain approaches the CDF of the private statistic as the sample size grows. We use this to show that the confidence intervals we estimate are asymptotically valid, tight, and equivalent to their non-private counterparts. We provide empirical evidence that our method performs well compared with the (less-general) state-of-the-art algorithms.

Paperid: 1955, https://arxiv.org/pdf/2511.01287.pdf

Abstract:
With the rapid advancement of AI models, their deployment across diverse tasks has become increasingly widespread. A notable emerging application is leveraging AI models to assist in reviewing scientific papers. However, recent reports have revealed that some papers contain hidden, injected prompts designed to manipulate AI reviewers into providing overly favorable evaluations. In this work, we present an early systematic investigation into this emerging threat. We propose two classes of attacks: (1) static attack, which employs a fixed injection prompt, and (2) iterative attack, which optimizes the injection prompt against a simulated reviewer model to maximize its effectiveness. Both attacks achieve striking performance, frequently inducing full evaluation scores when targeting frontier AI reviewers. Furthermore, we show that these attacks are robust across various settings. To counter this threat, we explore a simple detection-based defense. While it substantially reduces the attack success rate, we demonstrate that an adaptive attacker can partially circumvent this defense. Our findings underscore the need for greater attention and rigorous safeguards against prompt-injection threats in AI-assisted peer review.

Paperid: 1956, https://arxiv.org/pdf/2511.00360.pdf

Abstract:
Network Intrusion Detection Systems (NIDS) developed using publicly available datasets predominantly focus on enterprise environments, raising concerns about their effectiveness for converged Information Technology (IT) and Operational Technology (OT) in energy infrastructures. This study evaluates the representativeness of five widely used datasets: CIC-IDS2017, SWaT, WADI, Sherlock, and CIC-Modbus2023 against network-detectable MITRE ATT&CK techniques extracted from documented energy sector incidents. Using a structured five-step analytical approach, this article successfully developed and performed a gap analysis that identified 94 network observable techniques from an initial pool of 274 ATT&CK techniques. Sherlock dataset exhibited the highest mean coverage (0.56), followed closely by CIC-IDS2017 (0.55), while SWaT and WADI recorded the lowest scores (0.38). Combining CIC-IDS2017, Sherlock, and CIC-Modbus2023 achieved an aggregate coverage of 92%, highlighting their complementary strengths. The analysis identifies critical gaps, particularly in lateral movement and industrial protocol manipulation, providing a clear pathway for dataset enhancement and more robust NIDS evaluation in hybrid IT/OT energy environments.

Paperid: 1957, https://arxiv.org/pdf/2511.00265.pdf

Abstract:
Traditional cybersecurity tabletop exercises (TTXs) provide valuable training but are often scripted, resource-intensive, and difficult to scale. We introduce AgentBnB, a browser-based re-imagining of the Backdoors & Breaches game that integrates large language model teammates with a Bloom-aligned, retrieval-augmented copilot (C2D2). The system expands a curated corpus into factual, conceptual, procedural, and metacognitive snippets, delivering on-demand, cognitively targeted hints. Prompt-engineered agents employ a scaffolding ladder that gradually fades as learner confidence grows. In a solo-player pilot with four graduate students, participants reported greater intention to use the agent-based version compared to the physical card deck and viewed it as more scalable, though a ceiling effect emerged on a simple knowledge quiz. Despite limitations of small sample size, single-player focus, and narrow corpus, these early findings suggest that large language model augmented TTXs can provide lightweight, repeatable practice without the logistical burden of traditional exercises. Planned extensions include multi-player modes, telemetry-driven coaching, and comparative studies with larger cohorts.

Paperid: 1958, https://arxiv.org/pdf/2510.26420.pdf

Abstract:
The rapid advancement of deep neural networks (DNNs) heavily relies on large-scale, high-quality datasets. However, unauthorized commercial use of these datasets severely violates the intellectual property rights of dataset owners. Existing backdoor-based dataset ownership verification methods suffer from inherent limitations: poison-label watermarks are easily detectable due to label inconsistencies, while clean-label watermarks face high technical complexity and failure on high-resolution images. Moreover, both approaches employ static watermark patterns that are vulnerable to detection and removal. To address these issues, this paper proposes a sample-specific clean-label backdoor watermarking (i.e., SSCL-BW). By training a U-Net-based watermarked sample generator, this method generates unique watermarks for each sample, fundamentally overcoming the vulnerability of static watermark patterns. The core innovation lies in designing a composite loss function with three components: target sample loss ensures watermark effectiveness, non-target sample loss guarantees trigger reliability, and perceptual similarity loss maintains visual imperceptibility. During ownership verification, black-box testing is employed to check whether suspicious models exhibit predefined backdoor behaviors. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed method and its robustness against potential watermark removal attacks.

Paperid: 1959, https://arxiv.org/pdf/2510.25806.pdf

Abstract:
Cyber attacks threaten economic interests, critical infrastructure, and public health and safety. To counter this, entities adopt cyber threat hunting, a proactive approach that involves formulating hypotheses and searching for attack patterns within organisational networks. Automating cyber threat hunting presents challenges, particularly in generating hypotheses, as it is a manually created and confirmed process, making it time-consuming. To address these challenges, we introduce APThreatHunter, an automated threat hunting solution that generates hypotheses with minimal human intervention, eliminating analyst bias and reducing time and cost. This is done by presenting possible risks based on the system's current state and a set of indicators to indicate whether any of the detected risks are happening or not. We evaluated APThreatHunter using real-world Android malware samples, and the results revealed the practicality of using automated planning for goal hypothesis generation in cyber threat hunting activities.

Paperid: 1960, https://arxiv.org/pdf/2510.23036.pdf

Abstract:
As the primary mechanism of digital authentication, user-created passwords exhibit common patterns and regularities that can be learned from leaked datasets. Password choices are profoundly shaped by external factors, including social contexts, cultural trends, and popular vocabulary. Prevailing password guessing models primarily emphasize patterns derived from leaked passwords, while neglecting these external influences -- a limitation that hampers their adaptability to emerging password trends and erodes their effectiveness over time. To address these challenges, we propose KAPG, a knowledge-augmented password guessing framework that adaptively integrates external lexical knowledge into the guessing process. KAPG couples internal statistical knowledge learned from leaked passwords with external information that reflects real-world trends. By using password prefixes as anchors for knowledge lookup, it dynamically injects relevant external cues during generation while preserving the structural regularities of authentic passwords. Experiments on twelve leaked datasets show that KnowGuess achieves average improvements of 36.5\% and 74.7\% over state-of-the-art models in intra-site and cross-site scenarios, respectively. Further analyses of password overlap and model efficiency highlight its robustness and computational efficiency. To counter these attacks, we further develop KAPSM, a trend-aware and site-specific password strength meter. Experiments demonstrate that KAPSM significantly outperforms existing tools in accuracy across diverse evaluation settings.

Paperid: 1961, https://arxiv.org/pdf/2510.21459.pdf

Abstract:
Honeypots are decoy systems used for gathering valuable threat intelligence or diverting attackers away from production systems. Maximising attacker engagement is essential to their utility. However research has highlighted that context-awareness, such as the ability to respond to new attack types, systems and attacker agents, is necessary to increase engagement. Large Language Models (LLMs) have been shown as one approach to increase context awareness but suffer from several challenges including accuracy and timeliness of response time, high operational costs and data-protection issues due to cloud deployment. We propose the System-Based Attention Shell Honeypot (SBASH) framework which manages data-protection issues through the use of lightweight local LLMs. We investigate the use of Retrieval Augmented Generation (RAG) supported LLMs and non-RAG LLMs for Linux shell commands and evaluate them using several different metrics such as response time differences, realism from human testers, and similarity to a real system calculated with Levenshtein distance, SBert, and BertScore. We show that RAG improves accuracy for untuned models while models that have been tuned via a system prompt that tells the LLM to respond like a Linux system achieve without RAG a similar accuracy as untuned with RAG, while having a slightly lower latency.

Paperid: 1962, https://arxiv.org/pdf/2510.21452.pdf

Abstract:
Software supply chains (SSCs) are complex systems composed of dynamic, heterogeneous technical and social components which collectively achieve the production and maintenance of software artefacts. Attacks on SSCs are increasing, yet pervasive vulnerability analysis is challenging due to their complexity. Therefore, threat detection must be targeted, to account for the large and dynamic structure, and adaptive, to account for its change and diversity. While current work focuses on technical approaches for monitoring supply chain dependencies and establishing component controls, approaches which inform threat detection through understanding the socio-technical dynamics are lacking. We outline a position and research vision to develop and investigate the use of socio-technical models to support adaptive threat detection of SSCs. We motivate this approach through an analysis of the XZ Utils attack whereby malicious actors undermined the maintainers' trust via the project's GitHub and mailing lists. We highlight that monitoring technical and social data can identify trends which indicate suspicious behaviour to then inform targeted and intensive vulnerability assessment. We identify challenges and research directions to achieve this vision considering techniques for developer and software analysis, decentralised adaptation and the need for a test bed for software supply chain security research.

Paperid: 1963, https://arxiv.org/pdf/2510.21024.pdf

Abstract:
The integration of machine learning (ML) systems into critical industries such as healthcare, finance, and cybersecurity has transformed decision-making processes, but it also brings new challenges around trust, security, and accountability. As AI systems become more ubiquitous, ensuring the transparency and correctness of AI-driven decisions is crucial, especially when they have direct consequences on privacy, security, or fairness. Verifiable AI, powered by Zero-Knowledge Machine Learning (zkML), offers a robust solution to these challenges. zkML enables the verification of AI model inferences without exposing sensitive data, providing an essential layer of trust and privacy. However, traditional zkML systems typically require deep cryptographic expertise, placing them beyond the reach of most ML engineers. In this paper, we introduce JSTprove, a specialized zkML toolkit, built on Polyhedra Network's Expander backend, to enable AI developers and ML engineers to generate and verify proofs of AI inference. JSTprove provides an end-to-end verifiable AI inference pipeline that hides cryptographic complexity behind a simple command-line interface while exposing auditable artifacts for reproducibility. We present the design, innovations, and real-world use cases of JSTprove as well as our blueprints and tooling to encourage community review and extension. JSTprove therefore serves both as a usable zkML product for current engineering needs and as a reproducible foundation for future research and production deployments of verifiable AI.

Paperid: 1964, https://arxiv.org/pdf/2510.20739.pdf

Abstract:
Program analysis tools often produce large volumes of candidate vulnerability reports that require costly manual review, creating a practical challenge: how can security analysts prioritize the reports most likely to be true vulnerabilities? This paper investigates whether machine learning can be applied to prioritizing vulnerabilities reported by program analysis tools. We focus on Node.js packages and collect a benchmark of 1,883 Node.js packages, each containing one reported ACE or ACI vulnerability. We evaluate a variety of machine learning approaches, including classical models, graph neural networks (GNNs), large language models (LLMs), and hybrid models that combine GNN and LLMs, trained on data based on a dynamic program analysis tool's output. The top LLM achieves $F_{1} {=} 0.915$, while the best GNN and classical ML models reaching $F_{1} {=} 0.904$. At a less than 7% false-negative rate, the leading model eliminates 66.9% of benign packages from manual review, taking around 60 ms per package. If the best model is tuned to operate at a precision level of 0.8 (i.e., allowing 20% false positives amongst all warnings), our approach can detect 99.2% of exploitable taint flows while missing only 0.8%, demonstrating strong potential for real-world vulnerability triage.

Paperid: 1965, https://arxiv.org/pdf/2510.20061.pdf

Abstract:
AI systems have the potential to produce both benefits and harms, but without rigorous and ongoing adversarial evaluation, AI actors will struggle to assess the breadth and magnitude of the AI risk surface. Researchers from the field of systems design have developed several effective sociotechnical AI evaluation and red teaming techniques targeting bias, hate speech, mis/disinformation, and other documented harm classes. However, as increasingly sophisticated AI systems are released into high-stakes sectors (such as education, healthcare, and intelligence-gathering), our current evaluation and monitoring methods are proving less and less capable of delivering effective oversight. In order to actually deliver responsible AI and to ensure AI's harms are fully understood and its security vulnerabilities mitigated, pioneering new approaches to close this "responsibility gap" are now more urgent than ever. In this paper, we propose one such approach, the cooperative public AI red-teaming exercise, and discuss early results of its prior pilot implementations. This approach is intertwined with CAMLIS itself: the first in-person public demonstrator exercise was held in conjunction with CAMLIS 2024. We review the operational design and results of this exercise, the prior National Institute of Standards and Technology (NIST)'s Assessing the Risks and Impacts of AI (ARIA) pilot exercise, and another similar exercise conducted with the Singapore Infocomm Media Development Authority (IMDA). Ultimately, we argue that this approach is both capable of delivering meaningful results and is also scalable to many AI developing jurisdictions.

Paperid: 1966, https://arxiv.org/pdf/2510.19851.pdf

Abstract:
Recent findings suggest that misaligned models may exhibit deceptive behavior, raising concerns about output trustworthiness. Chain-of-thought (CoT) is a promising tool for alignment monitoring: when models articulate their reasoning faithfully, monitors can detect and mitigate harmful behaviors before undesirable outcomes occur. However, a key uncertainty is: Can models obfuscate their CoT in order to pursue hidden adversarial objectives while evading detection? To answer this question and thus stress-test CoT monitorability, we develop a composable and quantifiable taxonomy of prompts to elicit CoT obfuscation. We evaluate both internal CoT (reasoning traces) and external CoT (prompted reasoning in outputs) using toy tasks and more realistic environments in SHADE-Arena. We show that: (i) CoT monitoring performs accurately and efficiently without obfuscation pressure. (ii) Under strong obfuscation pressure, some models successfully complete adversarial tasks while evading detection. (iii) Models do not obfuscate their internal CoT as much as their external CoT (under prompt pressure). These results suggest that while CoT provides valuable oversight in benign settings, robust deployment requires model-specific stress-testing of monitorability.

Paperid: 1967, https://arxiv.org/pdf/2510.19462.pdf

Abstract:
In this work, we study security of Model Context Protocol (MCP) agent toolchains and their applications in smart homes. We introduce AegisMCP, a protocol-level intrusion detector. Our contributions are: (i) a minimal attack suite spanning instruction-driven escalation, chain-of-tool exfiltration, malicious MCP server registration, and persistence; (ii) NEBULA-Schema (Network-Edge Behavioral Learning for Untrusted LLM Agents), a reusable protocol-level instrumentation that represents MCP activity as a streaming heterogeneous temporal graph over agents, MCP servers, tools, devices, remotes, and sessions; and (iii) a CPU-only streaming detector that fuses novelty, session-DAG structure, and attribute cues for near-real-time edge inference, with optional fusion of local prompt-guardrail signals. On an emulated smart-home testbed spanning multiple MCP stacks and a physical bench, AegisMCP achieves sub-second per-window model inference and end-to-end alerting. The latency of AegisMCP is consistently sub-second on Intel N150-class edge hardware, while outperforming traffic-only and sequence baselines; ablations confirm the importance of DAG and install/permission signals. We release code, schemas, and generators for reproducible evaluation.

Paperid: 1968, https://arxiv.org/pdf/2510.19418.pdf

Abstract:
As the volume of stored data continues to grow, identifying and protecting sensitive information within large repositories becomes increasingly challenging, especially when shared with multiple users with different roles and permissions. This work presents a system architecture for trusted data sharing with policy-driven access control, enabling selective protection of sensitive regions while maintaining scalability. The proposed architecture integrates four core modules that combine automated detection of sensitive regions, post-correction, key management, and access control. Sensitive regions are secured using a hybrid scheme that employs symmetric encryption for efficiency and Attribute-Based Encryption for policy enforcement. The system supports efficient key distribution and isolates key storage to strengthen overall security. To demonstrate its applicability, we evaluate the system on visual datasets, where Privacy-Sensitive Objects in images are automatically detected, reassessed, and selectively encrypted prior to sharing in a data repository. Experimental results show that our system provides effective PSO detection, increases macro-averaged F1 score (5%) and mean Average Precision (10%), and maintains an average policy-enforced decryption time of less than 1 second per image. These results demonstrate the effectiveness, efficiency and scalability of our proposed solution for fine-grained access control.

Paperid: 1969, https://arxiv.org/pdf/2510.19145.pdf

Abstract:
The growing use of third-party hardware accelerators (e.g., FPGAs, ASICs) for deep neural networks (DNNs) introduces new security vulnerabilities. Conventional model-level backdoor attacks, which only poison a model's weights to misclassify inputs with a specific trigger, are often detectable because the entire attack logic is embedded within the model (i.e., software), creating a traceable layer-by-layer activation path. This paper introduces the HArdware-Model Logically Combined Attack (HAMLOCK), a far stealthier threat that distributes the attack logic across the hardware-software boundary. The software (model) is now only minimally altered by tuning the activations of few neurons to produce uniquely high activation values when a trigger is present. A malicious hardware Trojan detects those unique activations by monitoring the corresponding neurons' most significant bit or the 8-bit exponents and triggers another hardware Trojan to directly manipulate the final output logits for misclassification. This decoupled design is highly stealthy, as the model itself contains no complete backdoor activation path as in conventional attacks and hence, appears fully benign. Empirically, across benchmarks like MNIST, CIFAR10, GTSRB, and ImageNet, HAMLOCK achieves a near-perfect attack success rate with a negligible clean accuracy drop. More importantly, HAMLOCK circumvents the state-of-the-art model-level defenses without any adaptive optimization. The hardware Trojan is also undetectable, incurring area and power overheads as low as 0.01%, which is easily masked by process and environmental noise. Our findings expose a critical vulnerability at the hardware-software interface, demanding new cross-layer defenses against this emerging threat.

Paperid: 1970, https://arxiv.org/pdf/2510.19026.pdf

Abstract:
In recent years, the rapid integration of Internet of Things (IoT) devices into the healthcare sector has brought about revolutionary advancements in patient care and data management. While these technological innovations hold immense promise, they concurrently raise critical security concerns, particularly in safeguarding medical data against potential cyber threats. The sensitive nature of health-related information requires robust measures to ensure the confidentiality, integrity, and availability of patient data in IoT-enabled medical environments. Addressing the imperative need for enhanced security in IoT-based healthcare systems, we propose a comprehensive method encompassing three distinct phases. In the first phase, we implement Blockchain-Enabled Request and Transaction Encryption to strengthen data transaction security, providing an immutable and transparent framework. In the second phase, we introduce a Request Pattern Recognition Check that leverages diverse data sources to identify and block potential unauthorized access attempts. Finally, the third phase incorporates Feature Selection and a BiLSTM network to enhance the accuracy and efficiency of intrusion detection using advanced machine learning techniques. We compared the simulation results of the proposed method with three recent related methods: AIBPSF-IoMT, OMLIDS-PBIoT, and AIMMFIDS. The evaluation criteria include detection rate, false alarm rate, precision, recall, and accuracy - crucial benchmarks for assessing the overall performance of intrusion detection systems. Our findings show that the proposed method outperforms existing approaches across all evaluated criteria, demonstrating its effectiveness in improving the security of IoT-based healthcare systems.

Paperid: 1971, https://arxiv.org/pdf/2510.18568.pdf

Abstract:
The integration of Internet of Things (IoT) devices in healthcare has revolutionized patient care by enabling real-time monitoring, personalized treatments, and efficient data management. However, this technological advancement introduces significant security risks, particularly concerning the confidentiality, integrity, and availability of sensitive medical data. Traditional security measures are often insufficient to address the unique challenges posed by IoT environments, such as heterogeneity, resource constraints, and the need for real-time processing. To tackle these challenges, we propose a comprehensive three-phase security framework designed to enhance the security and reliability of IoT-enabled healthcare systems. In the first phase, the framework assesses the reliability of IoT devices using a reputation-based trust estimation mechanism, which combines device behavior analytics with off-chain data storage to ensure scalability. The second phase integrates blockchain technology with a lightweight proof-of-work mechanism, ensuring data immutability, secure communication, and resistance to unauthorized access. The third phase employs a lightweight Long Short-Term Memory (LSTM) model for anomaly detection and classification, enabling real-time identification of cyber threats. Simulation results demonstrate that the proposed framework outperforms existing methods, achieving a 2% increase in precision, accuracy, and recall, a 5% higher attack detection rate, and a 3% reduction in false alarm rate. These improvements highlight the framework's ability to address critical security concerns while maintaining scalability and real-time performance.

Paperid: 1972, https://arxiv.org/pdf/2510.17621.pdf

Abstract:
Federated Learning (FL) enables collaborative training of Machine Learning (ML) models across multiple clients while preserving their privacy. Rather than sharing raw data, federated clients transmit locally computed updates to train the global model. Although this paradigm should provide stronger privacy guarantees than centralized ML, client updates remain vulnerable to privacy leakage. Adversaries can exploit them to infer sensitive properties about the training data or even to reconstruct the original inputs via Gradient Inversion Attacks (GIAs). Under the honest-butcurious threat model, GIAs attempt to reconstruct training data by reversing intermediate updates using optimizationbased techniques. We observe that these approaches usually reconstruct noisy approximations of the original inputs, whose quality can be enhanced with specialized denoising models. This paper presents Gradient Update Inversion with DEnoising (GUIDE), a novel methodology that leverages diffusion models as denoising tools to improve image reconstruction attacks in FL. GUIDE can be integrated into any GIAs that exploits surrogate datasets, a widely adopted assumption in GIAs literature. We comprehensively evaluate our approach in two attack scenarios that use different FL algorithms, models, and datasets. Our results demonstrate that GUIDE integrates seamlessly with two state-ofthe- art GIAs, substantially improving reconstruction quality across multiple metrics. Specifically, GUIDE achieves up to 46% higher perceptual similarity, as measured by the DreamSim metric.

Paperid: 1973, https://arxiv.org/pdf/2510.17333.pdf

Abstract:
Encrypted controllers using homomorphic encryption have proven to guarantee the privacy of measurement and control signals, as well as system and controller parameters, while regulating the system as intended. However, encrypting dynamic controllers has remained a challenge due to growing noise and overflow issues in the encoding. In this paper, we review recent approaches to dynamic encrypted control, such as bootstrapping, periodic resets of the controller state, integer reformulations, and FIR controllers, and equip them with a stability and performance analysis to evaluate their suitability. We complement the analysis with a numerical performance comparison on a benchmark system.

Paperid: 1974, https://arxiv.org/pdf/2510.16593.pdf

Abstract:
This paper presents DESTinE Block, a blockchain-based data storage framework designed for power systems and optimized for resource-constrained environments, including grid-edge devices such as single-board computers. The proposed architecture leverages the InterPlanetary File System (IPFS) for storing large files while maintaining secure and traceable metadata on a custom blockchain named DESTinE Block. The metadata, comprising the IPFS Content Identifier (CID), uploader identity, administrator verification, and timestamp; is immutably recorded on-chain to ensure authenticity and integrity. DESTinE Block adopts a dual-blockchain abstraction, where the blockchain remains unaware of the IPFS storage layer to enhance security and limit the exposure of sensitive file data. The consensus mechanism is based on Proof of Authority (PoA), where both an administrator and an uploader with distinct cryptographic key pairs are required to create a block collaboratively. Each block contains verified signatures of both parties and is designed to be computationally efficient, enabling deployment on devices like the Raspberry Pi 5. The framework was tested on both an x86-based device and an ARM64-based Raspberry Pi, demonstrating its potential for secure, decentralized logging and measurement storage in smart grid applications. Moreover, DESTinE Block is compared with a similar framework based on Multichain. The results indicate that DESTinE Block provides a promising solution for tamper-evident data retention in distributed power system infrastructure while maintaining minimal hardware requirements.

Paperid: 1975, https://arxiv.org/pdf/2510.16219.pdf

Abstract:
Malicious agents pose significant threats to the reliability and decision-making capabilities of Multi-Agent Systems (MAS) powered by Large Language Models (LLMs). Existing defenses often fall short due to reactive designs or centralized architectures which may introduce single points of failure. To address these challenges, we propose SentinelNet, the first decentralized framework for proactively detecting and mitigating malicious behaviors in multi-agent collaboration. SentinelNet equips each agent with a credit-based detector trained via contrastive learning on augmented adversarial debate trajectories, enabling autonomous evaluation of message credibility and dynamic neighbor ranking via bottom-k elimination to suppress malicious communications. To overcome the scarcity of attack data, it generates adversarial trajectories simulating diverse threats, ensuring robust training. Experiments on MAS benchmarks show SentinelNet achieves near-perfect detection of malicious agents, close to 100% within two debate rounds, and recovers 95% of system accuracy from compromised baselines. By exhibiting strong generalizability across domains and attack patterns, SentinelNet establishes a novel paradigm for safeguarding collaborative MAS.

Paperid: 1976, https://arxiv.org/pdf/2510.15303.pdf

Abstract:
Large web-scale datasets have driven the rapid advancement of pre-trained language models (PLMs), but unauthorized data usage has raised serious copyright concerns. Existing dataset ownership verification (DOV) methods typically assume that watermarks remain stable during inference; however, this assumption often fails under natural noise and adversary-crafted perturbations. We propose the first certified dataset ownership verification method for PLMs based on dual-space smoothing (i.e., DSSmoothing). To address the challenges of text discreteness and semantic sensitivity, DSSmoothing introduces continuous perturbations in the embedding space to capture semantic robustness and applies controlled token reordering in the permutation space to capture sequential robustness. DSSmoothing consists of two stages: in the first stage, triggers are collaboratively embedded in both spaces to generate norm-constrained and robust watermarked datasets; in the second stage, randomized smoothing is applied in both spaces during verification to compute the watermark robustness (WR) of suspicious models and statistically compare it with the principal probability (PP) values of a set of benign models. Theoretically, DSSmoothing provides provable robustness guarantees for dataset ownership verification by ensuring that WR consistently exceeds PP under bounded dual-space perturbations. Extensive experiments on multiple representative web datasets demonstrate that DSSmoothing achieves stable and reliable verification performance and exhibits robustness against potential adaptive attacks.

Paperid: 1977, https://arxiv.org/pdf/2510.15112.pdf

Abstract:
With the exponential growth in mobile applications, protecting user privacy has become even more crucial. Android applications are often known for collecting, storing, and sharing sensitive user information such as contacts, location, camera, and microphone data often without the user's clear consent or awareness raising significant privacy risks and exposure. In the context of privacy assessment, dataflow analysis is particularly valuable for identifying data usage and potential leaks. Traditionally, this type of analysis has relied on formal methods, heuristics, and rule-based matching. However, these techniques are often complex to implement and prone to errors, such as taint explosion for large programs. Moreover, most existing Android dataflow analysis methods depend heavily on predefined list of sinks, limiting their flexibility and scalability. To address the limitations of these existing techniques, we propose AndroByte, an AI-driven privacy analysis tool that leverages LLM reasoning on bytecode summarization to dynamically generate accurate and explainable dataflow call graphs from static code analysis. AndroByte achieves a significant F\b{eta}-Score of 89% in generating dynamic dataflow call graphs on the fly, outperforming the effectiveness of traditional tools like FlowDroid and Amandroid in leak detection without relying on predefined propagation rules or sink lists. Moreover, AndroByte's iterative bytecode summarization provides comprehensive and explainable insights into dataflow and leak detection, achieving high, quantifiable scores based on the G-Eval metric.

Paperid: 1978, https://arxiv.org/pdf/2510.14844.pdf

Abstract:
Machine Unlearning aims to remove specific data from trained models, addressing growing privacy and ethical concerns. We provide a theoretical analysis of a simple and widely used method - gradient ascent - used to reverse the influence of a specific data point without retraining from scratch. Leveraging the implicit bias of gradient descent towards solutions that satisfy the Karush-Kuhn-Tucker (KKT) conditions of a margin maximization problem, we quantify the quality of the unlearned model by evaluating how well it satisfies these conditions w.r.t. the retained data. To formalize this idea, we propose a new success criterion, termed \textbf{$(ε, δ, τ)$-successful} unlearning, and show that, for both linear models and two-layer neural networks with high dimensional data, a properly scaled gradient-ascent step satisfies this criterion and yields a model that closely approximates the retrained solution on the retained data. We also show that gradient ascent performs successful unlearning while still preserving generalization in a synthetic Gaussian-mixture setting.

Paperid: 1979, https://arxiv.org/pdf/2510.14133.pdf

Abstract:
Agentic AI systems, which leverage multiple autonomous agents and Large Language Models (LLMs), are increasingly used to address complex, multi-step tasks. The safety, security, and functionality of these systems are critical, especially in high-stakes applications. However, the current ecosystem of inter-agent communication is fragmented, with protocols such as the Model Context Protocol (MCP) for tool access and the Agent-to-Agent (A2A) protocol for coordination being analyzed in isolation. This fragmentation creates a semantic gap that prevents the rigorous analysis of system properties and introduces risks such as architectural misalignment and exploitable coordination issues. To address these challenges, we introduce a modeling framework for agentic AI systems composed of two foundational models. The first, the host agent model, formalizes the top-level entity that interacts with the user, decomposes tasks, and orchestrates their execution by leveraging external agents and tools. The second, the task lifecycle model, details the states and transitions of individual sub-tasks from creation to completion, providing a fine-grained view of task management and error handling. Together, these models provide a unified semantic framework for reasoning about the behavior of multi-AI agent systems. Grounded in this framework, we define 17 properties for the host agent and 14 for the task lifecycle, categorized into liveness, safety, completeness, and fairness. Expressed in temporal logic, these properties enable formal verification of system behavior, detection of coordination edge cases, and prevention of deadlocks and security vulnerabilities. Through this effort, we introduce the first rigorously grounded, domain-agnostic framework for the systematic analysis, design, and deployment of correct, reliable, and robust agentic AI systems.

Paperid: 1980, https://arxiv.org/pdf/2510.13322.pdf

Abstract:
Backdoor attacks pose a persistent security risk to deep neural networks (DNNs) due to their stealth and durability. While recent research has explored leveraging model unlearning mechanisms to enhance backdoor concealment, existing attack strategies still leave persistent traces that may be detected through static analysis. In this work, we introduce the first paradigm of revocable backdoor attacks, where the backdoor can be proactively and thoroughly removed after the attack objective is achieved. We formulate the trigger optimization in revocable backdoor attacks as a bilevel optimization problem: by simulating both backdoor injection and unlearning processes, the trigger generator is optimized to achieve a high attack success rate (ASR) while ensuring that the backdoor can be easily erased through unlearning. To mitigate the optimization conflict between injection and removal objectives, we employ a deterministic partition of poisoning and unlearning samples to reduce sampling-induced variance, and further apply the Projected Conflicting Gradient (PCGrad) technique to resolve the remaining gradient conflicts. Experiments on CIFAR-10 and ImageNet demonstrate that our method maintains ASR comparable to state-of-the-art backdoor attacks, while enabling effective removal of backdoor behavior after unlearning. This work opens a new direction for backdoor attack research and presents new challenges for the security of machine learning systems.

Paperid: 1981, https://arxiv.org/pdf/2510.11895.pdf

Abstract:
We study statistical estimation under local differential privacy (LDP) when users may hold heterogeneous privacy levels and accuracy must be guaranteed with high probability. Departing from the common in-expectation analyses, and for one-dimensional and multi-dimensional mean estimation problems, we develop finite sample upper bounds in $\ell_2$-norm that hold with probability at least $1-β$. We complement these results with matching minimax lower bounds, establishing the optimality (up to constants) of our guarantees in the heterogeneous LDP regime. We further study distribution learning in $\ell_\infty$-distance, designing an algorithm with high-probability guarantees under heterogeneous privacy demands. Our techniques offer principled guidance for designing mechanisms in settings with user-specific privacy levels.

Paperid: 1982, https://arxiv.org/pdf/2510.11224.pdf

Abstract:
Threshold-Computation-in-the-Head (TCitH) and VOLE-in-the-Head (VOLEitH), two recent developments of the MPC-in-the-Head (MPCitH) paradigm, have significantly improved the performance of digital signature schemes in this framework. In this note, we embed the restricted decoding problem within these frameworks. We propose a structurally simple modeling that achieves competitive signature sizes. Specifically, by instantiating the restricted decoding problem with the same hardness assumption underlying CROSS, we reduce sizes by more than a factor of two compared to the NIST submission. Moreover, we observe that ternary full-weight decoding, closely related to the hardness assumption underlying WAVE, is a restricted decoding problem. Using ternary full-weight decoding, we obtain signature sizes comparable to the smallest MPCitH-based candidates in the NIST competition.

Paperid: 1983, https://arxiv.org/pdf/2510.11116.pdf

Abstract:
Local Differential Privacy (LDP) addresses significant privacy concerns in sensitive data collection. In this work, we focus on numerical data collection under LDP, targeting a significant gap in the literature: existing LDP mechanisms are optimized for either a very small ($|Ω| \in \{2, 3\}$) or infinite output spaces. However, no generalized method for constructing an optimal mechanism for an arbitrary output size $N$ exists. To fill this gap, we propose the \textbf{N-output mechanism}, a generalized framework that maps numerical data to one of $N$ discrete outputs. We formulate the mechanism's design as an optimization problem to minimize estimation variance for any given $N \geq 2$ and develop both numerical and analytical solutions. This results in a mechanism that is highly accurate and adaptive, as its design is determined by solving an optimization problem for any chosen $N$. Furthermore, we extend our framework and existing mechanisms to the task of distribution estimation. Empirical evaluations show that the N-output mechanism achieves state-of-the-art accuracy for mean, variance, and distribution estimation with small communication costs.

Paperid: 1984, https://arxiv.org/pdf/2510.10486.pdf

Abstract:
Open-source large language models (LLMs) have demonstrated considerable dominance over proprietary LLMs in resolving neural processing tasks, thanks to the collaborative and sharing nature. Although full access to source codes, model parameters, and training data lays the groundwork for transparency, we argue that such a full-access manner is vulnerable to stego attacks, and their ill-effects are not fully understood. In this paper, we conduct a systematic formalization for stego attacks on open-source LLMs by enumerating all possible threat models associated with adversary objectives, knowledge, and capabilities. Therein, the threat posed by adversaries with internal knowledge, who inject payloads and triggers during the model sharing phase, is of practical interest. We go even further and propose the first stego attack on open-source LLMs, dubbed SASER, which wields impacts through identifying targeted parameters, embedding payloads, injecting triggers, and executing payloads sequentially. Particularly, SASER enhances the attack robustness against quantization-based local deployment by de-quantizing the embedded payloads. In addition, to achieve stealthiness, SASER devises the performance-aware importance metric to identify targeted parameters with the least degradation of model performance. Extensive experiments on LlaMA2-7B and ChatGLM3-6B, without quantization, show that the stealth rate of SASER outperforms existing stego attacks (for general DNNs) by up to 98.1%, while achieving the same attack success rate (ASR) of 100%. More importantly, SASER improves ASR on quantized models from 0 to 100% in all settings. We appeal for investigations on countermeasures against SASER in view of the significant attack effectiveness.

Paperid: 1985, https://arxiv.org/pdf/2510.09984.pdf

Abstract:
Software systems can be represented as graphs, capturing dependencies among functions and processes. An interesting aspect of software systems is that they can be represented as different types of graphs, depending on the extraction goals and priorities. For example, function calls within the software can be captured to create function call graphs, which highlight the relationships between functions and their dependencies. Alternatively, the processes spawned by the software can be modeled to generate process interaction graphs, which focus on runtime behavior and inter-process communication. While these graph representations are related, each captures a distinct perspective of the system, providing complementary insights into its structure and operation. While previous studies have leveraged graph neural networks (GNNs) to analyze software behaviors, most of this work has focused on a single type of graph representation. The joint modeling of both function call graphs and process interaction graphs remains largely underexplored, leaving opportunities for deeper, multi-perspective analysis of software systems. This paper presents a pipeline for constructing and training Function Call Graphs (FCGs) and Process Call Graphs (PCGs) and learning joint embeddings. We demonstrate that joint embeddings outperform a single-graph model. In this paper, we propose GeminiNet, a unified neural network approach that learns joint embeddings from both FCGs and PCGs. We construct a new dataset of 635 Windows executables (318 malicious and 317 benign), extracting FCGs via Ghidra and PCGs via Any.Run sandbox. GeminiNet employs dual graph convolutional branches with an adaptive gating mechanism that balances contributions from static and dynamic views.

Paperid: 1986, https://arxiv.org/pdf/2510.09485.pdf

Abstract:
We study the problem of sampling from a distribution under local differential privacy (LDP). Given a private distribution $P \in \mathcal{P}$, the goal is to generate a single sample from a distribution that remains close to $P$ in $f$-divergence while satisfying the constraints of LDP. This task captures the fundamental challenge of producing realistic-looking data under strong privacy guarantees. While prior work by Park et al. (NeurIPS'24) focuses on global minimax-optimality across a class of distributions, we take a local perspective. Specifically, we examine the minimax risk in a neighborhood around a fixed distribution $P_0$, and characterize its exact value, which depends on both $P_0$ and the privacy level. Our main result shows that the local minimax risk is determined by the global minimax risk when the distribution class $\mathcal{P}$ is restricted to a neighborhood around $P_0$. To establish this, we (1) extend previous work from pure LDP to the more general functional LDP framework, and (2) prove that the globally optimal functional LDP sampler yields the optimal local sampler when constrained to distributions near $P_0$. Building on this, we also derive a simple closed-form expression for the locally minimax-optimal samplers which does not depend on the choice of $f$-divergence. We further argue that this local framework naturally models private sampling with public data, where the public data distribution is represented by $P_0$. In this setting, we empirically compare our locally optimal sampler to existing global methods, and demonstrate that it consistently outperforms global minimax samplers.

Paperid: 1987, https://arxiv.org/pdf/2510.09093.pdf

Abstract:
Large language models (LLMs) are now routinely used to autonomously execute complex tasks, from natural language processing to dynamic workflows like web searches. The usage of tool-calling and Retrieval Augmented Generation (RAG) allows LLMs to process and retrieve sensitive corporate data, amplifying both their functionality and vulnerability to abuse. As LLMs increasingly interact with external data sources, indirect prompt injection emerges as a critical and evolving attack vector, enabling adversaries to exploit models through manipulated inputs. Through a systematic evaluation of indirect prompt injection attacks across diverse models, we analyze how susceptible current LLMs are to such attacks, which parameters, including model size and manufacturer, specific implementations, shape their vulnerability, and which attack methods remain most effective. Our results reveal that even well-known attack patterns continue to succeed, exposing persistent weaknesses in model defenses. To address these vulnerabilities, we emphasize the need for strengthened training procedures to enhance inherent resilience, a centralized database of known attack vectors to enable proactive defense, and a unified testing framework to ensure continuous security validation. These steps are essential to push developers toward integrating security into the core design of LLMs, as our findings show that current models still fail to mitigate long-standing threats.

Paperid: 1988, https://arxiv.org/pdf/2510.08725.pdf

Abstract:
Block ciphers are versatile cryptographic ingredients that are used in a wide range of applications ranging from secure Internet communications to disk encryption. While post-quantum security of public-key cryptography has received significant attention, the case of symmetric-key cryptography (and block ciphers in particular) remains a largely unexplored topic. In this work, we set the foundations for a theory of post-quantum security for block ciphers and associated constructions. Leveraging our new techniques, we provide the first post-quantum security proofs for the key-length extension scheme FX, the tweakable block ciphers LRW and XEX, and most block cipher encryption and authentication modes. Our techniques can be used for security proofs in both the plain model and the quantum ideal cipher model. Our work takes significant initial steps in establishing a rigorous understanding of the post-quantum security of practical symmetric-key cryptography.

Paperid: 1989, https://arxiv.org/pdf/2510.08473.pdf

Abstract:
The assumed hardness of the Shortest Vector Problem in high-dimensional lattices is one of the cornerstones of post-quantum cryptography. The fastest known heuristic attacks on SVP are via so-called sieving methods. While these still take exponential time in the dimension $d$, they are significantly faster than non-heuristic approaches and their heuristic assumptions are verified by extensive experiments. $k$-Tuple sieving is an iterative method where each iteration takes as input a large number of lattice vectors of a certain norm, and produces an equal number of lattice vectors of slightly smaller norm, by taking sums and differences of $k$ of the input vectors. Iterating these ''sieving steps'' sufficiently many times produces a short lattice vector. The fastest attacks (both classical and quantum) are for $k=2$, but taking larger $k$ reduces the amount of memory required for the attack. In this paper we improve the quantum time complexity of 3-tuple sieving from $2^{0.3098 d}$ to $2^{0.2846 d}$, using a two-level amplitude amplification aided by a preprocessing step that associates the given lattice vectors with nearby ''center points'' to focus the search on the neighborhoods of these center points. Our algorithm uses $2^{0.1887d}$ classical bits and QCRAM bits, and $2^{o(d)}$ qubits. This is the fastest known quantum algorithm for SVP when total memory is limited to $2^{0.1887d}$.

Paperid: 1990, https://arxiv.org/pdf/2510.07334.pdf

Abstract:
Quantum computing is rapidly emerging as one of the most transformative technologies of our time. With the potential to tackle problems that remain intractable for even the most powerful classical supercomputers, quantum hardware has advanced at an extraordinary pace. Today, major platforms such as IBM Quantum, Amazon Braket, and Microsoft Azure provide cloud-based access to quantum processors, making them more widely available than ever before. While a promising technology, quantum computing is not magically immune to security threats. Much research has been done on post-quantum cryptography, which addresses how to protect classical computers from attackers using quantum computers. This article meanwhile introduces the dual idea of quantum computer security: how to protect quantum computers from security attacks.

Paperid: 1991, https://arxiv.org/pdf/2510.06468.pdf

Abstract:
We present BATTLE for Bitcoin, a DoS-resilient dispute layer that secures optimistic bridges between Bitcoin and rollups or sidechains. Our design adapts the BATTLE tournament protocol to Bitcoin's UTXO model using BitVM-style FLEX components and garbled circuits with on-demand L1 security bonds. Disputes are resolved in logarithmic rounds while recycling rewards, keeping the honest asserter's minimum initial capital constant even under many permissionless challengers. The construction is fully contestable (challengers can supply higher-work counter-proofs) and relies only on standard timelocks and pre-signed transaction DAGs, without new opcodes. For $N$ operators, the protocol requires $O(N^2)$ pre-signed transactions, signatures, and message exchanges, yet remains practical at $N\!\gtrsim\!10^3$, enabling high decentralization.

Paperid: 1992, https://arxiv.org/pdf/2510.05936.pdf

Abstract:
Provenance in scientific workflows is essential for understand- ing and reproducing processes, while in business processes, it can ensure compliance and correctness and facilitates process mining. However, the provenance of process adaptations, especially modifications during execu- tion, remains insufficiently addressed. A review of the literature reveals a lack of systematic approaches for capturing provenance information about adaptive workflows/processes. To fill this gap, we propose the AdProv method for collecting, storing, retrieving, and visualizing prove- nance of runtime workflow adaptations. In addition to the definition of the AdProv method in terms of steps and concepts like change events, we also present an architecture for a Provenance Holder service that is essential for implementing the method. To ensure semantic consistency and interoperability we define a mapping to the ontology PROV Ontol- ogy (PROV-O). Additionally, we extend the XES standard with elements for adaptation logging. Our main contributions are the AdProv method and a comprehensive framework and its tool support for managing adap- tive workflow provenance, facilitating advanced provenance tracking and analysis for different application domains.

Paperid: 1993, https://arxiv.org/pdf/2510.05803.pdf

Abstract:
The Five Safes is a framework used by national statistical offices (NSO) for assessing and managing the disclosure risk of data sharing. This paper makes two points: Firstly, the Five Safes can be understood as a specialization of a broader concept $\unicode{x2013}$ contextual integrity $\unicode{x2013}$ to the situation of statistical dissemination by an NSO. We demonstrate this by mapping the five parameters of contextual integrity onto the five dimensions of the Five Safes. Secondly, the Five Safes contextualizes narrow, technical notions of privacy within a holistic risk assessment. We demonstrate this with the example of differential privacy (DP). This contextualization allows NSOs to place DP within their Five Safes toolkit while also guiding the design of DP implementations within the broader privacy context, as delineated by both their regulation and the relevant social norms.

Paperid: 1994, https://arxiv.org/pdf/2510.03752.pdf

Abstract:
We construct a public-key encryption scheme from the hardness of the (planted) MinRank problem over uniformly random instances. This corresponds to the hardness of decoding random linear rank-metric codes. Existing constructions of public-key encryption from such problems require hardness for structured instances arising from the masking of efficiently decodable codes. Central to our construction is the development of a new notion of duality for rank-metric codes.

Paperid: 1995, https://arxiv.org/pdf/2510.03461.pdf

Abstract:
A resource leak occurs when a program fails to release a finite resource like a socket, file descriptor or database connection. While sound static analysis tools can detect all leaks, automatically repairing them remains challenging. Prior work took the output of a detection tool and attempted to repair only leaks from a hard-coded list of library resource types. That approach limits the scope of repairable leaks: real-world code uses resource wrappers that store a resource in a field and must themselves be closed. This paper makes four key contributions to improve resource leak repair in the presence of wrappers. (1) It integrates inference of resource management specifications into the repair pipeline, enabling extant fixing approaches to reason about wrappers. (2) It transforms programs into variants that are easier to analyze, making inference, detection, and fixing tools more effective; for instance, it makes detection tools report problems closer to the root cause, often in a client of a resource wrapper rather than within the wrapper class itself. (3) A novel field containment analysis reasons about resource lifetimes, enabling repair of more leaks involving resources stored in fields. (4) It introduces a new repair pattern and more precise reasoning to better handle resources stored in non-final fields. Prior work fixed 41% of resource leak warnings in the NJR benchmark suite; our implementation Arodnap fixes 68%.

Paperid: 1996, https://arxiv.org/pdf/2510.02519.pdf

Abstract:
We present TLoRa, an end-to-end architecture for HTTPS communication over LoRa by integrating TCP tunneling and a complete TLS 1.3 handshake. It enables a seamless and secure communication channel between WiFi-enabled end devices and the Internet over LoRa using an End Hub (EH) and a Net Relay (NR). The EH tethers a WiFi hotspot and a captive portal for user devices to connect and request URLs. The EH forwards the requested URLs to the NR using a secure tunnel over LoRa. The NR, which acts as a server-side proxy, receives and resolves the request from the Internet-based server. It then relays back the encrypted response from the server over the same secure tunnel. TLoRa operates in three phases -session setup, secure tunneling, and rendering. In the first phase, it manages the TCP socket and initiates the TLS handshake. In the second, it creates a secure tunnel and transfers encrypted TLS data over LoRa. Finally, it delivers the URL content to the user. TLoRa also implements a lightweight TLS record reassembly layer and a queuing mechanism for session multiplexing. We evaluate TLoRa on real hardware using multiple accesses to a web API. Results indicate that it provides a practical solution by successfully establishing a TLS session over LoRa in 9.9 seconds and takes 3.58 seconds to fulfill API requests. To the best of our knowledge, this is the first work to comprehensively design, implement, and evaluate the performance of HTTPS access over LoRa using full TLS.

Paperid: 1997, https://arxiv.org/pdf/2510.02169.pdf

Abstract:
The growing integration of open-source software and AI-driven technologies has introduced new layers of complexity into the software supply chain, challenging existing methods for dependency management and system assurance. While Software Bills of Materials (SBOMs) have become critical for enhancing transparency and traceability, current frameworks fall short in capturing the unique characteristics of AI systems -- namely, their dynamic, data-driven nature and the loosely coupled dependencies across datasets, models, and software components. These challenges are compounded by fragmented governance structures and the lack of robust tools for ensuring integrity, trust, and compliance in AI-enabled environments. In this paper, we introduce Trusted AI Bill of Materials (TAIBOM) -- a novel framework extending SBOM principles to the AI domain. TAIBOM provides (i) a structured dependency model tailored for AI components, (ii) mechanisms for propagating integrity statements across heterogeneous AI pipelines, and (iii) a trust attestation process for verifying component provenance. We demonstrate how TAIBOM supports assurance, security, and compliance across AI workflows, highlighting its advantages over existing standards such as SPDX and CycloneDX. This work lays the foundation for trustworthy and verifiable AI systems through structured software transparency.

Paperid: 1998, https://arxiv.org/pdf/2510.01350.pdf

Abstract:
Memristive crossbar arrays enable in-memory computing by performing parallel analog computations directly within memory, making them well-suited for machine learning, neural networks, and neuromorphic systems. However, despite their advantages, non-volatile memristors are vulnerable to security threats (such as adversarial extraction of stored weights when the hardware is compromised. Protecting these weights is essential since they represent valuable intellectual property resulting from lengthy and costly training processes using large, often proprietary, datasets. As a solution we propose two security mechanisms: Keyed Permutor and Watermark Protection Columns; where both safeguard critical weights and establish verifiable ownership (even in cases of data leakage). Our approach integrates efficiently with existing memristive crossbar architectures without significant design modifications. Simulations across 45nm, 22nm, and 7nm CMOS nodes, using a realistic interconnect model and a large RF dataset, show that both mechanisms offer robust protection with under 10% overhead in area, delay and power. We also present initial experiments employing the widely known MNIST dataset; further highlighting the feasibility of securing memristive in-memory computing systems with minimal performance trade-offs.

Paperid: 1999, https://arxiv.org/pdf/2510.01223.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks. However, they remain exposed to jailbreak attacks, eliciting harmful responses. The nested scenario strategy has been increasingly adopted across various methods, demonstrating immense potential. Nevertheless, these methods are easily detectable due to their prominent malicious intentions. In this work, we are the first to find and systematically verify that LLMs' alignment defenses are not sensitive to nested scenarios, where these scenarios are highly semantically relevant to the queries and incorporate targeted toxic knowledge. This is a crucial yet insufficiently explored direction. Based on this, we propose RTS-Attack (Semantically Relevant Nested Scenarios with Targeted Toxic Knowledge), an adaptive and automated framework to examine LLMs' alignment. By building scenarios highly relevant to the queries and integrating targeted toxic knowledge, RTS-Attack bypasses the alignment defenses of LLMs. Moreover, the jailbreak prompts generated by RTS-Attack are free from harmful queries, leading to outstanding concealment. Extensive experiments demonstrate that RTS-Attack exhibits superior performance in both efficiency and universality compared to the baselines across diverse advanced LLMs, including GPT-4o, Llama3-70b, and Gemini-pro. Our complete code is available in the supplementary material. WARNING: THIS PAPER CONTAINS POTENTIALLY HARMFUL CONTENT.

Paperid: 2000, https://arxiv.org/pdf/2510.00167.pdf

Abstract:
Autonomous drones must often respond to sudden events, such as alarms, faults, or unexpected changes in their environment, that require immediate and adaptive decision-making. Traditional approaches rely on safety engineers hand-coding large sets of recovery rules, but this strategy cannot anticipate the vast range of real-world contingencies and quickly becomes incomplete. Recent advances in embodied AI, powered by large visual language models, provide commonsense reasoning to assess context and generate appropriate actions in real time. We demonstrate this capability in a simulated urban benchmark in the Unreal Engine, where drones dynamically interpret their surroundings and decide on sudden maneuvers for safe landings. Our results show that embodied AI makes possible a new class of adaptive recovery and decision-making pipelines that were previously infeasible to design by hand, advancing resilience and safety in autonomous aerial systems.

Paperid: 2001, https://arxiv.org/pdf/2510.00151.pdf

Abstract:
AI models are often regarded as valuable intellectual property due to the high cost of their development, the competitive advantage they provide, and the proprietary techniques involved in their creation. As a result, AI model stealing attacks pose a serious concern for AI model providers. In this work, we present a novel attack targeting wireless devices equipped with AI hardware accelerators. The attack unfolds in two phases. In the first phase, the victim's device is compromised with a hardware Trojan (HT) designed to covertly leak model weights through a hidden communication channel, without the victim realizing it. In the second phase, the adversary uses a nearby wireless device to intercept the victim's transmission frames during normal operation and incrementally reconstruct the complete weight matrix. The proposed attack is agnostic to both the AI model architecture and the hardware accelerator used. We validate our approach through a hardware-based demonstration involving four diverse AI models of varying types and sizes. We detail the design of the HT and the covert channel, highlighting their stealthy nature. Additionally, we analyze the impact of bit error rates on the reception and propose an error mitigation technique. The effectiveness of the attack is evaluated based on the accuracy of the reconstructed models with stolen weights and the time required to extract them. Finally, we explore potential defense mechanisms.

Paperid: 2002, https://arxiv.org/pdf/2509.23882.pdf

Abstract:
OpenAI's GPT-OSS family provides open-weight language models with explicit chain-of-thought (CoT) reasoning and a Harmony prompt format. We summarize an extensive security evaluation of GPT-OSS-20B that probes the model's behavior under different adversarial conditions. Using the Jailbreak Oracle (JO) [1], a systematic LLM evaluation tool, the study uncovers several failure modes including quant fever, reasoning blackholes, Schrodinger's compliance, reasoning procedure mirage, and chain-oriented prompting. Experiments demonstrate how these behaviors can be exploited on the GPT-OSS-20B model, leading to severe consequences.

Paperid: 2003, https://arxiv.org/pdf/2509.22060.pdf

Abstract:
Recent studies have demonstrated the vulnerability of Automatic Speech Recognition systems to adversarial examples, which can deceive these systems into misinterpreting input speech commands. While previous research has primarily focused on white-box attacks with constrained optimizations, and transferability based black-box attacks against commercial Automatic Speech Recognition devices, this paper explores cost efficient white-box attack and non transferability black-box adversarial attacks on Automatic Speech Recognition systems, drawing insights from approaches such as Fast Gradient Sign Method and Zeroth-Order Optimization. Further, the novelty of the paper includes how poisoning attack can degrade the performances of state-of-the-art models leading to misinterpretation of audio signals. Through experimentation and analysis, we illustrate how hybrid models can generate subtle yet impactful adversarial examples with very little perturbation having Signal Noise Ratio of 35dB that can be generated within a minute. These vulnerabilities of state-of-the-art open source model have practical security implications, and emphasize the need for adversarial security.

Paperid: 2004, https://arxiv.org/pdf/2509.21634.pdf

Abstract:
The evolution toward 6G networks is being accelerated by the Open Radio Access Network (O-RAN) paradigm -- an open, interoperable architecture that enables intelligent, modular applications across public telecom and private enterprise domains. While this openness creates unprecedented opportunities for innovation, it also expands the attack surface, demanding resilient, low-cost, and autonomous security solutions. Legacy defenses remain largely reactive, labor-intensive, and inadequate for the scale and complexity of next-generation systems. Current O-RAN applications focus mainly on network optimization or passive threat detection, with limited capability for closed-loop, automated response. To address this critical gap, we present an agentic AI framework for fully automated, end-to-end threat mitigation in 6G O-RAN environments. MobiLLM orchestrates security workflows through a modular multi-agent system powered by Large Language Models (LLMs). The framework features a Threat Analysis Agent for real-time data triage, a Threat Classification Agent that uses Retrieval-Augmented Generation (RAG) to map anomalies to specific countermeasures, and a Threat Response Agent that safely operationalizes mitigation actions via O-RAN control interfaces. Grounded in trusted knowledge bases such as the MITRE FiGHT framework and 3GPP specifications, and equipped with robust safety guardrails, MobiLLM provides a blueprint for trustworthy AI-driven network security. Initial evaluations demonstrate that MobiLLM can effectively identify and orchestrate complex mitigation strategies, significantly reducing response latency and showcasing the feasibility of autonomous security operations in 6G.

Paperid: 2005, https://arxiv.org/pdf/2509.20714.pdf

Abstract:
In this paper we show that cryptographic backdoors in a neural network (NN) can be highly effective in two directions, namely mounting the attacks as well as in presenting the defenses as well. On the attack side, a carefully planted cryptographic backdoor enables powerful and invisible attack on the NN. Considering the defense, we present applications: first, a provably robust NN watermarking scheme; second, a protocol for guaranteeing user authentication; and third, a protocol for tracking unauthorized sharing of the NN intellectual property (IP). From a broader theoretical perspective, borrowing the ideas from Goldwasser et. al. [FOCS 2022], our main contribution is to show that all these instantiated practical protocol implementations are provably robust. The protocols for watermarking, authentication and IP tracking resist an adversary with black-box access to the NN, whereas the backdoor-enabled adversarial attack is impossible to prevent under the standard assumptions. While the theoretical tools used for our attack is mostly in line with the Goldwasser et. al. ideas, the proofs related to the defense need further studies. Finally, all these protocols are implemented on state-of-the-art NN architectures with empirical results corroborating the theoretical claims. Further, one can utilize post-quantum primitives for implementing the cryptographic backdoors, laying out foundations for quantum-era applications in machine learning (ML).

Paperid: 2006, https://arxiv.org/pdf/2509.20537.pdf

Abstract:
Altered fingerprint recognition (AFR) is challenging for biometric verification in applications such as border control, forensics, and fiscal admission. Adversaries can deliberately modify ridge patterns to evade detection, so robust recognition of altered prints is essential. We present DeepAFRNet, a deep learning recognition model that matches and recognizes distorted fingerprint samples. The approach uses a VGG16 backbone to extract high-dimensional features and cosine similarity to compare embeddings. We evaluate on the SOCOFing Real-Altered subset with three difficulty levels (Easy, Medium, Hard). With strict thresholds, DeepAFRNet achieves accuracies of 96.7 percent, 98.76 percent, and 99.54 percent for the three levels. A threshold-sensitivity study shows that relaxing the threshold from 0.92 to 0.72 sharply degrades accuracy to 7.86 percent, 27.05 percent, and 29.51 percent, underscoring the importance of threshold selection in biometric systems. By using real altered samples and reporting per-level metrics, DeepAFRNet addresses limitations of prior work based on synthetic alterations or limited verification protocols, and indicates readiness for real-world deployments where both security and recognition resilience are critical.

Paperid: 2007, https://arxiv.org/pdf/2509.20476.pdf

Abstract:
Federated Learning (FL) enables collaborative model training while preserving data privacy by keeping raw data locally stored on client devices, preventing access from other clients or the central server. However, recent studies reveal that sharing model gradients creates vulnerability to Model Inversion Attacks, particularly Deep Leakage from Gradients (DLG), which reconstructs private training data from shared gradients. While Homomorphic Encryption has been proposed as a promising defense mechanism to protect gradient privacy, fully encrypting all model gradients incurs high computational overhead. Selective encryption approaches aim to balance privacy protection with computational efficiency by encrypting only specific gradient components. However, the existing literature largely overlooks a theoretical exploration of the spectral behavior of encrypted versus unencrypted parameters, relying instead primarily on empirical evaluations. To address this gap, this paper presents a framework for theoretical analysis of the underlying principles of selective encryption as a defense against model inversion attacks. We then provide a comprehensive empirical study that identifies and quantifies the critical factors, such as model complexity, encryption ratios, and exposed gradients, that influence defense effectiveness. Our theoretical framework clarifies the relationship between gradient selection and privacy preservation, while our experimental evaluation demonstrates how these factors shape the robustness of defenses against model inversion attacks. Collectively, these contributions advance the understanding of selective encryption mechanisms and offer principled guidance for designing efficient, scalable, privacy-preserving federated learning systems.

Paperid: 2008, https://arxiv.org/pdf/2509.20356.pdf

Abstract:
Decentralized resource markets are Web 3.0 applications that build open-access platforms for trading digital resources among users without any central management. They promise cost reduction, transparency, and flexible service provision. However, these markets usually have large workload that must be processed in a timely manner, leading to serious scalability problems. Despite the large amount of work on blockchain scalability, existing solutions are ineffective as they do not account for these markets' work models and traffic patterns. We introduce chainScale, a secure hybrid sidechain-sharding solution that aims to boost throughput of decentralized resource markets and reduce their latency and storage footprint. At its core, chainScale leverages dependent sidechains and functionality-oriented workload splitting to parallelize traffic processing by having each market module assigned to a sidechain. Different from sharding, chainScale does not incur any cross-sidechain transactions that tend to be costly. chainScale introduces several techniques, including hierarchical workload sharing that further sub-divides overloaded modules, and weighted miner assignment that assigns miners with vested interest in the system to critical modules' sidechains. Furthermore, chainScale employs sidechain syncing to maintain the mainchain as the single truth of system state, and pruning to discard stale records. Beside analyzing security, we build a proof-of-concept implementation for a distributed file storage market as a use case. Our experiments show that, compared to a single sidechain-based prior solution, chainScale boosts throughput by 4x and reduces confirmation latency by 5x. Also, they show that chainScale outperforms sharding by 2.5x in throughput and 3.5x in latency.

Paperid: 2009, https://arxiv.org/pdf/2509.18572.pdf

Abstract:
This study examines the robustness of I2P, a well-regarded anonymous and decentralized peer-to-peer network designed to ensure anonymity, confidentiality, and circumvention of censorship. Unlike its more widely researched counterpart, TOR, I2P's resilience has received less scholarly attention. Employing network analysis, this research evaluates I2P's susceptibility to adversarial percolation. By utilizing the degree centrality as a measure of nodes' influence in the network, the finding suggests the network is vulnerable to targeted disruptions. Before percolation, the network exhibited a density of 0.01065443 and an average path length of 6.842194. At the end of the percolation process, the density decreased by approximately 10%, and the average path length increased by 33%, indicating a decline in efficiency and connectivity. These results highlight that even decentralized networks, such as I2P, exhibit structural fragility under targeted attacks, emphasizing the need for improved design strategies to enhance resilience against adversarial disruptions.

Paperid: 2010, https://arxiv.org/pdf/2509.18338.pdf

Abstract:
Restaking protocols expand validator responsibilities beyond consensus, but their security depends on resistance to Sybil attacks. We introduce a formal framework for Sybil-proofness in restaking networks, distinguishing between two types of attacks, one in which other Sybil identities are kept out of an attack and one where multiple Sybil identities attack. We analyze marginal and multiplicative slashing mechanisms and characterize the conditions under which each deters Sybil strategies. We then prove an impossibility theorem: no slashing mechanism can simultaneously prevent both attack types. Finally, we study the impact of network structure through random graph models: while ErdÃ¶s-RÃ©nyi networks remain Sybil-proof, even minimal heterogeneity in a two-block stochastic block model makes Sybil attacks profitable. These results reveal fundamental limits of mechanism design for restaking and highlight the critical role of network topology.

Paperid: 2011, https://arxiv.org/pdf/2509.18044.pdf

Abstract:
Federated Learning (FL) in 5G and edge network environments face severe security threats from adversarial clients. Malicious participants can perform label flipping, inject backdoor triggers, or launch Sybil attacks to corrupt the global model. This paper introduces Hybrid Reputation Aggregation (HRA), a novel robust aggregation mechanism designed to defend against diverse adversarial behaviors in FL without prior knowledge of the attack type. HRA combines geometric anomaly detection with momentum-based reputation tracking of clients. In each round, it detects outlier model updates via distance-based geometric analysis while continuously updating a trust score for each client based on historical behavior. This hybrid approach enables adaptive filtering of suspicious updates and long-term penalization of unreliable clients, countering attacks ranging from backdoor insertions to random noise Byzantine failures. We evaluate HRA on a large-scale proprietary 5G network dataset (3M+ records) and the widely used NF-CSE-CIC-IDS2018 benchmark under diverse adversarial attack scenarios. Experimental results reveal that HRA achieves robust global model accuracy of up to 98.66% on the 5G dataset and 96.60% on NF-CSE-CIC-IDS2018, outperforming state-of-the-art aggregators such as Krum, Trimmed Mean, and Bulyan by significant margins. Our ablation studies further demonstrate that the full hybrid system achieves 98.66% accuracy, while the anomaly-only and reputation-only variants drop to 84.77% and 78.52%, respectively, validating the synergistic value of our dual-mechanism approach. This demonstrates HRA's enhanced resilience and robustness in 5G/edge federated learning deployments, even under significant adversarial conditions.

Paperid: 2012, https://arxiv.org/pdf/2509.17253.pdf

Abstract:
Autonomous vehicles (AVs) rely heavily on LiDAR sensors for accurate 3D perception. We show a novel class of low-cost, passive LiDAR spoofing attacks that exploit mirror-like surfaces to inject or remove objects from an AV's perception. Using planar mirrors to redirect LiDAR beams, these attacks require no electronics or custom fabrication and can be deployed in real settings. We define two adversarial goals: Object Addition Attacks (OAA), which create phantom obstacles, and Object Removal Attacks (ORA), which conceal real hazards. We develop geometric optics models, validate them with controlled outdoor experiments using a commercial LiDAR and an Autoware-equipped vehicle, and implement a CARLA-based simulation for scalable testing. Experiments show mirror attacks corrupt occupancy grids, induce false detections, and trigger unsafe planning and control behaviors. We discuss potential defenses (thermal sensing, multi-sensor fusion, light-fingerprinting) and their limitations.

Paperid: 2013, https://arxiv.org/pdf/2509.17070.pdf

Abstract:
We introduce FreqRank, a mutation-based defense to localize malicious components in LLM outputs and their corresponding backdoor triggers. FreqRank assumes that the malicious sub-string(s) consistently appear in outputs for triggered inputs and uses a frequency-based ranking system to identify them. Our ranking system then leverages this knowledge to localize the backdoor triggers present in the inputs. We create nine malicious models through fine-tuning or custom instructions for three downstream tasks, namely, code completion (CC), code generation (CG), and code summarization (CS), and show that they have an average attack success rate (ASR) of 86.6%. Furthermore, FreqRank's ranking system highlights the malicious outputs as one of the top five suggestions in 98% of cases. We also demonstrate that FreqRank's effectiveness scales as the number of mutants increases and show that FreqRank is capable of localizing the backdoor trigger effectively even with a limited number of triggered samples. Finally, we show that our approach is 35-50% more effective than other defense methods.

Paperid: 2014, https://arxiv.org/pdf/2509.16418.pdf

Abstract:
With society's increasing reliance on digital data sharing, the protection of sensitive information has become critical. Encryption serves as one of the privacy-preserving methods; however, its realization in the audio domain predominantly relies on signal processing or software methods embedded into hardware. In this paper, we introduce LenslessMic, a hybrid optical hardware-based encryption method that utilizes a lensless camera as a physical layer of security applicable to multiple types of audio. We show that LenslessMic enables (1) robust authentication of audio recordings and (2) encryption strength that can rival the search space of 256-bit digital standards, while maintaining high-quality signals and minimal loss of content information. The approach is validated with a low-cost Raspberry Pi prototype and is open-sourced together with datasets to facilitate research in the area.

Paperid: 2015, https://arxiv.org/pdf/2509.16352.pdf

Abstract:
Model-sharing offers significant business value by enabling firms with well-established Machine Learning (ML) models to monetize and share their models with others who lack the resources to develop ML models from scratch. However, concerns over data confidentiality remain a significant barrier to model-sharing adoption, as Confidential Property Inference (CPI) attacks can exploit shared ML models to uncover confidential properties of the model provider's private model training data. Existing defenses often assume that CPI attacks are non-adaptive to the specific ML model they are targeting. This assumption overlooks a key characteristic of real-world adversaries: their responsiveness, i.e., adversaries' ability to dynamically adjust their attack models based on the information of the target and its defenses. To overcome this limitation, we propose a novel defense method that explicitly accounts for the responsive nature of real-world adversaries via two methodological innovations: a novel Responsive CPI attack and an attack-defense arms race framework. The former emulates the responsive behaviors of adversaries in the real world, and the latter iteratively enhances both the target and attack models, ultimately producing a secure ML model that is robust against responsive CPI attacks. Furthermore, we propose and integrate a novel approximate strategy into our defense, which addresses a critical computational bottleneck of defense methods and improves defense efficiency. Through extensive empirical evaluations across various realistic model-sharing scenarios, we demonstrate that our method outperforms existing defenses by more effectively defending against CPI attacks, preserving ML model utility, and reducing computational overhead.

Paperid: 2016, https://arxiv.org/pdf/2509.16292.pdf

Abstract:
Cryptocurrencies and Web3 applications based on blockchain technology have flourished in the blockchain research field. Unlike Bitcoin and Ethereum, due to its unique architectural designs in consensus mechanisms, resource management, and throughput, TRON has developed a more distinctive ecosystem and application scenarios centered around stablecoins. Although it is popular in areas like stablecoin payments and settlement, research on analyzing on-chain data from the TRON blockchain is remarkably scarce. To fill this gap, this paper proposes a comprehensive data extraction and exploration framework for the TRON blockchain. An innovative high-performance ETL system aims to efficiently extract raw on-chain data from TRON, including blocks, transactions, smart contracts, and receipts, establishing a research dataset. An in-depth analysis of the extracted dataset reveals insights into TRON's block generation, transaction trends, the dominance of exchanges, the resource delegation market, smart contract usage patterns, and the central role of the USDT stablecoin. The prominence of gambling applications and potential illicit activities related to USDT is emphasized. The paper discusses opportunities for future research leveraging this dataset, including analysis of delegate services, gambling scenarios, stablecoin activities, and illicit transaction detection. These contributions enhance blockchain data management capabilities and understanding of the rapidly evolving TRON ecosystem.

Paperid: 2017, https://arxiv.org/pdf/2509.15555.pdf

Abstract:
The exponential expansion of IoT and 5G-Advanced applications has enlarged the attack surface for DDoS, malware, and zero-day intrusions. We propose an intrusion detection system that fuses a convolutional neural network (CNN), a bidirectional LSTM (BiLSTM), and an autoencoder (AE) bottleneck within a privacy-preserving federated learning (FL) framework. The CNN-BiLSTM branch captures local and gated cross-feature interactions, while the AE emphasizes reconstruction-based anomaly sensitivity. Training occurs across edge devices without sharing raw data. On UNSW-NB15 (binary), the fused model attains AUC 99.59 percent and F1 97.36 percent; confusion-matrix analysis shows balanced error rates with high precision and recall. Average inference time is approximately 0.0476 ms per sample on our test hardware, which is well within the less than 10 ms URLLC budget, supporting edge deployment. We also discuss explainability, drift tolerance, and FL considerations for compliant, scalable 5G-Advanced IoT security.

Paperid: 2018, https://arxiv.org/pdf/2509.15202.pdf

Abstract:
Jailbreak attacks pose persistent threats to large language models (LLMs). Current safety alignment methods have attempted to address these issues, but they experience two significant limitations: insufficient safety alignment depth and unrobust internal defense mechanisms. These limitations make them vulnerable to adversarial attacks such as prefilling and refusal direction manipulation. We introduce DeepRefusal, a robust safety alignment framework that overcomes these issues. DeepRefusal forces the model to dynamically rebuild its refusal mechanisms from jailbreak states. This is achieved by probabilistically ablating the refusal direction across layers and token depths during fine-tuning. Our method not only defends against prefilling and refusal direction attacks but also demonstrates strong resilience against other unseen jailbreak strategies. Extensive evaluations on four open-source LLM families and six representative attacks show that DeepRefusal reduces attack success rates by approximately 95%, while maintaining model capabilities with minimal performance degradation.

Paperid: 2019, https://arxiv.org/pdf/2509.12923.pdf

Abstract:
Interpreting the massive volume of security alerts is a significant challenge in Security Operations Centres (SOCs). Effective contextualisation is important, enabling quick distinction between genuine threats and benign activity to prioritise what needs further analysis. This paper proposes a graph-based approach to enhance alert contextualisation in a SOC by aggregating alerts into graph-based alert groups, where nodes represent alerts and edges denote relationships within defined time-windows. By grouping related alerts, we enable analysis at a higher abstraction level, capturing attack steps more effectively than individual alerts. Furthermore, to show that our format is well suited for downstream machine learning methods, we employ Graph Matching Networks (GMNs) to correlate incoming alert groups with historical incidents, providing analysts with additional insights.

Paperid: 2020, https://arxiv.org/pdf/2509.12676.pdf

Abstract:
In the era of cloud computing, privacy-preserving computation offloading is crucial for safeguarding sensitive data. Fully Homomorphic Encryption (FHE) enables secure processing of encrypted data, but the inherent computational complexity of FHE operations introduces significant computational overhead on the server side. FHE schemes often face a tradeoff between efficiency and versatility. While the CKKS scheme is highly efficient for polynomial operations, it lacks the flexibility of the binary TFHE (Torus-FHE) scheme, which offers greater versatility but at the cost of efficiency. The recent multi-bit TFHE extension offers greater flexibility and performance by supporting native non-polynomial operations and efficient integer processing. However, current implementations of multi-bit TFHE are constrained by its narrower numeric representation, which prevents its adoption in applications requiring wider numeric representations. To address this challenge, we introduce Taurus, a hardware accelerator designed to enhance the efficiency of multi-bit TFHE computations. Taurus supports ciphertexts up to 10 bits by leveraging novel FFT units and optimizing memory bandwidth through key reuse strategies. We also propose a compiler with operation deduplication to improve memory utilization. Our experiment results demonstrate that Taurus achieves up to 2600x speedup over a CPU, 1200x speedup over a GPU, and up to 7x faster compared to the previous state-of-the-art TFHE accelerator. Moreover, Taurus is the first accelerator to demonstrate privacy-preserving inference with large language models such as GPT-2. These advancements enable more practical and scalable applications of privacy-preserving computation in cloud environments.

Paperid: 2021, https://arxiv.org/pdf/2509.11761.pdf

Abstract:
In Vehicle-to-Everything (V2X) networks with multi-hop communication, Road Side Units (RSUs) intend to gather location data from the vehicles to offer various location-based services. Although vehicles use the Global Positioning System (GPS) for navigation, they may refrain from sharing their exact GPS coordinates to the RSUs due to privacy considerations. Thus, to address the localization expectations of the RSUs and the privacy concerns of the vehicles, we introduce a relaxed-privacy model wherein the vehicles share their partial location information in order to avail the location-based services. To implement this notion of relaxed-privacy, we propose a low-latency protocol for spatial-provenance recovery, wherein vehicles use correlated linear Bloom filters to embed their position information. Our proposed spatial-provenance recovery process takes into account the resolution of localization, the underlying ad hoc protocol, and the coverage range of the wireless technology used by the vehicles. Through a rigorous theoretical analysis, we present extensive analysis on the underlying trade-off between relaxed-privacy and the communication-overhead of the protocol. Finally, using a wireless testbed, we show that our proposed method requires a few bits in the packet header to provide security features such as localizing a low-power jammer executing a denial-of-service attack.

Paperid: 2022, https://arxiv.org/pdf/2509.11555.pdf

Abstract:
Web3 applications require execution platforms that maintain confidentiality and integrity without relying on centralized trust authorities. While Trusted Execution Environments (TEEs) offer promising capabilities for confidential computing, current implementations face significant limitations when applied to Web3 contexts, particularly in security reliability, censorship resistance, and vendor independence. This paper presents dstack, a comprehensive framework that transforms raw TEE technology into a true Zero Trust platform. We introduce three key innovations: (1) Portable Confidential Containers that enable seamless workload migration across heterogeneous TEE environments while maintaining security guarantees, (2) Decentralized Code Management that leverages smart contracts for transparent governance of TEE applications, and (3) Verifiable Domain Management that ensures secure and verifiable application identity without centralized authorities. These innovations are implemented through three core components: dstack-OS, dstack-KMS, and dstack-Gateway. Together, they demonstrate how to achieve both the performance advantages of VM-level TEE solutions and the trustless guarantees required by Web3 applications. Our evaluation shows that dstack provides comprehensive security guarantees while maintaining practical usability for real-world applications.

Paperid: 2023, https://arxiv.org/pdf/2509.10838.pdf

Abstract:
Recently, a considerable amount of malware research has focused on the use of powerful image-based machine learning techniques, which generally yield impressive results. However, before image-based techniques can be applied to malware, the samples must be converted to images, and there is no generally-accepted approach for doing so. The malware-to-image conversion strategies found in the literature often appear to be ad hoc, with little or no effort made to take into account properties of executable files. In this paper, we experiment with eight distinct malware-to-image conversion techniques, and for each, we test a variety of learning models. We find that several of these image conversion techniques perform similarly across a range of learning models, in spite of the image conversion processes being quite different. These results suggest that the effectiveness of image-based malware classification techniques may depend more on the inherent strengths of image analysis techniques, as opposed to the precise details of the image conversion strategy.

Paperid: 2024, https://arxiv.org/pdf/2509.10581.pdf

Abstract:
In modern smart systems, the convergence of the Internet of Things (IoT) and Wireless of Things (WoT) have been revolutionized by offering a broad level of wireless connectivity and communication among various devices. Hitherto, this greater interconnectivity poses important security problems, including the question of how to securely interconnect different networks, preserve secure communication channels, and maintain data integrity. However, the traditional cryptographic method and frequency hopping technique, although they provide some protection, are not sufficient to defend against Man-In-The-Middle, jamming, and replay attacks. In addition, synchronization issues in multi-channel communication systems result in increased latency and energy consumption, which make them unsuitable for resource-constrained IoT and WoT devices. This work presents the Multi-Channel Secure Communication (MCSC) framework, which integrates advanced cryptographic protocols with dynamic channel-hopping strategies to enhance security with reduced synchronization overhead. The MCSC framework maximizes the critical performance metrics, such as packet delivery ratio, latency, throughput, and energy efficiency, and fulfills the specific requirements of the IoT and WoT networks. A comprehensive comparison of MCSC with well-established methods, including Frequency Hop Spread Spectrum, single channel Advanced Encryption Standard, and various Elliptic Curve Cryptography-based schemes, indicates that MCSC has lower error rates and is more resilient to a wider range of cyber attacks. The efficiency of the proposed solution to secure IoT and WoT networks without compromising the operational performance is validated under various interference conditions.

Paperid: 2025, https://arxiv.org/pdf/2509.10577.pdf

Abstract:
We prove a sharp threshold for the robustness of cryptographic watermarking for generative models. This is achieved by introducing a coding abstraction, which we call messageless secret-key codes, that formalizes sufficient and necessary requirements of robust watermarking: soundness, tamper detection, and pseudorandomness. Thus, we establish that robustness has a precise limit: For binary outputs no scheme can survive if more than half of the encoded bits are modified, and for an alphabet of size q the corresponding threshold is $(1-1/q)$ of the symbols. Complementing this impossibility, we give explicit constructions that meet the bound up to a constant slack. For every $Î´ > 0$, assuming pseudorandom functions and access to a public counter, we build linear-time codes that tolerate up to $(1/2)(1-Î´)$ errors in the binary case and $(1-1/q)(1-Î´)$ errors in the $q$-ary case. Together with the lower bound, these yield the maximum robustness achievable under standard cryptographic assumptions. We then test experimentally whether this limit appears in practice by looking at the recent watermarking for images of Gunn, Zhao, and Song (ICLR 2025). We show that a simple crop and resize operation reliably flipped about half of the latent signs and consistently prevented belief-propagation decoding from recovering the codeword, erasing the watermark while leaving the image visually intact. These results provide a complete characterization of robust watermarking, identifying the threshold at which robustness fails, constructions that achieve it, and an experimental confirmation that the threshold is already reached in practice.

Paperid: 2026, https://arxiv.org/pdf/2509.10551.pdf

Abstract:
This paper introduces a hybrid encryption framework combining classical cryptography (EdDSA, ECDH), post-quantum cryptography (ML-DSA-6x5, ML-KEM-768), and Quantum Key Distribution (QKD) via Guardian to counter quantum computing threats. Our prototype implements this integration, using a key derivation function to generate secure symmetric and HMAC keys, and evaluates its performance across execution time and network metrics. The approach improves data protection by merging classical efficiency with PQC's quantum resilience and QKD's key security, offering a practical transition path for cryptographic systems. This research lays the foundation for future adoption of PQC in securing digital communication.

Paperid: 2027, https://arxiv.org/pdf/2509.08493.pdf

Abstract:
Scammers are increasingly harnessing generative AI(GenAI) technologies to produce convincing phishing content at scale, amplifying financial fraud and undermining public trust. While conventional defenses, such as detection algorithms, user training, and reactive takedown efforts remain important, they often fall short in dismantling the infrastructure scammers depend on, including mule bank accounts and cryptocurrency wallets. To bridge this gap, a proactive and emerging strategy involves using conversational honeypots to engage scammers and extract actionable threat intelligence. This paper presents the first large-scale, real-world evaluation of a scambaiting system powered by large language models (LLMs). Over a five-month deployment, the system initiated over 2,600 engagements with actual scammers, resulting in a dataset of more than 18,700 messages. It achieved an Information Disclosure Rate (IDR) of approximately 32%, successfully extracting sensitive financial information such as mule accounts. Additionally, the system maintained a Human Acceptance Rate (HAR) of around 70%, indicating strong alignment between LLM-generated responses and human operator preferences. Alongside these successes, our analysis reveals key operational challenges. In particular, the system struggled with engagement takeoff: only 48.7% of scammers responded to the initial seed message sent by defenders. These findings highlight the need for further refinement and provide actionable insights for advancing the design of automated scambaiting systems.

Paperid: 2028, https://arxiv.org/pdf/2509.07053.pdf

Abstract:
The Signalgate incident of March 2025, wherein senior US national security officials inadvertently disclosed sensitive military operational details via the encrypted messaging platform Signal, highlights critical vulnerabilities in organizational security arising from human error, governance gaps, and the misuse of technology. Although smaller in scale when compared to historical breaches involving billions of records, Signalgate illustrates critical systemic issues often overshadowed by a focus on external cyber threats. Employing a case-study approach and systematic review grounded in the NIST Cybersecurity Framework, we analyze the incident to identify patterns of human-centric vulnerabilities and governance challenges common to organizational security failures. Findings emphasize three critical points. (1) Organizational security depends heavily on human behavior, with internal actors often serving as the weakest link despite advanced technical defenses; (2) Leadership tone strongly influences organizational security culture and efficacy, and (3) widespread reliance on technical solutions without sufficient investments in human and organizational factors leads to ineffective practices and wasted resources. From these observations, we propose actionable recommendations for enhancing organizational and national security, including strong leadership engagement, comprehensive adoption of zero-trust architectures, clearer accountability structures, incentivized security behaviors, and rigorous oversight. Particularly during periods of organizational transition, such as mergers or large-scale personnel changes, additional measures become particularly important. Signalgate underscores the need for leaders and policymakers to reorient cybersecurity strategies toward addressing governance, cultural, and behavioral risks.

Paperid: 2029, https://arxiv.org/pdf/2509.06920.pdf

Abstract:
Insider threats are a growing organizational problem due to the complexity of identifying their technical and behavioral elements. A large research body is dedicated to the study of insider threats from technological, psychological, and educational perspectives. However, research in this domain has been generally dependent on datasets that are static and limited access which restricts the development of adaptive detection models. This study introduces a novel, ethically grounded approach that uses the large language model (LLM) Claude Sonnet 3.7 to dynamically synthesize syslog messages, some of which contain indicators of insider threat scenarios. The messages reflect real-world data distributions by being highly imbalanced (1% insider threats). The syslogs were analyzed for insider threats by both Sonnet 3.7 and GPT-4o, with their performance evaluated through statistical metrics including accuracy, precision, recall, F1, specificity, FAR, MCC, and ROC AUC. Sonnet 3.7 consistently outperformed GPT-4o across nearly all metrics, particularly in reducing false alarms and improving detection accuracy. The results show strong promise for the use of LLMs in synthetic dataset generation and insider threat detection.

Paperid: 2030, https://arxiv.org/pdf/2509.06509.pdf

Abstract:
Leakage contracts have been proposed as a new security abstraction at the instruction set architecture level. Leakage contracts aim to capture the information that processors may leak via microarchitectural side channels. Recently, the first tools have emerged to verify whether a processor satisfies a given contract. However, coming up with a contract that is both sound and precise for a given processor is challenging, time-consuming, and error-prone, as it requires in-depth knowledge of the timing side channels introduced by microarchitectural optimizations. In this paper, we address this challenge by proposing LeaSyn, the first tool for automatically synthesizing leakage contracts that are both sound and precise for processor designs at register-transfer level. Starting from a user-provided contract template that captures the space of possible contracts, LeaSyn automatically constructs a contract, alternating between contract synthesis, which ensures precision based on an empirical characterization of the processor's leaks, and contract verification, which ensures soundness. Using LeaSyn, we automatically synthesize contracts for six open-source RISC-V CPUs for a variety of contract templates. Our experiments indicate that LeaSyn's contracts are sound and more precise (i.e., represent the actual leaks in the target processor more faithfully) than contracts constructed by existing approaches.

Paperid: 2031, https://arxiv.org/pdf/2509.05884.pdf

Abstract:
The Number Theoretic Transform (NTT) can be regarded as a variant of the Discrete Fourier Transform. NTT has been quite a powerful mathematical tool in developing Post-Quantum Cryptography and Homomorphic Encryption. The Fourier Transform essentially decomposes a signal into its frequencies. They are traditionally sine or cosine waves. NTT works more over groups or finite fields rather than on a continuous signal and polynomials work as the analog of sine waves in case of NTT. Fast Fourier Trnasform (FFT) style NTT or fast NTT has been proven to be useful in lattice-based cryptography due to its ability to reduce the complexity of polynomial multiplication from quadratic to quasilinear. We have introduced the concepts of cyclic, negacyclic convolutions along with NTT and its inverse and their fast versions.

Paperid: 2032, https://arxiv.org/pdf/2509.05320.pdf

Abstract:
The integration of Large Language Models (LLMs) in 6G vehicular networks promises unprecedented advancements in intelligent transportation systems. However, offloading LLM computations from vehicles to edge infrastructure poses significant privacy risks, potentially exposing sensitive user data. This paper presents a novel privacy-preserving offloading framework for LLM-integrated vehicular networks. We introduce a hybrid approach combining federated learning (FL) and differential privacy (DP) techniques to protect user data while maintaining LLM performance. Our framework includes a privacy-aware task partitioning algorithm that optimizes the trade-off between local and edge computation, considering both privacy constraints and system efficiency. We also propose a secure communication protocol for transmitting model updates and aggregating results across the network. Experimental results demonstrate that our approach achieves 75\% global accuracy with only a 2-3\% reduction compared to non-privacy-preserving methods, while maintaining DP guarantees with an optimal privacy budget of $\varepsilon = 0.8$. The framework shows stable communication overhead of approximately 2.1MB per round with computation comprising over 90\% of total processing time, validating its efficiency for resource-constrained vehicular environments.

Paperid: 2033, https://arxiv.org/pdf/2509.05311.pdf

Abstract:
Reinforcement Learning (RL) has shown great potential for autonomous decision-making in the cybersecurity domain, enabling agents to learn through direct environment interaction. However, RL agents in Autonomous Cyber Operations (ACO) typically learn from scratch, requiring them to execute undesirable actions to learn their consequences. In this study, we integrate external knowledge in the form of a Large Language Model (LLM) pretrained on cybersecurity data that our RL agent can directly leverage to make informed decisions. By guiding initial training with an LLM, we improve baseline performance and reduce the need for exploratory actions with obviously negative outcomes. We evaluate our LLM-integrated approach in a simulated cybersecurity environment, and demonstrate that our guided agent achieves over 2x higher rewards during early training and converges to a favorable policy approximately 4,500 episodes faster than the baseline.

Paperid: 2034, https://arxiv.org/pdf/2509.01509.pdf

Abstract:
Insider threat detection (ITD) requires analyzing sparse, heterogeneous user behavior. Existing ITD methods predominantly rely on single-view modeling, resulting in limited coverage and missed anomalies. While multi-view learning has shown promise in other domains, its direct application to ITD introduces significant challenges: scalability bottlenecks from independently trained sub-models, semantic misalignment across disparate feature spaces, and view imbalance that causes high-signal modalities to overshadow weaker ones. In this work, we present Insight-LLM, the first modular multi-view fusion framework specifically tailored for insider threat detection. Insight-LLM employs frozen, pre-nes, achieving state-of-the-art detection with low latency and parameter overhead.

Paperid: 2035, https://arxiv.org/pdf/2509.00437.pdf

Abstract:
Ensuring the de-identification of medical imaging data is a critical step in enabling safe data sharing. This paper presents a hybrid de-identification framework designed to process Digital Imaging and Communications in Medicine (DICOM) files. Our framework adopts a modified, pre-built rule-based component, updated with The Cancer Imaging Archive (TCIA)'s best practices guidelines, as outlined in DICOM PS 3.15, for improved performance. It incorporates PaddleOCR, a robust Optical Character Recognition (OCR) system for extracting text from images, and RoBERTa, a fine-tuned transformer-based model for identifying and removing Personally Identifiable Information (PII) and Protected Health Information (PHI). Initially, the transformer-based model and the rule-based component were integrated to process for both structured data and free text. However, this coarse-grained approach did not yield optimal results. To improve performance, we refined our approach by applying the transformer model exclusively to free text, while structured data was handled only by rule-based methods. In this framework the DICOM validator dciodvfy was leveraged to ensure the integrity of DICOM files after the deID process. Through iterative refinement, including the incorporation of custom rules and private tag handling, the framework achieved a de-identification accuracy of 99.91% on the MIDI-B test dataset. The results demonstrate the effectiveness of combining rule-based compliance with AI-enabled adaptability in addressing the complex challenges of DICOM de-identification.

Paperid: 2036, https://arxiv.org/pdf/2508.19697.pdf

Abstract:
Current safety alignment for large language models(LLMs) continues to present vulnerabilities, given that adversarial prompting can effectively bypass their safety measures.Our investigation shows that these safety mechanisms predominantly depend on a limited subset of attention heads: removing or ablating these heads can severely compromise model safety. To identify and evaluate these safety-critical components, we introduce RDSHA, a targeted ablation method that leverages the model's refusal direction to pinpoint attention heads mostly responsible for safety behaviors. Further analysis shows that existing jailbreak attacks exploit this concentration by selectively bypassing or manipulating these critical attention heads. To address this issue, we propose AHD, a novel training strategy designed to promote the distributed encoding of safety-related behaviors across numerous attention heads. Experimental results demonstrate that AHD successfully distributes safety-related capabilities across more attention heads. Moreover, evaluations under several mainstream jailbreak attacks show that models trained with AHD exhibit considerably stronger safety robustness, while maintaining overall functional utility.

Paperid: 2037, https://arxiv.org/pdf/2508.19472.pdf

Abstract:
Sensitive Information Exposure (SIEx) vulnerabilities (CWE-200) remain a persistent and under-addressed threat across software systems, often leading to serious security breaches. Existing detection tools rarely target the diverse subcategories of CWE-200 or provide context-aware analysis of code-level data flows. Aims: This paper aims to present SIExVulTS, a novel vulnerability detection system that integrates transformer-based models with static analysis to identify and verify sensitive information exposure in Java applications. Method: SIExVulTS employs a three-stage architecture: (1) an Attack Surface Detection Engine that uses sentence embeddings to identify sensitive variables, strings, comments, and sinks; (2) an Exposure Analysis Engine that instantiates CodeQL queries aligned with the CWE-200 hierarchy; and (3) a Flow Verification Engine that leverages GraphCodeBERT to semantically validate source-to-sink flows. We evaluate SIExVulTS using three curated datasets, including real-world CVEs, a benchmark set of synthetic CWE-200 examples, and labeled flows from 31 open-source projects. Results: The Attack Surface Detection Engine achieved an average F1 score greater than 93\%, the Exposure Analysis Engine achieved an F1 score of 85.71\%, and the Flow Verification Engine increased precision from 22.61\% to 87.23\%. Moreover, SIExVulTS successfully uncovered six previously unknown CVEs in major Apache projects. Conclusions: The results demonstrate that SIExVulTS is effective and practical for improving software security against sensitive data exposure, addressing limitations of existing tools in detecting and verifying CWE-200 vulnerabilities.

Paperid: 2038, https://arxiv.org/pdf/2508.19278.pdf

Abstract:
Simulated environments have proven invaluable in Autonomous Cyber Operations (ACO) where Reinforcement Learning (RL) agents can be trained without the computational overhead of emulation. These environments must accurately represent cybersecurity scenarios while producing the necessary signals to support RL training. In this study, we present a framework where we first extend CybORG's Cage Challenge 2 environment by implementing three new actions: Patch, Isolate, and Unisolate, to better represent the capabilities available to human operators in real-world settings. We then propose a design for agent development where we modify the reward signals and the agent's feature space to enhance training performance. To validate these modifications, we train DQN and PPO agents in the updated environment. Our study demonstrates that CybORG can be extended with additional realistic functionality, while maintaining its ability to generate informative training signals for RL agents.

Paperid: 2039, https://arxiv.org/pdf/2508.17215.pdf

Abstract:
Large Vision-Language Models (LVLMs) augmented with Retrieval-Augmented Generation (RAG) are increasingly employed in medical AI to enhance factual grounding through external clinical image-text retrieval. However, this reliance creates a significant attack surface. We propose MedThreatRAG, a novel multimodal poisoning framework that systematically probes vulnerabilities in medical RAG systems by injecting adversarial image-text pairs. A key innovation of our approach is the construction of a simulated semi-open attack environment, mimicking real-world medical systems that permit periodic knowledge base updates via user or pipeline contributions. Within this setting, we introduce and emphasize Cross-Modal Conflict Injection (CMCI), which embeds subtle semantic contradictions between medical images and their paired reports. These mismatches degrade retrieval and generation by disrupting cross-modal alignment while remaining sufficiently plausible to evade conventional filters. While basic textual and visual attacks are included for completeness, CMCI demonstrates the most severe degradation. Evaluations on IU-Xray and MIMIC-CXR QA tasks show that MedThreatRAG reduces answer F1 scores by up to 27.66% and lowers LLaVA-Med-1.5 F1 rates to as low as 51.36%. Our findings expose fundamental security gaps in clinical RAG systems and highlight the urgent need for threat-aware design and robust multimodal consistency checks. Finally, we conclude with a concise set of guidelines to inform the safe development of future multimodal medical RAG systems.

Paperid: 2040, https://arxiv.org/pdf/2508.16761.pdf

Abstract:
In the face of adverse environmental conditions and cyber threats, robust communication systems for critical applications such as wildfire management and detection demand secure and resilient architectures. This paper presents a novel framework that considers both adversarial factors, building resilience into a heterogeneous network (HetNet) integrating Low Earth Orbit (LEO) satellite constellation with High-Altitude Platform Ground Stations (HAPGS) and Low-Altitude Platforms (LAPS), tailored to support wildfire management operations. Building upon our previous work on secure-by-component approach for link segment security, we extend protection to the communication layer by securing both Radio Frequency (RF)/Free Space Optics (FSO) management and different links. Through a case study, we quantify how environmental stressors impact secrecy capacity and expose the system to passive adversaries. Key findings demonstrate that atmospheric attenuation and beam misalignment can notably degrade secrecy capacity across both short- and long-range communication links, while high-altitude eavesdroppers face less signal degradation, increasing their interception capability. Moreover, increasing transmit power to counter environmental losses can inadvertently improve eavesdropper reception, thereby reducing overall link confidentiality. Our work not only highlights the importance of protecting networks from these dual threats but also aligns with the IEEE P3536 Standard for Space System Cybersecurity Design, ensuring resilience and the prevention of mission failures.

Paperid: 2041, https://arxiv.org/pdf/2508.16202.pdf

Abstract:
This paper studies proof-of-work Nakamoto consensus protocols under bounded network delays, settling two long-standing questions in blockchain security: What is the most effective attack on block safety under a given block confirmation latency? And what is the resulting probability of safety violation? A Markov decision process (MDP) framework is introduced to precisely characterize the system state (including the blocktree and timings of all blocks mined), the adversary's potential actions, and the state transitions due to the adversarial action and the random block arrival processes. An optimal attack, called bait-and-switch, is proposed and proved to maximize the adversary's chance of violating block safety by "beating Nakamoto in the race". The exact probability of this violation is calculated for any given confirmation depth using Markov chain analysis, offering fresh insights into the interplay of network delay, confirmation rules, and blockchain security.

Paperid: 2042, https://arxiv.org/pdf/2508.15172.pdf

Abstract:
This paper evaluates the secure level of authenticated encryption \textsc{Ascon} against cube-like method. \textsc{Ascon} submitted by Dobraunig \emph{et~al.} is one of 16 survivors of the 3rd round CAESAR competition. The cube-like method is first used by Dinur \emph{et~al.} to analyze Keccak keyed modes. At CT-RSA 2015, Dobraunig \emph{et~al.} applied this method to 5/6-round reduced \textsc{Ascon}, whose structure is similar to Keccak keyed modes. However, for \textsc{Ascon} the non-linear layer is more complex and state is much smaller, which make it hard for the attackers to select enough cube variables that do not multiply with each other after the first round. This seems to be the reason why the best previous key-recovery attack is on 6-round \textsc{Ascon}, while for Keccak keyed modes (Keccak-MAC and Keyak) the attacked round is no less than 7-round. In this paper, we generalize the conditional cube attack proposed by Huang \emph{et~al.}, and find new cubes depending on some key bit conditions for 5/6-round reduced \textsc{Ascon}, and translate the previous theoretic 6-round attack with $2^{66}$ time complexity to a practical one with $2^{40}$ time complexity. Moreover, we propose the first 7-round key-recovery attack on \textsc{Ascon}. By introducing \emph{the cube-like key-subset technique}, we divide the full key space into many subsets according to different key conditions. For each key subset, we launch the cube tester to determine if the key falls into it. Finally, we recover the full key space by testing all the key subsets. The total time complexity is about $2^{103.9}$. In addition, for a weak-key subset, whose size is $2^{117}$, the attack is more efficient and costs only $2^{77}$ time complexity. Those attacks do not threaten the full round (12 rounds) \textsc{Ascon}.

Paperid: 2043, https://arxiv.org/pdf/2508.13118.pdf

Abstract:
Incident response (IR) requires fast, coordinated, and well-informed decision-making to contain and mitigate cyber threats. While large language models (LLMs) have shown promise as autonomous agents in simulated IR settings, their reasoning is often limited by a lack of access to external knowledge. In this work, we present AutoBnB-RAG, an extension of the AutoBnB framework that incorporates retrieval-augmented generation (RAG) into multi-agent incident response simulations. Built on the Backdoors & Breaches (B&B) tabletop game environment, AutoBnB-RAG enables agents to issue retrieval queries and incorporate external evidence during collaborative investigations. We introduce two retrieval settings: one grounded in curated technical documentation (RAG-Wiki), and another using narrative-style incident reports (RAG-News). We evaluate performance across eight team structures, including newly introduced argumentative configurations designed to promote critical reasoning. To validate practical utility, we also simulate real-world cyber incidents based on public breach reports, demonstrating AutoBnB-RAG's ability to reconstruct complex multi-stage attacks. Our results show that retrieval augmentation improves decision quality and success rates across diverse organizational models. This work demonstrates the value of integrating retrieval mechanisms into LLM-based multi-agent systems for cybersecurity decision-making.

Paperid: 2044, https://arxiv.org/pdf/2508.12553.pdf

Abstract:
Endpoint Detection and Response (EDR) solutions embrace the method of attack provenance graph to discover unknown threats through system event correlation. However, this method still faces some unsolved problems in the fields of interoperability, reliability, flexibility, and practicability to deliver actionable results. Our research highlights the limitations of current solutions in detecting obfuscation, correlating attacks, identifying low-frequency events, and ensuring robust context awareness in relation to command-line activities. To address these challenges, we introduce DEFENDCLI, an innovative system leveraging provenance graphs that, for the first time, delves into command-line-level detection. By offering finer detection granularity, it addresses a gap in modern EDR systems that has been overlooked in previous research. Our solution improves the precision of the information representation by evaluating differentiation across three levels: unusual system process calls, suspicious command-line executions, and infrequent external network connections. This multi-level approach enables EDR systems to be more reliable in complex and dynamic environments. Our evaluation demonstrates that DEFENDCLI improves precision by approximately 1.6x compared to the state-of-the-art methods on the DARPA Engagement Series attack datasets. Extensive real-time industrial testing across various attack scenarios further validates its practical effectiveness. The results indicate that DEFENDCLI not only detects previously unknown attack instances, which are missed by other modern commercial solutions, but also achieves a 2.3x improvement in precision over the state-of-the-art research work.

Paperid: 2045, https://arxiv.org/pdf/2508.12496.pdf

Abstract:
Traffic visibility remains a key component for management and security operations. Observing unsolicited and erroneous traffic, such as unanswered traffic or errors, is fundamental to detect misconfiguration, temporary failures or attacks. ChamaleoNet transforms any production network into a transparent monitor to let administrators collect unsolicited and erroneous traffic directed to hosts, whether offline or active, hosting a server or a client, protected by a firewall, or unused addresses. ChamaleoNet is programmed to ignore well-formed traffic and collect only erroneous packets, including those generated by misconfigured or infected internal hosts, and those sent by external actors which scan for services. Engineering such a system poses several challenges, from scalability to privacy. Leveraging the SDN paradigm, ChamaleoNet processes the traffic flowing through a campus/corporate network and focuses on erroneous packets only, lowering the pressure on the collection system while respecting privacy regulations by design. ChamaleoNet enables the seamless integration with active deceptive systems like honeypots that can impersonate unused hosts/ports/services and engage with senders. The SDN in-hardware filtering reduces the traffic to the controller by 96%, resulting in a scalable solution, which we offer as open source. Simple analytics unveil internal misconfigured and infected hosts, identify temporary failures, and enhance visibility on external radiation produced by attackers looking for vulnerable services.

Paperid: 2046, https://arxiv.org/pdf/2508.11082.pdf

Abstract:
The commutative supersingular isogeny Diffie-Hellman (CSIDH) algorithm is a promising post-quantum key exchange protocol, notable for its exceptionally small key sizes, but hindered by computationally intensive key generation. Furthermore, practical implementations must operate in constant time to mitigate side-channel vulnerabilities, which presents an additional performance challenge. This paper presents, to our knowledge, the first comprehensive hardware study of CSIDH, establishing a performance baseline with a unified architecture on both field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) platforms. The architecture features a top-level finite state machine (FSM) that orchestrates a deeply pipelined arithmetic logic unit (ALU) to accelerate the underlying 512-bit finite field operations. The ALU employs a parallelized schoolbook multiplier, completing a 512$\times$512-bit multiplication in 22 clock cycles and enabling a full Montgomery modular multiplication in 87 cycles. The constant-time CSIDH-512 design requires $1.03\times10^{8}$ clock cycles per key generation. When implemented on a Xilinx Zynq UltraScale+ FPGA, the architecture achieves a 200 MHz clock frequency, corresponding to a 515 ms latency. For ASIC implementation in a 180nm process, the design requires $1.065\times10^{8}$ clock cycles and achieves a \textasciitilde 180 MHz frequency, resulting in a key generation latency of 591 ms. By providing the first public hardware performance metrics for CSIDH on both FPGA and ASIC platforms, this work delivers a crucial benchmark for future isogeny-based post-quantum cryptography (PQC) accelerators.

Paperid: 2047, https://arxiv.org/pdf/2508.10039.pdf

Abstract:
Current multi-task adversarial text attacks rely on abundant access to shared internal features and numerous queries, often limited to a single task type. As a result, these attacks are less effective against practical scenarios involving black-box feedback APIs, limited queries, or multiple task types. To bridge this gap, we propose \textbf{C}luster and \textbf{E}nsemble \textbf{M}ulti-task Text Adversarial \textbf{A}ttack (\textbf{CEMA}), an effective black-box attack that exploits the transferability of adversarial texts across different tasks. CEMA simplifies complex multi-task scenarios by using a \textit{deep-level substitute model} trained in a \textit{plug-and-play} manner for text classification, enabling attacks without mimicking the victim model. This approach requires only a few queries for training, converting multi-task attacks into classification attacks and allowing attacks across various tasks. CEMA generates multiple adversarial candidates using different text classification methods and selects the one that most effectively attacks substitute models. In experiments involving multi-task models with two, three, or six tasks--spanning classification, translation, summarization, and text-to-image generation--CEMA demonstrates significant attack success with as few as 100 queries. Furthermore, CEMA can target commercial APIs (e.g., Baidu and Google Translate), large language models (e.g., ChatGPT 4o), and image-generation models (e.g., Stable Diffusion V2), showcasing its versatility and effectiveness in real-world applications.

Paperid: 2048, https://arxiv.org/pdf/2508.08875.pdf

Abstract:
Large Language Models (LLMs) increasingly leverage Federated Learning (FL) to utilize private, task-specific datasets for fine-tuning while preserving data privacy. However, while federated LLM frameworks effectively enable collaborative training without raw data sharing, they critically lack built-in mechanisms for regulatory compliance like GDPR's right to be forgotten. Integrating private data heightens concerns over data quality and long-term governance, yet existing distributed training frameworks offer no principled way to selectively remove specific client contributions post-training. Due to distributed data silos, stringent privacy constraints, and the intricacies of interdependent model aggregation, federated LLM unlearning is significantly more complex than centralized LLM unlearning. To address this gap, we introduce Oblivionis, a lightweight learning and unlearning framework that enables clients to selectively remove specific private data during federated LLM training, enhancing trustworthiness and regulatory compliance. By unifying FL and unlearning as a dual optimization objective, we incorporate 6 FL and 5 unlearning algorithms for comprehensive evaluation and comparative analysis, establishing a robust pipeline for federated LLM unlearning. Extensive experiments demonstrate that Oblivionis outperforms local training, achieving a robust balance between forgetting efficacy and model utility, with cross-algorithm comparisons providing clear directions for future LLM development.

Paperid: 2049, https://arxiv.org/pdf/2508.06972.pdf

Abstract:
DSperse is a modular framework for distributed machine learning inference with strategic cryptographic verification. Operating within the emerging paradigm of distributed zero-knowledge machine learning, DSperse avoids the high cost and rigidity of full-model circuitization by enabling targeted verification of strategically chosen subcomputations. These verifiable segments, or "slices", may cover part or all of the inference pipeline, with global consistency enforced through audit, replication, or economic incentives. This architecture supports a pragmatic form of trust minimization, localizing zero-knowledge proofs to the components where they provide the greatest value. We evaluate DSperse using multiple proving systems and report empirical results on memory usage, runtime, and circuit behavior under sliced and unsliced configurations. By allowing proof boundaries to align flexibly with the model's logical structure, DSperse supports scalable, targeted verification strategies suited to diverse deployment needs.

Paperid: 2050, https://arxiv.org/pdf/2508.06251.pdf

Abstract:
Synthetic data generation is a key technique in modern artificial intelligence, addressing data scarcity, privacy constraints, and the need for diverse datasets in training robust models. In this work, we propose a method for generating privacy-preserving high-quality synthetic tabular data using Tensor Networks, specifically Matrix Product States (MPS). We benchmark the MPS-based generative model against state-of-the-art models such as CTGAN, VAE, and PrivBayes, focusing on both fidelity and privacy-preserving capabilities. To ensure differential privacy (DP), we integrate noise injection and gradient clipping during training, enabling privacy guarantees via RÃ©nyi Differential Privacy accounting. Across multiple metrics analyzing data fidelity and downstream machine learning task performance, our results show that MPS outperforms classical models, particularly under strict privacy constraints. This work highlights MPS as a promising tool for privacy-aware synthetic data generation. By combining the expressive power of tensor network representations with formal privacy mechanisms, the proposed approach offers an interpretable and scalable alternative for secure data sharing. Its structured design facilitates integration into sensitive domains where both data quality and confidentiality are critical.

Paperid: 2051, https://arxiv.org/pdf/2508.05707.pdf

Abstract:
The integration of Terrestrial Networks (TN) and Non-Terrestrial Networks (NTN), including 5G Advanced/6G and the Internet of Things (IoT) technologies, using Low Earth Orbit (LEO) satellites, high-altitude platforms (HAPS), and Unmanned Aerial Vehicles (UAVs), is redefining the landscape of global connectivity. This paper introduces a new system-level security framework for 5G Advanced/6G IoT-integrated TN-NTN architectures with AI-native-enabled cloud security. Due to the heterogeneity, scale, and distributed nature of these networks, new security challenges have emerged. Leveraging AI-native cloud platforms offers powerful capabilities for real-time threat detection, security automation, and intelligent policy enforcement. The NTN satellite access function enhances security for discontinuous coverage via satellite connections. In addition, this paper explores the security risks associated with integrated 5G Advanced/6G IoT TN-NTN systems, including full network segmentation, network slicing, and the cloudification of the RAN and core. We present a comprehensive AI-enabled cloud security framework and conclude with proposals for implementing AI-powered, satellite-based NTN within future 5G Advanced/6G IoT networks. Our approach emphasizes zero-trust principles, federated learning, secure orchestration, a layered security framework, and resilience against adversarial threats.

Paperid: 2052, https://arxiv.org/pdf/2508.05600.pdf

Abstract:
Backdoor injection attacks are a threat to machine learning models that are trained on large data collected from untrusted sources; these attacks enable attackers to inject malicious behavior into the model that can be triggered by specially crafted inputs. Prior work has established bounds on the success of backdoor attacks and their impact on the benign learning task, however, an open question is what amount of poison data is needed for a successful backdoor attack. Typical attacks either use few samples, but need much information about the data points or need to poison many data points. In this paper, we formulate the one-poison hypothesis: An adversary with one poison sample and limited background knowledge can inject a backdoor with zero backdooring-error and without significantly impacting the benign learning task performance. Moreover, we prove the one-poison hypothesis for linear regression and linear classification. For adversaries that utilize a direction that is unused by the benign data distribution for the poison sample, we show that the resulting model is functionally equivalent to a model where the poison was excluded from training. We build on prior work on statistical backdoor learning to show that in all other cases, the impact on the benign learning task is still limited. We also validate our theoretical results experimentally with realistic benchmark data sets.

Paperid: 2053, https://arxiv.org/pdf/2508.05188.pdf

Abstract:
Timely and effective incident response is key to managing the growing frequency of cyberattacks. However, identifying the right response actions for complex systems is a major technical challenge. A promising approach to mitigate this challenge is to use the security knowledge embedded in large language models (LLMs) to assist security operators during incident handling. Recent research has demonstrated the potential of this approach, but current methods are mainly based on prompt engineering of frontier LLMs, which is costly and prone to hallucinations. We address these limitations by presenting a novel way to use an LLM for incident response planning with reduced hallucination. Our method includes three steps: fine-tuning, information retrieval, and lookahead planning. We prove that our method generates response plans with a bounded probability of hallucination and that this probability can be made arbitrarily small at the expense of increased planning time under certain assumptions. Moreover, we show that our method is lightweight and can run on commodity hardware. We evaluate our method on logs from incidents reported in the literature. The experimental results show that our method a) achieves up to 22% shorter recovery times than frontier LLMs and b) generalizes to a broad range of incident types and response actions.

Paperid: 2054, https://arxiv.org/pdf/2508.04583.pdf

Abstract:
Privacy-enhancing technologies (PETs) have attracted significant attention in response to privacy regulations, driving the development of applications that prioritize user data protection. At the same time, the information and communication technology (ICT) sector faces growing pressure to reduce its environmental footprint, particularly its carbon emissions. While numerous studies have assessed the energy footprint of various ICT applications, the environmental footprint of cryptographic PETs remains largely unexplored. Our work addresses this gap by proposing a standardized methodology for evaluating the carbon footprint of PETs. To demonstrate this methodology, we focus on PETs supporting client-server applications as they are the simplest to deploy. In particular, we measure the energy consumption and carbon footprint increase induced by five cryptographic PETs (compared to their non-private equivalent): HTTPS web browsing, encrypted machine learning (ML) inference, encrypted ML training, encrypted databases, and encrypted emails. Our findings reveal significant variability in carbon footprint increases, ranging from a twofold increase in HTTPS web browsing to a 100,000-fold increase in encrypted ML. Our study provides essential data to help decision-makers assess privacy-carbon trade-offs in such applications. Finally, we outline key research directions for developing PETs that balance strong privacy protection with environmental sustainability.

Paperid: 2055, https://arxiv.org/pdf/2508.02543.pdf

Abstract:
Nicknames for Group Signatures (NGS) is a new signature scheme that extends Group Signatures (GS) with Signatures with Flexible Public Keys (SFPK). Via GS, each member of a group can sign messages on behalf of the group without revealing his identity, except to a designated auditor. Via SFPK, anyone can create new identities for a particular user, enabling anonymous transfers with only the intended recipient able to trace these new identities. To prevent the potential abuses that this anonymity brings, NGS integrates flexible public keys into the GS framework to support auditable transfers. In addition to introducing NGS, we describe its security model and provide a mathematical construction proved secure in the Random Oracle Model. As a practical NGS use case, we build NickHat, a blockchain-based token-exchange prototype system on top of Ethereum.

Paperid: 2056, https://arxiv.org/pdf/2508.02454.pdf

Abstract:
Stalkerware is a serious threat to individuals' privacy that is receiving increased attention from the security and privacy research communities. Existing works have largely focused on studying leading stalkerware apps, dual-purpose apps, monetization of stalkerware, or the experience of survivors. However, there remains a need to understand potential defenses beyond the detection-and-removal approach, which may not necessarily be effective in the context of stalkerware. In this paper, we perform a systematic analysis of a large corpus of recent Android stalkerware apps. We combine multiple analysis techniques to quantify stalkerware behaviors and capabilities and how these evolved over time. Our primary goal is understanding: how (and whether) recent Android platform changes -- largely designed to improve user privacy -- have thwarted stalkerware functionality; how stalkerware may have adapted as a result; and what we may conclude about potential defenses. Our investigation reveals new insights into tactics used by stalkerware and may inspire alternative defense strategies.

Paperid: 2057, https://arxiv.org/pdf/2508.02008.pdf

Abstract:
The proper use of Android app permissions is crucial to the success and security of these apps. Users must agree to permission requests when installing or running their apps. Despite official Android platform documentation on proper permission usage, there are still many cases of permission abuse. This study provides a comprehensive analysis of the Android permission landscape, highlighting trends and patterns in permission requests across various applications from the Google Play Store. By distinguishing between benign and malicious applications, we uncover developers' evolving strategies, with malicious apps increasingly requesting fewer permissions to evade detection, while benign apps request more to enhance functionality. In addition to examining permission trends across years and app features such as advertisements, in-app purchases, content ratings, and app sizes, we leverage association rule mining using the FP-Growth algorithm. This allows us to uncover frequent permission combinations across the entire dataset, specific years, and 16 app genres. The analysis reveals significant differences in permission usage patterns, providing a deeper understanding of co-occurring permissions and their implications for user privacy and app functionality. By categorizing permissions into high-level semantic groups and examining their application across distinct app categories, this study offers a structured approach to analyzing the dynamics within the Android ecosystem. The findings emphasize the importance of continuous monitoring, user education, and regulatory oversight to address permission misuse effectively.

Paperid: 2058, https://arxiv.org/pdf/2508.01638.pdf

Abstract:
The increasing adoption of Cloud-based Large Language Models (CLLMs) has raised significant concerns regarding data privacy during user interactions. While existing approaches primarily focus on encrypting sensitive information, they often overlook the logical structure of user inputs. This oversight can lead to reduced data utility and degraded performance of CLLMs. To address these limitations and enable secure yet effective interactions, we propose Semantic Encryption (SE)-a plug-and-play framework designed to preserve both privacy and utility. SE consists of two key components: Semantic Encoding and Semantic Decoding. In the encoding phase, a lightweight local model transforms the original user input into an alternative semantic context that maintains the original intent and logical structure while obfuscating sensitive information. This transformed input is then processed by the CLLM, which generates a response based on the transformed semantic context. To maintain a seamless user experience, the decoding phase will reconstruct the CLLM's response back into the original semantic context by referencing the locally stored user input. Extensive experimental evaluations demonstrate that SE effectively protects data privacy without compromising data utility or user experience, offering a practical solution for secure interaction with CLLMs. Particularly, the proposed SE demonstrates a significant improvement over the state-of-the-art InferDPT, surpassing it across various evaluated metrics and datasets.

Paperid: 2059, https://arxiv.org/pdf/2508.01074.pdf

Abstract:
Modern over-parameterized deep models are highly data-dependent, with large scale general-purpose and domain-specific datasets serving as the bedrock for rapid advancements. However, many datasets are proprietary or contain sensitive information, making unrestricted model training problematic. In the open world where data thefts cannot be fully prevented, Dataset Ownership Verification (DOV) has emerged as a promising method to protect copyright by detecting unauthorized model training and tracing illicit activities. Due to its diversity and superior stealth, evading DOV is considered extremely challenging. However, this paper identifies that previous studies have relied on oversimplistic evasion attacks for evaluation, leading to a false sense of security. We introduce a unified evasion framework, in which a teacher model first learns from the copyright dataset and then transfers task-relevant yet identifier-independent domain knowledge to a surrogate student using an out-of-distribution (OOD) dataset as the intermediary. Leveraging Vision-Language Models and Large Language Models, we curate the most informative and reliable subsets from the OOD gallery set as the final transfer set, and propose selectively transferring task-oriented knowledge to achieve a better trade-off between generalization and evasion effectiveness. Experiments across diverse datasets covering eleven DOV methods demonstrate our approach simultaneously eliminates all copyright identifiers and significantly outperforms nine state-of-the-art evasion attacks in both generalization and effectiveness, with moderate computational overhead. As a proof of concept, we reveal key vulnerabilities in current DOV methods, highlighting the need for long-term development to enhance practicality.

Paperid: 2060, https://arxiv.org/pdf/2507.23608.pdf

Authors:Linmin Pei, Granger Sutton, Michael Rutherford, Ulrike Wagner, Tracy Nolan, Kirk Smith, Phillip Farmer, Peter Gu, Ambar Rana, Kailing Chen, Thomas Ferleman, Brian Park, Ye Wu, Jordan Kojouharov, Gargi Singh, Jon Lemon, Tyler Willis, Milos Vukadinovic, Grant Duffy, Bryan He, David Ouyang, Marco Pereanez, Daniel Samber, Derek A. Smith, Christopher Cannistraci, Zahi Fayad, David S. Mendelson, Michele Bufano, Elmar Kotter, Hamideh Haghiri, Rajesh Baidya, Stefan Dvoretskii, Klaus H. Maier-Hein, Marco Nolden, Christopher Ablett, Silvia Siggillino, Sandeep Kaushik, Hongzhu Jiang, Sihan Xie, Zhiyu Wan, Alex Michie, Simon J Doran, Angeline Aurelia Waly, Felix A. Nathaniel Liang, Humam Arshad Mustagfirin, Michelle Grace Felicia, Kuo Po Chih, Rahul Krish, Ghulam Rasool, Nidhal Bouaynaya, Nikolas Koutsoubis, Kyle Naddeo, Kartik Pandit, Tony O'Sullivan, Raj Krish, Qinyan Pan, Scott Gustafson, Benjamin Kopchick, Laura Opsahl-Ong, Andrea Olvera-Morales, Jonathan Pinney, Kathryn Johnson, Theresa Do, Juergen Klenk, Maria Diaz, Arti Singh, Rong Chai, David A. Clunie, Fred Prior, Keyvan Farahani

Abstract:
The de-identification (deID) of protected health information (PHI) and personally identifiable information (PII) is a fundamental requirement for sharing medical images, particularly through public repositories, to ensure compliance with patient privacy laws. In addition, preservation of non-PHI metadata to inform and enable downstream development of imaging artificial intelligence (AI) is an important consideration in biomedical research. The goal of MIDI-B was to provide a standardized platform for benchmarking of DICOM image deID tools based on a set of rules conformant to the HIPAA Safe Harbor regulation, the DICOM Attribute Confidentiality Profiles, and best practices in preservation of research-critical metadata, as defined by The Cancer Imaging Archive (TCIA). The challenge employed a large, diverse, multi-center, and multi-modality set of real de-identified radiology images with synthetic PHI/PII inserted. The MIDI-B Challenge consisted of three phases: training, validation, and test. Eighty individuals registered for the challenge. In the training phase, we encouraged participants to tune their algorithms using their in-house or public data. The validation and test phases utilized the DICOM images containing synthetic identifiers (of 216 and 322 subjects, respectively). Ten teams successfully completed the test phase of the challenge. To measure success of a rule-based approach to image deID, scores were computed as the percentage of correct actions from the total number of required actions. The scores ranged from 97.91% to 99.93%. Participants employed a variety of open-source and proprietary tools with customized configurations, large language models, and optical character recognition (OCR). In this paper we provide a comprehensive report on the MIDI-B Challenge's design, implementation, results, and lessons learned.

Paperid: 2061, https://arxiv.org/pdf/2507.22347.pdf

Abstract:
We introduce the novel problem of benchmarking fraud detectors on private graph-structured data. Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs. We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties (e.g., vendors or researchers). The third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results. We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results. In simulations of a privacy-sensitive benchmark for facial recognition algorithms by the National Institute of Standards and Technology (NIST), our attack achieves near perfect accuracy in identifying whether individuals' data is present in a private dataset, with a True Positive Rate of 0.98 at a False Positive Rate of 0.00. We then study how to benchmark algorithms while satisfying a formal differential privacy (DP) guarantee. We empirically evaluate two classes of solutions: subsample-and-aggregate and DP synthetic graph data. We demonstrate through extensive experiments that current approaches do not provide utility when guaranteeing DP. Our results indicate that the error arising from DP trades off between bias from distorting graph structure and variance from adding random noise. Current methods lie on different points along this bias-variance trade-off, but more complex methods tend to require high-variance noise addition, undermining utility.

Paperid: 2062, https://arxiv.org/pdf/2507.22140.pdf

Abstract:
This work explores and evaluates noise and crosstalk in neutral atom quantum computers. Neutral atom quantum computers are a promising platform for analog Hamiltonian simulations, which rely on a sequence of time-dependent Hamiltonians to model the dynamics of a larger system and are particularly useful for problems in optimization, physics, and molecular dynamics. However, the viability of running multiple simulations in a co-located or multi-tenant environment is limited by noise and crosstalk. This work conducts an analysis of how noise faced by simulations changes over time, and investigates the effects of spatial co-location on simulation fidelity. Findings of this work demonstrate that the close proximity of concurrent simulations can increase crosstalk between them. To mitigate this issue, a Moving Target Defense (MTD) strategy is proposed and evaluated. The results confirm that the MTD is a viable technique for enabling safe and reliable co-location of simulations on neutral atom quantum hardware.

Paperid: 2063, https://arxiv.org/pdf/2507.21061.pdf

Abstract:
What makes safety claims about general purpose AI systems such as large language models trustworthy? We show that rather than the capabilities of security tools such as alignment and red teaming procedures, it is security practices based on these tools that contributed to reconfiguring the image of AI safety and made the claims acceptable. After showing what causes the gap between the capabilities of security tools and the desired safety guarantees, we critically investigate how AI security practices attempt to fill the gap and identify several shortcomings in diversity and participation. We found that these security practices are part of securitization processes aiming to support (commercial) development of general purpose AI systems whose trustworthiness can only be imperfectly tested instead of guaranteed. We conclude by offering several improvements to the current AI security practices.

Paperid: 2064, https://arxiv.org/pdf/2507.19484.pdf

Abstract:
Social key recovery mechanisms enable users to recover their vaults with the help of trusted contacts, or trustees, avoiding the need for a single point of trust or memorizing complex strings. However, existing mechanisms overlook the memorability demands on users for recovery, such as the need to recall a threshold number of trustees. Therefore, we first formalize the notion of recovery metadata in the context of social key recovery, illustrating the tradeoff between easing the burden of memorizing the metadata and maintaining metadata privacy. We present Apollo, the first framework that addresses this tradeoff by distributing indistinguishable data within a user's social circle, where trustees hold relevant data and non-trustees store random data. Apollo eliminates the need to memorize recovery metadata since a user eventually gathers sufficient data from her social circle for recovery. Due to indistinguishability, Apollo protects metadata privacy by forming an anonymity set that hides the trustees among non-trustees. To make the anonymity set scalable, Apollo proposes a novel multi-layered secret sharing scheme that mitigates the overhead due to the random data distributed among non-trustees. Finally, we provide a prototype implementation of Apollo and report on its performance. Apollo reduces the chances of malicious recovery to between 0.005% and 1.8%, depending on the adversary's ability to compromise. The multi-layered design shows a latency reduction from 1.1x to 740kx compared to a single-layered approach, depending on the number of reconnections.

Paperid: 2065, https://arxiv.org/pdf/2507.17712.pdf

Abstract:
The number of qubits in quantum computers keeps growing, but most quantum programs remain relatively small because of the noisy nature of the underlying quantum hardware. This might lead quantum cloud providers to explore increased hardware utilization, and thus profitability through means such as multi-programming, which would allow the execution of multiple programs in parallel. The adoption of such technology would bring entirely new challenges to the field of quantum software security. This article explores and reports the key challenges identified in quantum software security within shared quantum computing environments.

Paperid: 2066, https://arxiv.org/pdf/2507.17518.pdf

Abstract:
Digital Twins (DTs) are gaining prominence in cybersecurity for their ability to replicate complex IT (Information Technology), OT (Operational Technology), and IoT (Internet of Things) infrastructures, allowing for real time monitoring, threat analysis, and system simulation. This study investigates how integrating DTs with penetration testing tools and Large Language Models (LLMs) can enhance cybersecurity education and operational readiness. By simulating realistic cyber environments, this approach offers a practical, interactive framework for exploring vulnerabilities and defensive strategies. At the core of this research is the Red Team Knife (RTK), a custom penetration testing toolkit aligned with the Cyber Kill Chain model. RTK is designed to guide learners through key phases of cyberattacks, including reconnaissance, exploitation, and response within a DT powered ecosystem. The incorporation of Large Language Models (LLMs) further enriches the experience by providing intelligent, real-time feedback, natural language threat explanations, and adaptive learning support during training exercises. This combined DT LLM framework is currently being piloted in academic settings to develop hands on skills in vulnerability assessment, threat detection, and security operations. Initial findings suggest that the integration significantly improves the effectiveness and relevance of cybersecurity training, bridging the gap between theoretical knowledge and real-world application. Ultimately, the research demonstrates how DTs and LLMs together can transform cybersecurity education to meet evolving industry demands.

Paperid: 2067, https://arxiv.org/pdf/2507.17033.pdf

Abstract:
As power consumption from AI training and inference continues to increase, AI accelerators are being integrated directly into the CPU. Intel's Advanced Matrix Extensions (AMX) is one such example, debuting on the 4th generation Intel Xeon Scalable CPU. We discover a timing side and covert channel, GATEBLEED, caused by the aggressive power gating utilized to keep the CPU within operating limits. We show that the GATEBLEED side channel is a threat to AI privacy as many ML models such as transformers and CNNs make critical computationally-heavy decisions based on private values like confidence thresholds and routing logits. Timing delays from selective powering down of AMX components mean that each matrix multiplication is a potential leakage point when executed on the AMX accelerator. Our research identifies over a dozen potential gadgets across popular ML libraries (HuggingFace, PyTorch, TensorFlow, etc.), revealing that they can leak sensitive and private information. GATEBLEED poses a risk for local and remote timing inference, even under previous protective measures. GATEBLEED can be used as a high performance, stealthy remote covert channel and a generic magnifier for timing transmission channels, capable of bypassing traditional cache defenses to leak arbitrary memory addresses and evading state of the art microarchitectural attack detectors under realistic network conditions and system configurations in which previous attacks fail. We implement an end-to-end microarchitectural inference attack on a transformer model optimized with Intel AMX, achieving a membership inference accuracy of 81% and a precision of 0.89. In a CNN-based or transformer-based mixture-of-experts model optimized with Intel AMX, we leak expert choice with 100% accuracy. To our knowledge, this is the first side-channel attack on AI privacy that exploits hardware optimizations.

Paperid: 2068, https://arxiv.org/pdf/2507.16226.pdf

Abstract:
Large Language Models (LLMs) are increasingly used in circuit design tasks and have typically undergone multiple rounds of training. Both the trained models and their associated training data are considered confidential intellectual property (IP) and must be protected from exposure. Confidential Computing offers a promising solution to protect data and models through Trusted Execution Environments (TEEs). However, existing TEE implementations are not designed to support the resource-intensive nature of LLMs efficiently. In this work, we first present a comprehensive evaluation of the LLMs within a TEE-enabled confidential computing environment, specifically utilizing Intel Trust Domain Extensions (TDX). We constructed experiments on three environments: TEE-based, CPU-only, and CPU-GPU hybrid implementations, and evaluated their performance in terms of tokens per second. Our first observation is that distilled models, i.e., DeepSeek, surpass other models in performance due to their smaller parameters, making them suitable for resource-constrained devices. Also, in the quantized models such as 4-bit quantization (Q4) and 8-bit quantization (Q8), we observed a performance gain of up to 3x compared to FP16 models. Our findings indicate that for fewer parameter sets, such as DeepSeek-r1-1.5B, the TDX implementation outperforms the CPU version in executing computations within a secure environment. We further validate the results using a testbench designed for SoC design tasks. These validations demonstrate the potential of efficiently deploying lightweight LLMs on resource-constrained systems for semiconductor CAD applications.

Paperid: 2069, https://arxiv.org/pdf/2507.15449.pdf

Abstract:
We consider the multivariate scheme Pesto, which was introduced by Calderini, Caminata, and Villa. In this scheme, the public polynomials are obtained by applying a CCZ transformation to a set of quadratic secret polynomials. As a consequence, the public key consists of polynomials of degree 4. In this work, we show that the public degree 4 polynomial system can be efficiently reduced to a system of quadratic polynomials. This seems to suggest that the CCZ transformation may not offer a significant increase in security, contrary to what was initially believed.

Paperid: 2070, https://arxiv.org/pdf/2507.15377.pdf

Abstract:
Nowadays, equivalence problems are widely used in cryptography, most notably to establish cryptosystems such as digital signatures, with MEDS, LESS, PERK as the most recent ones. However, in the context of matrix codes, only the code equivalence problem has been studied, while the subcode equivalence is well-defined in the Hamming metric. In this work, we introduce two new problems: the Matrix Subcode Equivalence Problem and the Matrix Code Permuted Kernel Problem, to which we apply the MPCitH paradigm to build a signature scheme. These new problems, closely related to the Matrix Code Equivalence problem, ask to find an isometry given a code $C$ and a subcode $D$. Furthermore, we prove that the Matrix Subcode Equivalence problem reduces to the Hamming Subcode Equivalence problem, which is known to be NP-Complete, thus introducing the matrix code version of the Permuted Kernel Problem. We also adapt the combinatorial and algebraic algorithms for the Matrix Code Equivalence problem to the subcode case, and we analyze their complexities. We find with this analysis that the algorithms perform much worse than in the code equivalence case, which is the same as what happens in the Hamming metric. Finally, our analysis of the attacks allows us to take parameters much smaller than in the Matrix Code Equivalence case. Coupled with the effectiveness of \textit{Threshold-Computation-in-the-Head} or \textit{VOLE-in-the-Head}, we obtain a signature size of $\approx$ 4 800 Bytes, with a public key of $\approx$ 275 Bytes. We thus obtain a reasonable signature size, which brings diversity in the landscape of post-quantum signature schemes, by relying on a new hard problem. In particular, this new signature scheme performs better than SPHINCS+, with a smaller size of public key + signature. Our signature compares also well with other signature schemes: compared to MEDS, the signature is smaller, and we reduced the size of the sum of signature and public key by a factor close to 5. We also obtain a signature size that is almost half the size of the CROSS signature scheme.

Paperid: 2071, https://arxiv.org/pdf/2507.14822.pdf

Abstract:
Recently, low-altitude wireless networks (LAWNs) have emerged as a critical backbone for supporting the low-altitude economy, particularly with the densification of unmanned aerial vehicles (UAVs) and high-altitude platforms (HAPs). To meet growing data demands, some LAWN deployments incorporate free-space optical (FSO) links, which offer exceptional bandwidth and beam directivity. However, without strong security measures in place, both conventional radio frequency channels and FSO beams remain vulnerable to interception and spoofing and FSO in particular can suffer from turbulence, misalignment, and weather-related attenuation. To address these challenges in the quantum era, a quantum-secure architecture called Quantum Skyshield is proposed to enable reliable communication between the base transceiver station (BTS) and LAWN. The proposed design integrates BB84 quantum key distribution (QKD) with post-quantum authentication mechanisms. Simulation results confirm the reliable generation of a 128-bit symmetric key when the quantum bit error rate (QBER) remains below the threshold of 11%. Authentication is enforced using Lamport one-time signatures and hash-based message authentication codes (HMAC) to ensure message integrity. A Grover-inspired threat detection mechanism identifies anomalies with up to 89% probability in a single iteration, enabling real-time trust evaluation. Lastly, future research challenges have also been identified and discussed to guide further development in this area.

Paperid: 2072, https://arxiv.org/pdf/2507.14324.pdf

Abstract:
Identity verification is the process of confirming an individual's claimed identity, which is essential in sectors like finance, healthcare, and online services to ensure security and prevent fraud. However, current password/PIN-based identity solutions are susceptible to phishing or skimming attacks, where malicious intermediaries attempt to steal credentials using fake identification portals. Alikhani et al. [Nature, 2021] began exploring identity verification through graph coloring-based relativistic zero-knowledge proofs (RZKPs), a key cryptographic primitive that enables a prover to demonstrate knowledge of secret credentials to a verifier without disclosing any information about the secret. Our work advances this field and addresses unresolved issues: From an engineering perspective, we relax further the relativistic constraints from 60m to 30m, and significantly enhance the stability and scalability of the experimental demonstration of the 2-prover graph coloring-based RZKP protocol for near-term use cases. At the same time, for long-term security against entangled malicious provers, we propose a modified protocol with comparable computation and communication costs, we establish an upper bound on the soundness parameter for this modified protocol. On the other hand, we extend the two-prover, two-verifier setup to a three-prover configuration, demonstrating the security of such relativistic protocols against entangled malicious provers.

Paperid: 2073, https://arxiv.org/pdf/2507.13810.pdf

Abstract:
This article introduces the innovative Quantum Dining Information Brokers Problem, presenting a novel entanglement-based quantum protocol to address it. The scenario involves $n$ information brokers, all located in distinct geographical regions, engaging in a metaphorical virtual dinner. The objective is for each broker to share a unique piece of information with all others simultaneously. Unlike previous approaches, this protocol enables a fully parallel, single-step communication exchange among all brokers, regardless of their physical locations. A key feature of this protocol is its ability to ensure both the anonymity and privacy of all participants are preserved, meaning no broker can discern the identity of the sender behind any received information. At its core, the Quantum Dining Information Brokers Problem serves as a conceptual framework for achieving anonymous, untraceable, and massively parallel information exchange in a distributed system. The proposed protocol introduces three significant advancements. First, while quantum protocols for one-to-many simultaneous information transmission have been developed, this is, to the best of our knowledge, one of the first quantum protocols to facilitate many-to-many simultaneous information exchange. Second, it guarantees complete anonymity and untraceability for all senders, a critical improvement over sequential applications of one-to-many protocols, which fail to ensure such robust anonymity. Third, leveraging quantum entanglement, the protocol operates in a fully distributed manner, accommodating brokers in diverse spatial locations. This approach marks a substantial advancement in secure, scalable, and anonymous communication, with potential applications in distributed environments where privacy and parallelism are paramount.

Paperid: 2074, https://arxiv.org/pdf/2507.13508.pdf

Abstract:
The "Fake or Real" competition hosted on Kaggle (https://www.kaggle.com/competitions/fake-or-real-the-impostor-hunt ) is the second part of a series of follow-up competitions and hackathons related to the "Assurance for Space Domain AI Applications" project funded by the European Space Agency (https://assurance-ai.space-codev.org/ ). The competition idea is based on two real-life AI security threats identified within the project -- data poisoning and overreliance in Large Language Models. The task is to distinguish between the proper output from LLM and the output generated under malicious modification of the LLM. As this problem was not extensively researched, participants are required to develop new techniques to address this issue or adjust already existing ones to this problem's statement.

Paperid: 2075, https://arxiv.org/pdf/2507.11970.pdf

Abstract:
Program obfuscation aims to hide the inner workings of a program while preserving its functionality. In the quantum setting, recent works have obtained obfuscation schemes for specialized classes of quantum circuits. For instance, Bartusek, Brakerski, and Vaikuntanathan (STOC 2024) constructed a quantum state obfuscation scheme, which supports the obfuscation of quantum programs represented as quantum states for pseudo-deterministic quantum programs with classical inputs and outputs in the classical oracle model. In this work, we improve upon existing results by constructing the first quantum state obfuscation scheme for unitary (or approximately unitary) quantum programs supporting quantum inputs and outputs in the classical oracle model. At the core of our obfuscation scheme are two novel ingredients: a functional quantum authentication scheme that allows key holders to learn specific functions of the authenticated quantum state with simulation-based security, and a compiler that represents an arbitrary quantum circuit as a projective linear-plus-measurement quantum program described by a sequence of non-adaptive Clifford gates interleaved with adaptive and compatible measurements.

Paperid: 2076, https://arxiv.org/pdf/2507.11763.pdf

Abstract:
Cyber threats against space infrastructures, including satellites and systems on the ground, have not been adequately understood. Testbeds are important to deepen our understanding and validate space cybersecurity studies. The state of the art is that there are very few studies on building testbeds, and there are few characterizations of testbeds. In this paper, we propose a framework for characterizing the fidelity of space cybersecurity testbeds. The framework includes 7 attributes for characterizing the system models, threat models, and defenses that can be accommodated by a testbed. We use the framework to guide us in building and characterizing a concrete testbed we have implemented, which includes space, ground, user, and link segments. In particular, we show how the testbed can accommodate some space cyber attack scenarios that have occurred in the real world, and discuss future research directions.

Paperid: 2077, https://arxiv.org/pdf/2507.10971.pdf

Abstract:
Designing secure architectures for system-on-chip (SoC) platforms is a highly intricate and time-intensive task, often requiring months of development and meticulous verification. Even minor architectural oversights can lead to critical vulnerabilities that undermine the security of the entire chip. In response to this challenge, we introduce CITADEL, a modular security framework aimed at streamlining the creation of robust security architectures for SoCs. CITADEL offers a configurable, plug-and-play subsystem composed of custom intellectual property (IP) blocks, enabling the construction of diverse security mechanisms tailored to specific threats. As a concrete demonstration, we instantiate CITADEL to defend against supply-chain threats, illustrating how the framework adapts to one of the most pressing concerns in hardware security. This paper explores the range of obstacles encountered when building a unified security architecture capable of addressing multiple attack vectors and presents CITADEL's strategies for overcoming them. Through several real-world case studies, we showcase the practical implementation of CITADEL and present a thorough evaluation of its impact on silicon area and power consumption across various ASIC technologies. Results indicate that CITADEL introduces only minimal resource overhead, making it a practical solution for enhancing SoC security.

Paperid: 2078, https://arxiv.org/pdf/2507.10746.pdf

Abstract:
We design a debiased parametric bootstrap framework for statistical inference from differentially private data. Existing usage of the parametric bootstrap on privatized data ignored or avoided handling the effect of clamping, a technique employed by the majority of privacy mechanisms. Ignoring the impact of clamping often leads to under-coverage of confidence intervals and miscalibrated type I errors of hypothesis tests. The main reason for the failure of the existing methods is the inconsistency of the parameter estimate based on the privatized data. We propose using the indirect inference method to estimate the parameter values consistently, and we use the improved estimator in parametric bootstrap for inference. To implement the indirect estimator, we present a novel simulation-based, adaptive approach along with the theory that establishes the consistency of the corresponding parametric bootstrap estimates, confidence intervals, and hypothesis tests. In particular, we prove that our adaptive indirect estimator achieves the minimum asymptotic variance among all "well-behaved" consistent estimators based on the released summary statistic. Our simulation studies show that our framework produces confidence intervals with well-calibrated coverage and performs hypothesis testing with the correct type I error, giving state-of-the-art performance for inference on location-scale normals, simple linear regression, and logistic regression.

Paperid: 2079, https://arxiv.org/pdf/2507.09354.pdf

Abstract:
Integrated sensing and communication (ISAC) systems potentially encounter significant performance degradation in densely obstructed urban and non-line-of-sight scenarios, thus limiting their effectiveness in practical deployments. To deal with these challenges, this paper proposes a backscatter device (BD)-assisted ISAC system, which leverages passive BDs naturally distributed in underlying environments for performance enhancement. These ambient devices can enhance sensing accuracy and communication reliability by providing additional reflective signal paths. In this system, we define the Pareto boundary characterizing the trade-off between sensing mutual information (SMI) and communication rates to provide fundamental insights for its design. To derive the boundary, we formulate a performance optimization problem within an orthogonal frequency division multiplexing (OFDM) framework, by jointly optimizing time-frequency resource element (RE) allocation, transmit power management, and BD modulation decisions. To tackle the non-convexity of the problem, we decompose it into three subproblems, solved iteratively through a block coordinate descent (BCD) algorithm. Specifically, the RE subproblem is addressed using the successive convex approximation (SCA) method, the power subproblem is solved using an augmented Lagrangian combined water-filling method, and the BD modulation subproblem is tackled using semidefinite relaxation (SDR) methods. Additionally, we demonstrate the generality of the proposed system by showing its adaptability to bistatic ISAC scenarios and MIMO settings. Finally, extensive simulation results validate the effectiveness of the proposed system and its superior performance compared to existing state-of-the-art ISAC schemes.

Paperid: 2080, https://arxiv.org/pdf/2507.08202.pdf

Abstract:
Quantum neural networks (QNN) hold immense potential for the future of quantum machine learning (QML). However, QNN security and robustness remain largely unexplored. In this work, we proposed novel Trojan attacks based on the quantum computing properties in a QNN-based binary classifier. Our proposed Quantum Properties Trojans (QuPTs) are based on the unitary property of quantum gates to insert noise and Hadamard gates to enable superposition to develop Trojans and attack QNNs. We showed that the proposed QuPTs are significantly stealthier and heavily impact the quantum circuits' performance, specifically QNNs. The most impactful QuPT caused a deterioration of 23% accuracy of the compromised QNN under the experimental setup. To the best of our knowledge, this is the first work on the Trojan attack on a fully quantum neural network independent of any hybrid classical-quantum architecture.

Paperid: 2081, https://arxiv.org/pdf/2507.07341.pdf

Abstract:
With the increased deployment of large language models (LLMs), one concern is their potential misuse for generating harmful content. Our work studies the alignment challenge, with a focus on filters to prevent the generation of unsafe information. Two natural points of intervention are the filtering of the input prompt before it reaches the model, and filtering the output after generation. Our main results demonstrate computational challenges in filtering both prompts and outputs. First, we show that there exist LLMs for which there are no efficient prompt filters: adversarial prompts that elicit harmful behavior can be easily constructed, which are computationally indistinguishable from benign prompts for any efficient filter. Our second main result identifies a natural setting in which output filtering is computationally intractable. All of our separation results are under cryptographic hardness assumptions. In addition to these core findings, we also formalize and study relaxed mitigation approaches, demonstrating further computational barriers. We conclude that safety cannot be achieved by designing filters external to the LLM internals (architecture and weights); in particular, black-box access to the LLM will not suffice. Based on our technical results, we argue that an aligned AI system's intelligence cannot be separated from its judgment.

Paperid: 2082, https://arxiv.org/pdf/2507.06008.pdf

Abstract:
Information systems support the execution of business processes. The event logs of these executions generally contain sensitive information about customers, patients, and employees. The corresponding privacy challenges can be addressed by anonymizing the event logs while still retaining utility for process discovery. However, trading off utility and privacy is difficult: the higher the complexity of event log, the higher the loss of utility by anonymization. In this work, we propose a pipeline that combines anonymization and event data partitioning, where event abstraction is utilized for partitioning. By leveraging event abstraction, event logs can be segmented into multiple parts, allowing each sub-log to be anonymized separately. This pipeline preserves privacy while mitigating the loss of utility. To validate our approach, we study the impact of event partitioning on two anonymization techniques using three real-world event logs and two process discovery techniques. Our results demonstrate that event partitioning can bring improvements in process discovery utility for directly-follows-based anonymization techniques.

Paperid: 2083, https://arxiv.org/pdf/2507.05630.pdf

Abstract:
LLM-integrated applications and agents are vulnerable to prompt injection attacks, in which adversaries embed malicious instructions within seemingly benign user inputs to manipulate the LLM's intended behavior. Recent defenses based on $\textit{known-answer detection}$ (KAD) have achieved near-perfect performance by using an LLM to classify inputs as clean or contaminated. In this work, we formally characterize the KAD framework and uncover a structural vulnerability in its design that invalidates its core security premise. We design a methodical adaptive attack, $\textit{DataFlip}$, to exploit this fundamental weakness. It consistently evades KAD defenses with detection rates as low as $1.5\%$ while reliably inducing malicious behavior with success rates of up to $88\%$, without needing white-box access to the LLM or any optimization procedures.

Paperid: 2084, https://arxiv.org/pdf/2507.04077.pdf

Abstract:
Conjunctive Searchable Symmetric Encryption (CSSE) enables secure conjunctive searches over encrypted data. While leakage-abuse attacks (LAAs) against single-keyword SSE have been extensively studied, their extension to conjunctive queries faces a critical challenge: the combinatorial explosion of candidate keyword combinations, leading to enormous time and space overhead for attacks. In this paper, we reveal a fundamental vulnerability in state-of-the-art CSSE schemes: s-term leakage, where the keyword with the minimal document frequency in a query leaks distinct patterns. We propose S-Leak, the first passive attack framework that progressively recovers conjunctive queries by exploiting s-term leakage and global leakage. Our key innovation lies in a three-stage approach: identifying the s-term of queries, pruning low-probability keyword conjunctions, and reconstructing full queries. We propose novel metrics to better assess attacks in conjunctive query scenarios. Empirical evaluations on real-world datasets demonstrate that our attack is effective in diverse CSSE configurations. When considering 161,700 conjunctive keyword queries, our attack achieves a 95.15% accuracy in recovering at least one keyword, 82.57% for at least two, 58% for all three keywords, and maintains efficacy against defenses such as SEAL padding and CLRZ obfuscation. Our work exposes the underestimated risks of s-term leakage in practical SSE deployments and calls for a redesign of leakage models for multi-keyword search scenarios.

Paperid: 2085, https://arxiv.org/pdf/2507.03619.pdf

Abstract:
Today, the training of large language models (LLMs) can involve personally identifiable information and copyrighted material, incurring dataset misuse. To mitigate the problem of dataset misuse, this paper explores \textit{dataset inference}, which aims to detect if a suspect model $\mathcal{M}$ used a victim dataset $\mathcal{D}$ in training. Previous research tackles dataset inference by aggregating results of membership inference attacks (MIAs) -- methods to determine whether individual samples are a part of the training dataset. However, restricted by the low accuracy of MIAs, previous research mandates grey-box access to $\mathcal{M}$ to get intermediate outputs (probabilities, loss, perplexity, etc.) for obtaining satisfactory results. This leads to reduced practicality, as LLMs, especially those deployed for profits, have limited incentives to return the intermediate outputs. In this paper, we propose a new method of dataset inference with only black-box access to the target model (i.e., assuming only the text-based responses of the target model are available). Our method is enabled by two sets of locally built reference models, one set involving $\mathcal{D}$ in training and the other not. By measuring which set of reference model $\mathcal{M}$ is closer to, we determine if $\mathcal{M}$ used $\mathcal{D}$ for training. Evaluations of real-world LLMs in the wild show that our method offers high accuracy in all settings and presents robustness against bypassing attempts.

Paperid: 2086, https://arxiv.org/pdf/2507.02850.pdf

Abstract:
We describe a vulnerability in language models (LMs) trained with user feedback, whereby a single user can persistently alter LM knowledge and behavior given only the ability to provide prompts and upvote / downvote feedback on LM outputs. To implement the attack, the attacker prompts the LM to stochastically output either a "poisoned" or benign response, then upvotes the poisoned response or downvotes the benign one. When feedback signals are used in a subsequent preference tuning behavior, LMs exhibit increased probability of producing poisoned responses even in contexts without malicious prompts. We show that this attack can be used to (1) insert factual knowledge the model did not previously possess, (2) modify code generation patterns in ways that introduce exploitable security flaws, and (3) inject fake financial news. Our finding both identifies a new qualitative feature of language model preference tuning (showing that it even highly restricted forms of preference data can be used to exert fine-grained control over behavior), and a new attack mechanism for LMs trained with user feedback (extending work on pretraining-time data poisoning and deployment-time prompt injection).

Paperid: 2087, https://arxiv.org/pdf/2507.02737.pdf

Abstract:
Monitoring Large Language Model (LLM) outputs is crucial for mitigating risks from misuse and misalignment. However, LLMs could evade monitoring through steganography: Encoding hidden information within seemingly benign generations. In this paper, we evaluate the steganography capabilities in frontier LLMs to better understand the risk they pose. We focus on two types of steganography: passing encoded messages and performing encoded reasoning. We find that current models are unable to encode short messages in their outputs without a monitor noticing under standard affordances. They can succeed, however, if given additional affordances such as using an unmonitored scratchpad and coordinating on what encoding scheme to use. We additionally find early signs that models can perform basic encoded reasoning in a simple state-tracking problem. This includes some ability to reason with their own and pre-defined schemes, including encoding schemes such as Hexadecimal. Despite this, they can rarely hide reasoning subtly within a cover task to fool a monitor. Overall, our results indicate that current LLMs exhibit nascent steganographic capabilities. While these capabilities are likely insufficient to bypass well-designed monitors at present, this could change in the future.

Paperid: 2088, https://arxiv.org/pdf/2507.02622.pdf

Abstract:
Access control is a cornerstone of computer security that prevents unauthorised access to resources. In this paper, we study access control in quantum computer systems. We present the first explicit scenario of a security breach when a classically secure access control system is straightforwardly adapted to the quantum setting. The breach is ultimately due to that quantum mechanics allows the phenomenon of entanglement and violates Mermin inequality, a multi-party variant of the celebrated Bell inequality. This reveals a threat from quantum entanglement to access control if existing computer systems integrate with quantum computing. To protect against such threat, we propose several new models of quantum access control, and rigorously analyse their security, flexibility and efficiency.

Paperid: 2089, https://arxiv.org/pdf/2507.02332.pdf

Abstract:
This paper investigates privacy jailbreaking in LLMs via steering, focusing on whether manipulating activations can bypass LLM alignment and alter response behaviors to privacy related queries (e.g., a certain public figure's sexual orientation). We begin by identifying attention heads predictive of refusal behavior for private attributes (e.g., sexual orientation) using lightweight linear probes trained with privacy evaluator labels. Next, we steer the activations of a small subset of these attention heads guided by the trained probes to induce the model to generate non-refusal responses. Our experiments show that these steered responses often disclose sensitive attribute details, along with other private information about data subjects such as life events, relationships, and personal histories that the models would typically refuse to produce. Evaluations across four LLMs reveal jailbreaking disclosure rates of at least 95%, with more than 50% on average of these responses revealing true personal information. Our controlled study demonstrates that private information memorized in LLMs can be extracted through targeted manipulation of internal activations.

Paperid: 2090, https://arxiv.org/pdf/2507.01768.pdf

Abstract:
The prevalence of cyberattacks on Industrial Control Systems (ICS) has highlighted the necessity for robust security measures and incident response to protect critical infrastructure. This is prominent when Operational Technology (OT) systems undergo digital transformation by integrating with Information Technology (IT) systems to enhance operational efficiency, adaptability, and safety. To support analysts in staying abreast of emerging attack patterns, there is a need for ICS datasets that reflect indicators representative of contemporary cyber threats. To address this, we conduct two ICS cyberattack simulations to showcase the impact of trending ICS cyberattacks on a railway cyber range that resembles the railway infrastructure. The attack scenario is designed to blend trending attack trends with attack patterns observed from historical ICS incidents. The resulting evidence is collected as datasets, serving as an essential resource for cyberattack analysis. This captures key indicators that are relevant to the current threat landscape, augmenting the effectiveness of security systems and analysts to protect against ICS cyber threats.

Paperid: 2091, https://arxiv.org/pdf/2507.01694.pdf

Abstract:
Federated large language models (FedLLMs) enable powerful generative capabilities within wireless networks while preserving data privacy. Nonetheless, FedLLMs remain vulnerable to model poisoning attacks. This article first reviews recent advancements in model poisoning techniques and existing defense mechanisms for FedLLMs, underscoring critical limitations, especially when dealing with non-IID textual data distributions. Current defense strategies predominantly employ distance or similarity-based outlier detection mechanisms, relying on the assumption that malicious updates markedly differ from benign statistical patterns. However, this assumption becomes inadequate against adaptive adversaries targeting billion-parameter LLMs. The article further investigates graph representation-based model poisoning (GRMP), an emerging attack paradigm that exploits higher-order correlations among benign client gradients to craft malicious updates indistinguishable from legitimate ones. GRMP can effectively circumvent advanced defense systems, causing substantial degradation in model accuracy and overall performance. Moreover, the article outlines a forward-looking research roadmap that emphasizes the necessity of graph-aware secure aggregation methods, specialized vulnerability metrics tailored for FedLLMs, and evaluation frameworks to enhance the robustness of federated language model deployments.

Paperid: 2092, https://arxiv.org/pdf/2507.01581.pdf

Abstract:
Location information serves as the fundamental element for numerous Internet of Things (IoT) applications. Traditional indoor localization techniques often produce significant errors and raise privacy concerns due to centralized data collection. In response, Machine Learning (ML) techniques offer promising solutions by capturing indoor environment variations. However, they typically require central data aggregation, leading to privacy, bandwidth, and server reliability issues. To overcome these challenges, in this paper, we propose a Federated Learning (FL)-based approach for dynamic indoor localization using a Deep Neural Network (DNN) model. Experimental results show that FL has the nearby performance to Centralized Model (CL) while keeping the data privacy, bandwidth efficiency and server reliability. This research demonstrates that our proposed FL approach provides a viable solution for privacy-enhanced indoor localization, paving the way for advancements in secure and efficient indoor localization systems.

Paperid: 2093, https://arxiv.org/pdf/2507.01465.pdf

Abstract:
Resource Public Key Infrastructure (RPKI) is a critical security mechanism for BGP, but the complexity of its architecture is a growing concern as its adoption scales. Current RPKI design heavily reuses legacy PKI components, such as X.509 EE-certificates, ASN.1 encoding, and XML-based repository protocols, which introduce excessive cryptographic validation, redundant metadata, and inefficiencies in both storage and processing. We show that these design choices, although based on established standards, create significant performance bottlenecks, increase the vulnerability surface, and hinder scalability for wide-scale Internet deployment. In this paper, we perform the first systematic analysis of the root causes of complexity in RPKI's design and experimentally quantify their real-world impact. We show that over 70\% of validation time in RPKI relying parties is spent on certificate parsing and signature verification, much of it unnecessary. Building on this insight, we introduce the improved RPKI (iRPKI), a backwards-compatible redesign that preserves all security guarantees while substantially reducing protocol overhead. iRPKI eliminates EE-certificates and ROA signatures, merges revocation and integrity objects, replaces verbose encodings with Protobuf, and restructures repository metadata for more efficient access. We experimentally demonstrate that our implementation of iRPKI in the Routinator validator achieves a 20x speed-up of processing time, 18x improvement of bandwidth requirements and 8x reduction in cache memory footprint, while also eliminating classes of vulnerabilities that have led to at least 10 vulnerabilities in RPKI software. iRPKI significantly increases the feasibility of deploying RPKI at scale in the Internet, and especially in constrained environments. Our design may be deployed incrementally without impacting existing operations.

Paperid: 2094, https://arxiv.org/pdf/2512.24400.pdf

Abstract:
SourceRank is a scoring system made of 18 metrics that assess the popularity and quality of open-source packages. Despite being used in several recent studies, none has thoroughly analyzed its reliability against evasion attacks aimed at inflating the score of malicious packages, thereby masquerading them as trustworthy. To fill this gap, we first propose a threat model that identifies potential evasion approaches for each metric, including the URL confusion technique, which can affect 5 out of the 18 metrics by leveraging a URL pointing to a legitimate repository potentially unrelated to the malicious package. Furthermore, we study the reliability of SourceRank in the PyPI ecosystem by analyzing the SourceRank distributions of benign and malicious packages in the state-of-the-art MalwareBench dataset, as well as in a real-world dataset of 122,398 packages. Our analysis reveals that, while historical data suggests a clear distinction between benign and malicious packages, the real-world distributions overlap significantly, mainly due to SourceRank's failure to timely reflect package removals. As a result, SourceRank cannot be reliably used to discriminate between benign and malicious packages in real-world scenarios, nor to select benign packages among those available on PyPI. Finally, our analysis reveals that URL confusion represents an emerging attack vector, with its prevalence increasing from 4.2% in MalwareBench to 7.0% in our real-world dataset. Moreover, this technique is often used alongside other evasion techniques and can significantly inflate the SourceRank metrics of malicious packages.

Paperid: 2095, https://arxiv.org/pdf/2512.23480.pdf

Abstract:
The software supply chain attacks are becoming more and more focused on trusted development and delivery procedures, so the conventional post-build integrity mechanisms cannot be used anymore. The available frameworks like SLSA, SBOM and in toto are majorly used to offer provenance and traceability but do not have the capabilities of actively identifying and removing vulnerabilities in software production. The current paper includes an example of agentic artificial intelligence (AI) based on autonomous software supply chain security that combines large language model (LLM)-based reasoning, reinforcement learning (RL), and multi-agent coordination. The suggested system utilizes specialized security agents coordinated with the help of LangChain and LangGraph, communicates with actual CI/CD environments with the Model Context Protocol (MCP), and documents all the observations and actions in a blockchain security ledger to ensure integrity and auditing. Reinforcement learning can be used to achieve adaptive mitigation strategies that consider the balance between security effectiveness and the operational overhead, and LLMs can be used to achieve semantic vulnerability analysis, as well as explainable decisions. This framework is tested based on simulated pipelines, as well as, actual world CI/CD integrations on GitHub Actions and Jenkins, including injection attacks, insecure deserialization, access control violations, and configuration errors. Experimental outcomes indicate better detection accuracy, shorter mitigation latency and reasonable build-time overhead than rule-based, provenance only and RL only baselines. These results show that agentic AI can facilitate the transition to self defending, proactive software supply chains rather than reactive verification ones.

Paperid: 2096, https://arxiv.org/pdf/2512.21827.pdf

Abstract:
The Internet of Drones (IoD) is an emerging and crucial paradigm enabling advanced applications that require seamless, secure communication across heterogeneous and untrusted domains. In such environments, access control and the transmission of sensitive data pose significant security challenges for IoD systems, necessitating the design of lightweight mutual authentication and key exchange protocols. Existing solutions are often hampered by high computation overhead, reliance on third parties, the requirement for secret storage in resource-constrained drones, and the need for a strictly controlled enrollment environment. These limitations make them impractical for dynamic cross-domain deployment. To address these limitations, we propose a lightweight mutual authentication mechanism that integrates Radio Frequency Fingerprint (RFF) and Physical Unclonable Function (PUF) technologies for secure drone-to-drone (D2D) and drone-to-ground station server (D2G) communication. RFF-based device identification is used to achieve over-the-air (OTA) enrollment, while the PUF serves as the root of trust for establishing mutual authentication among communication parties. Additionally, the on-the-fly key generation capability of the PUF is co-designed with One-Time-Pad (OTP) encryption to realize ephemeral keying and eliminate the need for storing secrets within drones. Both informal security analysis and ProVerif-based formal security verification comprehensively demonstrate the resilience of our protocol against common security attacks. The proposed protocol also outperforms existing IoD authentication schemes in terms of security features, as well as computation, communication, and storage overhead.

Paperid: 2097, https://arxiv.org/pdf/2512.21250.pdf

Abstract:
LLM-based code agents(e.g., ChatGPT Codex) are increasingly deployed as detector for code review and security auditing tasks. Although CoT-enhanced LLM vulnerability detectors are believed to provide improved robustness against obfuscated malicious code, we find that their reasoning chains and semantic abstraction processes exhibit exploitable systematic weaknesses.This allows attackers to covertly embed malicious logic, bypass code review, and propagate backdoored components throughout real-world software supply chains.To investigate this issue, we present CoTDeceptor, the first adversarial code obfuscation framework targeting CoT-enhanced LLM detectors. CoTDeceptor autonomously constructs evolving, hard-to-reverse multi-stage obfuscation strategy chains that effectively disrupt CoT-driven detection logic.We obtained malicious code provided by security enterprise, experimental results demonstrate that CoTDeceptor achieves stable and transferable evasion performance against state-of-the-art LLMs and vulnerability detection agents. CoTDeceptor bypasses 14 out of 15 vulnerability categories, compared to only 2 bypassed by prior methods. Our findings highlight potential risks in real-world software supply chains and underscore the need for more robust and interpretable LLM-powered security analysis systems.

Paperid: 2098, https://arxiv.org/pdf/2512.20004.pdf

Abstract:
Since the Internet of Things (IoT) is widely adopted using Android applications, detecting malicious Android apps is essential. In recent years, Android graph-based deep learning research has proposed many approaches to extract relationships from applications as graphs to generate graph embeddings. First, we demonstrate the effectiveness of graph-based classification using a Graph Neural Network (GNN)-based classifier to generate API graph embeddings. The graph embeddings are combined with Permission and Intent features to train multiple machine learning and deep learning models for Android malware detection. The proposed classification approach achieves an accuracy of 98.33 percent on the CICMaldroid dataset and 98.68 percent on the Drebin dataset. However, graph-based deep learning models are vulnerable, as attackers can add fake relationships to evade detection by the classifier. Second, we propose a Generative Adversarial Network (GAN)-based attack algorithm named VGAE-MalGAN targeting graph-based GNN Android malware classifiers. The VGAE-MalGAN generator produces adversarial malware API graphs, while the VGAE-MalGAN substitute detector attempts to mimic the target detector. Experimental results show that VGAE-MalGAN can significantly reduce the detection rate of GNN-based malware classifiers. Although the model initially fails to detect adversarial malware, retraining with generated adversarial samples improves robustness and helps mitigate adversarial attacks.

Paperid: 2099, https://arxiv.org/pdf/2512.19997.pdf

Abstract:
Broken Access Control (BAC) violations, which consistently rank among the top five security risks in the OWASP API Security Top 10, refer to unauthorized access attempts arising from BAC vulnerabilities, whose successful exploitation can impose significant risks on exposed application programming interfaces (APIs). In recent years, learning-based methods have demonstrated promising prospects in detecting various types of malicious activities. However, in real-network operation and maintenance scenarios, leveraging learning-based methods for BAC detection faces two critical challenges. Firstly, under the RESTful API design principles, most systems omit recording composite traffic for performance, and together with ethical and legal bans on directly testing real-world systems, this leads to a critical shortage of training data for detecting BAC violations. Secondly, common malicious behaviors such as SQL injection typically generate individual access traffic that is inherently anomalous. In contrast, BAC is usually composed of multiple correlated access requests that appear normal when examined in isolation. To tackle these problems, we introduce \BAC, an approach for establishing a BAC violation detection model by generating and utilizing API traffic data. The \BAC consists of an API Traffic Generator and a BAC Detector. Experimental results show that \BAC outperforms current state-of-the-art invariant-based and learning-based methods with the $\text{F}_1$ and MCC improving by 21.2\% and 24.1\%.

Paperid: 2100, https://arxiv.org/pdf/2512.18495.pdf

Abstract:
Artificial intelligence techniques have achieved strong performance in classifying Windows Portable Executable (PE) malware, but their reliability often degrades under dataset shifts, leading to misclassifications with severe security consequences. To address this, we enhance an existing LightGBM (LGBM) malware detector by integrating Neural Networks (NN), PriorNet, and Neural Network Ensembles, evaluated across three benchmark datasets: EMBER, BODMAS, and UCSB. The UCSB dataset, composed mainly of packed malware, introduces a substantial distributional shift relative to EMBER and BODMAS, making it a challenging testbed for robustness. We study uncertainty-aware decision strategies, including probability thresholding, PriorNet, ensemble-derived estimates, and Inductive Conformal Evaluation (ICE). Our main contribution is the use of ensemble-based uncertainty estimates as Non-Conformity Measures within ICE, combined with a novel threshold optimisation method. On the UCSB dataset, where the shift is most severe, the state-of-the-art probability-based ICE (SOTA) yields an incorrect acceptance rate (IA%) of 22.8%. In contrast, our method reduces this to 16% a relative reduction of about 30% while maintaining competitive correct acceptance rates (CA%). These results demonstrate that integrating ensemble-based uncertainty with conformal prediction provides a more reliable safeguard against misclassifications under extreme dataset shifts, particularly in the presence of packed malware, thereby offering practical benefits for real-world security operations.

Paperid: 2101, https://arxiv.org/pdf/2512.18483.pdf

Abstract:
Insider threat detection (ITD) is challenging due to the subtle and concealed nature of malicious activities performed by trusted users. This paper proposes a post-hoc ITD framework that integrates explicit and implicit graph representations with temporal modelling to capture complex user behaviour patterns. An explicit graph is constructed using predefined organisational rules to model direct relationships among user activities. To mitigate noise and limitations in this hand-crafted structure, an implicit graph is learned from feature similarities using the Gumbel-Softmax trick, enabling the discovery of latent behavioural relationships. Separate Graph Convolutional Networks (GCNs) process the explicit and implicit graphs to generate node embeddings, which are concatenated and refined through an attention mechanism to emphasise threat-relevant features. The refined representations are then passed to a bidirectional Long Short-Term Memory (Bi-LSTM) network to capture temporal dependencies in user behaviour. Activities are flagged as anomalous when their probability scores fall below a predefined threshold. Extensive experiments on CERT r5.2 and r6.2 datasets demonstrate that the proposed framework outperforms state-of-the-art methods. On r5.2, the model achieves an AUC of 98.62, a detection rate of 100%, and a false positive rate of 0.05. On the more challenging r6.2 dataset, it attains an AUC of 88.48, a detection rate of 80.15%, and a false positive rate of 0.15, highlighting the effectiveness of combining graph-based and temporal representations for robust ITD.

Paperid: 2102, https://arxiv.org/pdf/2512.17594.pdf

Abstract:
Out of distribution (OOD) detection remains a critical challenge in malware classification due to the substantial intra family variability introduced by polymorphic and metamorphic malware variants. Most existing deep learning based malware detectors rely on closed world assumptions and fail to adequately model this intra class variation, resulting in degraded performance when confronted with previously unseen malware families. This paper presents MADOOD, a novel two stage, cluster driven deep learning framework for robust OOD malware detection and classification. In the first stage, malware family embeddings are modeled using class conditional spherical decision boundaries derived from Gaussian Discriminant Analysis (GDA), enabling statistically grounded separation of indistribution and OOD samples without requiring OOD data during training. Z score based distance analysis across multiple class centroids is employed to reliably identify anomalous samples in the latent space. In the second stage, a deep neural network integrates cluster based predictions, refined embeddings, and supervised classifier outputs to enhance final classification accuracy. Extensive evaluations on benchmark malware datasets comprising 25 known families and multiple novel OOD variants demonstrate that MADOOD significantly outperforms state of the art OOD detection methods, achieving an AUC of up to 0.911 on unseen malware families. The proposed framework provides a scalable, interpretable, and statistically principled solution for real world malware detection and anomaly identification in evolving cybersecurity environments.

Paperid: 2103, https://arxiv.org/pdf/2512.16957.pdf

Abstract:
Securing low-latency I/O in commodity systems forces a fundamental trade-off: rely on the kernel's high overhead mediated interface, or bypass it entirely, exposing sensitive hardware resources to userspace and creating new vulnerabilities. This dilemma stems from a hardware granularity mismatch: standard MMUs operate at page boundaries, making it impossible to selectively expose safe device registers without also exposing the sensitive control registers colocated on the same page. Existing solutions to driver isolation enforce an isolation model that cannot protect sub-page device resources. This paper presents CAPIO, the first architecture to leverage hardware capabilities to enforce fine-grained access control on memory-mapped I/O. Unlike prior page-based protections, CAPIO utilizes unforgeable capabilities to create precise, sub-page "slices" of device memory. This mechanism enables the kernel to delegate latency-critical hardware access to userspace applications while strictly preventing interaction with co-located privileged registers. We implement CAPIO based on CHERI on the ARM Morello platform and demonstrate a proof-of-concept safe-access driver for a commodity network card which was not originally designed for kernel bypass. We demonstrate that CAPIO achieves the latency improvements of kernel bypass while enforcing byte-level access control of privileged resources.

Paperid: 2104, https://arxiv.org/pdf/2512.16284.pdf

Abstract:
Synthetic data generation is gaining traction as a privacy enhancing technology (PET). When properly generated, synthetic data preserve the analytic utility of real data while avoiding the retention of information that would allow the identification of specific individuals. However, the concept of data privacy remains elusive, making it challenging for practitioners to evaluate and benchmark the degree of privacy protection offered by synthetic data. In this paper, we propose a framework to empirically assess the efficacy of tabular synthetic data privacy quantification methods through controlled, deliberate risk insertion. To demonstrate this framework, we survey existing approaches to synthetic data privacy quantification and the related legal theory. We then apply the framework to the main privacy quantification methods with no-box threat models on publicly available datasets.

Paperid: 2105, https://arxiv.org/pdf/2512.16280.pdf

Abstract:
Romance-baiting scams have become a major source of financial and emotional harm worldwide. These operations are run by organized crime syndicates that traffic thousands of people into forced labor, requiring them to build emotional intimacy with victims over weeks of text conversations before pressuring them into fraudulent cryptocurrency investments. Because the scams are inherently text-based, they raise urgent questions about the role of Large Language Models (LLMs) in both current and future automation. We investigate this intersection by interviewing 145 insiders and 5 scam victims, performing a blinded long-term conversation study comparing LLM scam agents to human operators, and executing an evaluation of commercial safety filters. Our findings show that LLMs are already widely deployed within scam organizations, with 87% of scam labor consisting of systematized conversational tasks readily susceptible to automation. In a week-long study, an LLM agent not only elicited greater trust from study participants (p=0.007) but also achieved higher compliance with requests than human operators (46% vs. 18% for humans). Meanwhile, popular safety filters detected 0.0% of romance baiting dialogues. Together, these results suggest that romance-baiting scams may be amenable to full-scale LLM automation, while existing defenses remain inadequate to prevent their expansion.

Paperid: 2106, https://arxiv.org/pdf/2512.15818.pdf

Abstract:
Identity-preserving models have led to notable progress in generating personalized content. Unfortunately, such models also exacerbate risks when misused, for instance, by generating threatening content targeting specific individuals. This paper introduces the \textbf{Attribute Misbinding Attack}, a novel method that poses a threat to identity-preserving models by inducing them to produce Not-Safe-For-Work (NSFW) content. The attack's core idea involves crafting benign-looking textual prompts to circumvent text-filter safeguards and leverage a key model vulnerability: flawed attribute binding that stems from its internal attention bias. This results in misattributing harmful descriptions to a target identity and generating NSFW outputs. To facilitate the study of this attack, we present the \textbf{Misbinding Prompt} evaluation set, which examines the content generation risks of current state-of-the-art identity-preserving models across four risk dimensions: pornography, violence, discrimination, and illegality. Additionally, we introduce the \textbf{Attribute Binding Safety Score (ABSS)}, a metric for concurrently assessing both content fidelity and safety compliance. Experimental results show that our Misbinding Prompt evaluation set achieves a \textbf{5.28}\% higher success rate in bypassing five leading text filters (including GPT-4o) compared to existing main-stream evaluation sets, while also demonstrating the highest proportion of NSFW content generation. The proposed ABSS metric enables a more comprehensive evaluation of identity-preserving models by concurrently assessing both content fidelity and safety compliance.

Paperid: 2107, https://arxiv.org/pdf/2512.15648.pdf

Abstract:
We present the Distributed High-Dimensional Matrix Mechanism (Distributed HDMM), a protocol for answering workloads of linear queries on distributed data that provides the accuracy of central-model HDMM without a trusted curator. Distributed HDMM leverages a secure aggregation protocol to evaluate HDMM on distributed data, and is secure in the context of a malicious aggregator and malicious clients (assuming an honest majority). Our preliminary empirical evaluation shows that Distributed HDMM can run on realistic datasets and workloads with thousands of clients in less than one minute.

Paperid: 2108, https://arxiv.org/pdf/2512.12559.pdf

Abstract:
The widespread adoption of open-source ecosystems enables developers to integrate third-party packages, but also exposes them to malicious packages crafted to execute harmful behavior via public repositories such as PyPI. Existing datasets (e.g., pypi-malregistry, DataDog, OpenSSF, MalwareBench) label packages as malicious or benign at the package level, but do not specify which statements implement malicious behavior. This coarse granularity limits research and practice: models cannot be trained to localize malicious code, detectors cannot justify alerts with code-level evidence, and analysts cannot systematically study recurring malicious indicators or attack chains. To address this gap, we construct a statement-level dataset of 370 malicious Python packages (833 files, 90,527 lines) with 2,962 labeled occurrences of malicious indicators. From these annotations, we derive a fine-grained taxonomy of 47 malicious indicators across 7 types that capture how adversarial behavior is implemented in code, and we apply sequential pattern mining to uncover recurring indicator sequences that characterize common attack workflows. Our contribution enables explainable, behavior-centric detection and supports both semantic-aware model training and practical heuristics for strengthening software supply-chain defenses.

Paperid: 2109, https://arxiv.org/pdf/2512.12086.pdf

Abstract:
Data obfuscation is a promising technique for mitigating attribute inference attacks by semi-trusted parties with access to time-series data emitted by sensors. Recent advances leverage conditional generative models together with adversarial training or mutual information-based regularization to balance data privacy and utility. However, these methods often require modifying the downstream task, struggle to achieve a satisfactory privacy-utility trade-off, or are computationally intensive, making them impractical for deployment on resource-constrained mobile IoT devices. We propose Cloak, a novel data obfuscation framework based on latent diffusion models. In contrast to prior work, we employ contrastive learning to extract disentangled representations, which guide the latent diffusion process to retain useful information while concealing private information. This approach enables users with diverse privacy needs to navigate the privacy-utility trade-off with minimal retraining. Extensive experiments on four public time-series datasets, spanning multiple sensing modalities, and a dataset of facial images demonstrate that Cloak consistently outperforms state-of-the-art obfuscation techniques and is well-suited for deployment in resource-constrained settings.

Paperid: 2110, https://arxiv.org/pdf/2512.12070.pdf

Abstract:
Radio frequency fingerprint identification (RFFI) is an emerging method for authenticating Internet of Things (IoT) devices. RFFI exploits the intrinsic and unique hardware imperfections for classifying IoT devices. Deep learning-based RFFI has shown excellent performance. However, there are still remaining research challenges, such as limited public training datasets as well as impacts of channel and receive effects. In this paper, we proposed a three-stage RFFI approach involving contrastive learning-enhanced pretraining, Siamese network-based classification network training, and inference. Specifically, we employed spectrogram as signal representation to decouple the transmitter impairments from channel effects and receiver impairments. We proposed an unsupervised contrastive learning method to pretrain a channel-robust RFF extractor. In addition, the Siamese network-based scheme is enhanced by data augmentation and contrastive loss, which is capable of jointly mitigating the effects of channel and receiver impairments. We carried out a comprehensive experimental evaluation using three public LoRa datasets and one self-collected LoRa dataset. The results demonstrated that our approach can effectively and simultaneously mitigate the effects of channel and receiver impairments. We also showed that pretraining can significantly reduce the required amount of the fine-tuning data. Our proposed approach achieved an accuracy of over 90% in dynamic non-line-of-sight (NLOS) scenarios when there are only 20 packets per device.

Paperid: 2111, https://arxiv.org/pdf/2512.12002.pdf

Abstract:
Radio frequency fingerprint identification (RFFI) is an emerging technique for the lightweight authentication of wireless Internet of things (IoT) devices. RFFI exploits deep learning models to extract hardware impairments to uniquely identify wireless devices. Recent studies show deep learning-based RFFI is vulnerable to adversarial attacks. However, effective adversarial attacks against different types of RFFI classifiers have not yet been explored. In this paper, we carried out a comprehensive investigations into different adversarial attack methods on RFFI systems using various deep learning models. Three specific algorithms, fast gradient sign method (FGSM), projected gradient descent (PGD), and universal adversarial perturbation (UAP), were analyzed. The attacks were launched to LoRa-RFFI and the experimental results showed the generated perturbations were effective against convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and gated recurrent units (GRU). We further used UAP to launch practical attacks. Special factors were considered for the wireless context, including implementing real-time attacks, the effectiveness of the attacks over a period of time, etc. Our experimental evaluation demonstrated that UAP can successfully launch adversarial attacks against the RFFI, achieving a success rate of 81.7% when the adversary almost has no prior knowledge of the victim RFFI systems.

Paperid: 2112, https://arxiv.org/pdf/2512.11878.pdf

Abstract:
Decentralized AI systems, such as federated learning, can play a critical role in further unlocking AI asset marketplaces (e.g., healthcare data marketplaces) thanks to increased asset privacy protection. Unlocking this big potential necessitates governance mechanisms that are transparent, scalable, and verifiable. However current governance approaches rely on bespoke, infrastructure-specific policies that hinder asset interoperability and trust among systems. We are proposing a Technical Policy Blueprint that encodes governance requirements as policy-as-code objects and separates asset policy verification from asset policy enforcement. In this architecture the Policy Engine verifies evidence (e.g., identities, signatures, payments, trusted-hardware attestations) and issues capability packages. Asset Guardians (e.g. data guardians, model guardians, computation guardians, etc.) enforce access or execution solely based on these capability packages. This core concept of decoupling policy processing from capabilities enables governance to evolve without reconfiguring AI infrastructure, thus creating an approach that is transparent, auditable, and resilient to change.

Paperid: 2113, https://arxiv.org/pdf/2512.11783.pdf

Abstract:
The rapid deployment of Large Language Models (LLMs) has created an urgent need for enhanced security and privacy measures in Machine Learning (ML). LLMs are increasingly being used to process untrusted text inputs and even generate executable code, often while having access to sensitive system controls. To address these security concerns, several companies have introduced guard models, which are smaller, specialized models designed to protect text generation models from adversarial or malicious inputs. In this work, we advance the study of adversarial inputs by introducing Super Suffixes, suffixes capable of overriding multiple alignment objectives across various models with different tokenization schemes. We demonstrate their effectiveness, along with our joint optimization technique, by successfully bypassing the protection mechanisms of Llama Prompt Guard 2 on five different text generation models for malicious text and code generation. To the best of our knowledge, this is the first work to reveal that Llama Prompt Guard 2 can be compromised through joint optimization. Additionally, by analyzing the changing similarity of a model's internal state to specific concept directions during token sequence processing, we propose an effective and lightweight method to detect Super Suffix attacks. We show that the cosine similarity between the residual stream and certain concept directions serves as a distinctive fingerprint of model intent. Our proposed countermeasure, DeltaGuard, significantly improves the detection of malicious prompts generated through Super Suffixes. It increases the non-benign classification rate to nearly 100%, making DeltaGuard a valuable addition to the guard model stack and enhancing robustness against adversarial prompt attacks.

Paperid: 2114, https://arxiv.org/pdf/2512.11147.pdf

Abstract:
Tool calling agents are an emerging paradigm in LLM deployment, with major platforms such as ChatGPT, Claude, and Gemini adding connectors and autonomous capabilities. However, the inherent unreliability of LLMs introduces fundamental security risks when these agents operate over sensitive user services. Prior approaches either rely on manually written policies that require security expertise, or place LLMs in the confinement loop, which lacks rigorous security guarantees. We present MiniScope, a framework that enables tool calling agents to operate on user accounts while confining potential damage from unreliable LLMs. MiniScope introduces a novel way to automatically and rigorously enforce least privilege principles by reconstructing permission hierarchies that reflect relationships among tool calls and combining them with a mobile-style permission model to balance security and ease of use. To evaluate MiniScope, we create a synthetic dataset derived from ten popular real-world applications, capturing the complexity of realistic agentic tasks beyond existing simplified benchmarks. Our evaluation shows that MiniScope incurs only 1-6% latency overhead compared to vanilla tool calling agents, while significantly outperforming the LLM based baseline in minimizing permissions as well as computational and operational costs.

Paperid: 2115, https://arxiv.org/pdf/2512.10487.pdf

Abstract:
Cyber Ranges (CRs) have emerged as prominent platforms for cybersecurity training and education, especially for Critical Infrastructure (CI) sectors that face rising cyber threats. One way to address these threats is through hands-on exercises that bridge IT and OT domains to improve defensive readiness. However, consistently evaluating whether a CR platform is suitable and effective remains a challenge. This paper proposes an evaluation framework for CRs, emphasizing mission-critical settings by using a multi-criteria decision-making approach. We define a set of evaluation criteria that capture technical fidelity, training and assessment capabilities, scalability, usability, and other relevant factors. To weight and aggregate these criteria, we employ the Analytic Hierarchy Process (AHP), supported by a simulated panel of multidisciplinary experts implemented through a Large Language Model (LLM). This LLM-assisted expert reasoning enables consistent and reproducible pairwise comparisons across criteria without requiring direct expert convening. The framework's output equals quantitative scores that facilitate objective comparison of CR platforms and highlight areas for improvement. Overall, this work lays the foundation for a standardized and explainable evaluation methodology to guide both providers and end-users of CRs.

Paperid: 2116, https://arxiv.org/pdf/2512.08623.pdf

Abstract:
We propose a new secret communication scheme over the bosonic wiretap channel. It uses readily available hardware such as lasers and direct photodetectors. The scheme is based on randomness extractors, pulse-position modulation, and Reed-Solomon codes and is therefore computationally efficient. It is secure against an eavesdropper performing coherent joint measurements on the quantum states it observes. In the low-photon-flow limit, the scheme is asymptotically optimal and achieves the same dominant term as the secrecy capacity of the same channel.

Paperid: 2117, https://arxiv.org/pdf/2512.06914.pdf

Abstract:
Large Language Models (LLMs) are rapidly evolving into autonomous agents capable of interacting with the external world, significantly expanding their capabilities through standardized interaction protocols. However, this paradigm revives the classic cybersecurity challenges of agency and authorization in a novel and volatile context. As decision-making shifts from deterministic code logic to probabilistic inference driven by natural language, traditional security mechanisms designed for deterministic behavior fail. It is fundamentally challenging to establish trust for unpredictable AI agents and to enforce the Principle of Least Privilege (PoLP) when instructions are ambiguous. Despite the escalating threat landscape, the academic community's understanding of this emerging domain remains fragmented, lacking a systematic framework to analyze its root causes. This paper provides a unifying formal lens for agent-interaction security. We observed that most security threats in this domain stem from a fundamental mismatch between trust evaluation and authorization policies. We introduce a novel risk analysis model centered on this trust-authorization gap. Using this model as a unifying lens, we survey and classify the implementation paths of existing, often seemingly isolated, attacks and defenses. This new framework not only unifies the field but also allows us to identify critical research gaps. Finally, we leverage our analysis to suggest a systematic research direction toward building robust, trusted agents and dynamic authorization mechanisms.

Paperid: 2118, https://arxiv.org/pdf/2512.05951.pdf

Abstract:
AI agents powered by large language models are increasingly deployed as cloud services that autonomously access sensitive data, invoke external tools, and interact with other agents. However, these agents run within a complex multi-party ecosystem, where untrusted components can lead to data leakage, tampering, or unintended behavior. Existing Confidential Virtual Machines (CVMs) provide only per binary protection and offer no guarantees for cross-principal trust, accelerator-level isolation, or supervised agent behavior. We present Omega, a system that enables trusted AI agents by enforcing end-to-end isolation, establishing verifiable trust across all contributing principals, and supervising every external interaction with accountable provenance. Omega builds on Confidential VMs and Confidential GPUs to create a Trusted Agent Platform that hosts many agents within a single CVM using nested isolation. It also provides efficient multi-agent orchestration with cross-principal trust establishment via differential attestation, and a policy specification and enforcement framework that governs data access, tool usage, and inter-agent communication for data protection and regulatory compliance. Implemented on AMD SEV-SNP and NVIDIA H100, Omega fully secures agent state across CVM-GPU, and achieves high performance while enabling high-density, policy-compliant multi-agent deployments at cloud scale.

Paperid: 2119, https://arxiv.org/pdf/2512.05707.pdf

Abstract:
We evaluate the effectiveness of child filtering to prevent the misuse of text-to-image (T2I) models to create child sexual abuse material (CSAM). First, we capture the complexity of preventing CSAM generation using a game-based security definition. Second, we show that current detection methods cannot remove all children from a dataset. Third, using an ethical proxy for CSAM (a child wearing glasses, hereafter, CWG), we show that even when only a small percentage of child images are left in the training dataset, there exist prompting strategies that generate CWG from a child-filtered T2I model using only a few more queries than when the model is trained on the unfiltered data. Fine-tuning the filtered model on child images further reduces the additional query overhead. We also show that reintroducing a concept is possible via fine-tuning even if filtering is perfect. Our results demonstrate that current filtering methods offer limited protection to closed-weight models and no protection to open-weight models, while reducing the generality of the model by hindering the generation of child-related concepts or changing their representation. We conclude by outlining challenges in conducting evaluations that establish robust evidence on the impact of AI safety mitigations for CSAM.

Paperid: 2120, https://arxiv.org/pdf/2512.04338.pdf

Abstract:
The rise of supply chain attacks via malicious Python packages demands robust detection solutions. Current approaches, however, overlook two critical challenges: robustness against adversarial source code transformations and adaptability to the varying false positive rate (FPR) requirements of different actors, from repository maintainers (requiring low FPR) to enterprise security teams (higher FPR tolerance). We introduce a robust detector capable of seamless integration into both public repositories like PyPI and enterprise ecosystems. To ensure robustness, we propose a novel methodology for generating adversarial packages using fine-grained code obfuscation. Combining these with adversarial training (AT) enhances detector robustness by 2.5x. We comprehensively evaluate AT effectiveness by testing our detector against 122,398 packages collected daily from PyPI over 80 days, showing that AT needs careful application: it makes the detector more robust to obfuscations and allows finding 10% more obfuscated packages, but slightly decreases performance on non-obfuscated packages. We demonstrate production adaptability of our detector via two case studies: (i) one for PyPI maintainers (tuned at 0.1% FPR) and (ii) one for enterprise teams (tuned at 10% FPR). In the former, we analyze 91,949 packages collected from PyPI over 37 days, achieving a daily detection rate of 2.48 malicious packages with only 2.18 false positives. In the latter, we analyze 1,596 packages adopted by a multinational software company, obtaining only 1.24 false positives daily. These results show that our detector can be seamlessly integrated into both public repositories like PyPI and enterprise ecosystems, ensuring a very low time budget of a few minutes to review the false positives. Overall, we uncovered 346 malicious packages, now reported to the community.

Paperid: 2121, https://arxiv.org/pdf/2512.03868.pdf

Abstract:
Open-source libraries are widely used by software developers to speed up the development of products, however, they can introduce security vulnerabilities, leading to incidents like Log4Shell. With the expanding usage of open-source libraries, it becomes even more imperative to comprehend and address these dependency vulnerabilities. The use of Software Composition Analysis (SCA) tools does greatly help here as they provide a deep insight on what dependencies are used in a project, enhancing the security and integrity in the software supply chain. In order to learn how wide spread vulnerabilities are and how quickly they are being fixed, we conducted a study on over 1k open-source software projects with about 50k releases comprising several languages such as Java, Python, Rust, Go, Ruby, PHP, and JavaScript. Our objective is to investigate the severity, persistence, and distribution of these vulnerabilities, as well as their correlation with project metrics such as team and contributors size, activity and release cycles. In order to perform such analysis, we crawled over 1k projects from github including their version history ranging from 2013 to 2023 using VODA, our SCA tool. Using our approach, we can provide information such as library versions, dependency depth, and known vulnerabilities, and how they evolved over the software development cycle. Being larger and more diverse than datasets used in earlier works and studies, ours provides better insights and generalizability of the gained results. The data collected answers several research questions about the dependency depth and the average time a vulnerability persists. Among other findings, we observed that for most programming languages, vulnerable dependencies are transitive, and a critical vulnerability persists in average for over a year before being fixed.

Paperid: 2122, https://arxiv.org/pdf/2512.03536.pdf

Abstract:
This work presents an experimental performance evaluation of a private 5G airfield network under controlled directional SDR jamming attacks targeting UAV-based UE nodes. Using a QualiPoc Android UE, mounted as a payload on a quadcopter UAV, we conducted a series of experiments to evaluate signal degradation, handover performance, and ser-vice stability in the presence of constant directional jamming. The conducted experiments aimed to examine the effects of varying travel speeds, altitudes, and moving patterns of a UAV-based UE to record and analyze the key physical-layer and network-layer metrics such as CQI, MCS, RSRP, SINR, BLER, Net PDSCH Throughput and RLF. The re-sults of this work describe the link stability and signal degradation dependencies, caused by the level of mobility of the UAV-based UE nodes during autonomous and automatic operation in private 5G Airfield networks

Paperid: 2123, https://arxiv.org/pdf/2512.03420.pdf

Abstract:
Large language model (LLM)-based techniques have achieved notable progress in generating harnesses for program fuzzing. However, applying them to arbitrary functions (especially internal functions) \textit{at scale} remains challenging due to the requirement of sophisticated contextual information, such as specification, dependencies, and usage examples. State-of-the-art methods heavily rely on static or incomplete context provisioning, causing failure of generating functional harnesses. Furthermore, LLMs tend to exploit harness validation metrics, producing plausible yet logically useless code. % Therefore, harness generation across large and diverse projects continues to face challenges in reliable compilation, robust code retrieval, and comprehensive validation. To address these challenges, we present HarnessAgent, a tool-augmented agentic framework that achieves fully automated, scalable harness construction over hundreds of OSS-Fuzz targets. HarnessAgent introduces three key innovations: 1) a rule-based strategy to identify and minimize various compilation errors; 2) a hybrid tool pool for precise and robust symbol source code retrieval; and 3) an enhanced harness validation pipeline that detects fake definitions. We evaluate HarnessAgent on 243 target functions from OSS-Fuzz projects (65 C projects and 178 C++ projects). It improves the three-shot success rate by approximately 20\% compared to state-of-the-art techniques, reaching 87\% for C and 81\% for C++. Our one-hour fuzzing results show that more than 75\% of the harnesses generated by HarnessAgent increase the target function coverage, surpassing the baselines by over 10\%. In addition, the hybrid tool-pool system of HarnessAgent achieves a response rate of over 90\% for source code retrieval, outperforming Fuzz Introspector by more than 30\%.

Paperid: 2124, https://arxiv.org/pdf/2512.02534.pdf

Abstract:
USDT, a stablecoin pegged to dollar, has become a preferred choice for money laundering due to its stability, anonymity, and ease of use. Notably, a new form of money laundering on stablecoins -- we refer to as crowdsourcing laundering -- disperses funds through recruiting a large number of ordinary individuals, and has rapidly emerged as a significant threat. However, due to the refined division of labor, crowdsourcing laundering transactions exhibit diverse patterns and a polycentric structure, posing significant challenges for detection. In this paper, we introduce transaction group as auxiliary information, and propose the Multi-Task Collaborative Crowdsourcing Laundering Detection (MCCLD) framework. MCCLD employs an end-to-end graph neural network to realize collaboration between laundering transaction detection and transaction group detection tasks, enhancing detection performance on diverse patterns within crowdsourcing laundering group. These two tasks are jointly optimized through a shared classifier, with a shared feature encoder that fuses multi-level feature embeddings to provide rich transaction semantics and potential group information. Extensive experiments on both crowdsourcing and general laundering demonstrate MCCLD's effectiveness and generalization. To the best of our knowledge, this is the first work on crowdsourcing laundering detection.

Paperid: 2125, https://arxiv.org/pdf/2512.01596.pdf

Abstract:
Securing the near-real-time (near-RT) control operations in Open Radio Access Networks (Open RAN) is increasingly critical, yet remains insufficiently addressed, as new runtime threats target the control loop while the system is operational. In this paper, we propose a multi-layer defence framework designed to enhance the security of near-RT RAN Intelligent Controller (RIC) operations. We classify operational-time threats into three categories, message-level, data-level, and control logic-level, and design and implement a dedicated detection and mitigation component for each: a signature-based E2 message inspection module performing structural and semantic validation of signalling exchanges, a telemetry poisoning detector based on temporal anomaly scoring using an LSTM network, and a runtime xApp attestation mechanism based on execution-time hash challenge-response. The framework is evaluated on an O-RAN testbed comprising FlexRIC and a commercial RAN emulator, demonstrating effective detection rates, low latency overheads, and practical integration feasibility. Results indicate that the proposed safeguards can operate within near-RT time constraints while significantly improving protection against runtime attacks, introducing less than 80 ms overhead for a network with 500 User Equipment (UEs). Overall, this work lays the foundation for deployable, layered, and policy-driven runtime security architectures for the near-RT RIC control loop in Open RAN, and provides an extensible framework into which future mitigation policies and threat-specific modules can be integrated.

Paperid: 2126, https://arxiv.org/pdf/2512.01594.pdf

Abstract:
Confidential Virtual Machines (CVMs) are increasingly adopted to protect sensitive workloads from privileged adversaries such as the hypervisor. While they provide strong isolation guarantees, existing CVM architectures lack first-class mechanisms for inter-CVM data sharing due to their disjoint memory model, making inter-CVM data exchange a performance bottleneck in compartmentalized or collaborative multi-CVM systems. Under this model, a CVM's accessible memory is either shared with the hypervisor or protected from both the hypervisor and all other CVMs. This design simplifies reasoning about memory ownership; however, it fundamentally precludes plaintext data sharing between CVMs because all inter-CVM communication must pass through hypervisor-accessible memory, requiring costly encryption and decryption to preserve confidentiality and integrity. In this paper, we introduce CAEC, a system that enables protected memory sharing between CVMs. CAEC builds on Arm Confidential Compute Architecture (CCA) and extends its firmware to support Confidential Shared Memory (CSM), a memory region securely shared between multiple CVMs while remaining inaccessible to the hypervisor and all non-participating CVMs. CAEC's design is fully compatible with CCA hardware and introduces only a modest increase (4%) in CCA firmware code size. CAEC delivers substantial performance benefits across a range of workloads. For instance, inter-CVM communication over CAEC achieves up to 209$\times$ reduction in CPU cycles compared to encryption-based mechanisms over hypervisor-accessible shared memory. By combining high performance, strong isolation guarantees, and attestable sharing semantics, CAEC provides a practical and scalable foundation for the next generation of trusted multi-CVM services across both edge and cloud environments.

Paperid: 2127, https://arxiv.org/pdf/2512.01391.pdf

Abstract:
Cybercriminals have long depended on domain names for phishing, spam, malware distribution, and botnet operation. To facilitate the malicious activities, they continually register new domain names for exploitation. Previous work revealed an abnormally high concentration of malicious registrations in a handful of domain name registrars and top-level domains (TLDs). Anecdotal evidence suggests that low registration prices attract cybercriminals, implying that higher costs may potentially discourage them. However, no existing study has systematically analyzed the factors driving abuse, leaving a critical gap in understanding how different variables influence malicious registrations. In this report, we carefully distill the inclinations and aversions of malicious actors during the registration of new phishing domain names. We compile a comprehensive list of 73 features encompassing three main latent factors: registration attributes, proactive verification, and reactive security practices. Through a GLM regression analysis, we find that each dollar reduction in registration fees corresponds to a 49% increase in malicious domains. The availability of free services, such as web hosting, drives an 88% surge in phishing activities. Conversely, stringent restrictions cut down abuse by 63%, while registrars providing API access for domain registration or account creation experience a staggering 401% rise in malicious domains. This exploration may assist intermediaries involved in domain registration to develop tailored anti-abuse practices, yet aligning them with their economic incentives.

Paperid: 2128, https://arxiv.org/pdf/2512.01115.pdf

Abstract:
We study privatization mechanism design and privacy accounting in the Pufferfish family, addressing two practical gaps of Renyi Pufferfish Privacy (RPP): high-dimensional optimal transport (OT) calibration and the absence of a general, mechanism-agnostic composition rule for iterative learning. We introduce Sliced Renyi Pufferfish Privacy (SRPP), which replaces high-dimensional comparisons by directional ones over a set of unit vectors, enabling geometry-aware and tractable guarantees. To calibrate noise without high-dimensional OT, we propose sliced Wasserstein mechanisms that compute per-direction (1-D) sensitivities, yielding closed-form, statistically stable, and anisotropic calibrations. We further define SRPP Envelope (SRPE) as computable upper bounds that are tightly implementable by these sliced Wasserstein mechanisms. For iterative deep learning algorithms, we develop a decompose-then-compose SRPP-SGD scheme with gradient clipping based on a History-Uniform Cap (HUC), a pathwise bound on one-step directional changes that is uniform over optimization history, and a mean-square variant (ms-HUC) that leverages subsampling randomness to obtain on-average SRPP guarantees with improved utility. The resulting HUC and ms-HUC accountants aggregate per-iteration, per-direction Renyi costs and integrate naturally with moments-accountant style analyses. Finally, when multiple mechanisms are trained and privatized independently under a common slicing geometry, our analysis yields graceful additive composition in both worst-case and mean-square regimes. Our experiments indicate that the proposed SRPP-based methods achieve favorable privacy-utility trade-offs in both static and iterative settings.

Paperid: 2129, https://arxiv.org/pdf/2512.00699.pdf

Abstract:
Variational quantum circuits face a critical trade-off between privacy and trainability. High expressivity required for robust privacy induces exponentially large dynamical Lie algebras. This structure inevitably leads to barren plateaus. Conversely, trainable models restricted to polynomial-sized algebras remain transparent to algebraic attacks. To resolve this impasse, DyLoC is proposed. This dual-layer architecture employs an orthogonal decoupling strategy. Trainability is anchored to a polynomial-DLA ansatz while privacy is externalized to the input and output interfaces. Specifically, Truncated Chebyshev Graph Encoding (TCGE) is employed to thwart snapshot inversion. Dynamic Local Scrambling (DLS) is utilized to obfuscate gradients. Experiments demonstrate that DyLoC maintains baseline-level convergence with a final loss of 0.186. It outperforms the baseline by increasing the gradient reconstruction error by 13 orders of magnitude. Furthermore, snapshot inversion attacks are blocked when the reconstruction mean squared error exceeds 2.0. These results confirm that DyLoC effectively establishes a verifiable pathway for secure and trainable quantum machine learning.

Paperid: 2130, https://arxiv.org/pdf/2512.00645.pdf

Abstract:
Digital forensics faces unprecedented challenges with the emergence of digital twins and metaverse technologies. This paper presents the first comparative analysis between blockchain-based and traditional database systems for managing digital twin evidence in forensic investigations. We conducted controlled experiments comparing the Ethereum blockchain with IPFS storage against traditional SQL databases for digital twin evidence management. Our findings reveal that while blockchain provides superior data integrity and immutability, crucial for forensic applications, traditional databases offer better performance consistency. The blockchain implementation showed faster average storage times but higher variability in retrieval operations. Both systems maintained forensic integrity through hash verification, though blockchain's immutable nature provides additional security guarantees essential for legal proceedings. This research contributes to the development of robust digital forensic methodologies for emerging technologies in the metaverse era.

Paperid: 2131, https://arxiv.org/pdf/2511.22700.pdf

Abstract:
In recent years, large language models (LLMs) have achieved substantial advancements and are increasingly integrated into critical applications across various domains. This growing adoption underscores the need to ensure their security and robustness. In this work, we focus on the impact of Bit Flip Attacks (BFAs) on LLMs, which exploits hardware faults to corrupt model parameters, posing a significant threat to model integrity and performance. Existing studies on BFA against LLMs adopt a progressive bit-search strategy that predominantly relies on gradient-based techniques to identify sensitive layers or weights. However, computing gradients comes with two specific challenges: First, in the context of LLMs, it increases computational and memory costs exponentially, and Second, it requires access to a sample victim dataset or knowledge of the victim domain to compute the gradient. In this work, we investigate beyond the scope of attack efficacy and aim to develop an efficient, practical Gradient-Data-free Bit-Flip Attack. The challenge lies in the core principle of adversarial attacks, which relies heavily on computing gradients from sample test/train data and manipulating model weights based on gradient information. To overcome this, we propose novel vulnerability index metrics that can identify vulnerable weight bits in LLMs independent of any gradient or data knowledge. By removing the dependency on gradient computation, our approach drastically reduces memory requirements and scales efficiently across multiple tasks with constant complexity. Experimental results demonstrate the efficiency of our method, requiring as few as a single bit flip to achieve adversarial objectives for five open-source LLMs.

Paperid: 2132, https://arxiv.org/pdf/2511.19649.pdf

Abstract:
The ever-increasing number of Android devices and the accelerated evolution of malware, reaching over 35 million samples by 2024, highlight the critical importance of effective detection methods. Attackers are now using Artificial Intelligence to create sophisticated malware variations that can easily evade traditional detection techniques. Although machine learning has shown promise in malware classification, its success relies heavily on the availability of up-to-date, high-quality datasets. The scarcity and high cost of obtaining and labeling real malware samples presents significant challenges in developing robust detection models. In this paper, we propose MalSynGen, a Malware Synthetic Data Generation methodology that uses a conditional Generative Adversarial Network (cGAN) to generate synthetic tabular data. This data preserves the statistical properties of real-world data and improves the performance of Android malware classifiers. We evaluated the effectiveness of this approach using various datasets and metrics that assess the fidelity of the generated data, its utility in classification, and the computational efficiency of the process. Our experiments demonstrate that MalSynGen can generalize across different datasets, providing a viable solution to address the issues of obsolescence and low quality data in malware detection.

Paperid: 2133, https://arxiv.org/pdf/2511.19257.pdf

Abstract:
With the rapid advancement of retrieval-augmented vision-language models, multimodal medical retrieval-augmented generation (MMed-RAG) systems are increasingly adopted in clinical decision support. These systems enhance medical applications by performing cross-modal retrieval to integrate relevant visual and textual evidence for tasks, e.g., report generation and disease diagnosis. However, their complex architecture also introduces underexplored adversarial vulnerabilities, particularly via visual input perturbations. In this paper, we propose Medusa, a novel framework for crafting cross-modal transferable adversarial attacks on MMed-RAG systems under a black-box setting. Specifically, Medusa formulates the attack as a perturbation optimization problem, leveraging a multi-positive InfoNCE loss (MPIL) to align adversarial visual embeddings with medically plausible but malicious textual targets, thereby hijacking the retrieval process. To enhance transferability, we adopt a surrogate model ensemble and design a dual-loop optimization strategy augmented with invariant risk minimization (IRM). Extensive experiments on two real-world medical tasks, including medical report generation and disease diagnosis, demonstrate that Medusa achieves over 90% average attack success rate across various generation models and retrievers under appropriate parameter configuration, while remaining robust against four mainstream defenses, outperforming state-of-the-art baselines. Our results reveal critical vulnerabilities in the MMed-RAG systems and highlight the necessity of robustness benchmarking in safety-critical medical applications. The code and data are available at https://anonymous.4open.science/r/MMed-RAG-Attack-F05A.

Paperid: 2134, https://arxiv.org/pdf/2511.17070.pdf

Abstract:
During the past few years, we have witnessed various efforts to provide confidentiality and integrity for applications running in untrusted environments such as public clouds. In most of these approaches, hardware extensions such as Intel SGX, TDX, AMD SEV, etc., are leveraged to provide encryption and integrity protection on process or VM level. Although all of these approaches increase the trust in the application at runtime, an often overlooked aspect is the integrity and confidentiality protection at build time, which is equally important as maliciously injected code during compilation can compromise the entire application and system. In this paper, we present Tical, a practical framework for trusted compilation that provides integrity protection and confidentiality in build pipelines from source code to the final executable. Our approach harnesses TEEs as runtime protection but enriches TEEs with file system shielding and an immutable audit log with version history to provide accountability. This way, we can ensure that the compiler chain can only access trusted files and intermediate output, such as object files produced by trusted processes. Our evaluation using micro- and macro-benchmarks shows that Tical can protect the confidentiality and integrity of whole CI/CD pipelines with an acceptable performance overhead.

Paperid: 2135, https://arxiv.org/pdf/2511.16940.pdf

Abstract:
Modern Vision-Language Models (VLMs) demonstrate sophisticated reasoning, escalating privacy risks beyond simple attribute perception to individual-level linkage. Current privacy benchmarks are structurally insufficient for this new threat, as they primarily evaluate privacy perception while failing to address the more critical risk of privacy reasoning: a VLM's ability to infer and link distributed information to construct individual profiles. To address this critical gap, we propose \textbf{MultiPriv}, the first benchmark designed to systematically evaluate individual-level privacy reasoning in VLMs. We introduce the \textbf{Privacy Perception and Reasoning (PPR)} framework and construct a novel, bilingual multimodal dataset to support it. The dataset uniquely features a core component of synthetic individual profiles where identifiers (e.g., faces, names) are meticulously linked to sensitive attributes. This design enables nine challenging tasks evaluating the full PPR spectrum, from attribute detection to cross-image re-identification and chained inference. We conduct a large-scale evaluation of over 50 foundational and commercial VLMs. Our analysis reveals: (1) Many VLMs possess significant, unmeasured reasoning-based privacy risks. (2) Perception-level metrics are poor predictors of these reasoning risks, revealing a critical evaluation gap. (3) Existing safety alignments are inconsistent and ineffective against such reasoning-based attacks. MultiPriv exposes systemic vulnerabilities and provides the necessary framework for developing robust, privacy-preserving VLMs.

Paperid: 2136, https://arxiv.org/pdf/2511.16765.pdf

Abstract:
Detecting kinetic vulnerabilities in Cyber-Physical Systems (CPS), vulnerabilities in control code that can precipitate hazardous physical consequences, is a critical challenge. This task is complicated by the need to analyze the intricate coupling between complex software behavior and the system's physical dynamics. Furthermore, the periodic execution of control code in CPS applications creates a combinatorial explosion of execution paths that must be analyzed over time, far exceeding the scope of traditional single-run code analysis. This paper introduces RampoNN, a novel framework that systematically identifies kinetic vulnerabilities given the control code, a physical system model, and a Signal Temporal Logic (STL) specification of safe behavior. RampoNN first analyzes the control code to map the control signals that can be generated under various execution branches. It then employs a neural network to abstract the physical system's behavior. To overcome the poor scaling and loose over-approximations of standard neural network reachability, RampoNN uniquely utilizes Deep Bernstein neural networks, which are equipped with customized reachability algorithms that yield orders of magnitude tighter bounds. This high-precision reachability analysis allows RampoNN to rapidly prune large sets of guaranteed-safe behaviors and rank the remaining traces by their potential to violate the specification. The results of this analysis are then used to effectively guide a falsification engine, focusing its search on the most promising system behaviors to find actual vulnerabilities. We evaluated our approach on a PLC-controlled water tank system and a switched PID controller for an automotive engine. The results demonstrate that RampoNN leads to acceleration of the process of finding kinetic vulnerabilities by up to 98.27% and superior scalability compared to other state-of-the-art methods.

Paperid: 2137, https://arxiv.org/pdf/2511.15730.pdf

Abstract:
This paper presents a comprehensive comparative analysis of two dominant blockchain consensus mechanisms, Proof of Work (PoW) and Proof of Stake (PoS), evaluated across seven critical metrics: energy use, security, transaction speed, scalability, centralization risk, environmental impact, and transaction fees. Utilizing recent academic research and real-world blockchain data, the study highlights that PoW offers robust, time-tested security but suffers from high energy consumption, slower throughput, and centralization through mining pools. In contrast, PoS demonstrates improved scalability and efficiency, significantly reduced environmental impact, and more stable transaction fees, however it raises concerns over validator centralization and long-term security maturity. The findings underscore the trade-offs inherent in each mechanism and suggest hybrid designs may combine PoW's security with PoS's efficiency and sustainability. The study aims to inform future blockchain infrastructure development by striking a balance between decentralization, performance, and ecological responsibility.

Paperid: 2138, https://arxiv.org/pdf/2511.15051.pdf

Abstract:
The shuffle model enhances privacy by anonymizing users' reports through random permutation. This paper presents the first systematic study of the single-message shuffle model from an information-theoretic perspective. We analyze two regimes: the shuffle-only setting, where each user directly submits its message ($Y_i=X_i$), and the shuffle-DP setting, where each user first applies a local $\varepsilon_0$-differentially private mechanism before shuffling ($Y_i=\mathcal{R}(X_i)$). Let $\boldsymbol{Z} = (Y_{σ(i)})_i$ denote the shuffled sequence produced by a uniformly random permutation $σ$, and let $K = σ^{-1}(1)$ represent the position of user 1's message after shuffling. For the shuffle-only setting, we focus on a tractable yet expressive \emph{basic configuration}, where the target user's message follows $Y_1 \sim P$ and the remaining users' messages are i.i.d.\ samples from $Q$, i.e., $Y_2,\dots,Y_n \sim Q$. We derive asymptotic expressions for the mutual information quantities $I(Y_1;\boldsymbol{Z})$ and $I(K;\boldsymbol{Z})$ as $n \to \infty$, and demonstrate how this analytical framework naturally extends to settings with heterogeneous user distributions. For the shuffle-DP setting, we establish information-theoretic upper bounds on total information leakage. When each user applies an $\varepsilon_0$-DP mechanism, the overall leakage satisfies $I(K; \boldsymbol{Z}) \le 2\varepsilon_0$ and $I(X_1; \boldsymbol{Z}\mid (X_i)_{i=2}^n) \le (e^{\varepsilon_0}-1)/(2n) + O(n^{-3/2})$. These results bridge shuffle differential privacy and mutual-information-based privacy.

Paperid: 2139, https://arxiv.org/pdf/2511.14422.pdf

Abstract:
In decentralized machine learning paradigms such as Split Federated Learning (SFL) and its variant U-shaped SFL, the server's capabilities are severely restricted. Although this enhances client-side privacy, it also leaves the server highly vulnerable to model theft by malicious clients. Ensuring intellectual property protection for such capability-limited servers presents a dual challenge: watermarking schemes that depend on client cooperation are unreliable in adversarial settings, whereas traditional server-side watermarking schemes are technically infeasible because the server lacks access to critical elements such as model parameters or labels. To address this challenge, this paper proposes Sigil, a mandatory watermarking framework designed specifically for capability-limited servers. Sigil defines the watermark as a statistical constraint on the server-visible activation space and embeds the watermark into the client model via gradient injection, without requiring any knowledge of the data. Besides, we design an adaptive gradient clipping mechanism to ensure that our watermarking process remains both mandatory and stealthy, effectively countering existing gradient anomaly detection methods and a specifically designed adaptive subspace removal attack. Extensive experiments on multiple datasets and models demonstrate Sigil's fidelity, robustness, and stealthiness.

Paperid: 2140, https://arxiv.org/pdf/2511.14074.pdf

Abstract:
Sensor data-based recognition systems are widely used in various applications, such as gait-based authentication and human activity recognition (HAR). Modern wearable and smart devices feature various built-in Inertial Measurement Unit (IMU) sensors, and such sensor-based measurements can be fed to a machine learning-based model to train and classify human activities. While deep learning-based models have proven successful in classifying human activity and gestures, they pose various security risks. In our paper, we discuss a novel dynamic trigger-generation technique for performing black-box adversarial attacks on sensor data-based IoT systems. Our empirical analysis shows that the attack is successful on various datasets and classifier models with minimal perturbation on the input data. We also provide a detailed comparative analysis of performance and stealthiness to various other poisoning techniques found in backdoor attacks. We also discuss some adversarial defense mechanisms and their impact on the effectiveness of our trigger-generation technique.

Paperid: 2141, https://arxiv.org/pdf/2511.13598.pdf

Abstract:
Split Federated Learning (SFL) is renowned for its privacy-preserving nature and low computational overhead among decentralized machine learning paradigms. In this framework, clients employ lightweight models to process private data locally and transmit intermediate outputs to a powerful server for further computation. However, SFL is a double-edged sword: while it enables edge computing and enhances privacy, it also introduces intellectual property ambiguity as both clients and the server jointly contribute to training. Existing watermarking techniques fail to protect both sides since no single participant possesses the complete model. To address this, we propose RISE, a Robust model Intellectual property protection scheme using client-Server watermark Embedding for SFL. Specifically, RISE adopts an asymmetric client-server watermarking design: the server embeds feature-based watermarks through a loss regularization term, while clients embed backdoor-based watermarks by injecting predefined trigger samples into private datasets. This co-embedding strategy enables both clients and the server to verify model ownership. Experimental results on standard datasets and multiple network architectures show that RISE achieves over $95\%$ watermark detection rate ($p-value \lt 0.03$) across most settings. It exhibits no mutual interference between client- and server-side watermarks and remains robust against common removal attacks.

Paperid: 2142, https://arxiv.org/pdf/2511.12936.pdf

Abstract:
In federated learning, multiple parties can cooperate to train the model without directly exchanging their own private data, but the gradient leakage problem still threatens the privacy security and model integrity. Although the existing scheme uses threshold cryptography to mitigate the inference attack, it can not guarantee the verifiability of the aggregation results, making the system vulnerable to the threat of poisoning attack. We construct a partial decryption verifiable threshold multi client function encryption scheme, and apply it to Federated learning to implement the federated learning verifiable threshold security aggregation protocol (VTSAFL). VTSAFL empowers clients to verify aggregation results, concurrently minimizing both computational and communication overhead. The size of the functional key and partial decryption results of the scheme are constant, which provides efficiency guarantee for large-scale deployment. The experimental results on MNIST dataset show that vtsafl can achieve the same accuracy as the existing scheme, while reducing the total training time by more than 40%, and reducing the communication overhead by up to 50%. This efficiency is critical for overcoming the resource constraints inherent in Internet of Things (IoT) devices.

Paperid: 2143, https://arxiv.org/pdf/2511.12739.pdf

Abstract:
Fingerprint recognition systems are widely deployed for authentication and forensic applications, but the security of stored fingerprint data remains a critical vulnerability. While many systems avoid storing raw fingerprint images in favor of minutiae-based templates, recent research shows that these templates can be reverse-engineered to reconstruct realistic fingerprint images, enabling physical spoofing attacks that compromise user identities with no means of remediation. We present ProxyPrints, the first practical defense that brings cancellable biometrics to existing fingerprint recognition systems without requiring modifications to proprietary matching software. ProxyPrints acts as a transparent middleware layer between the fingerprint scanner and the matching algorithm, transforming each scanned fingerprint into a consistent, unlinkable alias. This transformation allows biometric identities to be revoked and replaced in the event of a breach, without affecting authentication accuracy. Additionally, ProxyPrints provides organizations with breach detection capabilities by enabling the identification of out-of-band spoofing attempts involving compromised aliases. We evaluate ProxyPrints on standard benchmark datasets and commercial fingerprint recognition systems, demonstrating that it preserves matching performance while offering strong security and revocability. Our open-source implementation includes tools for alias generation and deployment in real-world pipelines, making ProxyPrints a drop-in, scalable solution for fingerprint data protection.

Paperid: 2144, https://arxiv.org/pdf/2511.11249.pdf

Abstract:
Data normalization is a crucial preprocessing step for enhancing model performance and training stability. In federated learning (FL), where data remains distributed across multiple parties during collaborative model training, normalization presents unique challenges due to the decentralized and often heterogeneous nature of the data. Traditional methods rely on either independent client-side processing, i.e., local normalization, or normalizing the entire dataset before distributing it to parties, i.e., pooled normalization. Local normalization can be problematic when data distributions across parties are non-IID, while the pooled normalization approach conflicts with the decentralized nature of FL. In this paper, we explore the adaptation of widely used normalization techniques to FL and define the term federated normalization. Federated normalization simulates pooled normalization by enabling the collaborative exchange of normalization parameters among parties. Thus, it achieves performance on par with pooled normalization without compromising data locality. However, sharing normalization parameters such as the mean introduces potential privacy risks, which we further mitigate through a robust privacy-preserving solution. Our contributions include: (i) We systematically evaluate the impact of various federated and local normalization techniques in heterogeneous FL scenarios, (ii) We propose a novel homomorphically encrypted $k$-th ranked element (and median) calculation tailored for the federated setting, enabling secure and efficient federated normalization, (iii) We propose privacy-preserving implementations of widely used normalization techniques for FL, leveraging multiparty fully homomorphic encryption (MHE).

Paperid: 2145, https://arxiv.org/pdf/2511.10863.pdf

Abstract:
This paper presents a secure aggregation system Armadillo that has disruptive resistance against adversarial clients, such that any coalition of malicious clients (within the tolerated threshold) can affect the aggregation result only by misreporting their private inputs in a pre-defined legitimate range. Armadillo is designed for federated learning setting, where a single powerful server interacts with many weak clients iteratively to train models on client's private data. While a few prior works consider disruption resistance under such setting, they either incur high per-client cost (Chowdhury et al. CCS '22) or require many rounds (Bell et al. USENIX Security '23). Although disruption resistance can be achieved generically with zero-knowledge proof techniques (which we also use in this paper), we realize an efficient system with two new designs: 1) a simple two-layer secure aggregation protocol that requires only simple arithmetic computation; 2) an agreement protocol that removes the effect of malicious clients from the aggregation with low round complexity. With these techniques, Armadillo completes each secure aggregation in 3 rounds while keeping the server and clients computationally lightweight.

Paperid: 2146, https://arxiv.org/pdf/2511.08637.pdf

Abstract:
The internet has become the main source of data to train modern text-to-image or vision-language models, yet it is increasingly unclear whether web-scale data collection practices for training AI systems adequately respect data owners' wishes. Ignoring the owner's indication of consent around data usage not only raises ethical concerns but also has recently been elevated into lawsuits around copyright infringement cases. In this work, we aim to reveal information about data owners' consent to AI scraping and training, and study how it's expressed in DataComp, a popular dataset of 12.8 billion text-image pairs. We examine both the sample-level information, including the copyright notice, watermarking, and metadata, and the web-domain-level information, such as a site's Terms of Service (ToS) and Robots Exclusion Protocol. We estimate at least 122M of samples exhibit some indication of copyright notice in CommonPool, and find that 60\% of the samples in the top 50 domains come from websites with ToS that prohibit scraping. Furthermore, we estimate 9-13\% with 95\% confidence interval of samples from CommonPool to contain watermarks, where existing watermark detection methods fail to capture them in high fidelity. Our holistic methods and findings show that data owners rely on various channels to convey data consent, of which current AI data collection pipelines do not entirely respect. These findings highlight the limitations of the current dataset curation/release practice and the need for a unified data consent framework taking AI purposes into consideration.

Paperid: 2147, https://arxiv.org/pdf/2511.08059.pdf

Abstract:
Since the introduction of the European General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), software developers increasingly have to make privacy-related decisions during system design and implementation. However, past research showed that they often lack legal expertise and struggle with privacy-compliant development. To shed light on how effective current information sources are in supporting them with privacy-sensitive implementation, we conducted a qualitative study with 30 developers. Participants were presented with a privacy-sensitive scenario and asked to identify privacy issues and suggest measures using their knowledge, online resources, and an AI assistant. We observed developers' decision-making in think-aloud sessions and discussed it in follow-up interviews. We found that participants struggled with all three sources: personal knowledge was insufficient, web content was often too complex, and while AI assistants provided clear and user-tailored responses, they lacked contextual relevance and failed to identify scenario-specific issues. Our study highlights major shortcomings in existing support for privacy-related development tasks. Based on our findings, we discuss the need for more accessible, understandable, and actionable privacy resources for developers.

Paperid: 2148, https://arxiv.org/pdf/2511.07548.pdf

Abstract:
The increasing deployment of the Internet of Things (IoT) edge devices in modern smart grid environments requires secure and efficient communication protocols specifically designed for resource-constrained environments. However, most existing authentication schemes either impose excessive computational overhead or lack robustness against advanced cyber threats, making them unsuitable for resource-limited smart grid deployments. To address these limitations, this paper proposes a lightweight authentication and secure key exchange protocol for smart grid (LSEG) environments. The proposed LSEG protocol utilizes a unified elliptic curve key pair, enabled by birational mapping between Ed25519 and Curve25519, for signing and key exchange. Initial keys are derived using the hash based message authentication code (HMAC) based key derivation function (HKDF), while ephemeral key pairs, generated through the Elliptic Curve Diffie Hellman Ephemeral (ECDHE), are used in each session to ensure forward secrecy. Session communication is protected using ASCON128a, a lightweight, NIST-standardized, authenticated encryption algorithm. Formal security proofs in the random oracle model validate the security properties of LSEG, including mutual authentication, forward secrecy, and resistance to impersonation, replay, and man in the middle attacks. Experimental results on both Raspberry Pi and Intel Core i9-based systems demonstrate practical efficiency, achieving execution times under 5.5 milliseconds on embedded hardware and a communication cost of only 1024 bits for the protocol's message exchanges. The results demonstrate that LSEG effectively balances security, efficiency, and compliance, making it a scalable solution for secure communication in smart grid infrastructures.

Paperid: 2149, https://arxiv.org/pdf/2511.06871.pdf

Abstract:
Steinke (2025) recently asked the following intriguing open question: Can we solve the differentially private selection problem with nearly-optimal error by only (adaptively) invoking Gaussian mechanism on low-sensitivity queries? We resolve this question positively. In particular, for a candidate set $\mathcal{Y}$, we achieve error guarantee of $\tilde{O}(\log |\mathcal{Y}|)$, which is within a factor of $(\log \log |\mathcal{Y}|)^{O(1)}$ of the exponential mechanism (McSherry and Talwar, 2007). This improves on Steinke's mechanism which achieves an error of $O(\log^{3/2} |\mathcal{Y}|)$.

Paperid: 2150, https://arxiv.org/pdf/2511.06742.pdf

Abstract:
As Federated Learning (FL) becomes more widespread, there is growing interest in its decentralized variants. Decentralized FL leverages the benefits of fast and energy-efficient device-to-device communications to obviate the need for a central server. However, this opens the door to new security vulnerabilities as well. While FL security has been a popular research topic, the role of adversarial node placement in decentralized FL remains largely unexplored. This paper addresses this gap by evaluating the impact of various coordinated adversarial node placement strategies on decentralized FL's model training performance. We adapt two threads of placement strategies to this context: maximum span-based algorithms, and network centrality-based approaches. Building on them, we propose a novel attack strategy, MaxSpAN-FL, which is a hybrid between these paradigms that adjusts node placement probabilistically based on network topology characteristics. Numerical experiments demonstrate that our attack consistently induces the largest degradation in decentralized FL models compared with baseline schemes across various network configurations and numbers of coordinating adversaries. We also provide theoretical support for why eigenvector centrality-based attacks are suboptimal in decentralized FL. Overall, our findings provide valuable insights into the vulnerabilities of decentralized FL systems, setting the stage for future research aimed at developing more secure and robust decentralized FL frameworks.

Paperid: 2151, https://arxiv.org/pdf/2511.06429.pdf

Abstract:
LockBit has evolved from an obscure Ransomware-as-a-Service newcomer in 2019 to the most prolific ransomware franchise of 2024. Leveraging a recently leaked MySQL dump of the gang's management panel, this study offers an end-to-end reconstruction of LockBit's technical, behavioral, and financial apparatus. We recall the family's version timeline and map its tactics, techniques, and procedures to MITRE ATT&CK, highlighting the incremental hardening that distinguishes LockBit 3.0 from its predecessors. We then analyze 51 negotiation chat logs using natural-language embeddings and clustering to infer a canonical interaction playbook, revealing recurrent rhetorical stages that underpin the double-extortion strategy. Finally, we trace 19 Bitcoin addresses related to ransom payment chains, revealing two distinct patterns based on different laundering phases. In both cases, a small portion of the ransom is immediately split into long-lived addresses (presumably retained by the group as profit and to finance further operations) while the remainder is ultimately aggregated into two high-volume addresses before likely being sent to the affiliate. These two collector addresses appear to belong to distinct exchanges, each processing over 200k BTC. The combined evidence portrays LockBit as a tightly integrated criminal service whose resilience rests on rapid code iteration, script-driven social engineering, and industrial-scale cash-out pipelines.

Paperid: 2152, https://arxiv.org/pdf/2511.04409.pdf

Abstract:
The use of blockchains for data certification and traceability is now well established in both the literature and practical applications. However, while blockchain-based certification of individual data is clear and straightforward, the use of blockchain to certify large amounts of data produced on a nearly continuous basis still poses some challenges. In such a case, in fact, it is first necessary to collect the data in an off-chain buffer, and then to organize it, e.g., via Merkle trees, in order to keep the size and quantity of certification data to be written to the blockchain small. In this paper, we consider a typical system for blockchain-based traceability of a production process, and propose and comparatively analyze some strategies for certifying the data of such a process on blockchain, while maintaining the possibility of verifying their certification in a decentralized way.

Paperid: 2153, https://arxiv.org/pdf/2511.03816.pdf

Abstract:
Child sexual abuse is among the most hideous crimes, yet, after the COVID-19 pandemic, there is a huge surge in the distribution of child sexual abuse material (CSAM). Traditionally, the exchange of such material is performed on the dark web, as it provides many privacy guarantees that facilitate illicit trades. However, the introduction of end-to-end encryption platforms has brought it to the deep web. In this work, we report our findings for a campaign of spreading child sexual abuse material on the clear web. The campaign utilized at least 1,026 web pages for at least 738,286 registered users. Our analysis details the operation of such a campaign, showcasing how social networks are abused and the role of bots, but also the bypasses that are used. Going a step further and exploiting operational faults in the campaign, we gain insight into the demand for such content, as well as the dynamics of the user network that supports it.

Paperid: 2154, https://arxiv.org/pdf/2511.03622.pdf

Abstract:
We consider the problem of searching for an intruder in a geometric domain by utilizing multiple search robots. The domain is a simply connected orthogonal polygon with edges parallel to the cartesian coordinate axes. Each robot has a limited sensing capability. We study the problem for both static and mobile intruders. It turns out that the problem of finding an intruder is NP-hard, even for a stationary intruder. Given this intractability, we turn our attention towards developing efficient and robust algorithms, namely methods based on space-filling curves, random search, and cooperative random search. Moreover, for each proposed algorithm, we evaluate the trade-off between the number of search robots and the time required for the robots to complete the search process while considering the geometric properties of the connected orthogonal search area.

Paperid: 2155, https://arxiv.org/pdf/2511.03486.pdf

Abstract:
Instant messaging has become one of the most used methods of communication online, which has attracted significant attention to its underlying cryptographic protocols and security guarantees. Techniques to increase privacy such as End-to-End Encryption and pseudonyms have been introduced. However, online spaces such as messaging groups still require moderation to prevent misbehaving users from participating in them, particularly in anonymous contexts.. In Anonymous Blocklisting (AB) schemes, users must prove during authentication that none of their previous pseudonyms has been blocked, preventing misbehaving users from creating new pseudonyms. In this work we propose an alternative \textit{Federated Anonymous Blocklisting} (FAB) in which the centralised Service Provider is replaced by small distributed Realms, each with its own blocklist. Realms can establish trust relationships between each other, such that when users authenticate to a realm, they must prove that they are not banned in any of its trusted realms. We provide an implementation of our proposed scheme; unlike existing AB constructions, the performance of ours does not depend on the current size of the blocklist nor requires processing new additions to the blocklist. We also demonstrate its applicability to real-world messaging groups by integrating our FAB scheme into the Messaging Layer Security protocol.

Paperid: 2156, https://arxiv.org/pdf/2511.03213.pdf

Abstract:
The shuffle model, which anonymizes data by randomly permuting user messages, has been widely adopted in both cryptography and differential privacy. In this work, we present the first systematic study of the Bayesian advantage in re-identifying a user's message under the shuffle model. We begin with a basic setting: one sample is drawn from a distribution $P$, and $n - 1$ samples are drawn from a distribution $Q$, after which all $n$ samples are randomly shuffled. We define $β_n(P, Q)$ as the success probability of a Bayes-optimal adversary in identifying the sample from $P$, and define the additive and multiplicative Bayesian advantages as $\mathsf{Adv}_n^{+}(P, Q) = β_n(P,Q) - \frac{1}{n}$ and $\mathsf{Adv}_n^{\times}(P, Q) = n \cdot β_n(P,Q)$, respectively. We derive exact analytical expressions and asymptotic characterizations of $β_n(P, Q)$, along with evaluations in several representative scenarios. Furthermore, we establish (nearly) tight mutual bounds between the additive Bayesian advantage and the total variation distance. Finally, we extend our analysis beyond the basic setting and present, for the first time, an upper bound on the success probability of Bayesian attacks in shuffle differential privacy. Specifically, when the outputs of $n$ users -- each processed through an $\varepsilon$-differentially private local randomizer -- are shuffled, the probability that an attacker successfully re-identifies any target user's message is at most $e^{\varepsilon}/n$.

Paperid: 2157, https://arxiv.org/pdf/2511.01746.pdf

Abstract:
Scam detection remains a critical challenge in cybersecurity as adversaries craft messages that evade automated filters. We propose a Hierarchical Scam Detection System (HSDS) that combines a lightweight multi-model voting front end with a fine-tuned LLaMA 3.1 8B Instruct back end to improve accuracy and robustness against adversarial attacks. An ensemble of four classifiers provides preliminary predictions through majority vote, and ambiguous cases are escalated to the fine-tuned model, which is optimized with adversarial training to reduce misclassification. Experiments show that this hierarchical design both improves adversarial scam detection and shortens inference time by routing most cases away from the LLM, outperforming traditional machine-learning baselines and proprietary LLM baselines. The findings highlight the effectiveness of a hybrid voting mechanism and adversarial fine-tuning in fortifying LLMs against evolving scam tactics, enhancing the resilience of automated scam detection systems.

Paperid: 2158, https://arxiv.org/pdf/2510.26829.pdf

Abstract:
Large language models (LLMs) continually evolve through pre-training on ever-expanding web data, but this adaptive process also exposes them to subtle forms of misinformation. While prior work has explored data poisoning during static pre-training, the effects of such manipulations under continual pre-training remain largely unexplored. Drawing inspiration from the illusory truth effect in human cognition - where repeated exposure to falsehoods increases belief in their accuracy - we ask whether LLMs exhibit a similar vulnerability. We investigate whether repeated exposure to false but confidently stated facts can shift a model's internal representation away from the truth. We introduce Layer of Truth, a framework and dataset for probing belief dynamics in continually trained LLMs. By injecting controlled amounts of poisoned data and probing intermediate representations across checkpoints, model scales, and question types, we quantify when and how factual beliefs shift. Our findings reveal that even minimal exposure can induce persistent representational drift in well-established facts, with susceptibility varying across layers and model sizes. These results highlight an overlooked vulnerability of continually updated LLMs: their capacity to internalize misinformation analogously to humans, underscoring the need for robust monitoring of factual integrity during model updates.

Paperid: 2159, https://arxiv.org/pdf/2510.25472.pdf

Abstract:
In the rapidly expanding landscape of Large Language Model (LLM) applications, real-time output streaming has become the dominant interaction paradigm. While this enhances user experience, recent research reveals that it exposes a non-trivial attack surface through network side-channels. Adversaries can exploit patterns in encrypted traffic to infer sensitive information and reconstruct private conversations. In response, LLM providers and third-party services are deploying defenses such as traffic padding and obfuscation to mitigate these vulnerabilities. This paper starts by presenting a systematic analysis of contemporary side-channel defenses in mainstream LLM applications, with a focus on services from vendors like OpenAI and DeepSeek. We identify and examine seven representative deployment scenarios, each incorporating active/passive mitigation techniques. Despite these enhanced security measures, our investigation uncovers significant residual information that remains vulnerable to leakage within the network traffic. Building on this discovery, we introduce NetEcho, a novel, LLM-based framework that comprehensively unleashes the network side-channel risks of today's LLM applications. NetEcho is designed to recover entire conversations -- including both user prompts and LLM responses -- directly from encrypted network traffic. It features a deliberate design that ensures high-fidelity text recovery, transferability across different deployment scenarios, and moderate operational cost. In our evaluations on medical and legal applications built upon leading models like DeepSeek-v3 and GPT-4o, NetEcho can recover avg $\sim$70\% information of each conversation, demonstrating a critical limitation in current defense mechanisms. We conclude by discussing the implications of our findings and proposing future directions for augmenting network traffic security.

Paperid: 2160, https://arxiv.org/pdf/2510.23927.pdf

Abstract:
Pig butchering, and similar interactive online scams, lower their victims' defenses by building trust over extended periods of conversation - sometimes weeks or months. They have become increasingly public losses (at least $75B by one recent study). However, because of their long-term conversational nature, they are extremely challenging to investigate at scale. In this paper, we describe the motivation, design, implementation, and experience with CHATTERBOX, an LLM-based system that automates long-term engagement with online scammers, making large-scale investigations of their tactics possible. We describe the techniques we have developed to attract scam attempts, the system and LLM-engineering required to convincingly engage with scammers, and the necessary capabilities required to satisfy or evade "milestones" in scammers' workflow.

Paperid: 2161, https://arxiv.org/pdf/2510.23847.pdf

Abstract:
Cryptocurrency blockchain networks safeguard digital assets using cryptographic keys, with wallets playing a critical role in generating, storing, and managing these keys. Wallets, typically categorized as hot and cold, offer varying degrees of security and convenience. However, they are generally software-based applications running on microcontrollers. Consequently, they are vulnerable to malware and side-channel attacks, allowing perpetrators to extract private keys by targeting critical algorithms, such as ECC, which processes private keys to generate public keys and authorize transactions. To address these issues, this work presents EthVault, the first hardware architecture for an Ethereum hierarchically deterministic cold wallet, featuring hardware implementations of key algorithms for secure key generation. Also, an ECC architecture resilient to side-channel and timing attacks is proposed. Moreover, an architecture of the child key derivation function, a fundamental component of cryptocurrency wallets, is proposed. The design minimizes resource usage, meeting market demand for small, portable cryptocurrency wallets. FPGA implementation results validate the feasibility of the proposed approach. The ECC architecture exhibits uniform execution behavior across varying inputs, while the complete design utilizes only 27%, 7%, and 6% of LUTs, registers, and RAM blocks, respectively, on a Xilinx Zynq UltraScale+ FPGA.

Paperid: 2162, https://arxiv.org/pdf/2510.23643.pdf

Abstract:
The globalized semiconductor supply chain has made Hardware Trojans (HT) a significant security threat to embedded systems, necessitating the design of efficient and adaptable detection mechanisms. Despite promising machine learning-based HT detection techniques in the literature, they suffer from ad hoc feature selection and the lack of adaptivity, all of which hinder their effectiveness across diverse HT attacks. In this paper, we propose SAND, a selfsupervised and adaptive NAS-driven framework for efficient HT detection. Specifically, this paper makes three key contributions. (1) We leverage self-supervised learning (SSL) to enable automated feature extraction, eliminating the dependency on manually engineered features. (2) SAND integrates neural architecture search (NAS) to dynamically optimize the downstream classifier, allowing for seamless adaptation to unseen benchmarks with minimal fine-tuning. (3) Experimental results show that SAND achieves a significant improvement in detection accuracy (up to 18.3%) over state-of-the-art methods, exhibits high resilience against evasive Trojans, and demonstrates strong generalization.

Paperid: 2163, https://arxiv.org/pdf/2510.23034.pdf

Abstract:
Binarized Neural Networks (BNNs) are a class of deep neural networks designed to utilize minimal computational resources, which drives their popularity across various applications. Recent studies highlight the potential of mapping BNN model parameters onto emerging non-volatile memory technologies, specifically using crossbar architectures, resulting in improved inference performance compared to traditional CMOS implementations. However, the common practice of protecting model parameters from theft attacks by storing them in an encrypted format and decrypting them at runtime introduces significant computational overhead, thus undermining the core principles of in-memory computing, which aim to integrate computation and storage. This paper presents a robust strategy for protecting BNN model parameters, particularly within in-memory computing frameworks. Our method utilizes a secret key derived from a physical unclonable function to transform model parameters prior to storage in the crossbar. Subsequently, the inference operations are performed on the encrypted weights, achieving a very special case of Fully Homomorphic Encryption (FHE) with minimal runtime overhead. Our analysis reveals that inference conducted without the secret key results in drastically diminished performance, with accuracy falling below 15%. These results validate the effectiveness of our protection strategy in securing BNNs within in-memory computing architectures while preserving computational efficiency.

Paperid: 2164, https://arxiv.org/pdf/2510.22149.pdf

Abstract:
Federated learning (FL) has emerged as a promising paradigm for decentralized model training, enabling multiple clients to collaboratively learn a shared model without exchanging their local data. However, the decentralized nature of FL also introduces vulnerabilities, as malicious clients can compromise or manipulate the training process. In this work, we introduce dictator clients, a novel, well-defined, and analytically tractable class of malicious participants capable of entirely erasing the contributions of all other clients from the server model, while preserving their own. We propose concrete attack strategies that empower such clients and systematically analyze their effects on the learning process. Furthermore, we explore complex scenarios involving multiple dictator clients, including cases where they collaborate, act independently, or form an alliance in order to ultimately betray one another. For each of these settings, we provide a theoretical analysis of their impact on the global model's convergence. Our theoretical algorithms and findings about the complex scenarios including multiple dictator clients are further supported by empirical evaluations on both computer vision and natural language processing benchmarks.

Paperid: 2165, https://arxiv.org/pdf/2510.21957.pdf

Abstract:
Ransomware has become a critical threat to cybersecurity due to its rapid evolution, the necessity for early detection, and growing diversity, posing significant challenges to traditional detection methods. While AI-based approaches had been proposed by prior works to assist ransomware detection, existing methods suffer from three major limitations, ad-hoc feature dependencies, delayed response, and limited adaptability to unseen variants. In this paper, we propose a framework that integrates self-supervised contrastive learning with neural architecture search (NAS) to address these challenges. Specifically, this paper offers three important contributions. (1) We design a contrastive learning framework that incorporates hardware performance counters (HPC) to analyze the runtime behavior of target ransomware. (2) We introduce a customized loss function that encourages early-stage detection of malicious activity, and significantly reduces the detection latency. (3) We deploy a neural architecture search (NAS) framework to automatically construct adaptive model architectures, allowing the detector to flexibly align with unseen ransomware variants. Experimental results show that our proposed method achieves significant improvements in both detection accuracy (up to 16.1%) and response time (up to 6x) compared to existing approaches while maintaining robustness under evasive attacks.

Paperid: 2166, https://arxiv.org/pdf/2510.21842.pdf

Abstract:
We present modal aphasia, a systematic dissociation in which current unified multimodal models accurately memorize concepts visually but fail to articulate them in writing, despite being trained on images and text simultaneously. For one, we show that leading frontier models can generate near-perfect reproductions of iconic movie artwork, but confuse crucial details when asked for textual descriptions. We corroborate those findings through controlled experiments on synthetic datasets in multiple architectures. Our experiments confirm that modal aphasia reliably emerges as a fundamental property of current unified multimodal models, not just as a training artifact. In practice, modal aphasia can introduce vulnerabilities in AI safety frameworks, as safeguards applied to one modality may leave harmful concepts accessible in other modalities. We demonstrate this risk by showing how a model aligned solely on text remains capable of generating unsafe images.

Paperid: 2167, https://arxiv.org/pdf/2510.21601.pdf

Abstract:
Previous studies on PTA have focused on analyzing privacy threats based on the potential areas of occurrence and their likelihood of occurrence. However, an in-depth understanding of the threat actors involved, their actions, and the intentions that result in privacy threats is essential. In this paper, we present a novel Privacy Threat Model Framework (PTMF) that analyzes privacy threats through different phases. The PTMF development is motivated through the selected tactics from the MITRE ATT\&CK framework and techniques from the LINDDUN privacy threat model, making PTMF a privacy-centered framework. The proposed PTMF can be employed in various ways, including analyzing the activities of threat actors during privacy threats and assessing privacy risks in IoT systems, among others. In this paper, we conducted a user study on 12 privacy threats associated with IoT by developing a questionnaire based on PTMF and recruited experts from both industry and academia in the fields of security and privacy to gather their opinions. The collected data were analyzed and mapped to identify the threat actors involved in the identification of IoT users (IU) and the remaining 11 privacy threats. Our observation revealed the top three threat actors and the critical paths they used during the IU privacy threat, as well as the remaining 11 privacy threats. This study could provide a solid foundation for understanding how and where privacy measures can be proactively and effectively deployed in IoT systems to mitigate privacy threats based on the activities and intentions of threat actors within these systems.

Paperid: 2168, https://arxiv.org/pdf/2510.21246.pdf

Abstract:
Self-hosted cloud storage platforms like Nextcloud are gaining popularity among individuals and organizations seeking greater control over their data. However, this shift introduces new challenges for digital forensic investigations, particularly in systematically analyzing both client and server components. Despite Nextcloud's widespread use, it has received limited attention in forensic research. In this work, we critically examine existing cloud storage forensic frameworks and highlight their limitations. To address the gaps, we propose an extended forensic framework that incorporates device monitoring and leverages cloud APIs for structured, repeatable evidence acquisition. Using Nextcloud as a case study, we demonstrate how its native APIs can be used to reliably access forensic artifacts, and we introduce an open-source acquisition tool that implements this approach. Our framework equips investigators with a more flexible method for analyzing self-hosted cloud storage systems, and offers a foundation for further development in this evolving area of digital forensics.

Paperid: 2169, https://arxiv.org/pdf/2510.19393.pdf

Abstract:
On average, 71% of the code in typical Java projects comes from open-source software (OSS) dependencies, making OSS dependencies the dominant component of modern software code bases. This high degree of OSS reliance comes with a considerable security risk of adding known security vulnerabilities to a code base. To remedy this risk, researchers and companies have developed various dependency scanners, which try to identify inclusions of known-to-be-vulnerable OSS dependencies. However, there are still challenges that modern dependency scanners do not overcome, especially when it comes to dependency modifications, such as re-compilations, re-bundlings or re-packagings, which are common in the Java ecosystem. To overcome these challenges, we present Jaralyzer, a bytecode-centric dependency scanner for Java. Jaralyzer does not rely on the metadata or the source code of the included OSS dependencies being available but directly analyzes a dependency's bytecode. Our evaluation across 56 popular OSS components demonstrates that Jaralyzer outperforms other popular dependency scanners in detecting vulnerabilities within modified dependencies. It is the only scanner capable of identifying vulnerabilities across all the above mentioned types of modifications. But even when applied to unmodified dependencies, Jaralyzer outperforms the current state-of-the-art code-centric scanner Eclipse Steady by detecting 28 more true vulnerabilities and yielding 29 fewer false warnings.

Paperid: 2170, https://arxiv.org/pdf/2510.18493.pdf

Abstract:
Phone scams remain a pervasive threat to both personal safety and financial security worldwide. Recent advances in large language models (LLMs) have demonstrated strong potential in detecting fraudulent behavior by analyzing transcribed phone conversations. However, these capabilities introduce notable privacy risks, as such conversations frequently contain sensitive personal information that may be exposed to third-party service providers during processing. In this work, we explore how to harness LLMs for phone scam detection while preserving user privacy. We propose MASK (Modular Adaptive Sanitization Kit), a trainable and extensible framework that enables dynamic privacy adjustment based on individual preferences. MASK provides a pluggable architecture that accommodates diverse sanitization methods - from traditional keyword-based techniques for high-privacy users to sophisticated neural approaches for those prioritizing accuracy. We also discuss potential modeling approaches and loss function designs for future development, enabling the creation of truly personalized, privacy-aware LLM-based detection systems that balance user trust and detection effectiveness, even beyond phone scam context.

Paperid: 2171, https://arxiv.org/pdf/2510.18160.pdf

Abstract:
The impending adoption of Open Radio Access Network (O-RAN) is fueling innovation in the RAN towards data-driven operation. Unlike traditional RAN where the RAN data and its usage is restricted within proprietary and monolithic RAN equipment, the O-RAN architecture opens up access to RAN data via RAN intelligent controllers (RICs), to third-party machine learning (ML) powered applications - rApps and xApps - to optimize RAN operations. Consequently, a major focus has been placed on leveraging RAN data to unlock greater efficiency gains. However, there is an increasing recognition that RAN data access to apps could become a source of vulnerability and be exploited by malicious actors. Motivated by this, we carry out a comprehensive investigation of data vulnerabilities on both xApps and rApps, respectively hosted in Near- and Non-real-time (RT) RIC components of O-RAN. We qualitatively analyse the O-RAN security mechanisms and limitations for xApps and rApps, and consider a threat model informed by this analysis. We design a viable and effective black-box evasion attack strategy targeting O-RAN RIC Apps while accounting for the stringent timing constraints and attack effectiveness. The strategy employs four key techniques: the model cloning algorithm, input-specific perturbations, universal adversarial perturbations (UAPs), and targeted UAPs. This strategy targets ML models used by both xApps and rApps within the O-RAN system, aiming to degrade network performance. We validate the effectiveness of the designed evasion attack strategy and quantify the scale of performance degradation using a real-world O-RAN testbed and emulation environments. Evaluation is conducted using the Interference Classification xApp and the Power Saving rApp as representatives for near-RT and non-RT RICs. We also show that the attack strategy is effective against prominent defense techniques for adversarial ML.

Paperid: 2172, https://arxiv.org/pdf/2510.17884.pdf

Abstract:
The remarkable capabilities of Large Language Models (LLMs) in natural language understanding and generation have sparked interest in their potential for cybersecurity applications, including password guessing. In this study, we conduct an empirical investigation into the efficacy of pre-trained LLMs for password cracking using synthetic user profiles. Specifically, we evaluate the performance of state-of-the-art open-source LLMs such as TinyLLaMA, Falcon-RW-1B, and Flan-T5 by prompting them to generate plausible passwords based on structured user attributes (e.g., name, birthdate, hobbies). Our results, measured using Hit@1, Hit@5, and Hit@10 metrics under both plaintext and SHA-256 hash comparisons, reveal consistently poor performance, with all models achieving less than 1.5% accuracy at Hit@10. In contrast, traditional rule-based and combinator-based cracking methods demonstrate significantly higher success rates. Through detailed analysis and visualization, we identify key limitations in the generative reasoning of LLMs when applied to the domain-specific task of password guessing. Our findings suggest that, despite their linguistic prowess, current LLMs lack the domain adaptation and memorization capabilities required for effective password inference, especially in the absence of supervised fine-tuning on leaked password datasets. This study provides critical insights into the limitations of LLMs in adversarial contexts and lays the groundwork for future efforts in secure, privacy-preserving, and robust password modeling.

Paperid: 2173, https://arxiv.org/pdf/2510.17883.pdf

Abstract:
Large Language Models (LLMs) can reason over natural-language inputs, but their role in intrusion detection without fine-tuning remains uncertain. This study evaluates a prompt-only approach on UNSW-NB15 by converting each network flow to a compact textual record and augmenting it with lightweight, domain-inspired boolean flags (asymmetry, burst rate, TTL irregularities, timer anomalies, rare service/state, short bursts). To reduce output drift and support measurement, the model is constrained to produce structured, grammar-valid responses, and a single decision threshold is calibrated on a small development split. We compare zero-shot, instruction-guided, and few-shot prompting to strong tabular and neural baselines under identical splits, reporting accuracy, precision, recall, F1, and macro scores. Empirically, unguided prompting is unreliable, while instructions plus flags substantially improve detection quality; adding calibrated scoring further stabilizes results. On a balanced subset of two hundred flows, a 7B instruction-tuned model with flags reaches macro-F1 near 0.78; a lighter 3B model with few-shot cues and calibration attains F1 near 0.68 on one thousand examples. As the evaluation set grows to two thousand flows, decision quality decreases, revealing sensitivity to coverage and prompting. Tabular baselines remain more stable and faster, yet the prompt-only pipeline requires no gradient training, produces readable artifacts, and adapts easily through instructions and flags. Contributions include a flow-to-text protocol with interpretable cues, a calibration method for thresholding, a systematic baseline comparison, and a reproducibility bundle with prompts, grammar, metrics, and figures.

Paperid: 2174, https://arxiv.org/pdf/2510.17687.pdf

Abstract:
Multimodal Large Language Models (MLLMs) achieve strong reasoning and perception capabilities but are increasingly vulnerable to jailbreak attacks. While existing work focuses on explicit attacks, where malicious content resides in a single modality, recent studies reveal implicit attacks, in which benign text and image inputs jointly express unsafe intent. Such joint-modal threats are difficult to detect and remain underexplored, largely due to the scarcity of high-quality implicit data. We propose ImpForge, an automated red-teaming pipeline that leverages reinforcement learning with tailored reward modules to generate diverse implicit samples across 14 domains. Building on this dataset, we further develop CrossGuard, an intent-aware safeguard providing robust and comprehensive defense against both explicit and implicit threats. Extensive experiments across safe and unsafe benchmarks, implicit and explicit attacks, and multiple out-of-domain settings demonstrate that CrossGuard significantly outperforms existing defenses, including advanced MLLMs and guardrails, achieving stronger security while maintaining high utility. This offers a balanced and practical solution for enhancing MLLM robustness against real-world multimodal threats.

Paperid: 2175, https://arxiv.org/pdf/2510.16873.pdf

Abstract:
Over time, cryptographically deniable systems have come to be associated in computer-science literature with the idea of "denying" evidence in court - specifically, with the ability to convincingly forge evidence in courtroom scenarios and an inability to authenticate evidence in such contexts. Evidentiary processes in courts, however, have been developed over centuries to account for the reality that evidence has always been forgeable, and relies on factors outside of cryptographic models to seek the truth "as well as possible" while acknowledging that all evidence is imperfect. We argue that deniability does not and need not change this paradigm. Our analysis highlights a gap between technical deniability notions and their application to the real world. There will always be factors outside a cryptographic model that influence perceptions of a message's authenticity, in realistic situations. We propose the broader concept of credibility to capture these factors. The credibility of a system is determined by (1) a threshold of quality that a forgery must pass to be "believable" as an original communication, which varies based on sociotechnical context and threat model, (2) the ease of creating a forgery that passes this threshold, which is also context- and threat-model-dependent, and (3) default system retention policy and retention settings. All three aspects are important for designing secure communication systems for real-world threat models, and some aspects of (2) and (3) may be incorporated directly into technical system design. We hope that our model of credibility will facilitate system design and deployment that addresses threats that are not and cannot be captured by purely technical definitions and existing cryptographic models, and support more nuanced discourse on the strengths and limitations of cryptographic guarantees within specific legal and sociotechnical contexts.

Paperid: 2176, https://arxiv.org/pdf/2510.16830.pdf

Abstract:
Large language models are often adapted through parameter efficient fine tuning, but current release practices provide weak assurances about what data were used and how updates were computed. We present Verifiable Fine Tuning, a protocol and system that produces succinct zero knowledge proofs that a released model was obtained from a public initialization under a declared training program and an auditable dataset commitment. The approach combines five elements. First, commitments that bind data sources, preprocessing, licenses, and per epoch quota counters to a manifest. Second, a verifiable sampler that supports public replayable and private index hiding batch selection. Third, update circuits restricted to parameter efficient fine tuning that enforce AdamW style optimizer semantics and proof friendly approximations with explicit error budgets. Fourth, recursive aggregation that folds per step proofs into per epoch and end to end certificates with millisecond verification. Fifth, provenance binding and optional trusted execution property cards that attest code identity and constants. On English and bilingual instruction mixtures, the method maintains utility within tight budgets while achieving practical proof performance. Policy quotas are enforced with zero violations, and private sampling windows show no measurable index leakage. Federated experiments demonstrate that the system composes with probabilistic audits and bandwidth constraints. These results indicate that end to end verifiable fine tuning is feasible today for real parameter efficient pipelines, closing a critical trust gap for regulated and decentralized deployments.

Paperid: 2177, https://arxiv.org/pdf/2510.16331.pdf

Abstract:
Striking a balance between protecting data privacy and enabling collaborative computation is a critical challenge for distributed machine learning. While privacy-preserving techniques for federated learning have been extensively developed, methods for scenarios involving bitwise operations, such as tree-based vertical federated learning (VFL), are still underexplored. Traditional mechanisms, including Shamir's secret sharing and multi-party computation (MPC), are not optimized for bitwise operations over binary data, particularly in settings where each participant holds a different part of the binary vector. This paper addresses the limitations of existing methods by proposing a novel binary multi-party computation (BiMPC) framework. The BiMPC mechanism facilitates privacy-preserving bitwise operations, with a particular focus on dot product computations of binary vectors, ensuring the privacy of each individual bit. The core of BiMPC is a novel approach called Dot Product via Modular Addition (DoMA), which uses regular and modular additions for efficient binary dot product calculation. To ensure privacy, BiMPC uses random masking in a higher field for linear computations and a three-party oblivious transfer (triot) protocol for non-linear binary operations. The privacy guarantees of the BiMPC framework are rigorously analyzed, demonstrating its efficiency and scalability in distributed settings.

Paperid: 2178, https://arxiv.org/pdf/2510.16024.pdf

Abstract:
Billions of dollars are lost every year in DeFi platforms by transactions exploiting business logic or accounting vulnerabilities. Existing defenses focus on static code analysis, public mempool screening, attacker contract detection, or trusted off-chain monitors, none of which prevents exploits submitted through private relays or malicious contracts that execute within the same block. We present the first decentralized, fully on-chain learning framework that: (i) performs gas-prohibitive computation on Layer-2 to reduce cost, (ii) propagates verified model updates to Layer-1, and (iii) enables gas-bounded, low-latency inference inside smart contracts. A novel Proof-of-Improvement (PoIm) protocol governs the training process and verifies each decentralized micro update as a self-verifying training transaction. Updates are accepted by \textit{PoIm} only if they demonstrably improve at least one core metric (e.g., accuracy, F1-score, precision, or recall) on a public benchmark without degrading any of the other core metrics, while adversarial proposals get financially penalized through an adaptable test set for evolving threats. We develop quantization and loop-unrolling techniques that enable inference for logistic regression, SVM, MLPs, CNNs, and gated RNNs (with support for formally verified decision tree inference) within the Ethereum block gas limit, while remaining bit-exact to their off-chain counterparts, formally proven in Z3. We curate 298 unique real-world exploits (2020 - 2025) with 402 exploit transactions across eight EVM chains, collectively responsible for \$3.74 B in losses.

Paperid: 2179, https://arxiv.org/pdf/2510.15612.pdf

Abstract:
Decentralized prediction markets (DePMs) allow open participation in event-based wagering without fully relying on centralized intermediaries. We review the history of DePMs which date back to 2011 and includes hundreds of proposals. Perhaps surprising, modern DePMs like Polymarket deviate materially from earlier designs like Truthcoin and Augur v1. We use our review to present a modular workflow comprising seven stages: underlying infrastructure, market topic, share structure and pricing, trading, market resolution, settlement, and archiving. For each module, we enumerate the design variants, analyzing trade-offs around decentralization, expressiveness, and manipulation resistance. We also identify open problems for researchers interested in this ecosystem.

Paperid: 2180, https://arxiv.org/pdf/2510.15017.pdf

Abstract:
Large language models (LLMs) are increasingly vulnerable to multi-turn jailbreak attacks, where adversaries iteratively elicit harmful behaviors that bypass single-turn safety filters. Existing defenses predominantly rely on passive rejection, which either fails against adaptive attackers or overly restricts benign users. We propose a honeypot-based proactive guardrail system that transforms risk avoidance into risk utilization. Our framework fine-tunes a bait model to generate ambiguous, non-actionable but semantically relevant responses, which serve as lures to probe user intent. Combined with the protected LLM's safe reply, the system inserts proactive bait questions that gradually expose malicious intent through multi-turn interactions. We further introduce the Honeypot Utility Score (HUS), measuring both the attractiveness and feasibility of bait responses, and use a Defense Efficacy Rate (DER) for balancing safety and usability. Initial experiment on MHJ Datasets with recent attack method across GPT-4o show that our system significantly disrupts jailbreak success while preserving benign user experience.

Paperid: 2181, https://arxiv.org/pdf/2510.13661.pdf

Abstract:
This paper introduces a methodology based on Euclidean information theory to investigate local properties of secure communication over discrete memoryless wiretap channels. We formulate a constrained optimization problem that maximizes a legitimate user's information rate while imposing explicit upper bounds on both the information leakage to an eavesdropper and the informational cost of encoding the secret message. By leveraging local geometric approximations, this inherently non-convex problem is transformed into a tractable quadratic programming structure. It is demonstrated that the optimal Lagrange multipliers governing this approximated problem can be found by solving a linear program. The constraints of this linear program are derived from Karush-Kuhn-Tucker conditions and are expressed in terms of the generalized eigenvalues of channel-derived matrices. This framework facilitates the derivation of an analytical formula for an approximate local secrecy capacity. Furthermore, we define and analyze a new class of secret local contraction coefficients. These coefficients, characterized as the largest generalized eigenvalues of a matrix pencil, quantify the maximum achievable ratio of approximate utility to approximate leakage, thus measuring the intrinsic local leakage efficiency of the channel. We establish bounds connecting these local coefficients to their global counterparts defined over true mutual information measures. The efficacy of the proposed framework is demonstrated through detailed analysis and numerical illustrations for both general multi-mode channels and the canonical binary symmetric wiretap channel.

Paperid: 2182, https://arxiv.org/pdf/2510.12153.pdf

Abstract:
Cross chain interoperability in blockchain systems exposes a fundamental tension between user privacy and regulatory accountability. Existing solutions enforce an all or nothing choice between full anonymity and mandatory identity disclosure, which limits adoption in regulated financial settings. We present VeilAudit, a cross chain auditing framework that introduces Auditor Only Linkability, which allows auditors to link transaction behaviors that originate from the same anonymous entity without learning its identity. VeilAudit achieves this with a user generated Linkable Audit Tag that embeds a zero knowledge proof to attest to its validity without exposing the user master wallet address, and with a special ciphertext that only designated auditors can test for linkage. To balance privacy and compliance, VeilAudit also supports threshold gated identity revelation under due process. VeilAudit further provides a mechanism for building reputation in pseudonymous environments, which enables applications such as cross chain credit scoring based on verifiable behavioral history. We formalize the security guarantees and develop a prototype that spans multiple EVM chains. Our evaluation shows that the framework is practical for today multichain environments.

Paperid: 2183, https://arxiv.org/pdf/2510.10271.pdf

Abstract:
Unlike regular tokens derived from existing text corpora, special tokens are artificially created to annotate structured conversations during the fine-tuning process of Large Language Models (LLMs). Serving as metadata of training data, these tokens play a crucial role in instructing LLMs to generate coherent and context-aware responses. We demonstrate that special tokens can be exploited to construct four attack primitives, with which malicious users can reliably bypass the internal safety alignment of online LLM services and circumvent state-of-the-art (SOTA) external content moderation systems simultaneously. Moreover, we found that addressing this threat is challenging, as aggressive defense mechanisms-such as input sanitization by removing special tokens entirely, as suggested in academia-are less effective than anticipated. This is because such defense can be evaded when the special tokens are replaced by regular ones with high semantic similarity within the tokenizer's embedding space. We systemically evaluated our method, named MetaBreak, on both lab environment and commercial LLM platforms. Our approach achieves jailbreak rates comparable to SOTA prompt-engineering-based solutions when no content moderation is deployed. However, when there is content moderation, MetaBreak outperforms SOTA solutions PAP and GPTFuzzer by 11.6% and 34.8%, respectively. Finally, since MetaBreak employs a fundamentally different strategy from prompt engineering, the two approaches can work synergistically. Notably, empowering MetaBreak on PAP and GPTFuzzer boosts jailbreak rates by 24.3% and 20.2%, respectively.

Paperid: 2184, https://arxiv.org/pdf/2510.10140.pdf

Abstract:
Deep learning based weather forecasting (DLWF) models leverage past weather observations to generate future forecasts, supporting a wide range of downstream tasks, including tropical cyclone (TC) trajectory prediction. In this paper, we investigate their vulnerability to adversarial attacks, where subtle perturbations to the upstream weather forecasts can alter the downstream TC trajectory predictions. Although research on adversarial attacks in DLWF models has grown recently, generating perturbed upstream forecasts that reliably steer downstream output toward attacker-specified trajectories remains a challenge. First, conventional TC detection systems are opaque, non-differentiable black boxes, making standard gradient-based attacks infeasible. Second, the extreme rarity of TC events leads to severe class imbalance problem, making it difficult to develop efficient attack methods that will produce the attacker's target trajectories. Furthermore, maintaining physical consistency in adversarially generated forecasts presents another significant challenge. To overcome these limitations, we propose Cyc-Attack, a novel method that perturbs the upstream forecasts of DLWF models to generate adversarial trajectories. First, we pre-train a differentiable surrogate model to approximate the TC detector's output, enabling the construction of gradient-based attacks. Cyc-Attack also employs skewness-aware loss function with kernel dilation strategy to address the imbalance problem. Finally, a distance-based gradient weighting scheme and regularization are used to constrain the perturbations and eliminate spurious trajectories to ensure the adversarial forecasts are realistic and not easily detectable.

Paperid: 2185, https://arxiv.org/pdf/2510.09775.pdf

Abstract:
Fingerprinting Radio Frequency (RF) emitters typically involves finding unique emitter characteristics that are featured in their transmitted signals. These fingerprints are nuanced but sufficiently detailed, motivating the pursuit of methods that can successfully extract them. The most granular downstream task is known as Specific Emitter Identification (SEI), which requires a well informed RF fingerprinting (RFF) approach for it to be successful. RFF and SEI have a long history, with numerous application areas in defence and civilian contexts such as signal intelligence, electronic surveillance, physical-layer authentication of wireless communication devices, to name a few. RFF methods also support many other downstream tasks such as Emitter Data Association (EDA) and RF Emitter Clustering (RFEC) and are applicable to a range of transmission types. In recent years, data-driven approaches have become popular in the RFF domain due to their ability to automatically learn intricate fingerprints from raw data. These methods generally deliver superior performance when compared to traditional techniques. The more traditional approaches are often labour-intensive, inflexible and only applicable to a particular emitter type or transmission scheme. Therefore, we consider data-driven Machine Learning (ML)-enabled RFF. In particular, we propose a generic framework for ML-enabled RFF which is inclusive of several popular downstream tasks such as SEI, EDA and RFEC. Each task is formulated as a RF fingerprint-dependent task. A variety of use cases using real RF datasets are presented here to demonstrate the framework for a range of tasks and application areas, such as spaceborne surveillance, signal intelligence and countering drones.

Paperid: 2186, https://arxiv.org/pdf/2510.09706.pdf

Abstract:
The open radio access network (O-RAN) enables modular, intelligent, and programmable 5G network architectures through the adoption of software-defined networking, network function virtualization, and implementation of standardized open interfaces. However, one of the security concerns for O-RAN, which can severely undermine network performance, is jamming attacks. This paper presents SAJD- a self-adaptive jammer detection framework that autonomously detects jamming attacks in AI/ML framework-integrated ORAN environments without human intervention. The SAJD framework forms a closed-loop system that includes near-realtime inference of radio signal jamming via our developed ML-based xApp, as well as continuous monitoring and retraining pipelines through rApps. In this demonstration, we will show how SAJD outperforms state-of-the-art jamming detection xApp (offline trained with manual labels) in terms of accuracy and adaptability under various dynamic and previously unseen interference scenarios in the O-RAN-compliant testbed.

Paperid: 2187, https://arxiv.org/pdf/2510.09673.pdf

Abstract:
A rapidly growing number of cybersecurity threats and incidents demands that Swedish organisations increase their efforts to improve their cybersecurity capacities. This paper presents results from interviews and a prior survey with key representatives from enterprises and public sector organisations in the Swedish region of Värmland in Inner Scandinavia, examining their cybersecurity readiness and needs for education and competence development. We discuss the generalizability of our findings and the extent to which they may be specific to Sweden and Värmland, and we conclude by proposing efforts to strengthen cybersecurity competences in Inner Scandinavia.

Paperid: 2188, https://arxiv.org/pdf/2510.09663.pdf

Abstract:
Radio Frequency Fingerprinting (RFF) has evolved as an effective solution for authenticating devices by leveraging the unique imperfections in hardware components involved in the signal generation process. In this work, we propose a Convolutional Neural Network (CNN) based framework for detecting rogue devices and identifying genuine ones using softmax probability thresholding. We emulate an attack scenario in which adversaries attempt to mimic the RF characteristics of genuine devices by training a Generative Adversarial Network (GAN) using In-phase and Quadrature (IQ) samples from genuine devices. The proposed approach is verified using IQ samples collected from ten different ADALM-PLUTO Software Defined Radios (SDRs), with seven devices considered genuine, two as rogue, and one used for validation to determine the threshold.

Paperid: 2189, https://arxiv.org/pdf/2510.09655.pdf

Abstract:
We address the problem of auditing whether sensitive or copyrighted texts were used to fine-tune large language models (LLMs) under black-box access. Prior signals-verbatim regurgitation and membership inference-are unreliable at the level of individual documents or require altering the visible text. We introduce a text-preserving watermarking framework that embeds sequences of invisible Unicode characters into documents. Each watermark is split into a cue (embedded in odd chunks) and a reply (embedded in even chunks). At audit time, we submit prompts that contain only the cue; the presence of the corresponding reply in the model's output provides evidence of memorization consistent with training on the marked text. To obtain sound decisions, we compare the score of the published watermark against a held-out set of counterfactual watermarks and apply a ranking test with a provable false-positive-rate bound. The design is (i) minimally invasive (no visible text changes), (ii) scalable to many users and documents via a large watermark space and multi-watermark attribution, and (iii) robust to common passive transformations. We evaluate on open-weight LLMs and multiple text domains, analyzing regurgitation dynamics, sensitivity to training set size, and interference under multiple concurrent watermarks. Our results demonstrate reliable post-hoc provenance signals with bounded FPR under black-box access. We experimentally observe a failure rate of less than 0.1\% when detecting a reply after fine-tuning with 50 marked documents. Conversely, no spurious reply was recovered in over 18,000 challenges, corresponding to a 100\%TPR@0\% FPR. Moreover, detection rates remain relatively stable as the dataset size increases, maintaining a per-document detection rate above 45\% even when the marked collection accounts for less than 0.33\% of the fine-tuning data.

Paperid: 2190, https://arxiv.org/pdf/2510.09618.pdf

Abstract:
It is well documented that criminals use IoT devices to facilitate crimes. The review process follows a systematic approach with a clear search strategy, and study selection strategy. The review included a total of 543 articles and the findings from these articles were synthesised through thematic analysis. Identified security attacks targeting consumer IoT devices include man-in-the-middle (MiTM) attacks, synchronisation attacks, Denial-of-Service (DoS), DNS poisoning and malware, alongside device-specific vulnerabilities. Besides security attacks, this review discusses mitigations. Furthermore, the literature also covers crime threat scenarios arising from these attacks, such as, fraud, identity theft, crypto jacking and domestic abuse.

Paperid: 2191, https://arxiv.org/pdf/2510.09271.pdf

Abstract:
The advent of quantum computing threatens the security of traditional encryption algorithms, motivating the development of post-quantum cryptography (PQC). In 2024, the National Institute of Standards and Technology (NIST) standardized several PQC algorithms, marking an important milestone in the transition toward quantum-resistant security. Blockchain systems fundamentally rely on cryptographic primitives to guarantee data integrity and transaction authenticity. However, widely used algorithms such as ECDSA, employed in Bitcoin, Ethereum, and other networks, are vulnerable to quantum attacks. Although adopting PQC is essential for long-term security, its computational overhead in blockchain environments remains largely unexplored. In this work, we propose a methodology for benchmarking both PQC and traditional cryptographic algorithms in blockchain contexts. We measure signature generation and verification times across diverse computational environments and simulate their impact at scale. Our evaluation focuses on PQC digital signature schemes (ML-DSA, Dilithium, Falcon, Mayo, SLH-DSA, SPHINCS+, and Cross) across security levels 1 to 5, comparing them to ECDSA, the current standard in Bitcoin and Ethereum. Our results indicate that PQC algorithms introduce only minor performance overhead at security level 1, while in some scenarios they significantly outperform ECDSA at higher security levels. For instance, ML-DSA achieves a verification time of 0.14 ms on an ARM-based laptop at level 5, compared to 0.88 ms for ECDSA. We also provide an open-source implementation to ensure reproducibility and encourage further research.

Paperid: 2192, https://arxiv.org/pdf/2510.08432.pdf

Abstract:
"Pebble games," an abstraction from classical reversible computing, have found use in the design of quantum circuits for inherently sequential tasks. Gidney showed that allowing Hadamard basis measurements during pebble games can dramatically improve costs -- an extension termed "spooky pebble games" because the measurements leave temporary phase errors called ghosts. In this work, we define and study parallel spooky pebble games. Previous work by Blocki, Holman, and Lee (TCC 2022) and Gidney studied the benefits offered by either parallelism or spookiness individually; here we show that these resources can yield impressive gains when used together. First, we show by construction that a line graph of length $\ell$ can be pebbled in depth $2\ell$ (which is exactly optimal) using space $\leq 2.47\log \ell$. Then, to explore pebbling schemes using even less space, we use a highly optimized $A^*$ search implemented in Julia to find the lowest-depth parallel spooky pebbling possible for a range of concrete line graph lengths $\ell$ given a constant number of pebbles $s$. We show that these techniques can be applied to Regev's factoring algorithm (Journal of the ACM 2025) to significantly reduce the cost of its arithmetic. For example, we find that 4096-bit integers $N$ can be factored in multiplication depth 193, which outperforms the 680 required of previous variants of Regev and the 444 reported by Ekerå and Gärtner for Shor's algorithm (IACR Communications in Cryptology 2025). While space-optimized implementations of Shor's algorithm remain likely the best candidates for first quantum factorization of large integers, our results show that Regev's algorithm may have practical importance in the future, especially given the possibility of further optimization. Finally, we believe our pebbling techniques will find applications in quantum cryptanalysis beyond integer factorization.

Paperid: 2193, https://arxiv.org/pdf/2510.08355.pdf

Abstract:
User authentication is one of the most important aspects for secure communication between services and end-users over the Internet. Service providers leverage Single-Sign On (SSO) to make it easier for their users to authenticate themselves. However, standardized systems for SSO, such as OIDC, do not guarantee user privacy as identity providers can track user activities. We propose a zero-knowledge-based mechanism that integrates with OIDC to let users authenticate through SSO without revealing information about the service provider. Our system leverages Groth's zk-SNARK to prove membership of subscribed service providers without revealing their identity. We adopt a decentralized and verifiable approach to set up the prerequisites of our construction that further secures and establishes trust in the system. We set up high security targets and achieve them with minimal storage and latency cost, proving that our research can be adopted for production.

Paperid: 2194, https://arxiv.org/pdf/2510.07219.pdf

Abstract:
Current generative steganography research mainly pursues computationally expensive mappings to perfect Gaussian priors within single diffusion model architectures. This work introduces an efficient framework based on approximate Gaussian mapping governed by a scale factor calibrated through capacity-aware adaptive optimization. Using this framework as a unified analytical tool, systematic comparative analysis of steganography in pixel-space models versus VAE-based latent-space systems is conducted. The investigation reveals a pronounced architecture dependent security-robustness trade-off: pixel-space models achieve high security against steganalysis but exhibit fragility to channel distortions, while VAE-based systems like Stable Diffusion offer substantial robustness at the cost of security vulnerabilities. Further analysis indicates that the VAE component drives this behavior through opposing mechanisms where the encoder confers robustness via manifold regularization while the decoder introduces vulnerabilities by amplifying latent perturbations into detectable artifacts. These findings characterize the conflicting architectural roles in generative steganography and establish a foundation for future research.

Paperid: 2195, https://arxiv.org/pdf/2510.06629.pdf

Abstract:
Spiking Neural Networks (SNNs) have gained increasing attention for their superior energy efficiency compared to Artificial Neural Networks (ANNs). However, their security aspects, particularly under backdoor attacks, have received limited attention. Existing defense methods developed for ANNs perform poorly or can be easily bypassed in SNNs due to their event-driven and temporal dependencies. This paper identifies the key blockers that hinder traditional backdoor defenses in SNNs and proposes an unsupervised post-training detection framework, Temporal Membrane Potential Backdoor Detection (TMPBD), to overcome these challenges. TMPBD leverages the maximum margin statistics of temporal membrane potential (TMP) in the final spiking layer to detect target labels without any attack knowledge or data access. We further introduce a robust mitigation mechanism, Neural Dendrites Suppression Backdoor Mitigation (NDSBM), which clamps dendritic connections between early convolutional layers to suppress malicious neurons while preserving benign behaviors, guided by TMP extracted from a small, clean, unlabeled dataset. Extensive experiments on multiple neuromorphic benchmarks and state-of-the-art input-aware dynamic trigger attacks demonstrate that TMPBD achieves 100% detection accuracy, while NDSBM reduces the attack success rate from 100% to 8.44%, and to 2.81% when combined with detection, without degrading clean accuracy.

Paperid: 2196, https://arxiv.org/pdf/2510.06432.pdf

Abstract:
A central challenge in data security is not just preventing theft, but detecting whether it has occurred. Classically, this is impossible because a perfect copy leaves no evidence. Quantum mechanics, on the other hand, forbids general duplication, opening up new possibilities. We introduce Proofs of No Intrusion, which enable a classical client to remotely test whether a quantum server has been hacked and the client's data stolen. Crucially, the test does not destroy the data being tested, avoiding the need to store a backup elsewhere. We define and construct proofs of no intrusion for ciphertexts assuming fully homomorphic encryption. Additionally, we show how to equip several constructions of unclonable primitives with proofs of non-intrusion, such as unclonable decryption keys and signature tokens. Conceptually, proofs of non-intrusion can be defined for essentially any unclonable primitive. At the heart of our techniques is a new method for non-destructively testing coset states with classical communication. It can be viewed as a non-destructive proof of knowledge of a measurement result of the coset state.

Paperid: 2197, https://arxiv.org/pdf/2510.06421.pdf

Abstract:
The Precision Time Protocol (PTP), standardized as IEEE 1588, provides sub-microsecond synchronization across distributed systems and underpins critical infrastructure in telecommunications, finance, power systems, and industrial automation. While prior work has extensively analyzed PTP's vulnerability to network-based attacks, prompting the development of cryptographic protections and anomaly detectors, these defenses presume an uncompromised host. In this paper, we identify and exploit a critical blind spot in current threat models: kernel-level adversaries operating from within the host running the PTP stack. We present the first systematic study of kernel-rooted attacks on PTP, demonstrating how privileged attackers can manipulate system time by corrupting key interfaces without altering PTP network traffic. We implement three attack primitives, constant offset, progressive skew, and random jitter, using in-kernel payloads, and evaluate their impact on the widely used ptp4l and phc2sys daemons. Our experiments reveal that these attacks can silently destabilize clock synchronization, bypassing existing PTP security extensions. These findings highlight the urgent need to reconsider host-level trust assumptions and integrate kernel integrity into the design of secure time synchronization systems.

Paperid: 2198, https://arxiv.org/pdf/2510.06015.pdf

Abstract:
Mobile healthcare (mHealth) applications promise convenient, continuous patient-provider interaction but also introduce severe and often underexamined security and privacy risks. We present an end-to-end audit of 272 Android mHealth apps from Google Play, combining permission forensics, static vulnerability analysis, and user review mining. Our multi-tool assessment with MobSF, RiskInDroid, and OWASP Mobile Audit revealed systemic weaknesses: 26.1% request fine-grained location without disclosure, 18.3% initiate calls silently, and 73 send SMS without notice. Nearly half (49.3%) still use deprecated SHA-1 encryption, 42 transmit unencrypted data, and 6 remain vulnerable to StrandHogg 2.0. Analysis of 2.56 million user reviews found 28.5% negative or neutral sentiment, with over 553,000 explicitly citing privacy intrusions, data misuse, or operational instability. These findings demonstrate the urgent need for enforceable permission transparency, automated pre-market security vetting, and systematic adoption of secure-by-design practices to protect Protected Health Information (PHI).

Paperid: 2199, https://arxiv.org/pdf/2510.05848.pdf

Abstract:
We classify small binary bibraces, using the correspondence with alternating algebras over the field F2, up to dimension eight, also determining their isomorphism classes. These finite-dimensional algebras, defined by an alternating bilinear multiplication and nilpotency of class two, can be represented by subspaces of skew-symmetric matrices, with classification corresponding to GL(m, F_2)-orbits under congruence. Our approach combines theoretical invariants, such as rank sequences and the identification of primitive algebras, with computational methods implemented in Magma. These results also count the number of possible alternative operations that can be used in differential cryptanalysis.

Paperid: 2200, https://arxiv.org/pdf/2510.05807.pdf

Abstract:
Decentralized applications (dApps) in Decentralized Finance (DeFi) face a fundamental tension between regulatory compliance requirements like Know Your Customer (KYC) and maintaining decentralization and privacy. Existing permissioned DeFi solutions often fail to adequately protect private attributes of dApp users and introduce implicit trust assumptions, undermining the blockchain's decentralization. Addressing these limitations, this paper presents a novel synthesis of Self-Sovereign Identity (SSI), Zero-Knowledge Proofs (ZKPs), and Attribute-Based Access Control to enable privacy-preserving on-chain permissioning based on decentralized policy decisions. We provide a comprehensive framework for permissioned dApps that aligns decentralized trust, privacy, and transparency, harmonizing blockchain principles with regulatory compliance. Our framework supports multiple proof types (equality, range, membership, and time-dependent) with efficient proof generation through a commit-and-prove scheme that moves credential authenticity verification outside the ZKP circuit. Experimental evaluation of our KYC-compliant DeFi implementation shows considerable performance improvement for different proof types compared to baseline approaches. We advance the state-of-the-art through a holistic approach, flexible proof mechanisms addressing diverse real-world requirements, and optimized proof generation enabling practical deployment.

Paperid: 2201, https://arxiv.org/pdf/2510.05777.pdf

Abstract:
Single nucleotide polymorphism (SNP) datasets are fundamental to genetic studies but pose significant privacy risks when shared. The correlation of SNPs with each other makes strong adversarial attacks such as masked-value reconstruction, kin, and membership inference attacks possible. Existing privacy-preserving approaches either apply differential privacy to statistical summaries of these datasets or offer complex methods that require post-processing and the usage of a publicly available dataset to suppress or selectively share SNPs. In this study, we introduce an innovative framework for generating synthetic SNP sequence datasets using samples derived from time-inhomogeneous hidden Markov models (TIHMMs). To preserve the privacy of the training data, we ensure that each SNP sequence contributes only a bounded influence during training, enabling strong differential privacy guarantees. Crucially, by operating on full SNP sequences and bounding their gradient contributions, our method directly addresses the privacy risks introduced by their inherent correlations. Through experiments conducted on the real-world 1000 Genomes dataset, we demonstrate the efficacy of our method using privacy budgets of $\varepsilon \in [1, 10]$ at $δ=10^{-4}$. Notably, by allowing the transition models of the HMM to be dependent on the location in the sequence, we significantly enhance performance, enabling the synthetic datasets to closely replicate the statistical properties of non-private datasets. This framework facilitates the private sharing of genomic data while offering researchers exceptional flexibility and utility.

Paperid: 2202, https://arxiv.org/pdf/2510.05605.pdf

Abstract:
Penetration testing and vulnerability assessment are essential industry practices for safeguarding computer systems. As cyber threats grow in scale and complexity, the demand for pentesting has surged, surpassing the capacity of human professionals to meet it effectively. With advances in AI, particularly Large Language Models (LLMs), there have been attempts to automate the pentesting process. However, existing tools such as PentestGPT are still semi-manual, requiring significant professional human interaction to conduct pentests. To this end, we propose a novel LLM agent-based framework, AutoPentester, which automates the pentesting process. Given a target IP, AutoPentester automatically conducts pentesting steps using common security tools in an iterative process. It can dynamically generate attack strategies based on the tool outputs from the previous iteration, mimicking the human pentester approach. We evaluate AutoPentester using Hack The Box and custom-made VMs, comparing the results with the state-of-the-art PentestGPT. Results show that AutoPentester achieves a 27.0% better subtask completion rate and 39.5% more vulnerability coverage with fewer steps. Most importantly, it requires significantly fewer human interactions and interventions compared to PentestGPT. Furthermore, we recruit a group of security industry professional volunteers for a user survey and perform a qualitative analysis to evaluate AutoPentester against industry practices and compare it with PentestGPT. On average, AutoPentester received a score of 3.93 out of 5 based on user reviews, which was 19.8% higher than PentestGPT.

Paperid: 2203, https://arxiv.org/pdf/2510.04153.pdf

Abstract:
Diffusion Models have gained significant popularity due to their remarkable capabilities in image generation, albeit at the cost of intensive computation requirement. Meanwhile, despite their widespread deployment in inference services such as Midjourney, concerns about the potential leakage of sensitive information in uploaded user prompts have arisen. Existing solutions either lack rigorous privacy guarantees or fail to strike an effective balance between utility and efficiency. To bridge this gap, we propose ObCLIP, a plug-and-play safeguard that enables oblivious cloud-device hybrid generation. By oblivious, each input prompt is transformed into a set of semantically similar candidate prompts that differ only in sensitive attributes (e.g., gender, ethnicity). The cloud server processes all candidate prompts without knowing which one is the real one, thus preventing any prompt leakage. To mitigate server cost, only a small portion of denoising steps is performed upon the large cloud model. The intermediate latents are then sent back to the client, which selects the targeted latent and completes the remaining denoising using a small device model. Additionally, we analyze and incorporate several cache-based accelerations that leverage temporal and batch redundancy, effectively reducing computation cost with minimal utility degradation. Extensive experiments across multiple datasets demonstrate that ObCLIP provides rigorous privacy and comparable utility to cloud models with slightly increased server cost.

Paperid: 2204, https://arxiv.org/pdf/2510.03218.pdf

Abstract:
Coin-flipping is a fundamental task in two-party cryptography where two remote mistrustful parties wish to generate a shared uniformly random bit. While quantum protocols promising near-perfect security exist for weak coin-flipping -- when the parties want opposing outcomes -- it has been shown that they must be inefficient in terms of their round complexity, and it is an open question of how space efficient they can be. In this work, we consider a variant called cheat-penalised weak coin-flipping in which if a party gets caught cheating, they lose $Λ$ points (compared to $0$ in the standard definition). We find that already for a small cheating penalty, the landscape of coin-flipping changes dramatically. For example, with $Λ=0.01$, we exhibit a protocol where neither Alice nor Bob can bias the result in their favour beyond $1/2 + 10^{-8}$, which uses $24$ qubits and $10^{16}$ rounds of communication (provably $10^{7}$ times better than any weak coin-flipping protocol with matching security). For the same space requirements, we demonstrate how one can choose between lowering how much a malicious party can bias the result (down to $1/2 + 10^{-10}$) and reducing the rounds of communication (down to $25,180$), depending on what is preferred. To find these protocols, we make two technical contributions. First, we extend the point game-protocol correspondence introduced by Kitaev and Mochon, to incorporate: (i) approximate point games, (ii) the cheat-penalised setting, and (iii) round and space complexity. Second, we give the first (to the best of our knowledge) numerical algorithm for constructing (approximate) point games that correspond to high security and low complexity. Our results open up the possibility of having secure and practical quantum protocols for multiparty computation.

Paperid: 2205, https://arxiv.org/pdf/2510.02773.pdf

Abstract:
OpenID Connect has revolutionized online authentication based on single sign-on (SSO) by providing a secure and convenient method for accessing multiple services with a single set of credentials. Despite its widespread adoption, critical security bugs in OpenID Connect have resulted in significant financial losses and security breaches, highlighting the need for robust mitigation strategies. Automated program repair presents a promising solution for generating candidate patches for OpenID implementations. However, challenges such as domain-specific complexities and the necessity for precise fault localization and patch verification must be addressed. We propose AuthFix, a counterexample-guided repair engine leveraging LLMs for automated OpenID bug fixing. AuthFix integrates three key components: fault localization, patch synthesis, and patch verification. By employing a novel Petri-net-based model checker, AuthFix ensures the correctness of patches by effectively modeling interactions. Our evaluation on a dataset of OpenID bugs demonstrates that AuthFix successfully generated correct patches for 17 out of 23 bugs (74%), with a high proportion of patches semantically equivalent to developer-written fixes.

Paperid: 2206, https://arxiv.org/pdf/2510.02280.pdf

Abstract:
In this paper, we provide details on the proofs of the quantum polynomial time algorithm of Biasse and Song (SODA 16) for computing the $S$-unit group of a number field. This algorithm directly implies polynomial time methods to calculate class groups, S-class groups, relative class group and the unit group, ray class groups, solve the principal ideal problem, solve certain norm equations, and decompose ideal classes in the ideal class group. Additionally, combined with a result of Cramer, Ducas, Peikert and Regev (Eurocrypt 2016), the resolution of the principal ideal problem allows one to find short generators of a principal ideal. Likewise, methods due to Cramer, Ducas and Wesolowski (Eurocrypt 2017) use the resolution of the principal ideal problem and the decomposition of ideal classes to find so-called ``mildly short vectors'' in ideal lattices of cyclotomic fields.

Paperid: 2207, https://arxiv.org/pdf/2510.02251.pdf

Abstract:
Reproducible builds are a set of software development practices that establish an independently verifiable path from source code to binary artifacts, helping to detect and mitigate certain classes of supply chain attacks. Although quantum computing is a rapidly evolving field of research, it can already benefit from adopting reproducible builds. This paper aims to bridge the gap between the quantum computing and reproducible builds communities. We propose a generalization of the definition of reproducible builds in the quantum setting, motivated by two threat models: one targeting the confidentiality of end users' data during circuit preparation and submission to a quantum computer, and another compromising the integrity of quantum computation results. This work presents three examples that show how classical information can be hidden in transpiled quantum circuits, and two cases illustrating how even minimal modifications to these circuits can lead to incorrect quantum computation results. Our work provides initial steps towards a framework for reproducibility in quantum software toolchains.

Paperid: 2208, https://arxiv.org/pdf/2510.01393.pdf

Abstract:
In this paper we show E-FuzzEdge, a novel fuzzing architecture targeted towards improving the throughput of fuzzing campaigns in contexts where scalability is unavailable. E-FuzzEdge addresses the inefficiencies of hardware-in-the-loop fuzzing for microcontrollers by optimizing execution speed. We evaluated our system against state-of-the-art benchmarks, demonstrating significant performance improvements. A key advantage of E-FuzzEdgearchitecture is its compatibility with other embedded fuzzing techniques that perform on device testing instead of firmware emulation. This means that the broader embedded fuzzing community can integrate E-FuzzEdge into their workflows to enhance overall testing efficiency.

Paperid: 2209, https://arxiv.org/pdf/2510.00350.pdf

Abstract:
We conduct the first comprehensive security analysis of Tile, the second most popular crowd-sourced location-tracking service behind Apple's AirTags. We identify several exploitable vulnerabilities and design flaws, disproving many of the platform's claimed security and privacy guarantees: Tile's servers can persistently learn the location of all users and tags, unprivileged adversaries can track users through Bluetooth advertisements emitted by Tile's devices, and Tile's anti-theft mode is easily subverted. Despite its wide deployment -- millions of users, devices, and purpose-built hardware tags -- Tile provides no formal description of its protocol or threat model. Worse, Tile intentionally weakens its antistalking features to support an antitheft use-case and relies on a novel "accountability" mechanism to punish those abusing the system to stalk victims. We examine Tile's accountability mechanism, a unique feature of independent interest; no other provider attempts to guarantee accountability. While an ideal accountability mechanism may disincentivize abuse in crowd-sourced location tracking protocols, we show that Tile's implementation is subvertible and introduces new exploitable vulnerabilities. We conclude with a discussion on the need for new, formal definitions of accountability in this setting.

Paperid: 2210, https://arxiv.org/pdf/2509.24807.pdf

Abstract:
Keystroke dynamics is a promising modality for active user authentication, but its effectiveness under varying LLM-assisted typing and cognitive conditions remains understudied. Using data from 50 users and cognitive labels from Bloom's Taxonomy, we evaluate keystroke-based authentication in Korean across three realistic typing scenarios: bona fide composition, LLM content paraphrasing, and transcription. Our pipeline incorporates continuity-aware segmentation, feature extraction, and classification via SVM, MLP, and XGB. Results show that the system maintains reliable performance across varying LLM usages and cognitive contexts, with Equal Error Rates ranging from 5.1% to 10.4%. These findings demonstrate the feasibility of behavioral authentication under modern writing conditions and offer insights into designing more context-resilient models.

Paperid: 2211, https://arxiv.org/pdf/2509.24515.pdf

Abstract:
While LLM-based specification generation is gaining traction, existing tools primarily focus on mainstream programming languages like C, Java, and even Solidity, leaving emerging and yet verification-oriented languages like Move underexplored. In this paper, we introduce MSG, an automated specification generation tool designed for Move smart contracts. MSG aims to highlight key insights that uniquely present when applying LLM-based specification generation to a new ecosystem. Specifically, MSG demonstrates that LLMs exhibit robust code comprehension and generation capabilities even for non-mainstream languages. MSG successfully generates verifiable specifications for 84% of tested Move functions and even identifies clauses previously overlooked by experts. Additionally, MSG shows that explicitly leveraging specification language features through an agentic, modular design improves specification quality substantially (generating 57% more verifiable clauses than conventional designs). Incorporating feedback from the verification toolchain further enhances the effectiveness of MSG, leading to a 30% increase in generated verifiable specifications.

Paperid: 2212, https://arxiv.org/pdf/2509.24408.pdf

Abstract:
Autonomous driving systems increasingly rely on multi-agent architectures powered by large language models (LLMs), where specialized agents collaborate to perceive, reason, and plan. A key component of these systems is the shared function library, a collection of software tools that agents use to process sensor data and navigate complex driving environments. Despite its critical role in agent decision-making, the function library remains an under-explored vulnerability. In this paper, we introduce FuncPoison, a novel poisoning-based attack targeting the function library to manipulate the behavior of LLM-driven multi-agent autonomous systems. FuncPoison exploits two key weaknesses in how agents access the function library: (1) agents rely on text-based instructions to select tools; and (2) these tools are activated using standardized command formats that attackers can replicate. By injecting malicious tools with deceptive instructions, FuncPoison manipulates one agent s decisions--such as misinterpreting road conditions--triggering cascading errors that mislead other agents in the system. We experimentally evaluate FuncPoison on two representative multi-agent autonomous driving systems, demonstrating its ability to significantly degrade trajectory accuracy, flexibly target specific agents to induce coordinated misbehavior, and evade diverse defense mechanisms. Our results reveal that the function library, often considered a simple toolset, can serve as a critical attack surface in LLM-based autonomous driving systems, raising elevated concerns on their reliability.

Paperid: 2213, https://arxiv.org/pdf/2509.23621.pdf

Abstract:
Automated machine learning (AutoML) has emerged as a promising paradigm for automating machine learning (ML) pipeline design, broadening AI adoption. Yet its reliability in complex domains such as cybersecurity remains underexplored. This paper systematically evaluates eight open-source AutoML frameworks across 11 publicly available cybersecurity datasets, spanning intrusion detection, malware classification, phishing, fraud detection, and spam filtering. Results show substantial performance variability across tools and datasets, with no single solution consistently superior. A paradigm shift is observed: the challenge has moved from selecting individual ML models to identifying the most suitable AutoML framework, complicated by differences in runtime efficiency, automation capabilities, and supported features. AutoML tools frequently favor tree-based models, which perform well but risk overfitting and limit interpretability. Key challenges identified include adversarial vulnerability, model drift, and inadequate feature engineering. We conclude with best practices and research directions to strengthen robustness, interpretability, and trust in AutoML for high-stakes cybersecurity applications.

Paperid: 2214, https://arxiv.org/pdf/2509.23418.pdf

Abstract:
YouTube has emerged as a dominant platform for both information dissemination and entertainment. However, its vast accessibility has also made it a target for scammers, who frequently upload deceptive or malicious content. Prior research has documented a range of scam types, and detection approaches rely primarily on textual or statistical metadata. Although effective to some extent, these signals are easy to evade and potentially overlook other modalities, such as visual cues. In this study, we present the first systematic investigation of multimodal approaches for YouTube scam detection. Our dataset consolidates established scam categories and augments them with full length video content and policy grounded reasoning annotations. Our experimental evaluation demonstrates that a text-only model using video titles and descriptions (fine-tuned BERT) achieves moderate effectiveness (76.61% F1), with modest improvements when incorporating audio transcripts (77.98% F1). In contrast, visual analysis using a fine-tuned LLaVA-Video model yields stronger results (79.61% F1). Finally, a multimodal framework that integrates titles, descriptions, and video frames achieves the highest performance (80.53% F1). Beyond improving detection accuracy, our multimodal framework produces interpretable reasoning grounded in YouTube content policies, thereby enhancing transparency and supporting potential applications in automated moderation. Moreover, we validate our approach on in-the-wild YouTube data by analyzing 6,374 videos, thereby contributing a valuable resource for future research on scam detection.

Paperid: 2215, https://arxiv.org/pdf/2509.22762.pdf

Abstract:
Modern IoT and embedded platforms must start execution from a known trusted state to thwart malware, ensure secure firmware updates, and protect critical infrastructure. Current approaches to establish a root of trust depend on secret keys and/or specialized secure hardware, which drives up costs, may involve third parties, adds operational complexity, and relies on assumptions about an attacker's computational power. In contrast, TRUSTCHECKPOINTS is the first system to establish an unconditional software root of trust based on a formal model without relying on secrets or trusted hardware. Developers capture a full-system checkpoint and later roll back to it and prove this to an external verifier. The verifier issues timing-constrained, randomized k-independent polynomial challenges (via Horner's rule) that repeatedly scan the fast on-chip memory in randomized passes. When malicious code attempts to persist, it must swap into slower, unchecked off-chip storage, causing a detectable timing delay. Our prototype for a commodity ARM Cortex-A53-based platform validates 192 KB of SRAM in approximately 10 s using 500 passes, sufficient to detect single-instruction persistent malware. The prototype then seamlessly extends trust to DRAM. Two modes (fast SRAM-bootstrap and comprehensive full-memory scan) allow trade-offs between speed and coverage, demonstrating reliable malware detection on unmodified hardware.

Paperid: 2216, https://arxiv.org/pdf/2509.22082.pdf

Abstract:
Federated Learning (FL) preserves privacy by keeping raw data local, yet Gradient Inversion Attacks (GIAs) pose significant threats. In FedAVG multi-step scenarios, attackers observe only aggregated gradients, making data reconstruction challenging. Existing surrogate model methods like SME assume linear parameter trajectories, but we demonstrate this severely underestimates SGD's nonlinear complexity, fundamentally limiting attack effectiveness. We propose Non-Linear Surrogate Model Extension (NL-SME), the first method to introduce nonlinear parametric trajectory modeling for GIAs. Our approach replaces linear interpolation with learnable quadratic BÃ©zier curves that capture SGD's curved characteristics through control points, combined with regularization and dvec scaling mechanisms for enhanced expressiveness. Extensive experiments on CIFAR-100 and FEMNIST datasets show NL-SME significantly outperforms baselines across all metrics, achieving order-of-magnitude improvements in cosine similarity loss while maintaining computational efficiency.This work exposes heightened privacy vulnerabilities in FL's multi-step update paradigm and offers novel perspectives for developing robust defense strategies.

Paperid: 2217, https://arxiv.org/pdf/2509.21843.pdf

Abstract:
Model integrity of Large language models (LLMs) has become a pressing security concern with their massive online deployment. Prior Bit-Flip Attacks (BFAs) -- a class of popular AI weight memory fault-injection techniques -- can severely compromise Deep Neural Networks (DNNs): as few as tens of bit flips can degrade accuracy toward random guessing. Recent studies extend BFAs to LLMs and reveal that, despite the intuition of better robustness from modularity and redundancy, only a handful of adversarial bit flips can also cause LLMs' catastrophic accuracy degradation. However, existing BFA methods typically focus on either integer or floating-point models separately, limiting attack flexibility. Moreover, in floating-point models, random bit flips often cause perturbed parameters to extreme values (e.g., flipping in exponent bit), making it not stealthy and leading to numerical runtime error (e.g., invalid tensor values (NaN/Inf)). In this work, for the first time, we propose SBFA (Sneaky Bit-Flip Attack), which collapses LLM performance with only one single bit flip while keeping perturbed values within benign layer-wise weight distribution. It is achieved through iterative searching and ranking through our defined parameter sensitivity metric, ImpactScore, which combines gradient sensitivity and perturbation range constrained by the benign layer-wise weight distribution. A novel lightweight SKIP searching algorithm is also proposed to greatly reduce searching complexity, which leads to successful SBFA searching taking only tens of minutes for SOTA LLMs. Across Qwen, LLaMA, and Gemma models, with only one single bit flip, SBFA successfully degrades accuracy to below random levels on MMLU and SST-2 in both BF16 and INT8 data formats. Remarkably, flipping a single bit out of billions of parameters reveals a severe security concern of SOTA LLM models.

Paperid: 2218, https://arxiv.org/pdf/2509.21831.pdf

Abstract:
The rise of Web3 and Decentralized Finance (DeFi) has enabled borderless access to financial services empowered by smart contracts and blockchain technology. However, the ecosystem's trustless, permissionless, and borderless nature presents substantial regulatory challenges. The absence of centralized oversight and the technical complexity create fertile ground for financial crimes. Among these, money laundering is particularly concerning, as in the event of successful scams, code exploits, and market manipulations, it facilitates covert movement of illicit gains. Beyond this, there is a growing concern that cryptocurrencies can be leveraged to launder proceeds from drug trafficking, or to transfer funds linked to terrorism financing. This survey aims to outline a taxonomy of high-level strategies and underlying mechanisms exploited to facilitate money laundering in Web3. We examine how criminals leverage the pseudonymous nature of Web3, alongside weak regulatory frameworks, to obscure illicit financial activities. Our study seeks to bridge existing knowledge gaps on laundering schemes, identify open challenges in the detection and prevention of such activities, and propose future research directions to foster a more transparent Web3 financial ecosystem -- offering valuable insights for researchers, policymakers, and industry practitioners.

Paperid: 2219, https://arxiv.org/pdf/2509.21590.pdf

Abstract:
Google Chrome is the most popular Web browser. Users can customize it with extensions that enhance their browsing experience. The most well-known marketplace of such extensions is the Chrome Web Store (CWS). Developers can upload their extensions on the CWS, but such extensions are made available to users only after a vetting process carried out by Google itself. Unfortunately, some malicious extensions bypass such checks, putting the security and privacy of downstream browser extension users at risk. Here, we scrutinize the extent to which automated mechanisms reliant on supervised machine learning (ML) can be used to detect malicious extensions on the CWS. To this end, we first collect 7,140 malicious extensions published in 2017--2023. We combine this dataset with 63,598 benign extensions published or updated on the CWS before 2023, and we develop three supervised-ML-based classifiers. We show that, in a "lab setting", our classifiers work well (e.g., 98% accuracy). Then, we collect a more recent set of 35,462 extensions from the CWS, published or last updated in 2023, with unknown ground truth. We were eventually able to identify 68 malicious extensions that bypassed the vetting process of the CWS. However, our classifiers also reported >1k likely malicious extensions. Based on this finding (further supported with empirical evidence), we elucidate, for the first time, a strong concept drift effect on browser extensions. We also show that commercial detectors (e.g., VirusTotal) work poorly to detect known malicious extensions. Altogether, our results highlight that detecting malicious browser extensions is a fundamentally hard problem. This requires additional work both by the research community and by Google itself -- potentially by revising their approaches. In the meantime, we informed Google of our discoveries, and we release our artifacts.

Paperid: 2220, https://arxiv.org/pdf/2509.21147.pdf

Abstract:
Federated Learning (FL) facilitates collaborative model training while keeping raw data decentralized, making it a conduit for leveraging the power of IoT devices while maintaining privacy of the locally collected data. However, existing privacy- preserving techniques present notable hurdles. Methods such as Multi-Party Computation (MPC), Homomorphic Encryption (HE), and Differential Privacy (DP) often incur high compu- tational costs and suffer from limited scalability. This survey examines emerging approaches that hold promise for enhancing both privacy and efficiency in FL, including Trusted Execution Environments (TEEs), Physical Unclonable Functions (PUFs), Quantum Computing (QC), Chaos-Based Encryption (CBE), Neuromorphic Computing (NC), and Swarm Intelligence (SI). For each paradigm, we assess its relevance to the FL pipeline, outlining its strengths, limitations, and practical considerations. We conclude by highlighting open challenges and prospective research avenues, offering a detailed roadmap for advancing secure and scalable FL systems.

Paperid: 2221, https://arxiv.org/pdf/2509.20796.pdf

Abstract:
Efficient and secure revocable attribute-based encryption (RABE) is vital for ensuring flexible and fine-grained access control and data sharing in cloud storage and outsourced data environments within the Internet of Things (IoT). However, current RABE schemes often struggle to achieve an optimal balance between efficiency, security, dynamic scalability, and other important features, which hampers their practical application. To overcome these limitations, we propose a fast RABE scheme with data integrity for IoT that achieves adaptive security with multiple challenge ciphertexts. Our scheme supports the revocation of authorized users and transfers the computationally heavy revocation processes to the cloud, thereby easing the computational burden on IoT devices. Moreover, it consistently guarantees the integrity and correctness of data. We have demonstrated its adaptive security within the defined security model with multiple challenge ciphertexts and optimized its performance. Experimental results indicate that our scheme provides better performance than existing solutions. Under the same access policy, our scheme reduces computational consumption by 7 to 9 times compared to previous schemes.

Paperid: 2222, https://arxiv.org/pdf/2509.20686.pdf

Abstract:
Fully Homomorphic Encryption (FHE) represents a paradigm shift in cryptography, enabling computation directly on encrypted data and unlocking privacy-critical computation. Despite being increasingly deployed in real platforms, the reliability aspects of FHE systems, especially how they respond to faults, have been mostly neglected. This paper aims to better understand of how FHE computation behaves in the presence of memory faults, both in terms of individual operations as well as at the level of applications, for different FHE schemes. Finally, we investigate how effective traditional and FHE-specific fault mitigation techniques are.

Paperid: 2223, https://arxiv.org/pdf/2509.20405.pdf

Abstract:
Speech deepfake detectors are often evaluated on clean, benchmark-style conditions, but deployment occurs in an open world of shifting devices, sampling rates, codecs, environments, and attack families. This creates a ``coverage debt" for AI-based detectors: every new condition multiplies with existing ones, producing data blind spots that grow faster than data can be collected. Because attackers can target these uncovered regions, worst-case performance (not average benchmark scores) determines security. To demonstrate the impact of the coverage debt problem, we analyze results from a recent cross-testing framework. Grouping performance by bona fide domain and spoof release year, two patterns emerge: newer synthesizers erase the legacy artifacts detectors rely on, and conversational speech domains (teleconferencing, interviews, social media) are consistently the hardest to secure. These findings show that detection alone should not be relied upon for high-stakes decisions. Detectors should be treated as auxiliary signals within layered defenses that include provenance, personhood credentials, and policy safeguards.

Paperid: 2224, https://arxiv.org/pdf/2509.20394.pdf

Abstract:
This paper introduces the Hazard-Aware System Card (HASC), a novel framework designed to enhance transparency and accountability in the development and deployment of AI systems. The HASC builds upon existing model card and system card concepts by integrating a comprehensive, dynamic record of an AI system's security and safety posture. The framework proposes a standardized system of identifiers, including a novel AI Safety Hazard (ASH) ID, to complement existing security identifiers like CVEs, allowing for clear and consistent communication of fixed flaws. By providing a single, accessible source of truth, the HASC empowers developers and stakeholders to make more informed decisions about AI system safety throughout its lifecycle. Ultimately, we also compare our proposed AI system cards with the ISO/IEC 42001:2023 standard and discuss how they can be used to complement each other, providing greater transparency and accountability for AI systems.

Paperid: 2225, https://arxiv.org/pdf/2509.20283.pdf

Abstract:
Auditing differential privacy has emerged as an important area of research that supports the design of privacy-preserving mechanisms. Privacy audits help to obtain empirical estimates of the privacy parameter, to expose flawed implementations of algorithms and to compare practical with theoretical privacy guarantees. In this work, we investigate an unexplored facet of privacy auditing: the sustained auditing of a mechanism that can go through changes during its development or deployment. Monitoring the privacy of algorithms over time comes with specific challenges. Running state-of-the-art (static) auditors repeatedly requires excessive sampling efforts, while the reliability of such methods deteriorates over time without proper adjustments. To overcome these obstacles, we present a new monitoring procedure that extracts information from the entire deployment history of the algorithm. This allows us to reduce sampling efforts, while sustaining reliable outcomes of our auditor. We derive formal guarantees with regard to the soundness of our methods and evaluate their performance for important mechanisms from the literature. Our theoretical findings and experiments demonstrate the efficacy of our approach.

Paperid: 2226, https://arxiv.org/pdf/2509.20190.pdf

Abstract:
In modern automotive development, security testing is critical for safeguarding systems against increasingly advanced threats. Attack trees are widely used to systematically represent potential attack vectors, but generating comprehensive test cases from these trees remains a labor-intensive, error-prone task that has seen limited automation in the context of testing vehicular systems. This paper introduces STAF (Security Test Automation Framework), a novel approach to automating security test case generation. Leveraging Large Language Models (LLMs) and a four-step self-corrective Retrieval-Augmented Generation (RAG) framework, STAF automates the generation of executable security test cases from attack trees, providing an end-to-end solution that encompasses the entire attack surface. We particularly show the elements and processes needed to provide an LLM to actually produce sensible and executable automotive security test suites, along with the integration with an automated testing framework. We further compare our tailored approach with general purpose (vanilla) LLMs and the performance of different LLMs (namely GPT-4.1 and DeepSeek) using our approach. We also demonstrate the method of our operation step-by-step in a concrete case study. Our results show significant improvements in efficiency, accuracy, scalability, and easy integration in any workflow, marking a substantial advancement in automating automotive security testing methodologies. Using TARAs as an input for verfication tests, we create synergies by connecting two vital elements of a secure automotive development process.

Paperid: 2227, https://arxiv.org/pdf/2509.19533.pdf

Abstract:
Security vulnerabilities in Internet-of-Things devices, mobile platforms, and autonomous systems remain critical. Traditional mutation-based fuzzers -- while effectively explore code paths -- primarily perform byte- or bit-level edits without semantic reasoning. Coverage-guided tools such as AFL++ use dictionaries, grammars, and splicing heuristics to impose shallow structural constraints, leaving deeper protocol logic, inter-field dependencies, and domain-specific semantics unaddressed. Conversely, reasoning-capable large language models (LLMs) can leverage pretraining knowledge to understand input formats, respect complex constraints, and propose targeted mutations, much like an experienced reverse engineer or testing expert. However, lacking ground truth for "correct" mutation reasoning makes supervised fine-tuning impractical, motivating explorations of off-the-shelf LLMs via prompt-based few-shot learning. To bridge this gap, we present an open-source microservices framework that integrates reasoning LLMs with AFL++ on Google's FuzzBench, tackling asynchronous execution and divergent hardware demands (GPU- vs. CPU-intensive) of LLMs and fuzzers. We evaluate four research questions: (R1) How can reasoning LLMs be integrated into the fuzzing mutation loop? (R2) Do few-shot prompts yield higher-quality mutations than zero-shot? (R3) Can prompt engineering with off-the-shelf models improve fuzzing directly? and (R4) Which open-source reasoning LLMs perform best under prompt-only conditions? Experiments with Llama3.3, Deepseek-r1-Distill-Llama-70B, QwQ-32B, and Gemma3 highlight Deepseek as the most promising. Mutation effectiveness depends more on prompt complexity and model choice than shot count. Response latency and throughput bottlenecks remain key obstacles, offering directions for future work.

Paperid: 2228, https://arxiv.org/pdf/2509.19485.pdf

Abstract:
With the rapid growth of smart home IoT devices, users are increasingly exposed to various security risks, as evident from recent studies. While seeking answers to know more on those security concerns, users are mostly left with their own discretion while going through various sources, such as online blogs and technical manuals, which may render higher complexity to regular users trying to extract the necessary information. This requirement does not go along with the common mindsets of smart home users and hence threatens the security of smart homes furthermore. In this paper, we aim to identify and address the major user-level security concerns in smart homes. Specifically, we develop a novel dataset of Q&A from public forums, capturing practical security challenges faced by smart home users. We extract major security concerns in smart homes from our dataset by leveraging the Latent Dirichlet Allocation (LDA). We fine-tune relatively "smaller" transformer models, such as T5 and Flan-T5, on this dataset to build a QA system tailored for smart home security. Unlike larger models like GPT and Gemini, which are powerful but often resource hungry and require data sharing, smaller models are more feasible for deployment in resource-constrained or privacy-sensitive environments like smart homes. The dataset is manually curated and supplemented with synthetic data to explore its potential impact on model performance. This approach significantly improves the system's ability to deliver accurate and relevant answers, helping users address common security concerns with smart home IoT devices. Our experiments on real-world user concerns show that our work improves the performance of the base models.

Paperid: 2229, https://arxiv.org/pdf/2509.18366.pdf

Abstract:
The central security issue of outsourced 3D printing (aka AM: Additive Manufacturing), an industry that is expected to dominate manufacturing, is the protection of the digital design (containing the designers' model, which is their intellectual property) shared with the manufacturer. Here, we show, for the first time, that side-channel attacks are, in fact, a concrete serious threat to existing industrial grade 3D printers, enabling the reconstruction of the model printed (regardless of employing ways to directly conceal the design, e.g. by encrypting it in transit and before loading it into the printer). Previously, such attacks were demonstrated only on fairly simple FDM desktop 3D printers, which play a negligible role in manufacturing of valuable designs. We focus on the Powder Bed Fusion (PBF) AM process, which is popular for manufacturing net-shaped parts with both polymers and metals. We demonstrate how its individual actuators can be instrumented for the collection of power side-channel information during the printing process. We then present our approach to reconstruct the 3D printed model solely from the collected power side-channel data. Further, inspired by Differential Power Analysis, we developed a method to improve the quality of the reconstruction based on multiple traces. We tested our approach on two design models with different degrees of complexity. For different models, we achieved as high as 90.29~\% of True Positives and as low as 7.02~\% and 9.71~\% of False Positives and False Negatives by voxel-based volumetric comparison between reconstructed and original designs. The lesson learned from our attack is that the security of design files cannot solely rely on protecting the files themselves in an industrial environment, but must instead also rely on assuring no leakage of power, noise and similar signals to potential eavesdroppers in the printer's vicinity.

Paperid: 2230, https://arxiv.org/pdf/2509.17836.pdf

Abstract:
Machine Learning (ML) techniques have shown strong potential for network traffic analysis; however, their effectiveness depends on access to representative, up-to-date datasets, which is limited in cybersecurity due to privacy and data-sharing restrictions. To address this challenge, Federated Learning (FL) has recently emerged as a novel paradigm that enables collaborative training of ML models across multiple clients while ensuring that sensitive data remains local. Nevertheless, Federated Averaging (FedAvg), the canonical FL algorithm, has proven poor convergence in heterogeneous environments where data distributions are non-independent and identically distributed (i.i.d.) and client datasets are unbalanced, conditions frequently observed in cybersecurity contexts. To overcome these challenges, several alternative FL strategies have been developed, yet their applicability to network intrusion detection remains insufficiently explored. This study systematically reviews and evaluates a range of FL methods in the context of intrusion detection for DDoS attacks. Using a dataset of network attacks within a Kubernetes-based testbed, we assess convergence efficiency, computational overhead, bandwidth consumption, and model accuracy. To the best of our knowledge, this is the first comparative analysis of FL algorithms for intrusion detection under realistic non-i.i.d. and unbalanced settings, providing new insights for the design of robust, privacypreserving network security solutions.

Paperid: 2231, https://arxiv.org/pdf/2509.17595.pdf

Abstract:
This paper introduces mathematical optimization as a new method for proving impossibility results in the field of card-based cryptography. While previous impossibility proofs were often limited to cases involving a small number of cards, this new approach establishes results that hold for a large number of cards. The research focuses on single-cut full-open (SCFO) protocols, which consist of performing one random cut and then revealing all cards. The main contribution is that for any three-variable Boolean function, no new SCFO protocols exist beyond those already known, under the condition that all additional cards have the same color. The significance of this work is that it provides a new framework for proving impossibility results and delivers a proof that is valid for any number of cards, as long as all additional cards have the same color.

Paperid: 2232, https://arxiv.org/pdf/2509.17371.pdf

Abstract:
The rapid adoption of large language models (LLMs) in critical domains has spurred extensive research into their security issues. While input manipulation attacks (e.g., prompt injection) have been well studied, Bit-Flip Attacks (BFAs) -- which exploit hardware vulnerabilities to corrupt model parameters and cause severe performance degradation -- have received far less attention. Existing BFA methods suffer from key limitations: they fail to balance performance degradation and output naturalness, making them prone to discovery. In this paper, we introduce SilentStriker, the first stealthy bit-flip attack against LLMs that effectively degrades task performance while maintaining output naturalness. Our core contribution lies in addressing the challenge of designing effective loss functions for LLMs with variable output length and the vast output space. Unlike prior approaches that rely on output perplexity for attack loss formulation, which inevitably degrade output naturalness, we reformulate the attack objective by leveraging key output tokens as targets for suppression, enabling effective joint optimization of attack effectiveness and stealthiness. Additionally, we employ an iterative, progressive search strategy to maximize attack efficacy. Experiments show that SilentStriker significantly outperforms existing baselines, achieving successful attacks without compromising the naturalness of generated text.

Paperid: 2233, https://arxiv.org/pdf/2509.17263.pdf

Abstract:
In an increasingly interconnected world, cybersecurity professionals play a pivotal role in safeguarding organizations from cyber threats. To secure their cyberspace, organizations are forced to adopt a cybersecurity framework such as the NIST National Initiative for Cybersecurity Education Workforce Framework for Cybersecurity (NICE Framework). Although these frameworks are a good starting point for businesses and offer critical information to identify, prevent, and respond to cyber incidents, they can be difficult to navigate and implement, particularly for small-medium businesses (SMB). To help overcome this issue, this paper identifies the most frequent attack vectors to SMBs (Objective 1) and proposes a practical model of both technical and non-technical tasks, knowledge, skills, abilities (TKSA) from the NICE Framework for those attacks (Objective 2). The research develops a scenario-based curriculum. By immersing learners in realistic cyber threat scenarios, their practical understanding and preparedness in responding to cybersecurity incidents is enhanced (Objective 3). Finally, this work integrates practical experience and real-life skill development into the curriculum (Objective 4). SMBs can use the model as a guide to evaluate, equip their existing workforce, or assist in hiring new employees. In addition, educational institutions can use the model to develop scenario-based learning modules to adequately equip the emerging cybersecurity workforce for SMBs. Trainees will have the opportunity to practice both technical and legal issues in a simulated environment, thereby strengthening their ability to identify, mitigate, and respond to cyber threats effectively.

Paperid: 2234, https://arxiv.org/pdf/2509.17185.pdf

Abstract:
The long-term success of cryptocurrencies largely depends on the incentive compatibility provided to the validators. Bribery attacks, facilitated trustlessly via smart contracts, threaten this foundation. This work introduces, implements, and evaluates three novel and efficient bribery contracts targeting Ethereum validators. The first bribery contract enables a briber to fork the blockchain by buying votes on their proposed blocks. The second contract incentivizes validators to voluntarily exit the consensus protocol, thus increasing the adversary's relative staking power. The third contract builds a trustless bribery market that enables the briber to auction off their manipulative power over the RANDAO, Ethereum's distributed randomness beacon. Finally, we provide an initial game-theoretical analysis of one of the described bribery markets.

Paperid: 2235, https://arxiv.org/pdf/2509.16987.pdf

Abstract:
Traditional air gaps in industrial systems are disappearing as IT technologies permeate the OT domain, accelerating the integration of wireless solutions like Wi-Fi. Next-generation Wi-Fi standards (IEEE 802.11ax/be) meet performance demands for industrial use cases, yet their introduction raises significant security concerns. A critical knowledge gap exists regarding the empirical prevalence and security configuration of Wi-Fi in real-world industrial settings. This work addresses this by mining the global crowdsourced WiGLE database to provide a data-driven understanding. We create the first publicly available dataset of 1,087 high-confidence industrial Wi-Fi networks, examining key attributes such as SSID patterns, encryption methods, vendor types, and global distribution. Our findings reveal a growing adoption of Wi-Fi across industrial sectors but underscore alarming security deficiencies, including the continued use of weak or outdated security configurations that directly expose critical infrastructure. This research serves as a pivotal reference point, offering both a unique dataset and practical insights to guide future investigations into wireless security within industrial environments.

Paperid: 2236, https://arxiv.org/pdf/2509.16655.pdf

Abstract:
Bug bounty programs have contributed significantly to security in technology firms in the last decade, but little is known about the role of reward incentives in producing useful outcomes. We analyze incentives and outcomes in Google's Vulnerability Rewards Program (VRP), one of the world's largest bug bounty programs. We analyze the responsiveness of the quality and quantity of bugs received to changes in payments, focusing on a change in Google's reward amounts posted in July, 2024, in which reward amounts increased by up to 200% for the highest impact tier. Our empirical results show an increase in the volume of high-value bugs received after the reward increase, for which we also compute elasticities. We further break down the sources of this increase between veteran researchers and new researchers, showing that the reward increase both redirected the attention of veteran researchers and attracted new top security researchers into the program.

Paperid: 2237, https://arxiv.org/pdf/2509.15547.pdf

Abstract:
This paper investigates physical-layer key generation (PLKG) in multi-antenna base station systems, by leveraging a fluid antenna system (FAS) to dynamically customize radio environments. Without requiring additional nodes or extensive radio frequency chains, the FAS effectively enables adaptive antenna port selection by exploiting channel spatial correlation to enhance the key generation rate (KGR) at legitimate nodes. To comprehensively evaluate the efficiency of the FAS in PLKG, we propose an FAS-assisted PLKG model that integrates transmit beamforming and sparse port selection under independent and identically distributed and spatially correlated channel models, respectively. Specifically, the PLKG utilizes reciprocal channel probing to derive a closed-form KGR expression based on the mutual information between legitimate channel estimates. Nonconvex optimization problems for these scenarios are formulated to maximize the KGR subject to transmit power constraints and sparse port activation. We propose an iterative algorithm by capitalizing on successive convex approximation and Cauchy-Schwarz inequality to obtain a locally optimal solution. A reweighted $\ell_1$-norm-based algorithm is applied to advocate for the sparse port activation of FAS-assisted PLKG. Furthermore, a low-complexity sliding window-based port selection is proposed to substitute reweighted $\ell_1$-norm method based on Rayleigh-quotient analysis. Simulation results demonstrate that the FAS-PLKG scheme significantly outperforms the FA-PLKG scheme in both independent and spatially correlated environments. The sliding window-based port selection method introduced in this paper has been shown to yield superior KGR, compared to the reweighted $\ell_1$-norm method. It is shown that the FAS achieves higher KGR with fewer RF chains through dynamic sparse port selection.

Paperid: 2238, https://arxiv.org/pdf/2509.15433.pdf

Abstract:
This report examines the synergy between Large Language Models (LLMs) and Static Application Security Testing (SAST) to improve vulnerability discovery. Traditional SAST tools, while effective for proactive security, are limited by high false-positive rates and a lack of contextual understanding. Conversely, LLMs excel at code analysis and pattern recognition but can be prone to inconsistencies and hallucinations. By integrating these two technologies, a more intelligent and efficient system is created. This combination moves beyond mere vulnerability detection optimization, transforming security into a deeply integrated, contextual process that provides tangible benefits like improved triage, dynamic bug descriptions, bug validation via exploit generation and enhanced analysis of complex codebases. The result is a more effective security approach that leverages the strengths of both technologies while mitigating their weaknesses. SAST-Genius reduced false positives by about 91 % (225 to 20) compared to Semgrep alone.

Paperid: 2239, https://arxiv.org/pdf/2509.14789.pdf

Abstract:
Replay speech attacks pose a significant threat to voice-controlled systems, especially in smart environments where voice assistants are widely deployed. While multi-channel audio offers spatial cues that can enhance replay detection robustness, existing datasets and methods predominantly rely on single-channel recordings. In this work, we introduce an acoustic simulation framework designed to simulate multi-channel replay speech configurations using publicly available resources. Our setup models both genuine and spoofed speech across varied environments, including realistic microphone and loudspeaker impulse responses, room acoustics, and noise conditions. The framework employs measured loudspeaker directionalities during the replay attack to improve the realism of the simulation. We define two spoofing settings, which simulate whether a reverberant or an anechoic speech is used in the replay scenario, and evaluate the impact of omnidirectional and diffuse noise on detection performance. Using the state-of-the-art M-ALRAD model for replay speech detection, we demonstrate that synthetic data can support the generalization capabilities of the detector across unseen enclosures.

Paperid: 2240, https://arxiv.org/pdf/2509.13509.pdf

Abstract:
Differential privacy (DP) -- a principled approach to producing statistical data products with strong, mathematically provable privacy guarantees for the individuals in the underlying dataset -- has seen substantial adoption in practice over the past decade. Applying DP requires making several implementation decisions, each with significant impacts on data privacy and/or utility. Hence, to promote shared learning and accountability around DP deployments, Dwork, Kohli, and Mulligan (2019) proposed a public-facing repository ("registry") of DP deployments. The DP community has recently started to work toward realizing this vision. We contribute to this effort by (1) developing a holistic, hierarchical schema to describe any given DP deployment and (2) designing and implementing an interactive interface to act as a registry where practitioners can access information about past DP deployments. We (3) populate our interface with 21 real-world DP deployments and (4) conduct an exploratory user study with DP practitioners ($n=16$) to understand how they would use the registry, as well as what challenges and opportunities they foresee around its adoption. We find that participants were enthusiastic about the registry as a valuable resource for evaluating prior deployments and making future deployments. They also identified several opportunities for the registry, including that it can become a "hub" for the community and support broader communication around DP (e.g., to legal teams). At the same time, they identified challenges around the registry gaining adoption, including the effort and risk involved with making implementation choices public and moderating the quality of entries. Based on our findings, we offer recommendations for encouraging adoption and increasing the registry's value not only to DP practitioners, but also to policymakers, data users, and data subjects.

Paperid: 2241, https://arxiv.org/pdf/2509.13186.pdf

Abstract:
In 2024, the Anti-Phishing Work Group identified over one million phishing pages. Phishers achieve this scale by using phishing kits -- ready-to-deploy phishing websites -- to rapidly deploy phishing campaigns with specific data exfiltration, evasion, or mimicry techniques. In contrast, researchers and defenders continue to fight phishing on a page-by-page basis and rely on manual analysis to recognize static features for kit identification. This paper aims to aid researchers and analysts by automatically differentiating groups of phishing pages based on the underlying kit, automating a previously manual process, and enabling us to measure how popular different client-side techniques are across these groups. For kit detection, our system has an accuracy of 97% on a ground-truth dataset of 548 kit families deployed across 4,562 phishing URLs. On an unlabeled dataset, we leverage the complexity of 434,050 phishing pages' JavaScript logic to group them into 11,377 clusters, annotating the clusters with what phishing techniques they employ. We find that UI interactivity and basic fingerprinting are universal techniques, present in 90% and 80% of the clusters, respectively. On the other hand, mouse detection via the browser's mouse API is among the rarest behaviors, despite being used in a deployment of a 7-year-old open-source phishing kit. Our methods and findings provide new ways for researchers and analysts to tackle the volume of phishing pages.

Paperid: 2242, https://arxiv.org/pdf/2509.12582.pdf

Abstract:
The STIR/SHAKEN (S/S) attestation Framework mandated by the United States, Canada, and France to combat pervasive telephone abuse has not achieved its goals, partly because legacy non-VoIP infrastructure could not participate. The industry solution to extend S/S broadcasts sensitive metadata of every non-VoIP call in plaintext to every third party required to facilitate the system. It has no mechanism to determine whether a provider's request for call data is appropriate, nor can it ensure that every copy of that call data is unavailable after its specified expiration. It threatens subscriber privacy and provider confidentiality. In this paper, we present Sidecar, a distributed, privacy-preserving system with tunable decentralization that securely extends S/S across all telephone network technologies. We introduce the notion of secure out-of-band signaling for telephony and formalize its system and security requirements. We then design novel, scalable protocols that realize these requirements and prove their security within the Universal Composability framework. Finally, we demonstrate Sidecar's efficiency with our open-sourced reference implementation. Compared to the current solution, Sidecar 1) protects the confidentiality of subscriber identity and provider trade secrets, 2) guarantees record expiration as long as a single node handling a record is honest, 3) reduces resource requirements while providing virtually identical call-setup times and equivalent or better uptimes, and 4) enables secure pay-per-use billing and integrates mechanisms to mitigate and detect misbehavior. Moreover, Sidecar can be extended to provide the same security guarantees for arbitrary call metadata. Not only is Sidecar a superior approach, it is also a transformative tool to retrofit fragmented global telephony and enable future improvements, such as stronger call authentication and Branded Calling.

Paperid: 2243, https://arxiv.org/pdf/2509.12462.pdf

Abstract:
Website Fingerprinting (WFP) uses deep learning models to classify encrypted network traffic to infer visited websites. While historically effective, prior methods fail to generalize to modern web environments. Single-page applications (SPAs) eliminate the paradigm of websites as sets of discrete pages, undermining page-based classification, and traffic from scripted browsers lacks the behavioral richness seen in real user sessions. Our study reveals that users exhibit highly diverse behaviors even on the same website, producing traffic patterns that vary significantly across individuals. This behavioral entropy makes WFP a harder problem than previously assumed and highlights the need for larger, more diverse, and representative datasets to achieve robust performance. To address this, we propose a new paradigm: we drop session-boundaries in favor of contiguous traffic segments and develop a scalable data generation pipeline using large language models (LLM) agents. These multi-agent systems coordinate decision-making and browser interaction to simulate realistic, persona-driven browsing behavior at 3--5x lower cost than human collection. We evaluate nine state-of-the-art WFP models on traffic from 20 modern websites browsed by 30 real users, and compare training performance across human, scripted, and LLM-generated datasets. All models achieve under 10\% accuracy when trained on scripted traffic and tested on human data. In contrast, LLM-generated traffic boosts accuracy into the 80\% range, demonstrating strong generalization to real-world traces. Our findings indicate that for modern WFP, model performance is increasingly bottlenecked by data quality, and that scalable, semantically grounded synthetic traffic is essential for capturing the complexity of real user behavior.

Paperid: 2244, https://arxiv.org/pdf/2509.11407.pdf

Abstract:
Hardware crosstalk in multi-tenant superconducting quantum computers constitutes a significant security threat, enabling adversaries to inject targeted errors across tenant boundaries. We present the first end-to-end framework for mapping physical pulse-level attacks to interpretable logical error channels, integrating density-matrix simulation, quantum process tomography (QPT), and a novel isometry-based circuit extraction method. Our pipeline reconstructs the complete induced error channel and fits an effective logical circuit model, revealing a fundamentally asymmetric attack mechanism: one adversarial qubit acts as a driver to set the induced logical rotation, while a second, the catalyst, refines the attack's coherence. Demonstrated on a linear three-qubit system, our approach shows that such attacks can significantly disrupt diverse quantum protocols, sometimes reducing accuracy to random guessing, while remaining effective and stealthy even under realistic hardware parameter variations. We further propose a protocol-level detection strategy based on observable attack signatures, showing that stealthy attacks can be exposed through targeted monitoring and providing a foundation for future defense-in-depth in quantum cloud platforms.

Paperid: 2245, https://arxiv.org/pdf/2509.10703.pdf

Abstract:
Over the past decade, AR/VR devices have drastically changed how we interact with the digital world. Users often share sensitive information, such as their location, browsing history, and even financial data, within third-party apps installed on these devices, assuming a secure environment protected from malicious actors. Recent research has revealed that malicious apps can exploit such capabilities and monitor benign apps to track user activities, leveraging fine-grained profiling tools, such as performance counter APIs. However, app-to-app monitoring is not feasible on all AR/VR devices (e.g., Meta Quest), as a concurrent standalone app execution is disabled. In this paper, we present OVRWatcher, a novel side-channel primitive for AR/VR devices that infers user activities by monitoring low-resolution (1Hz) GPU usage via a background script, unlike prior work that relies on high-resolution profiling. OVRWatcher captures correlations between GPU metrics and 3D object interactions under varying speeds, distances, and rendering scenarios, without requiring concurrent app execution, access to application data, or additional SDK installations. We demonstrate the efficacy of OVRWatcher in fingerprinting both standalone AR/VR and WebXR applications. OVRWatcher also distinguishes virtual objects, such as products in immersive shopping apps selected by real users and the number of participants in virtual meetings, thereby revealing users' product preferences and potentially exposing confidential information from those meetings. OVRWatcher achieves over 99% accuracy in app fingerprinting and over 98% accuracy in object-level inference.

Paperid: 2246, https://arxiv.org/pdf/2509.10682.pdf

Abstract:
The success and wide adoption of generative AI (GenAI), particularly large language models (LLMs), has attracted the attention of cybercriminals seeking to abuse models, steal sensitive data, or disrupt services. Moreover, providing security to LLM-based systems is a great challenge, as both traditional threats to software applications and threats targeting LLMs and their integration must be mitigated. In this survey, we shed light on security and privacy concerns of such LLM-based systems by performing a systematic review and comprehensive categorization of threats and defensive strategies considering the entire software and LLM life cycles. We analyze real-world scenarios with distinct characteristics of LLM usage, spanning from development to operation. In addition, threats are classified according to their severity level and to which scenarios they pertain, facilitating the identification of the most relevant threats. Recommended defense strategies are systematically categorized and mapped to the corresponding life cycle phase and possible attack strategies they attenuate. This work paves the way for consumers and vendors to understand and efficiently mitigate risks during integration of LLMs in their respective solutions or organizations. It also enables the research community to benefit from the discussion of open challenges and edge cases that may hinder the secure and privacy-preserving adoption of LLM-based systems.

Paperid: 2247, https://arxiv.org/pdf/2509.10224.pdf

Abstract:
Software-based memory-erasure protocols are two-party communication protocols where a verifier instructs a computational device to erase its memory and send a proof of erasure. They aim at guaranteeing that low-cost IoT devices are free of malware by putting them back into a safe state without requiring secure hardware or physical manipulation of the device. Several software-based memory-erasure protocols have been introduced and theoretically analysed. Yet, many of them have not been tested for their feasibility, performance and security on real devices, which hinders their industry adoption. This article reports on the first empirical analysis of software-based memory-erasure protocols with respect to their security, erasure guarantees, and performance. The experimental setup consists of 3 modern IoT devices with different computational capabilities, 7 protocols, 6 hash-function implementations, and various performance and security criteria. Our results indicate that existing software-based memory-erasure protocols are feasible, although slow devices may take several seconds to erase their memory and generate a proof of erasure. We found that no protocol dominates across all empirical settings, defined by the computational power and memory size of the device, the network speed, and the required level of security. Interestingly, network speed and hidden constants within the protocol specification played a more prominent role in the performance of these protocols than anticipated based on the related literature. We provide an evaluation framework that, given a desired level of security, determines which protocols offer the best trade-off between performance and erasure guarantees.

Paperid: 2248, https://arxiv.org/pdf/2509.09638.pdf

Abstract:
With the widespread adoption of cryptocurrencies, cryptojacking has become a significant security threat to crypto wallet users. This paper presents a front-end prototype of an AI-powered security dashboard, namely, CryptoGuard. Developed through a user-centered design process, the prototype was constructed as a high-fidelity, click-through model from Figma mockups to simulate key user interactions. It is designed to assist users in monitoring their login and transaction activity, identifying any suspicious behavior, and enabling them to take action directly within the wallet interface. The dashboard is designed for a general audience, prioritizing an intuitive user experience for non-technical individuals. Although its AI functionality is conceptual, the prototype demonstrates features like visual alerts and reporting. This work is positioned explicitly as a design concept, bridging cryptojacking detection research with human-centered interface design. This paper also demonstrates how usability heuristics can directly inform a tool's ability to support rapid and confident decision-making under real-world threats. This paper argues that practical security tools require not only robust backend functionality but also a user-centric design that communicates risk and empowers users to take meaningful action.

Paperid: 2249, https://arxiv.org/pdf/2509.09424.pdf

Abstract:
Secure inference enables privacy-preserving machine learning by leveraging cryptographic protocols that support computations on sensitive user data without exposing it. However, integrating cryptographic protocols with large language models (LLMs) presents significant challenges, as the inherent complexity of these protocols, together with LLMs' massive parameter scale and sophisticated architectures, severely limits practical usability. In this work, we propose ENSI, a novel non-interactive secure inference framework for LLMs, based on the principle of co-designing the cryptographic protocols and LLM architecture. ENSI employs an optimized encoding strategy that seamlessly integrates CKKS scheme with a lightweight LLM variant, BitNet, significantly reducing the computational complexity of encrypted matrix multiplications. In response to the prohibitive computational demands of softmax under homomorphic encryption (HE), we pioneer the integration of the sigmoid attention mechanism with HE as a seamless, retraining-free alternative. Furthermore, by embedding the Bootstrapping operation within the RMSNorm process, we efficiently refresh ciphertexts while markedly decreasing the frequency of costly bootstrapping invocations. Experimental evaluations demonstrate that ENSI achieves approximately an 8x acceleration in matrix multiplications and a 2.6x speedup in softmax inference on CPU compared to state-of-the-art method, with the proportion of bootstrapping is reduced to just 1%.

Paperid: 2250, https://arxiv.org/pdf/2509.09081.pdf

Abstract:
Users around the world face escalating network interference such as censorship, throttling, and interception, largely driven by the commoditization and growing availability of Deep Packet Inspection (DPI) devices. Once reserved for a few well-resourced nation-state actors, the ability to interfere with traffic at scale is now within reach of nearly any network operator. Despite this proliferation, our understanding of DPIs and their deployments on the Internet remains limited -- being network intermediary leaves DPI unresponsive to conventional host-based scanning tools, and DPI vendors actively obscuring their products further complicates measurement efforts. In this work, we present a remote measurement framework, dMAP (DPI Mapper), that derives behavioral fingerprints for DPIs to differentiate and cluster these otherwise indistinguishable middleboxes at scale, as a first step toward active reconnaissance of DPIs on the Internet. Our key insight is that parsing and interpreting traffic as network intermediaries inherently involves ambiguities -- from under-specified protocol behaviors to differing RFC interpretations -- forcing DPI vendors into independent implementation choices that create measurable variance among DPIs. Based on differential fuzzing, dMAP systematically discovers, selects, and deploys specialized probes that translate DPI internal parsing behaviors into externally observable fingerprints. Applying dMAP to DPI deployments globally, we demonstrate its practical feasibility, showing that even a modest set of 20-40 discriminative probes reliably differentiates a wide range of DPI implementations, including major nation-state censorship infrastructures and commercial DPI products. We discuss how our fingerprinting methodology generalizes beyond censorship to other forms of targeted interference.

Paperid: 2251, https://arxiv.org/pdf/2509.08746.pdf

Abstract:
Federated Learning (FL) is a distributed learning paradigm designed to address privacy concerns. However, FL is vulnerable to poisoning attacks, where Byzantine clients compromise the integrity of the global model by submitting malicious updates. Robust aggregation methods have been widely adopted to mitigate such threats, relying on the core assumption that malicious updates are inherently out-of-distribution and can therefore be identified and excluded before aggregating client updates. In this paper, we challenge this underlying assumption by showing that a model can be poisoned while keeping malicious updates within the main distribution. We propose Chameleon Poisoning (CHAMP), an adaptive and evasive poisoning strategy that exploits side-channel feedback from the aggregation process to guide the attack. Specifically, the adversary continuously infers whether its malicious contribution has been incorporated into the global model and adapts accordingly. This enables a dynamic adjustment of the local loss function, balancing a malicious component with a camouflaging component, thereby increasing the effectiveness of the poisoning while evading robust aggregation defenses. CHAMP enables more effective and evasive poisoning, highlighting a fundamental limitation of existing robust aggregation defenses and underscoring the need for new strategies to secure federated learning against sophisticated adversaries. Our approach is evaluated in two datasets reaching an average increase of 47.07% in attack success rate against nine robust aggregation defenses.

Paperid: 2252, https://arxiv.org/pdf/2509.08720.pdf

Abstract:
Metric Differential Privacy (mDP) extends the local differential privacy (LDP) framework to metric spaces, enabling more nuanced privacy protection for data such as geo-locations. However, existing mDP optimization methods, particularly those based on linear programming (LP), face scalability challenges due to the quadratic growth in decision variables. In this paper, we propose Perturbation via Anchor-based Distributed Approximation (PAnDA), a scalable two-phase framework for optimizing metric differential privacy (mDP). To reduce computational overhead, PAnDA allows each user to select a small set of anchor records, enabling the server to solve a compact linear program over a reduced domain. We introduce three anchor selection strategies, exponential decay (PAnDA-e), power-law decay (PAnDA-p), and logistic decay (PAnDA-l), and establish theoretical guarantees under a relaxed privacy notion called probabilistic mDP (PmDP). Experiments on real-world geo-location datasets demonstrate that PAnDA scales to secret domains with up to 5,000 records, two times larger than prior LP-based methods, while providing theoretical guarantees for both privacy and utility.

Paperid: 2253, https://arxiv.org/pdf/2509.08646.pdf

Abstract:
As Large Language Model (LLM) agents become increasingly capable of automating complex, multi-step tasks, the need for robust, secure, and predictable architectural patterns is paramount. This paper provides a comprehensive guide to the ``Plan-then-Execute'' (P-t-E) pattern, an agentic design that separates strategic planning from tactical execution. We explore the foundational principles of P-t-E, detailing its core components - the Planner and the Executor - and its architectural advantages in predictability, cost-efficiency, and reasoning quality over reactive patterns like ReAct (Reason + Act). A central focus is placed on the security implications of this design, particularly its inherent resilience to indirect prompt injection attacks by establishing control-flow integrity. We argue that while P-t-E provides a strong foundation, a defense-in-depth strategy is necessary, and we detail essential complementary controls such as the Principle of Least Privilege, task-scoped tool access, and sandboxed code execution. To make these principles actionable, this guide provides detailed implementation blueprints and working code references for three leading agentic frameworks: LangChain (via LangGraph), CrewAI, and AutoGen. Each framework's approach to implementing the P-t-E pattern is analyzed, highlighting unique features like LangGraph's stateful graphs for re-planning, CrewAI's declarative tool scoping for security, and AutoGen's built-in Docker sandboxing. Finally, we discuss advanced patterns, including dynamic re-planning loops, parallel execution with Directed Acyclic Graphs (DAGs), and the critical role of Human-in-the-Loop (HITL) verification, to offer a complete strategic blueprint for architects, developers, and security engineers aiming to build production-grade, resilient, and trustworthy LLM agents.

Paperid: 2254, https://arxiv.org/pdf/2509.08083.pdf

Abstract:
Software supply chain attacks have increased exponentially since 2020. The primary attack vectors for supply chain attacks are through: (1) software components; (2) the build infrastructure; and (3) humans (a.k.a software practitioners). Software supply chain risk management frameworks provide a list of tasks that an organization can adopt to reduce software supply chain risk. Exhaustively adopting all the tasks of these frameworks is infeasible, necessitating the prioritized adoption of tasks. Software organizations can benefit from being guided in this prioritization by learning what tasks other teams have adopted. The goal of this study is to aid software development organizations in understanding the adoption of security tasks that reduce software supply chain risk through an interview study of software practitioners engaged in software supply chain risk management efforts. An interview study was conducted with 61 practitioners at nine software development organizations that have focused efforts on reducing software supply chain risk. The results of the interviews indicate that organizations had implemented the most adopted software tasks before the focus on software supply chain security. Therefore, their implementation in organizations is more mature. The tasks that mitigate the novel attack vectors through software components and the build infrastructure are in the early stages of adoption. Adoption of these tasks should be prioritized.

Paperid: 2255, https://arxiv.org/pdf/2509.07504.pdf

Abstract:
Backdoor (trojan) attacks embed hidden, controllable behaviors into machine-learning models so that models behave normally on benign inputs but produce attacker-chosen outputs when a trigger is present. This survey reviews the rapidly growing literature on backdoor attacks and defenses in the computer-vision domain. We introduce a multi-dimensional taxonomy that organizes attacks and defenses by injection stage (dataset poisoning, model/parameter modification, inference-time injection), trigger type (patch, blended/frequency, semantic, transformation), labeling strategy (dirty-label vs. clean-label / feature-collision), representation stage (instance-specific, manifold/class-level, neuron/parameter hijacking, distributed encodings), and target task (classification, detection, segmentation, video, multimodal). For each axis we summarize representative methods, highlight evaluation practices, and discuss where defenses succeed or fail. For example, many classical sanitization and reverse-engineering tools are effective against reusable patch attacks but struggle with input-aware, sample-specific, or parameter-space backdoors and with transfer via compromised pre-trained encoders or hardware bit-flips. We synthesize trends, identify persistent gaps (supply-chain and hardware threats, certifiable defenses, cross-task benchmarks), and propose practical guidelines for threat-aware evaluation and layered defenses. This survey aims to orient researchers and practitioners to the current threat landscape and pressing research directions in secure computer vision.

Paperid: 2256, https://arxiv.org/pdf/2509.07457.pdf

Abstract:
An advanced persistent threat (APT) refers to a covert, long-term cyberattack, typically conducted by state-sponsored actors, targeting critical sectors and often remaining undetected for long periods. In response, collective intelligence from around the globe collaborates to identify and trace surreptitious activities, generating substantial documentation on APT campaigns publicly available on the web. While prior works predominantly focus on specific aspects of APT cases, such as detection, evaluation, cyber threat intelligence, and dataset creation, limited attention has been devoted to revisiting and investigating these scattered dossiers in a longitudinal manner. The objective of our study is to fill the gap by offering a macro perspective, connecting key insights and global trends in past APT attacks. We systematically analyze six reliable sources-three focused on technical reports and another three on threat actors-examining 1,509 APT dossiers (24,215 pages) spanning 2014-2023, and identifying 603 unique APT groups worldwide. To efficiently unearth relevant information, we employ a hybrid methodology that combines rule-based information retrieval with large-language-model-based search techniques. Our longitudinal analysis reveals shifts in threat actor activities, global attack vectors, changes in targeted sectors, and relationships between cyberattacks and significant events such as elections or wars, which provide insights into historical patterns in APT evolution. Over the past decade, 154 countries have been affected, primarily using malicious documents and spear phishing as dominant initial infiltration vectors, with a noticeable decline in zero-day exploitation since 2016. Furthermore, we present our findings through interactive visualization tools, such as an APT map or flow diagram, to facilitate intuitive understanding of global patterns and trends in APT activities.

Paperid: 2257, https://arxiv.org/pdf/2509.07323.pdf

Abstract:
The rapid advancement of AI has enabled highly realistic speech synthesis and voice cloning, posing serious risks to voice authentication, smart assistants, and telecom security. While most prior work frames spoof detection as a binary task, real-world attacks often involve hybrid utterances that mix genuine and synthetic speech, making detection substantially more challenging. To address this gap, we introduce the Hybrid Spoofed Audio Dataset (HSAD), a benchmark containing 1,248 clean and 41,044 degraded utterances across four classes: human, cloned, zero-shot AI-generated, and hybrid audio. Each sample is annotated with spoofing method, speaker identity, and degradation metadata to enable fine-grained analysis. We evaluate six transformer-based models, including spectrogram encoders (MIT-AST, MattyB95-AST) and self-supervised waveform models (Wav2Vec2, HuBERT). Results reveal critical lessons: pretrained models overgeneralize and collapse under hybrid conditions; spoof-specific fine-tuning improves separability but struggles with unseen compositions; and dataset-specific adaptation on HSAD yields large performance gains (AST greater than 97 percent and F1 score is approximately 99 percent), though residual errors persist for complex hybrids. These findings demonstrate that fine-tuning alone is not sufficient-robust hybrid-aware benchmarks like HSAD are essential to expose calibration failures, model biases, and factors affecting spoof detection in adversarial environments. HSAD thus provides both a dataset and an analytic framework for building resilient and trustworthy voice authentication systems.

Paperid: 2258, https://arxiv.org/pdf/2509.07276.pdf

Abstract:
In this work, we propose a new way to (non-interactively, verifiably) demonstrate quantum advantage by solving the average-case $\mathsf{NP}$ search problem of finding a solution to a system of (underdetermined) constant degree multivariate equations over the finite field $\mathbb{F}_2$ drawn from a specified distribution. In particular, for any $d \geq 2$, we design a distribution of degree up to $d$ polynomials $\{p_i(x_1,\ldots,x_n)\}_{i\in [m]}$ for $m 2$, it is classically hard to find one based on a thorough review of existing classical cryptanalysis. Our work thus posits that degree three functions are enough to instantiate the random oracle to obtain non-relativized quantum advantage. Our approach begins with the breakthrough Yamakawa-Zhandry (FOCS 2022) quantum algorithmic framework. In our work, we demonstrate that this quantum algorithmic framework extends to the setting of multivariate polynomial systems. Our key technical contribution is a new analysis on the Fourier spectra of distributions induced by a general family of distributions over $\mathbb{F}_2$ multivariate polynomials -- those that satisfy $2$-wise independence and shift-invariance. This family of distributions includes the distribution of uniform random degree at most $d$ polynomials for any constant $d \geq 2$. Our analysis opens up potentially new directions for quantum cryptanalysis of other multivariate systems.

Paperid: 2259, https://arxiv.org/pdf/2509.07055.pdf

Abstract:
We propose a practical sequential test for auditing differential privacy guarantees of black-box mechanisms. The test processes streams of mechanisms' outputs providing anytime-valid inference while controlling Type I error, overcoming the fixed sample size limitation of previous batch auditing methods. Experiments show this test detects violations with sample sizes that are orders of magnitude smaller than existing methods, reducing this number from 50K to a few hundred examples, across diverse realistic mechanisms. Notably, it identifies DP-SGD privacy violations in \textit{under} one training run, unlike prior methods needing full model training.

Paperid: 2260, https://arxiv.org/pdf/2509.06572.pdf

Abstract:
Large language models (LLMs) are increasingly integrated with external systems through the Model Context Protocol (MCP), which standardizes tool invocation and has rapidly become a backbone for LLM-powered applications. While this paradigm enhances functionality, it also introduces a fundamental security shift: LLMs transition from passive information processors to autonomous orchestrators of task-oriented toolchains, expanding the attack surface, elevating adversarial goals from manipulating single outputs to hijacking entire execution flows. In this paper, we reveal a new class of attacks, Parasitic Toolchain Attacks, instantiated as MCP Unintended Privacy Disclosure (MCP-UPD). These attacks require no direct victim interaction; instead, adversaries embed malicious instructions into external data sources that LLMs access during legitimate tasks. The malicious logic infiltrates the toolchain and unfolds in three phases: Parasitic Ingestion, Privacy Collection, and Privacy Disclosure, culminating in stealthy exfiltration of private data. Our root cause analysis reveals that MCP lacks both context-tool isolation and least-privilege enforcement, enabling adversarial instructions to propagate unchecked into sensitive tool invocations. To assess the severity, we design MCP-SEC and conduct the first large-scale security census of the MCP ecosystem, analyzing 12,230 tools across 1,360 servers. Our findings show that the MCP ecosystem is rife with exploitable gadgets and diverse attack methods, underscoring systemic risks in MCP platforms and the urgent need for defense mechanisms in LLM-integrated environments.

Paperid: 2261, https://arxiv.org/pdf/2509.05471.pdf

Abstract:
Large Language Models (LLMs) are increasingly vulnerable to a sophisticated form of adversarial prompting known as camouflaged jailbreaking. This method embeds malicious intent within seemingly benign language to evade existing safety mechanisms. Unlike overt attacks, these subtle prompts exploit contextual ambiguity and the flexible nature of language, posing significant challenges to current defense systems. This paper investigates the construction and impact of camouflaged jailbreak prompts, emphasizing their deceptive characteristics and the limitations of traditional keyword-based detection methods. We introduce a novel benchmark dataset, Camouflaged Jailbreak Prompts, containing 500 curated examples (400 harmful and 100 benign prompts) designed to rigorously stress-test LLM safety protocols. In addition, we propose a multi-faceted evaluation framework that measures harmfulness across seven dimensions: Safety Awareness, Technical Feasibility, Implementation Safeguards, Harmful Potential, Educational Value, Content Quality, and Compliance Score. Our findings reveal a stark contrast in LLM behavior: while models demonstrate high safety and content quality with benign inputs, they exhibit a significant decline in performance and safety when confronted with camouflaged jailbreak attempts. This disparity underscores a pervasive vulnerability, highlighting the urgent need for more nuanced and adaptive security strategies to ensure the responsible and robust deployment of LLMs in real-world applications.

Paperid: 2262, https://arxiv.org/pdf/2509.05382.pdf

Abstract:
Hundreds of millions of people now regularly interact with large language models via chatbots. Model developers are eager to acquire new sources of high-quality training data as they race to improve model capabilities and win market share. This paper analyzes the privacy policies of six U.S. frontier AI developers to understand how they use their users' chats to train models. Drawing primarily on the California Consumer Privacy Act, we develop a novel qualitative coding schema that we apply to each developer's relevant privacy policies to compare data collection and use practices across the six companies. We find that all six developers appear to employ their users' chat data to train and improve their models by default, and that some retain this data indefinitely. Developers may collect and train on personal information disclosed in chats, including sensitive information such as biometric and health data, as well as files uploaded by users. Four of the six companies we examined appear to include children's chat data for model training, as well as customer data from other products. On the whole, developers' privacy policies often lack essential information about their practices, highlighting the need for greater transparency and accountability. We address the implications of users' lack of consent for the use of their chat data for model training, data security issues arising from indefinite chat data retention, and training on children's chat data. We conclude by providing recommendations to policymakers and developers to address the data privacy challenges posed by LLM-powered chatbots.

Paperid: 2263, https://arxiv.org/pdf/2509.05161.pdf

Abstract:
The Y1 interface in O-RAN enables the sharing of RAN Analytics Information (RAI) between the near-RT RIC and authorized Y1 consumers, which may be internal applications within the operator's trusted domain or external systems accessing data through a secure exposure function. While this visibility enhances network optimization and enables advanced services, it also introduces a potential security risk -- a malicious or compromised Y1 consumer could misuse analytics to facilitate targeted interference. In this work, we demonstrate how an adversary can exploit the Y1 interface to launch selective jamming attacks by passively monitoring downlink metrics. We propose and evaluate two Y1-aided jamming strategies: a clustering-based jammer leveraging DBSCAN for traffic profiling and a threshold-based jammer. These are compared against two baselines strategies -- always-on jammer and random jammer -- on an over-the-air LTE/5G O-RAN testbed. Experimental results show that in unconstrained jamming budget scenarios, the threshold-based jammer can closely replicate the disruption caused by always-on jamming while reducing transmission time by 27\%. Under constrained jamming budgets, the clustering-based jammer proves most effective, causing up to an 18.1\% bitrate drop while remaining active only 25\% of the time. These findings reveal a critical trade-off between jamming stealthiness and efficiency, and illustrate how exposure of RAN analytics via the Y1 interface can enable highly targeted, low-overhead attacks, raising important security considerations for both civilian and mission-critical O-RAN deployments.

Paperid: 2264, https://arxiv.org/pdf/2509.05149.pdf

Abstract:
In the era of Construction 4.0, the industry is embracing a new paradigm of labor elasticity, driven by smart and flexible outsourcing and subcontracting strategies. The increased reliance on specialized subcontractors enables companies to scale labor dynamically based on project demands. This adaptable workforce model presents challenges in managing hierarchical integration and coordinating inter-site collaboration. Our design introduces a subsystem integrated into the Odoo ERP framework, employing a modular architecture to streamline labor management, task tracking, and approval workflows. The system adopts a three-pronged approach to ensure synchronized data exchange between general contractors and subcontractors, while maintaining both security and operational independence. The system features hybrid access control, third-party integration for cross-domain communication, and role-based mapping algorithm across sites. The system supports varying degrees of customization through a unified and consolidated attribute mapping center. This center leverages a tree-like index structure and Lagrange interpolation method to enhance the efficiency of role mapping. Demonstrations highlight practical application in outsourcing, integration, and scalability scenarios, confirming the system's robustness under high user volumes and in offline conditions. Experimental results further show improvements in database performance and workflow adaptability to support a scalable, enterprise-level solution that aligns with the evolving demands of smart construction management.

Paperid: 2265, https://arxiv.org/pdf/2509.05104.pdf

Abstract:
Since 2022, hacktivist groups have escalated their tactics, expanding from distributed denial-of-service attacks and document leaks to include targeting operational technology (OT). By 2024, attacks on the OT of critical national infrastructure (CNI) had been linked to partisan hacktivist efforts in ongoing geopolitical conflicts, demonstrating a shift from protest to something more resembling cyber warfare. This escalation raises critical questions about the classification of these groups and the appropriate state response to their growing role in destabilizing international security. This paper examines the strategic motivations behind escalatory hacktivism, highlighting how states may tolerate, encourage, or leverage hacktivist groups as proxies in conflicts that blur the lines between activism, cybercrime, and state-sponsored operations. We introduce a novel method for interpreting hacktivists based on the impact of their actions, alignment to state ideology, and host state involvement, offering a structured approach to understanding the phenomenon. Finally, we assess policy and security implications, particularly for host and victim states, and propose strategies to address this evolving threat. By doing so, this paper contributes to international discussions on cyber security policy, governance, and the increasing intersection between non-state cyber actors and state interests.

Paperid: 2266, https://arxiv.org/pdf/2509.04097.pdf

Abstract:
Whilst many key exchange and digital signature systems still rely on NIST P-256 (secp256r1) and secp256k1, offering around 128-bit security, there is an increasing demand for transparent and reproducible curves at the 256-bit security level. Standard higher-security options include NIST P-521, Curve448, and Brainpool-P512. This paper presents ECCFROG522PP ("Presunto Powered"), a 522-bit prime-field elliptic curve that delivers security in the same classical approx 260-bit ballpark as NIST P-521, but with a fundamentally different design philosophy. All of the curve parameters are deterministically derived from a fixed public seed via BLAKE3, with zero hidden choices. The curve has prime order (cofactor = 1), a verified twist with a proven approx 505-bit prime factor, safe embedding degree (greater than or equal to 14), and passes anti-MOV checks up to k less than or equal to 200 and CM discriminant sanity up to 100k. Unlike prior opaque or ad-hoc constructions, ECCFROG522PP is fully reproducible: anyone can regenerate and verify it byte-for-byte using the published scripts. The intent is not to outperform NIST P-521 in raw speed, but to maximise trust, verifiability, and long-term auditability in a practical curve of equivalent security level

Paperid: 2267, https://arxiv.org/pdf/2509.03821.pdf

Abstract:
Existing tamper-evident logging systems suffer from high overhead and severe data loss in high-load settings, yet only provide coarse-grained tamper detection. Moreover, installing such systems requires recompiling kernel code. To address these challenges, we present Nitro, a high-performance, tamper-evident audit logging system that supports fine-grained detection of log tampering. Even better, our system avoids kernel recompilation by using the eBPF technology. To formally justify the security of Nitro, we provide a new definitional framework for logging systems, and give a practical cryptographic construction meeting this new goal. Unlike prior work that focus only on the cryptographic processing, we codesign the cryptographic part with the pre- and post-processing of the logs to exploit all system-level optimizations. Our evaluations demonstrate Nitro's superior performance, achieving 10X-25X improvements in high-stress conditions and 2X-10X in real-world scenarios while maintaining near-zero data loss. We also provide an advanced variant, Nitro-R that introduces in-kernel log reduction techniques to reduce runtime overhead even further.

Paperid: 2268, https://arxiv.org/pdf/2509.03098.pdf

Abstract:
Many signature applications-such as root certificates, secure software updates, and authentication protocols-involve long-lived public keys that are transferred or installed once and then used for many verifications. This key longevity makes post-quantum signature schemes with conservative assumptions (e.g., structure-free lattices) attractive for long-term security. But many such schemes, especially those with short signatures, suffer from extremely large public keys. Even in scenarios where bandwidth is not a major concern, large keys increase storage costs and slow down verification. We address this with a method to replace large public keys in GPV-style signatures with smaller, private verification keys. This significantly reduces verifier storage and runtime while preserving security. Applied to the conservative, short-signature schemes Wave and Squirrels, our method compresses Squirrels-I keys from 665 kB to 20.7 kB and Wave822 keys from 3.5 MB to 207.97 kB.

Paperid: 2269, https://arxiv.org/pdf/2509.02076.pdf

Abstract:
This paper forecasts future Distributed Denial of Service (DDoS) attacks using deep learning models. Although several studies address forecasting DDoS attacks, they remain relatively limited compared to detection-focused research. By studying the current trends and forecasting based on newer and updated datasets, mitigation plans against the attacks can be planned and formulated. The methodology used in this research work conforms to the Cross Industry Standard Process for Data Mining (CRISP-DM) model.

Paperid: 2270, https://arxiv.org/pdf/2509.02004.pdf

Abstract:
Shuffle DP (Differential Privacy) protocols provide high accuracy and privacy by introducing a shuffler who randomly shuffles data in a distributed system. However, most shuffle DP protocols are vulnerable to two attacks: collusion attacks by the data collector and users and data poisoning attacks. A recent study addresses this issue by introducing an augmented shuffle DP protocol, where users do not add noise and the shuffler performs random sampling and dummy data addition. However, it focuses on frequency estimation over categorical data with a small domain and cannot be applied to a large domain due to prohibitively high communication and computational costs. In this paper, we fill this gap by introducing a novel augmented shuffle DP protocol called the FME (Filtering-with-Multiple-Encryption) protocol. Our FME protocol uses a hash function to filter out unpopular items and then accurately calculates frequencies for popular items. To perform this within one round of interaction between users and the shuffler, our protocol carefully communicates within a system using multiple encryption. We also apply our FME protocol to more advanced KV (Key-Value) statistics estimation with an additional technique to reduce bias. For both categorical and KV data, we prove that our protocol provides computational DP, high robustness to the above two attacks, accuracy, and efficiency. We show the effectiveness of our proposals through comparisons with twelve existing protocols.

Paperid: 2271, https://arxiv.org/pdf/2509.01701.pdf

Abstract:
AmphiKey, a dual-mode post-quantum/traditional (PQ/T) hybrid authenticated key exchange mechanism (AKEM) has been designed to secure smart grid communications against both classical and quantum threats. AmphiKey offers two distinct operational modes within a single framework: an Authenticated Mode and a Deniable Mode. The Authenticated Mode employs a blackbox approach, combining ephemeral ML-KEM-768 and X25519 with long-term Raccoon DSA keys to provide forward secrecy and strong, non-repudiable authenticity. This design achieves "OR" confidentiality, where security holds if either of the KEMs is unbroken, and robust "AND" authenticity. For the signature operation, it leverages the 'masking-friendly' Raccoon digital signature (DSA), which is specifically designed for side-channel attack resistance, though this protection is localized to the signing key and does not provide deniability. In contrast, Deniable Mode provides deniable authentication, preserving privacy. The protocol used ML-KEM-768 (AKEM-1), Ephemeral X25519 (AKEM-2), Raccoon-based DSA (Rac) (compared performance to ML-DSA-65), and the Ascon cipher to deliver its security guarantees. Key contributions include providing a flexible protocol with enhanced security, optional deniability, and efficiency adapted to the diverse needs of the smart grid infrastructure. We present a comprehensive performance evaluation on a heterogeneous testbed featuring a powerful server and client (AMD Ryzen 5) and a resource-constrained client (Raspberry Pi). In efficient Deniable mode, the full handshake completes in 0.15 ms on the server and 0.41 ms on the Raspberry Pi client. In contrast, the Authenticated Mode is bottlenecked by the client-side signature generation; the handshake takes 4.8 ms for the Raspberry Pi client to initiate and 0.84 ms for the server to verify.

Paperid: 2272, https://arxiv.org/pdf/2509.01597.pdf

Abstract:
Confidentiality for business data is an understudied area of disclosure avoidance, where legacy methods struggle to provide acceptable results. Modern formal privacy techniques designed for person-level data do not provide suitable confidentiality/utility trade-offs due to the highly skewed nature of business data and because extreme outlier records are often important contributors to query answers. In this paper, inspired by Gaussian Differential Privacy, we propose a novel confidentiality framework for business data with a focus on interpretability for policy makers. We propose two query-answering mechanisms and analyze new challenges that arise when noisy query answers are converted into confidentiality-preserving microdata. We evaluate our mechanisms on confidential Quarterly Census of Employment and Wages (QCEW) microdata and a public substitute dataset.

Paperid: 2273, https://arxiv.org/pdf/2509.00662.pdf

Abstract:
Virtual Reality (VR) technology is rapidly growing in recent years. VR devices such as Meta Quest 3 utilize numerous sensors to collect users' data to provide an immersive experience. Due to the extensive data collection and the immersive nature, the security of VR devices is paramount. Leading VR devices often adopt and customize Android systems, which makes them susceptible to both Android-based vulnerabilities and new issues introduced by VR-specific customizations (e.g., system services to support continuous head and hand tracking). While prior work has extensively examined the security properties of the Android software stack, how these security properties hold for VR systems remains unexplored. In this paper, we present the first comprehensive security analysis of VR firmware. We collect over 300 versions of VR firmware from two major vendors, Quest and Pico, and perform a longitudinal analysis across the kernel layer, the system binary and library layer, and the application layer. We have identified several security issues in these VR firmware, including missing kernel-level security features, insufficient binary hardening, inconsistent permission enforcement, and inadequate SELinux policy enforcement. Based on our findings, we synthesize recommendations for VR vendors to improve security and trust for VR devices. This paper will act as an important security resource for VR developers, users, and vendors, and will also direct future advancements in secure VR ecosystem.

Paperid: 2274, https://arxiv.org/pdf/2509.00088.pdf

Abstract:
Prompt injection attacks pose a significant challenge to the safe deployment of Large Language Models (LLMs) in real-world applications. While prompt-based detection offers a lightweight and interpretable defense strategy, its effectiveness has been hindered by the need for manual prompt engineering. To address this issue, we propose AEGIS , an Automated co-Evolutionary framework for Guarding prompt Injections Schema. Both attack and defense prompts are iteratively optimized against each other using a gradient-like natural language prompt optimization technique. This framework enables both attackers and defenders to autonomously evolve via a Textual Gradient Optimization (TGO) module, leveraging feedback from an LLM-guided evaluation loop. We evaluate our system on a real-world assignment grading dataset of prompt injection attacks and demonstrate that our method consistently outperforms existing baselines, achieving superior robustness in both attack success and detection. Specifically, the attack success rate (ASR) reaches 1.0, representing an improvement of 0.26 over the baseline. For detection, the true positive rate (TPR) improves by 0.23 compared to the previous best work, reaching 0.84, and the true negative rate (TNR) remains comparable at 0.89. Ablation studies confirm the importance of co-evolution, gradient buffering, and multi-objective optimization. We also confirm that this framework is effective in different LLMs. Our results highlight the promise of adversarial training as a scalable and effective approach for guarding prompt injections.

Paperid: 2275, https://arxiv.org/pdf/2508.21669.pdf

Abstract:
We demonstrate how AI-powered cybersecurity tools can be turned against themselves through prompt injection attacks. Prompt injection is reminiscent of cross-site scripting (XSS): malicious text is hidden within seemingly trusted content, and when the system processes it, that text is transformed into unintended instructions. When AI agents designed to find and exploit vulnerabilities interact with malicious web servers, carefully crafted reponses can hijack their execution flow, potentially granting attackers system access. We present proof-of-concept exploits against the Cybersecurity AI (CAI) framework and its CLI tool, and detail our mitigations against such attacks in a multi-layered defense implementation. Our findings indicate that prompt injection is a recurring and systemic issue in LLM-based architectures, one that will require dedicated work to address, much as the security community has had to do with XSS in traditional web applications.

Paperid: 2276, https://arxiv.org/pdf/2508.21579.pdf

Abstract:
Existing Android vulnerability detection tools overwhelm teams with thousands of low-signal warnings yet uncover few true positives. Analysts spend days triaging these results, creating a bottleneck in the security pipeline. Meanwhile, genuinely exploitable vulnerabilities often slip through, leaving opportunities open to malicious counterparts. We introduce A2, a system that mirrors how security experts analyze and validate Android vulnerabilities through two complementary phases: (i) Agentic Vulnerability Discovery, which reasons about application security by combining semantic understanding with traditional security tools; and (ii) Agentic Vulnerability Validation, which systematically validates vulnerabilities across Android's multi-modal attack surface-UI interactions, inter-component communication, file system operations, and cryptographic computations. On the Ghera benchmark (n=60), A2 achieves 78.3% coverage, surpassing state-of-the-art analyzers (e.g., APKHunt 30.0%). Rather than overwhelming analysts with thousands of warnings, A2 distills results into 82 speculative vulnerability findings, including 47 Ghera cases and 28 additional true positives. Crucially, A2 then generates working Proof-of-Concepts (PoCs) for 51 of these speculative findings, transforming them into validated vulnerability findings that provide direct, self-confirming evidence of exploitability. In real-world evaluation on 169 production APKs, A2 uncovers 104 true-positive zero-day vulnerabilities. Among these, 57 (54.8%) are self-validated with automatically generated PoCs, including a medium-severity vulnerability in a widely used application with over 10 million installs.

Paperid: 2277, https://arxiv.org/pdf/2508.21386.pdf

Abstract:
The European Union (EU) has long favored a risk-based approach to regulation. Such an approach is also used in recent cyber security legislation enacted in the EU. Risks are also inherently related to compliance with the new legislation. Objective: The paper investigates how risks are framed in the EU's five core cyber security legislative acts, whether the framings indicate convergence or divergence between the acts and their risk concepts, and what qualifying words and terms are used when describing the legal notions of risks. Method : The paper's methodology is based on qualitative legal interpretation and taxonomy-building. Results: The five acts have an encompassing coverage of different cyber security risks, including but not limited to risks related to technical, organizational, and human security as well as those not originating from man-made actions. Both technical aspects and assets are used to frame the legal risk notions in many of the legislative acts. A threat-centric viewpoint is also present in one of the acts. Notable gaps are related to acceptable risks, non-probabilistic risks, and residual risks. Conclusion: The EU's new cyber security legislation has significantly extended the risk-based approach to regulations. At the same time, complexity and compliance burden have increased. With this point in mind, the paper concludes with a few practical takeaways about means to deal with compliance and research it.

Paperid: 2278, https://arxiv.org/pdf/2508.21075.pdf

Abstract:
Digital payments play a pivotal role in the burgeoning digital economy. Moving forward, the enhancement of digital payment systems necessitates programmability, going beyond just efficiency and convenience, to meet the evolving needs and complexities. Smart contract platforms like Central Bank Digital Currency (CBDC) networks and blockchains support programmable digital payments. However, the prevailing paradigm of programming payment logics involves coding smart contracts with programming languages, leading to high costs and significant security challenges. A novel and versatile method for payment programming on DLTs was presented in this paper - transforming digital currencies into token streams, then pipelining smart contracts to authorize, aggregate, lock, direct, and dispatch these streams efficiently from source to target accounts. By utilizing a small set of configurable templates, a few specialized smart contracts could be generated, and support most of payment logics through configuring and composing them. This approach could substantially reduce the cost of payment programming and enhance security, self-enforcement, adaptability, and controllability, thus hold the potential to become an essential component in the infrastructure of digital economy.

Paperid: 2279, https://arxiv.org/pdf/2508.19819.pdf

Abstract:
Gradient inversion attacks have garnered attention for their ability to compromise privacy in federated learning. However, many studies consider attacks with the model in inference mode, where training-time behaviors like dropout are disabled and batch normalization relies on fixed statistics. In this work, we systematically analyze how architecture and training behavior affect vulnerability, including the first in-depth study of inference-mode clients, which we show dramatically simplifies inversion. To assess attack feasibility under more realistic conditions, we turn to clients operating in standard training mode. In this setting, we find that successful attacks are only possible when several architectural conditions are met simultaneously: models must be shallow and wide, use skip connections, and, critically, employ pre-activation normalization. We introduce two novel attacks against models in training-mode with varying attacker knowledge, achieving state-of-the-art performance under realistic training conditions. We extend these efforts by presenting the first attack on a production-grade object-detection model. Here, to enable any visibly identifiable leakage, we revert to the lenient inference mode setting and make multiple architectural modifications to increase model vulnerability, with the extent of required changes highlighting the strong inherent robustness of such architectures. We conclude this work by offering the first comprehensive mapping of settings, clarifying which combinations of architectural choices and operational modes meaningfully impact privacy. Our analysis provides actionable insight into when models are likely vulnerable, when they appear robust, and where subtle leakage may persist. Together, these findings reframe how gradient inversion risk should be assessed in future research and deployment scenarios.

Paperid: 2280, https://arxiv.org/pdf/2508.19458.pdf

Abstract:
A membership-inference attack gets the output of a learning algorithm, and a target individual, and tries to determine whether this individual is a member of the training data or an independent sample from the same distribution. A successful membership-inference attack typically requires the attacker to have some knowledge about the distribution that the training data was sampled from, and this knowledge is often captured through a set of independent reference samples from that distribution. In this work we study how much information the attacker needs for membership inference by investigating the sample complexity-the minimum number of reference samples required-for a successful attack. We study this question in the fundamental setting of Gaussian mean estimation where the learning algorithm is given $n$ samples from a Gaussian distribution $\mathcal{N}(Î¼,Î£)$ in $d$ dimensions, and tries to estimate $\hatÎ¼$ up to some error $\mathbb{E}[\|\hat Î¼- Î¼\|^2_Î£]\leq Ï^2 d$. Our result shows that for membership inference in this setting, $Î©(n + n^2 Ï^2)$ samples can be necessary to carry out any attack that competes with a fully informed attacker. Our result is the first to show that the attacker sometimes needs many more samples than the training algorithm uses to train the model. This result has significant implications for practice, as all attacks used in practice have a restricted form that uses $O(n)$ samples and cannot benefit from $Ï(n)$ samples. Thus, these attacks may be underestimating the possibility of membership inference, and better attacks may be possible when information about the distribution is easy to obtain.

Paperid: 2281, https://arxiv.org/pdf/2508.19430.pdf

Abstract:
Formal verification is crucial for ensuring the robustness of security protocols against adversarial attacks. The Needham-Schroeder protocol, a foundational authentication mechanism, has been extensively studied, including its integration with Physical Layer Security (PLS) techniques such as watermarking and jamming. Recent research has used ProVerif to verify these mechanisms in terms of secrecy. However, the ProVerif-based approach limits the ability to improve understanding of security beyond verification results. To overcome these limitations, we re-model the same protocol using an Isabelle formalism that generates sound animation, enabling interactive and automated formal verification of security protocols. Our modelling and verification framework is generic and highly configurable, supporting both cryptography and PLS. For the same protocol, we have conducted a comprehensive analysis (secrecy and authenticity in four different eavesdropper locations under both passive and active attacks) using our new web interface. Our findings not only successfully reproduce and reinforce previous results on secrecy but also reveal an uncommon but expected outcome: authenticity is preserved across all examined scenarios, even in cases where secrecy is compromised. We have proposed a PLS-based Diffie-Hellman protocol that integrates watermarking and jamming, and our analysis shows that it is secure for deriving a session key with required authentication. These highlight the advantages of our novel approach, demonstrating its robustness in formally verifying security properties beyond conventional methods.

Paperid: 2282, https://arxiv.org/pdf/2508.18439.pdf

Abstract:
Vulnerability databases, such as the National Vulnerability Database (NVD), offer detailed descriptions of Common Vulnerabilities and Exposures (CVEs), but often lack information on their real-world impact, such as the tactics, techniques, and procedures (TTPs) that adversaries may use to exploit the vulnerability. However, manually linking CVEs to their corresponding TTPs is a challenging and time-consuming task, and the high volume of new vulnerabilities published annually makes automated support desirable. This paper introduces TRIAGE, a two-pronged automated approach that uses Large Language Models (LLMs) to map CVEs to relevant techniques from the ATT&CK knowledge base. We first prompt an LLM with instructions based on MITRE's CVE Mapping Methodology to predict an initial list of techniques. This list is then combined with the results from a second LLM-based module that uses in-context learning to map a CVE to relevant techniques. This hybrid approach strategically combines rule-based reasoning with data-driven inference. Our evaluation reveals that in-context learning outperforms the individual mapping methods, and the hybrid approach improves recall of exploitation techniques. We also find that GPT-4o-mini performs better than Llama3.3-70B on this task. Overall, our results show that LLMs can be used to automatically predict the impact of cybersecurity vulnerabilities and TRIAGE makes the process of mapping CVEs to ATT&CK more efficient. Keywords: vulnerability impact, CVE, ATT&CK techniques, large language models, automated mapping.

Paperid: 2283, https://arxiv.org/pdf/2508.18155.pdf

Abstract:
The rapid integration of Internet of Things (IoT) and interconnected systems in modern vehicles not only introduced a new era of convenience, automation, and connected vehicles but also elevated their exposure to sophisticated cyber threats. This is especially evident in US and Canada, where cyber-enabled auto theft has surged in recent years, revealing the limitations of existing security measures for connected vehicles. In response, this paper proposes $AutoGuardX$, a comprehensive cybersecurity framework designed specifically for connected vehicles. $AutoGuardX$ combines key elements from existing recognized standards for vehicle security, such as ISO/SAE 21434 and ISO 26262, with advanced technologies, including machine learning-based anomaly detection, IoT security protocols, and encrypted communication channels. The framework addresses major attack vectors like relay attacks, controller area network (CAN) bus intrusions, and vulnerabilities introduced by emerging technologies such as 5G and quantum computing. $AutoGuardX$ is extensively evaluated through security simulations across a mix of Sedans and SUVs from four major vehicle brands manufactured between 2019 and 2023. The results demonstrate the framework's adaptability, scalability, and practical effectiveness against existing and emerging threats.

Paperid: 2284, https://arxiv.org/pdf/2508.17405.pdf

Abstract:
The widespread adoption of machine learning (ML) systems increased attention to their security and emergence of adversarial machine learning (AML) techniques that exploit fundamental vulnerabilities in ML systems, creating an urgent need for comprehensive risk assessment for ML-based systems. While traditional risk assessment frameworks evaluate conventional cybersecurity risks, they lack ability to address unique challenges posed by AML threats. Existing AML threat evaluation approaches focus primarily on technical attack robustness, overlooking crucial real-world factors like deployment environments, system dependencies, and attack feasibility. Attempts at comprehensive AML risk assessment have been limited to domain-specific solutions, preventing application across diverse systems. Addressing these limitations, we present FRAME, the first comprehensive and automated framework for assessing AML risks across diverse ML-based systems. FRAME includes a novel risk assessment method that quantifies AML risks by systematically evaluating three key dimensions: target system's deployment environment, characteristics of diverse AML techniques, and empirical insights from prior research. FRAME incorporates a feasibility scoring mechanism and LLM-based customization for system-specific assessments. Additionally, we developed a comprehensive structured dataset of AML attacks enabling context-aware risk assessment. From an engineering application perspective, FRAME delivers actionable results designed for direct use by system owners with only technical knowledge of their systems, without expertise in AML. We validated it across six diverse real-world applications. Our evaluation demonstrated exceptional accuracy and strong alignment with analysis by AML experts. FRAME enables organizations to prioritize AML risks, supporting secure AI deployment in real-world environments.

Paperid: 2285, https://arxiv.org/pdf/2508.17222.pdf

Abstract:
Retrieval-Augmented Generation (RAG) is a powerful technique for enhancing Large Language Models (LLMs) with external, up-to-date knowledge. Graph RAG has emerged as an advanced paradigm that leverages graph-based knowledge structures to provide more coherent and contextually rich answers. However, the move from plain document retrieval to structured graph traversal introduces new, under-explored privacy risks. This paper investigates the data extraction vulnerabilities of the Graph RAG systems. We design and execute tailored data extraction attacks to probe their susceptibility to leaking both raw text and structured data, such as entities and their relationships. Our findings reveal a critical trade-off: while Graph RAG systems may reduce raw text leakage, they are significantly more vulnerable to the extraction of structured entity and relationship information. We also explore potential defense mechanisms to mitigate these novel attack surfaces. This work provides a foundational analysis of the unique privacy challenges in Graph RAG and offers insights for building more secure systems.

Paperid: 2286, https://arxiv.org/pdf/2508.15606.pdf

Abstract:
Mobile application marketplaces are responsible for vetting apps to identify and mitigate security risks. Current vetting processes are labor-intensive, relying on manual analysis by security professionals aided by semi-automated tools. To address this inefficiency, we propose Mars, a system that leverages Large Language Models (LLMs) for automated risk identification and profiling. Mars is designed to concurrently analyze multiple applications across diverse risk categories with minimal human intervention. To enhance analytical precision and operational efficiency, Mars leverages a pre-constructed risk identification tree to extract relevant indicators from high-dimensional application features. This initial step filters the data, reducing the input volume for the LLM and mitigating the potential for model hallucination induced by irrelevant features. The extracted indicators are then subjected to LLM analysis for final risk determination. Furthermore, Mars automatically generates a comprehensive evidence chain for each assessment, documenting the analytical process to provide transparent justification. These chains are designed to facilitate subsequent manual review and to inform enforcement decisions, such as application delisting. The performance of Mars was evaluated on a real-world dataset from a partner Android marketplace. The results demonstrate that Mars attained an F1-score of 0.838 in risk identification and an F1-score of 0.934 in evidence retrieval. To assess its practical applicability, a user study involving 20 expert analysts was conducted, which indicated that Mars yielded a substantial efficiency gain, ranging from 60% to 90%, over conventional manual analysis.

Paperid: 2287, https://arxiv.org/pdf/2508.15386.pdf

Abstract:
Memory corruption vulnerabilities remain one of the most severe threats to software security. They often allow attackers to achieve arbitrary code execution by redirecting a vulnerable program's control flow. While Control Flow Integrity (CFI) has gained traction to mitigate this exploitation path, developers are not provided with any direction on how to apply CFI to real-world software. In this work, we establish a taxonomy mapping LLVM's forward-edge CFI variants to memory corruption vulnerability classes, offering actionable guidance for developers seeking to deploy CFI incrementally in existing codebases. Based on the Top 10 Known Exploited Vulnerabilities (KEV) list, we identify four high-impact vulnerability categories and select one representative CVE for each. We evaluate LLVM's CFI against each CVE and explain why CFI blocks exploitation in two cases while failing in the other two, illustrating its potential and current limitations. Our findings support informed deployment decisions and provide a foundation for improving the practical use of CFI in production systems.

Paperid: 2288, https://arxiv.org/pdf/2508.14925.pdf

Abstract:
By providing a standardized interface for LLM agents to interact with external tools, the Model Context Protocol (MCP) is quickly becoming a cornerstone of the modern autonomous agent ecosystem. However, it creates novel attack surfaces due to untrusted external tools. While prior work has focused on attacks injected through external tool outputs, we investigate a more fundamental vulnerability: Tool Poisoning, where malicious instructions are embedded within a tool's metadata without execution. To date, this threat has been primarily demonstrated through isolated cases, lacking a systematic, large-scale evaluation. We introduce MCPTox, the first benchmark to systematically evaluate agent robustness against Tool Poisoning in realistic MCP settings. MCPTox is constructed upon 45 live, real-world MCP servers and 353 authentic tools. To achieve this, we design three distinct attack templates to generate a comprehensive suite of 1312 malicious test cases by few-shot learning, covering 10 categories of potential risks. Our evaluation on 20 prominent LLM agents setting reveals a widespread vulnerability to Tool Poisoning, with o1-mini, achieving an attack success rate of 72.8\%. We find that more capable models are often more susceptible, as the attack exploits their superior instruction-following abilities. Finally, the failure case analysis reveals that agents rarely refuse these attacks, with the highest refused rate (Claude-3.7-Sonnet) less than 3\%, demonstrating that existing safety alignment is ineffective against malicious actions that use legitimate tools for unauthorized operation. Our findings create a crucial empirical baseline for understanding and mitigating this widespread threat, and we release MCPTox for the development of verifiably safer AI agents. Our dataset is available at an anonymized repository: \textit{https://anonymous.4open.science/r/AAAI26-7C02}.

Paperid: 2289, https://arxiv.org/pdf/2508.14524.pdf

Abstract:
Payment channel networks (PCNs) are a promising technology that alleviates blockchain scalability by shifting the transaction load from the blockchain to the PCN. Nevertheless, the network topology has to be carefully designed to maximise the transaction throughput in PCNs. Additionally, users in PCNs also have to make optimal decisions on which transactions to forward and which to reject to prolong the lifetime of their channels. In this work, we consider an input sequence of transactions over $p$ parties. Each transaction consists of a transaction size, source, and target, and can be either accepted or rejected (entailing a cost). The goal is to design a PCN topology among the $p$ cooperating parties, along with the channel capacities, and then output a decision for each transaction in the sequence to minimise the cost of creating and augmenting channels, as well as the cost of rejecting transactions. Our main contribution is an $\mathcal{O}(p)$ approximation algorithm for the problem with $p$ parties. We further show that with some assumptions on the distribution of transactions, we can reduce the approximation ratio to $\mathcal{O}(\sqrt{p})$. We complement our theoretical analysis with an empirical study of our assumptions and approach in the context of the Lightning Network.

Paperid: 2290, https://arxiv.org/pdf/2508.13965.pdf

Abstract:
The main goal of design obfuscation schemes is to protect sensitive design details from untrusted parties in the VLSI supply chain, including but not limited to off-shore foundries and untrusted end users. In this work, we provide a systematic red teaming approach to evaluate the security of design obfuscation approaches. Specifically, we propose security metrics and evaluation methodology for the scenarios where the adversary does not have access to a working chip. A case study on the RIPPER tool developed by the University of Florida indicates that more information is leaked about the structure of the original design than commonly considered.

Paperid: 2291, https://arxiv.org/pdf/2508.13750.pdf

Abstract:
The software supply chain is an increasingly common attack vector for malicious actors. The Node.js ecosystem has been subject to a wide array of attacks, likely due to its size and prevalence. To counter such attacks, the research community and practitioners have proposed a range of static and dynamic mechanisms, including process- and language-level sandboxing, permission systems, and taint tracking. Drawing on valuable insight from these works, this paper studies a runtime protection mechanism for (the supply chain of) Node.js applications with the ambitious goals of compatibility, automation, minimal overhead, and policy conciseness. Specifically, we design, implement and evaluate NodeShield, a protection mechanism for Node.js that enforces an application's dependency hierarchy and controls access to system resources at runtime. We leverage the up-and-coming SBOM standard as the source of truth for the dependency hierarchy of the application, thus preventing components from stealthily abusing undeclared components. We propose to enhance the SBOM with a notion of capabilities that represents a set of related system resources a component may access. Our proposed SBOM extension, the Capability Bill of Materials or CBOM, records the required capabilities of each component, providing valuable insight into the potential privileged behavior. NodeShield enforces the SBOM and CBOM at runtime via code outlining (as opposed to inlining) with no modifications to the original code or Node.js runtime, thus preventing unexpected, potentially malicious behavior. Our evaluation shows that NodeShield can prevent over 98% out of 67 known supply chain attacks while incurring minimal overhead on servers at less than 1ms per request. We achieve this while maintaining broad compatibility with vanilla Node.js and a concise policy language that consists of at most 7 entries per dependency.

Paperid: 2292, https://arxiv.org/pdf/2508.13730.pdf

Abstract:
Federated Learning (FL) is an emerging distributed machine learning paradigm enabling multiple clients to train a global model collaboratively without sharing their raw data. While FL enhances data privacy by design, it remains vulnerable to various security and privacy threats. This survey provides a comprehensive overview of more than 200 papers regarding the state-of-the-art attacks and defense mechanisms developed to address these challenges, categorizing them into security-enhancing and privacy-preserving techniques. Security-enhancing methods aim to improve FL robustness against malicious behaviors such as byzantine attacks, poisoning, and Sybil attacks. At the same time, privacy-preserving techniques focus on protecting sensitive data through cryptographic approaches, differential privacy, and secure aggregation. We critically analyze the strengths and limitations of existing methods, highlight the trade-offs between privacy, security, and model performance, and discuss the implications of non-IID data distributions on the effectiveness of these defenses. Furthermore, we identify open research challenges and future directions, including the need for scalable, adaptive, and energy-efficient solutions operating in dynamic and heterogeneous FL environments. Our survey aims to guide researchers and practitioners in developing robust and privacy-preserving FL systems, fostering advancements safeguarding collaborative learning frameworks' integrity and confidentiality.

Paperid: 2293, https://arxiv.org/pdf/2508.13588.pdf

Abstract:
This work introduces CAI Fluency, an an educational platform of the Cybersecurity AI (CAI) framework dedicated to democratizing the knowledge and application of cybersecurity AI tools in the global security community. The main objective of the CAI framework is to accelerate the widespread adoption and effective use of artificial intelligence-based cybersecurity solutions, pathing the way to vibe-hacking, the cybersecurity analogon to vibe-coding. CAI Fluency builds upon the Framework for AI Fluency, adapting its three modalities of human-AI interaction and four core competencies specifically for cybersecurity applications. This theoretical foundation ensures that practitioners develop not just technical skills, but also the critical thinking and ethical awareness necessary for responsible AI use in security contexts. This technical report serves as a white-paper, as well as detailed educational and practical guide that helps users understand the principles behind the CAI framework, and educates them how to apply this knowledge in their projects and real-world security contexts.

Paperid: 2294, https://arxiv.org/pdf/2508.12859.pdf

Abstract:
Butson matrices are complex Hadamard matrices with entries in the complex roots of unity of given order. There is an interesting code in phase space related to this matrix (Armario et al. 2023). We study the covering radius of Butson Hadamard codes for the homogeneous metric, a metric defined uniquely, up to scaling, for a commutative ring alphabet that is Quasi Frobenius. An upper bound is derived by an orthogonal array argument. A lower bound relies on the existence of bent sequences in the sense of (Shi et al. 2022). This latter bound generalizes a bound of (Armario et al. 2025) for the Hamming metric.

Paperid: 2295, https://arxiv.org/pdf/2508.12622.pdf

Abstract:
The advancement of AI technologies, particularly Large Language Models (LLMs), has transformed computing while introducing new security and privacy risks. Prior research shows that cybercriminals are increasingly leveraging uncensored LLMs (ULLMs) as backends for malicious services. Understanding these ULLMs has been hindered by the challenge of identifying them among the vast number of open-source LLMs hosted on platforms like Hugging Face. In this paper, we present the first systematic study of ULLMs, overcoming this challenge by modeling relationships among open-source LLMs and between them and related data, such as fine-tuning, merging, compressing models, and using or generating datasets with harmful content. Representing these connections as a knowledge graph, we applied graph-based deep learning to discover over 11,000 ULLMs from a small set of labeled examples and uncensored datasets. A closer analysis of these ULLMs reveals their alarming scale and usage. Some have been downloaded over a million times, with one over 19 million installs. These models -- created through fine-tuning, merging, or compression of other models -- are capable of generating harmful content, including hate speech, violence, erotic material, and malicious code. Evidence shows their integration into hundreds of malicious applications offering services like erotic role-play, child pornography, malicious code generation, and more. In addition, underground forums reveal criminals sharing techniques and scripts to build cheap alternatives to commercial malicious LLMs. These findings highlight the widespread abuse of LLM technology and the urgent need for effective countermeasures against this growing threat.

Paperid: 2296, https://arxiv.org/pdf/2508.12571.pdf

Abstract:
Brain-computer interfaces (BCIs) show enormous potential for advancing personalized medicine. However, BCIs also introduce new avenues for cyber-attacks or security compromises. In this article, we analyze the problem and make recommendations for device manufacturers to better secure devices and to help regulators understand where more guidance is needed to protect patient safety and data confidentiality. Device manufacturers should implement the prior suggestions in their BCI products. These recommendations help protect BCI users from undue risks, including compromised personal health and genetic information, unintended BCI-mediated movement, and many other cybersecurity breaches. Regulators should mandate non-surgical device update methods, strong authentication and authorization schemes for BCI software modifications, encryption of data moving to and from the brain, and minimize network connectivity where possible. We also design a hypothetical, average-case threat model that identifies possible cybersecurity threats to BCI patients and predicts the likeliness of risk for each category of threat. BCIs are at less risk of physical compromise or attack, but are vulnerable to remote attack; we focus on possible threats via network paths to BCIs and suggest technical controls to limit network connections.

Paperid: 2297, https://arxiv.org/pdf/2508.11711.pdf

Abstract:
GraphQL's flexibility, while beneficial for efficient data fetching, introduces unique security vulnerabilities that traditional API security mechanisms often fail to address. Malicious GraphQL queries can exploit the language's dynamic nature, leading to denial-of-service attacks, data exfiltration through injection, and other exploits. Existing solutions, such as static analysis, rate limiting, and general-purpose Web Application Firewalls, offer limited protection against sophisticated, context-aware attacks. This paper presents a novel, AI-driven approach for real-time detection of malicious GraphQL queries. Our method combines static analysis with machine learning techniques, including Large Language Models (LLMs) for dynamic schema-based configuration, Sentence Transformers (SBERT and Doc2Vec) for contextual embedding of query payloads, and Convolutional Neural Networks (CNNs), Random Forests, and Multilayer Perceptrons for classification. We detail the system architecture, implementation strategies optimized for production environments (including ONNX Runtime optimization and parallel processing), and evaluate the performance of our detection models and the overall system under load. Results demonstrate high accuracy in detecting various threats, including SQL injection, OS command injection, and XSS exploits, alongside effective mitigation of DoS and SSRF attempts. This research contributes a robust and adaptable solution for enhancing GraphQL API security.

Paperid: 2298, https://arxiv.org/pdf/2508.10327.pdf

Abstract:
Intrusion detection systems (IDS) are widely used to maintain the stability of network environments, but still face restrictions in generalizability due to the heterogeneity of network traffics. In this work, we propose BERTector, a new framework of joint-dataset learning for IDS based on BERT. BERTector integrates three key components: NSS-Tokenizer for traffic-aware semantic tokenization, supervised fine-tuning with a hybrid dataset, and low-rank adaptation for efficient fine-tuning. Experiments show that BERTector achieves state-of-the-art detection accuracy, strong generalizability, and excellent robustness. BERTector achieves the highest accuracy of 99.28% on NSL-KDD and reaches the average 80% detection success rate against four perturbations. These results establish a unified and efficient solution for modern IDS in complex and dynamic network environments.

Paperid: 2299, https://arxiv.org/pdf/2508.10038.pdf

Abstract:
Malware analysis involves analyzing suspicious software to detect malicious payloads. Static malware analysis, which does not require software execution, relies increasingly on machine learning techniques to achieve scalability. Although such techniques obtain very high detection accuracy, they can be easily evaded with adversarial examples where a few modifications of the sample can dupe the detector without modifying the behavior of the software. Unlike other domains, such as computer vision, creating an adversarial example of malware without altering its functionality requires specific transformations. We propose a new model architecture for certifiably robust malware detection by design. In addition, we show that every robust detector can be decomposed into a specific structure, which can be applied to learn empirically robust malware detectors, even on fragile features. Our framework ERDALT is based on this structure. We compare and validate these approaches with machine-learning-based malware detection methods, allowing for robust detection with limited reduction of detection performance.

Paperid: 2300, https://arxiv.org/pdf/2508.09815.pdf

Abstract:
We propose an extension to the OWASP Multi-Agentic System (MAS) Threat Modeling Guide, translating recent anticipatory research in multi-agent security (MASEC) into practical guidance for addressing challenges unique to large language model (LLM)-driven multi-agent architectures. Although OWASP's existing taxonomy covers many attack vectors, our analysis identifies gaps in modeling failures, including, but not limited to: reasoning collapse across planner-executor chains, metric overfitting, unsafe delegation escalation, emergent covert coordination, and heterogeneous multi-agent exploits. We introduce additional threat classes and scenarios grounded in practical MAS deployments, highlighting risks from benign goal drift, cross-agent hallucination propagation, affective prompt framing, and multi-agent backdoors. We also outline evaluation strategies, including robustness testing, coordination assessment, safety enforcement, and emergent behavior monitoring, to ensure complete coverage. This work complements the framework of OWASP by expanding its applicability to increasingly complex, autonomous, and adaptive multi-agent systems, with the goal of improving security posture and resilience in real world deployments.

Paperid: 2301, https://arxiv.org/pdf/2508.09422.pdf

Abstract:
A recent work of Schmidhuber et al (QIP, SODA, & Phys. Rev. X 2025) exhibited a quantum algorithm for the noisy planted $k$XOR problem running quartically faster than all known classical algorithms. In this work, we design a new classical algorithm that is quadratically faster than the best previous one, in the case of large constant $k$. Thus for such $k$, the quantum speedup of Schmidhuber et al. becomes only quadratic (though it retains a space advantage). Our algorithm, which also works in the semirandom case, combines tools from sublinear-time algorithms (essentially, the birthday paradox) and polynomial anticoncentration.

Paperid: 2302, https://arxiv.org/pdf/2508.08068.pdf

Abstract:
Proof-of-work allows Bitcoin to boast security amidst arbitrary fluctuations in participation of miners throughout time, so long as, at any point in time, a majority of hash power is honest. In recent years, however, the pendulum has shifted in favor of proof-of-stake-based consensus protocols. There, the sleepy model is the most prominent model for handling fluctuating participation of nodes. However, to date, no protocol in the sleepy model rivals Bitcoin in its robustness to drastic fluctuations in participation levels, with state-of-the-art protocols making various restrictive assumptions. In this work, we present a new adversary model, called external adversary. Intuitively, in our model, corrupt nodes do not divulge information about their secret keys. In this model, we show that protocols in the sleepy model can meaningfully claim to remain secure against fully fluctuating participation, without compromising efficiency or corruption resilience. Our adversary model is quite natural, and arguably naturally captures the process via which malicious behavior arises in protocols, as opposed to traditional worst-case modeling. On top of which, the model is also theoretically appealing, circumventing a barrier established in a recent work of Malkhi, Momose, and Ren.

Paperid: 2303, https://arxiv.org/pdf/2508.04669.pdf

Abstract:
Practical implementations of Quantum Key Distribution (QKD) often deviate from the theoretical protocols, exposing the implementations to various attacks even when the underlying (ideal) protocol is proven secure. We present new analysis tools and methodologies for quantum cybersecurity, adapting the concepts of vulnerabilities, attack surfaces, and exploits from classical cybersecurity to QKD implementation attacks. We also present three additional concepts, derived from the connection between classical and quantum cybersecurity: "Quantum Fuzzing", which is the first tool for black-box vulnerability research on QKD implementations; "Reversed-Space Attacks", which are a generic exploit method using the attack surface of imperfect receivers; and concrete quantum-mechanical definitions of "Quantum Side-Channel Attacks" and "Quantum State-Channel Attacks", meaningfully distinguishing them from each other and from other attacks. Using our tools, we analyze multiple existing QKD attacks and show that the "Bright Illumination" attack could have been found even with minimal knowledge of the device implementation. This work begins to bridge the gap between current analysis methods for experimental attacks on QKD implementations and the decades-long research in the field of classical cybersecurity, improving the practical security of QKD products and enhancing their usefulness in real-world systems.

Paperid: 2304, https://arxiv.org/pdf/2508.03474.pdf

Abstract:
Polymarket is a prediction market platform where users can speculate on future events by trading shares tied to specific outcomes, known as conditions. Each market is associated with a set of one or more such conditions. To ensure proper market resolution, the condition set must be exhaustive -- collectively accounting for all possible outcomes -- and mutually exclusive -- only one condition may resolve as true. Thus, the collective prices of all related outcomes should be \$1, representing a combined probability of 1 of any outcome. Despite this design, Polymarket exhibits cases where dependent assets are mispriced, allowing for purchasing (or selling) a certain outcome for less than (or more than) \$1, guaranteeing profit. This phenomenon, known as arbitrage, could enable sophisticated participants to exploit such inconsistencies. In this paper, we conduct an empirical arbitrage analysis on Polymarket data to answer three key questions: (Q1) What conditions give rise to arbitrage (Q2) Does arbitrage actually occur on Polymarket and (Q3) Has anyone exploited these opportunities. A major challenge in analyzing arbitrage between related markets lies in the scalability of comparisons across a large number of markets and conditions, with a naive analysis requiring $O(2^{n+m})$ comparisons. To overcome this, we employ a heuristic-driven reduction strategy based on timeliness, topical similarity, and combinatorial relationships, further validated by expert input. Our study reveals two distinct forms of arbitrage on Polymarket: Market Rebalancing Arbitrage, which occurs within a single market or condition, and Combinatorial Arbitrage, which spans across multiple markets. We use on-chain historical order book data to analyze when these types of arbitrage opportunities have existed, and when they have been executed by users. We find a realized estimate of 40 million USD of profit extracted.

Paperid: 2305, https://arxiv.org/pdf/2508.02961.pdf

Abstract:
This paper introduces a novel self-consciousness defense mechanism for Large Language Models (LLMs) to combat prompt injection attacks. Unlike traditional approaches that rely on external classifiers, our method leverages the LLM's inherent reasoning capabilities to perform self-protection. We propose a framework that incorporates Meta-Cognitive and Arbitration Modules, enabling LLMs to evaluate and regulate their own outputs autonomously. Our approach is evaluated on seven state-of-the-art LLMs using two datasets: AdvBench and Prompt-Injection-Mixed-Techniques-2024. Experiment results demonstrate significant improvements in defense success rates across models and datasets, with some achieving perfect and near-perfect defense in Enhanced Mode. We also analyze the trade-off between defense success rate improvement and computational overhead. This self-consciousness method offers a lightweight, cost-effective solution for enhancing LLM ethics, particularly beneficial for GenAI use cases across various platforms.

Paperid: 2306, https://arxiv.org/pdf/2508.00897.pdf

Abstract:
Despite recent progress in splicing detection, deep learning-based forensic tools remain difficult to deploy in practice due to their high sensitivity to training conditions. Even mild post-processing applied to evaluation images can significantly degrade detector performance, raising concerns about their reliability in operational contexts. In this work, we show that the same deep architecture can react very differently to unseen post-processing depending on the learned weights, despite achieving similar accuracy on in-distribution test data. This variability stems from differences in the latent spaces induced by training, which affect how samples are separated internally. Our experiments reveal a strong correlation between the distribution of latent margins and a detector's ability to generalize to post-processed images. Based on this observation, we propose a practical strategy for building more robust detectors: train several variants of the same model under different conditions, and select the one that maximizes latent margins.

Paperid: 2307, https://arxiv.org/pdf/2508.00620.pdf

Abstract:
Face Recognition Systems that operate in unconstrained environments capture images under varying conditions,such as inconsistent lighting, or diverse face poses. These challenges require including a Face Detection module that regresses bounding boxes and landmark coordinates for proper Face Alignment. This paper shows the effectiveness of Object Generation Attacks on Face Detection, dubbed Face Generation Attacks, and demonstrates for the first time a Landmark Shift Attack that backdoors the coordinate regression task performed by face detectors. We then offer mitigations against these vulnerabilities.

Paperid: 2308, https://arxiv.org/pdf/2507.23229.pdf

Abstract:
Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge bases, but this advancement introduces significant privacy risks. Existing privacy attacks on RAG systems can trigger data leakage but often fail to accurately isolate knowledge-base-derived sentences within mixed responses. They also lack robustness when applied across multiple domains. This paper addresses these challenges by presenting a novel black-box attack framework that exploits knowledge asymmetry between RAG and standard LLMs to achieve fine-grained privacy extraction across heterogeneous knowledge landscapes. We propose a chain-of-thought reasoning strategy that creates adaptive prompts to steer RAG systems away from sensitive content. Specifically, we first decompose adversarial queries to maximize information disparity and then apply a semantic relationship scoring to resolve lexical and syntactic ambiguities. We finally train a neural network on these feature scores to precisely identify sentences containing private information. Unlike prior work, our framework generalizes to unseen domains through iterative refinement without pre-defined knowledge. Experimental results show that we achieve over 91% privacy extraction rate in single-domain and 83% in multi-domain scenarios, reducing sensitive sentence exposure by over 65% in case studies. This work bridges the gap between attack and defense in RAG systems, enabling precise extraction of private information while providing a foundation for adaptive mitigation.

Paperid: 2309, https://arxiv.org/pdf/2507.22133.pdf

Abstract:
Applications that use Large Language Models (LLMs) are becoming widespread, making the identification of system vulnerabilities increasingly important. Automated Red Teaming accelerates this effort by using an LLM to generate and execute attacks against target systems. Attack generators are evaluated using the Attack Success Rate (ASR) the sample mean calculated over the judgment of success for each attack. In this paper, we introduce a method for optimizing attack generator prompts that applies ASR to individual attacks. By repeating each attack multiple times against a randomly seeded target, we measure an attack's discoverability the expectation of the individual attack success. This approach reveals exploitable patterns that inform prompt optimization, ultimately enabling more robust evaluation and refinement of generators.

Paperid: 2310, https://arxiv.org/pdf/2507.21538.pdf

Abstract:
Anticipating emerging attack methodologies is crucial for proactive cybersecurity. Recent advances in Large Language Models (LLMs) have enabled the automated generation of phishing messages and accelerated research into potential attack techniques. However, predicting future threats remains challenging due to reliance on existing training data. To address this limitation, we propose a novel framework that integrates LLM-based phishing attack simulations with a genetic algorithm in a psychological context, enabling phishing strategies to evolve dynamically through adversarial interactions with simulated victims. Through simulations using Llama 3.1, we demonstrate that (1) self-evolving phishing strategies employ increasingly sophisticated psychological manipulation techniques, surpassing naive LLM-generated attacks, (2) variations in a victim's prior knowledge significantly influence the evolution of attack strategies, and (3) adversarial interactions between evolving attacks and adaptive defenses create a cat-and-mouse dynamic, revealing an inherent asymmetry in cybersecurity -- attackers continuously refine their methods, whereas defenders struggle to comprehensively counter all evolving threats. Our approach provides a scalable, cost-effective method for analyzing the evolution of phishing strategies and defenses, offering insights into future social engineering threats and underscoring the necessity of proactive cybersecurity measures.

Paperid: 2311, https://arxiv.org/pdf/2507.21195.pdf

Abstract:
The great success of the diffusion model in image synthesis led to the release of gigantic commercial models, raising the issue of copyright protection and inappropriate content generation. Training-free diffusion watermarking provides a low-cost solution for these issues. However, the prior works remain vulnerable to rotation, scaling, and translation (RST) attacks. Although some methods employ meticulously designed patterns to mitigate this issue, they often reduce watermark capacity, which can result in identity (ID) collusion. To address these problems, we propose MaXsive, a training-free diffusion model generative watermarking technique that has high capacity and robustness. MaXsive best utilizes the initial noise to watermark the diffusion model. Moreover, instead of using a meticulously repetitive ring pattern, we propose injecting the X-shape template to recover the RST distortions. This design significantly increases robustness without losing any capacity, making ID collusion less likely to happen. The effectiveness of MaXsive has been verified on two well-known watermarking benchmarks under the scenarios of verification and identification.

Paperid: 2312, https://arxiv.org/pdf/2507.21182.pdf

Abstract:
Open-source Large Language Models (LLMs) often employ safety alignment methods to resist harmful instructions. However, recent research shows that maliciously fine-tuning these LLMs on harmful data can easily bypass these safeguards. To counter this, we theoretically uncover why malicious fine-tuning succeeds and identify potential defense strategies. Building on the theoretical analysis, we introduce the Self-Degraded Defense (SDD) framework. SDD encourages LLMs to produce high-quality but irrelevant responses to harmful prompts. When attackers attempt malicious fine-tuning, the general capability of the LLM aligned by SDD will significantly decrease, rendering it incapable of following harmful instructions. Our experimental results confirm SDD's effectiveness against such attacks.

Paperid: 2313, https://arxiv.org/pdf/2507.21181.pdf

Abstract:
Social media platforms offer numerous benefits and allow people to come together for various causes. Many communities, academia, government agencies, institutions, healthcare, entertainment, and businesses are on social media platforms. They are intuitive and free for users. It has become unimaginable to live without social media. Their architecture and data handling are geared towards scalability, uninterrupted availability, and both personal and collaborative revenue generation. Primarily, artificial intelligence algorithms are employed on stored user data for optimization and feeds. This has the potential to impact user safety, privacy, and security, even when metadata is used. A new decentralized data arrangement framework based on the Fractal-tree and L-Systems algorithm is proposed to mitigate some of the impacts of social media platforms. Future work will focus on demonstrating the effectiveness of the new decentralized framework by comparing its results against state-of-the-art security methods currently used in databases. A cryptographic algorithm could also be implemented for the framework, employing a new key generation for each branch. This will strengthen database security; for example, if a user key is leaked, regenerating the key for each branch will keep the data secure by applying defense mechanisms in the proposed L-System-based tree framework.

Paperid: 2314, https://arxiv.org/pdf/2507.21170.pdf

Abstract:
The rise of Large Language Models has created a general excitement about the great potential for a myriad of applications. While LLMs offer many possibilities, questions about safety, privacy, and ethics have emerged, and all the key actors are working to address these issues with protective measures for their own models and standalone solutions. The constantly evolving nature of LLMs makes it extremely challenging to universally shield users against their potential risks, and one-size-fits-all solutions are unfeasible. In this work, we propose OneShield, our stand-alone, model-agnostic and customizable solution to safeguard LLMs. OneShield aims to provide facilities for defining risk factors, expressing and declaring contextual safety and compliance policies, and mitigating LLM risks, with a focus on each specific customer. We describe the implementation of the framework, discuss scalability considerations, and provide usage statistics of OneShield since its initial deployment.

Paperid: 2315, https://arxiv.org/pdf/2507.20891.pdf

Abstract:
Homomorphic Encryption (HE) enables computation on encrypted data without decryption, making it a cornerstone of privacy-preserving computation in untrusted environments. As HE sees growing adoption in sensitive applications such as secure machine learning and confidential data analysis ensuring its robustness against errors becomes critical. Faults (e.g., transmission errors, hardware malfunctions, or synchronization failures) can corrupt encrypted data and compromise the integrity of HE operations. However, the impact of soft errors (such as bit flips) on modern HE schemes remains unexplored. Specifically, the CKKS scheme-one of the most widely used HE schemes for approximate arithmetic-lacks a systematic study of how such errors propagate across its pipeline, particularly under optimizations like the Residue Number System (RNS) and Number Theoretic Transform (NTT). This work bridges that gap by presenting a theoretical and empirical analysis of CKKS's fault tolerance under single bit-flip errors. We focus on client-side operations (encoding, encryption, decryption, and decoding) and demonstrate that while the vanilla CKKS scheme exhibits some resilience, performance optimizations (RNS/NTT) introduce significant fragility, amplifying error sensitivity. By characterizing these failure modes, we lay the groundwork for error-resilient HE designs, ensuring both performance and integrity in privacy-critical applications.

Paperid: 2316, https://arxiv.org/pdf/2507.19905.pdf

Abstract:
Despite significant advancements in computer vision, semantic segmentation models may be susceptible to backdoor attacks. These attacks, involving hidden triggers, aim to cause the models to misclassify instances of the victim class as the target class when triggers are present, posing serious threats to the reliability of these models. To further explore the field of backdoor attacks against semantic segmentation, in this paper, we propose a simple yet effective backdoor attack called Contextual Segmentation Backdoor Attack (ConSeg). ConSeg leverages the contextual information inherent in semantic segmentation models to enhance backdoor performance. Our method is motivated by an intriguing observation, i.e., when the target class is set as the `co-occurring' class of the victim class, the victim class can be more easily `mis-segmented'. Building upon this insight, ConSeg mimics the contextual information of the target class and rebuilds it in the victim region to establish the contextual relationship between the target class and the victim class, making the attack easier. Our experiments reveal that ConSeg achieves improvements in Attack Success Rate (ASR) with increases of 15.55\%, compared to existing methods, while exhibiting resilience against state-of-the-art backdoor defenses.

Paperid: 2317, https://arxiv.org/pdf/2507.19609.pdf

Abstract:
The Internet of Medical Things (IoMT) has the potential to radically improve healthcare by enabling real-time monitoring, remote diagnostics, and AI-driven decision making. However, the connectivity, embedded intelligence, and inclusion of a wide variety of novel sensors expose medical devices to severe cybersecurity threats, compromising patient safety and data privacy. In addition, many devices also have direct capacity - individually or in conjunction with other IoMT devices - to perform actions on the patient, such as delivering an electrical stimulus, administering a drug, or activating a motor, which can potentially be life-threatening. We provide a taxonomy of potential attacks targeting IoMT, presenting attack surfaces, vulnerabilities, and mitigation strategies across all layers of the IoMT architecture. It answers key questions such as: What makes IoMT security different from traditional IT security? What are the cybersecurity threats to medical devices? How can engineers design secure IoMT systems and protect hospital networks from cyberattacks? By analyzing historical cyber incidents, we highlight critical security gaps and propose practical security guidelines for medical device engineers and security professionals. This work bridges the gap between research and implementation, equipping healthcare stakeholders with actionable insights to build resilient and privacy-preserving IoMT ecosystems. Finally, we present the latest standardization and compliance frameworks, that IoMT security designers should be aware of.

Paperid: 2318, https://arxiv.org/pdf/2507.19032.pdf

Abstract:
A quantum copy-protection scheme (Aaronson, CCC 2009) encodes a functionality into a quantum state such that given this state, no efficient adversary can create two (possibly entangled) quantum states that are both capable of running the functionality. There has been a recent line of works on constructing provably-secure copy-protection schemes for general classes of schemes in the plain model, and most recently the recent work of Ãakan and Goyal (IACR Eprint, 2025) showed how to copy-protect all cryptographically puncturable schemes with pseudorandom puncturing points. In this work, we show how to copy-protect even a larger class of schemes. We define a class of cryptographic schemes called malleable-puncturable schemes where the only requirement is that one can create a circuit that is capable of answering inputs at points that are unrelated to the challenge in the security game but does not help the adversary answer inputs related to the challenge. This is a flexible generalization of puncturable schemes, and can capture a wide range of primitives that was not known how to copy-protect prior to our work. Going further, we show that our scheme is secure against arbitrary high min-entropy challenge distributions whereas previous work has only considered schemes that are punctured at pseudorandom points.

Paperid: 2319, https://arxiv.org/pdf/2507.18037.pdf

Abstract:
The MITRE Adversarial Tactics, Techniques and Common Knowledge (MITRE ATT&CK) Attack Technique to Proactive Software Supply Chain Risk Management Framework (P-SSCRM) Task mapping described in this document helps software organizations to determine how different tasks mitigate the attack techniques of software supply chain attacks. The mapping was created through four independent strategies to find agreed-upon mappings. Because each P-SSCRM task is mapped to one or more tasks from the 10 frameworks, the mapping we provide is also a mapping between MITRE ATT&CK and other prominent government and industry frameworks.

Paperid: 2320, https://arxiv.org/pdf/2507.17895.pdf

Abstract:
The most effective differentially private machine learning algorithms in practice rely on an additional source of purportedly public data. This paradigm is most interesting when the two sources combine to be more than the sum of their parts. However, there are settings such as mean estimation where we have strong lower bounds, showing that when the two data sources have the same distribution, there is no complementary value to combining the two data sources. In this work we extend the known lower bounds for public-private learning to setting where the two data sources exhibit significant distribution shift. Our results apply to both Gaussian mean estimation where the two distributions have different means, and to Gaussian linear regression where the two distributions exhibit parameter shift. We find that when the shift is small (relative to the desired accuracy), either public or private data must be sufficiently abundant to estimate the private parameter. Conversely, when the shift is large, public data provides no benefit.

Paperid: 2321, https://arxiv.org/pdf/2507.17589.pdf

Abstract:
With the rapid advancement of quantum computing, quantum compilation has become a crucial layer connecting high-level algorithms with physical hardware. In quantum cloud computing, compilation is performed on the cloud platforms, which expose user circuits to potential risks such as structural leakage and output predictability. To address these issues, we propose the encrypted-state quantum compilation scheme based on quantum circuit obfuscation (ECQCO), the first secure compilation scheme tailored for the co-location of compilers and quantum hardware for quantum cloud platforms. It applies quantum homomorphic encryption to conceal output states and instantiates a structure obfuscation mechanism based on quantum indistinguishability obfuscation, effectively protecting both functionality and topology of the circuit. Additionally, an adaptive decoupling obfuscation algorithm is designed to suppress potential idle errors while inserting pulse operations. The proposed scheme achieves information-theoretic security and guarantees computational indistinguishability under the quantum random oracle model. Experimental results on benchmark datasets demonstrate that ECQCO achieves a total variation distance (TVD) of up to 0.7 and a normalized graph edit distance (GED) of 0.88, enhancing compilation-stage security. Moreover, it introduces only a slight increase in circuit depth, while keeping the average fidelity change within 1\%, thus achieving a practical balance between security and efficiency.

Paperid: 2322, https://arxiv.org/pdf/2507.16181.pdf

Abstract:
Hardware crosstalk in multi-tenant superconducting quantum computers poses a severe security threat, allowing adversaries to induce targeted errors across tenant boundaries by injecting carefully engineered pulses. We present a simulation-based study of active crosstalk attacks at the pulse level, analyzing how adversarial control of pulse timing, shape, amplitude, and coupling can disrupt a victim's computation. Our framework models the time-dependent dynamics of a three-qubit system in the rotating frame, capturing both always-on couplings and injected drive pulses. We examine two attack strategies: attacker-first (pulse before victim operation) and victim-first (pulse after), and systematically identify the pulse and coupling configurations that cause the largest logical errors. Protocol-level experiments on quantum coin flip and XOR classification circuits show that some protocols are highly vulnerable to these attacks, while others remain robust. Based on these findings, we discuss practical methods for detection and mitigation to improve security in quantum cloud platforms.

Paperid: 2323, https://arxiv.org/pdf/2507.15101.pdf

Abstract:
Detecting partial deepfake speech is essential due to its potential for subtle misinformation. However, existing methods depend on costly frame-level annotations during training, limiting real-world scalability. Also, they focus on detecting transition artifacts between bonafide and deepfake segments. As deepfake generation techniques increasingly smooth these transitions, detection has become more challenging. To address this, our work introduces a new perspective by analyzing frame-level temporal differences and reveals that deepfake speech exhibits erratic directional changes and unnatural local transitions compared to bonafide speech. Based on this finding, we propose a Temporal Difference Attention Module (TDAM) that redefines partial deepfake detection as identifying unnatural temporal variations, without relying on explicit boundary annotations. A dual-level hierarchical difference representation captures temporal irregularities at both fine and coarse scales, while adaptive average pooling preserves essential patterns across variable-length inputs to minimize information loss. Our TDAM-AvgPool model achieves state-of-the-art performance, with an EER of 0.59% on the PartialSpoof dataset and 0.03% on the HAD dataset, which significantly outperforms the existing methods without requiring frame-level supervision.

Paperid: 2324, https://arxiv.org/pdf/2507.10836.pdf

Abstract:
Graph Neural Network (GNN)-based network intrusion detection systems (NIDS) are often evaluated on single datasets, limiting their ability to generalize under distribution drift. Furthermore, their adversarial robustness is typically assessed using synthetic perturbations that lack realism. This measurement gap leads to an overestimation of GNN-based NIDS resilience. To address the limitations, we propose \textbf{REAL-IoT}, a comprehensive framework for robustness evaluation of GNN-based NIDS in IoT environments. Our framework presents a methodology that creates a unified dataset from canonical datasets to assess generalization under drift. In addition, it features a novel intrusion dataset collected from a physical IoT testbed, which captures network traffic and attack scenarios under real-world settings. Furthermore, using REAL-IoT, we explore the usage of Large Language Models (LLMs) to analyze network data and mitigate the impact of adversarial examples by filtering suspicious flows. Our evaluations using REAL-IoT reveal performance drops in GNN models compared to results from standard benchmarks, quantifying their susceptibility to drift and realistic attacks. We also demonstrate the potential of LLM-based filtering to enhance robustness. These findings emphasize the necessity of realistic threat modeling and rigorous measurement practices for developing resilient IoT intrusion detection systems.

Paperid: 2325, https://arxiv.org/pdf/2507.10808.pdf

Abstract:
In the era of the Fourth Industrial Revolution, cybersecurity and intrusion detection systems are vital for the secure and reliable operation of IoT and IIoT environments. A key challenge in this domain is the scarcity of labeled cyberattack data, as most industrial systems operate under normal conditions. This data imbalance, combined with the high cost of annotation, hinders the effective training of machine learning models. Moreover, the rapid detection of attacks is essential, especially in critical infrastructure, to prevent large-scale disruptions. To address these challenges, we propose a real-time intrusion detection system based on a semi-supervised contrastive learning framework using the Kolmogorov-Arnold Network (KAN). Our method leverages abundant unlabeled data to effectively distinguish between normal and attack behaviors. We validate our approach on three benchmark datasets, UNSW-NB15, BoT-IoT, and Gas Pipeline, using only 2.20%, 1.28%, and 8% of labeled samples, respectively, to simulate real-world conditions. Experimental results show that our method outperforms existing contrastive learning-based approaches. We further compare KAN with a traditional multilayer perceptron (MLP), demonstrating KAN's superior performance in both detection accuracy and robustness under limited supervision. KAN's ability to model complex relationships, along with its learnable activation functions, is also explored and visualized, offering interpretability and the potential for rule extraction. The method supports multi-class classification and proves effective in safety, critical environments where reliability is paramount.

Paperid: 2326, https://arxiv.org/pdf/2507.10578.pdf

Abstract:
Poisoning attacks pose significant challenges to the robustness of diffusion models (DMs). In this paper, we systematically analyze when and where poisoning attacks textual inversion (TI), a widely used personalization technique for DMs. We first introduce Semantic Sensitivity Maps, a novel method for visualizing the influence of poisoning on text embeddings. Second, we identify and experimentally verify that DMs exhibit non-uniform learning behavior across timesteps, focusing on lower-noise samples. Poisoning attacks inherit this bias and inject adversarial signals predominantly at lower timesteps. Lastly, we observe that adversarial signals distract learning away from relevant concept regions within training data, corrupting the TI process. Based on these insights, we propose Safe-Zone Training (SZT), a novel defense mechanism comprised of 3 key components: (1) JPEG compression to weaken high-frequency poison signals, (2) restriction to high timesteps during TI training to avoid adversarial signals at lower timesteps, and (3) loss masking to constrain learning to relevant regions. Extensive experiments across multiple poisoning methods demonstrate that SZT greatly enhances the robustness of TI against all poisoning attacks, improving generative quality beyond prior published defenses. Code: www.github.com/JStyborski/Diff_Lab Data: www.github.com/JStyborski/NC10

Paperid: 2327, https://arxiv.org/pdf/2507.09857.pdf

Abstract:
Adversarial attacks on robotic grasping provide valuable insights into evaluating and improving the robustness of these systems. Unlike studies that focus solely on neural network predictions while overlooking the physical principles of grasping, this paper introduces AdvGrasp, a framework for adversarial attacks on robotic grasping from a physical perspective. Specifically, AdvGrasp targets two core aspects: lift capability, which evaluates the ability to lift objects against gravity, and grasp stability, which assesses resistance to external disturbances. By deforming the object's shape to increase gravitational torque and reduce stability margin in the wrench space, our method systematically degrades these two key grasping metrics, generating adversarial objects that compromise grasp performance. Extensive experiments across diverse scenarios validate the effectiveness of AdvGrasp, while real-world validations demonstrate its robustness and practical applicability

Paperid: 2328, https://arxiv.org/pdf/2507.09288.pdf

Abstract:
Quantum Key Distribution (QKD) offers information-theoretic security against quantum computing threats, but integrating QKD into existing security protocols remains an unsolved challenge due to fundamental mismatches between pre-distributed quantum keys and computational key exchange paradigms. This paper presents the first systematic comparison of sequential versus parallel hybrid QKD-PQC key establishment strategies for IPsec, revealing fundamental protocol design principles that extend beyond specific implementations. We introduce two novel approaches for incorporating QKD into Internet Key Exchange version 2 (IKEv2) with support for both ETSI GS QKD 004 stateful and ETSI GS QKD 014 stateless API specifications: (1) a pure QKD approach that replaces computational key derivation with identifier-based quantum key coordination, and (2) a unified QKD-KEM abstraction that enables parallel composition of quantum and post-quantum cryptographic methods within existing protocol frameworks. Our key insight is that parallel hybrid approaches eliminate the multiplicative latency penalties inherent in sequential methods mandated by RFC 9370, achieving significant performance improvements under realistic network conditions. Performance evaluation using a Docker-based testing framework with IDQuantique QKD hardware demonstrates that the parallel hybrid approach significantly outperforms sequential methods under network latency conditions, while pure QKD achieves minimal bandwidth overhead through identifier-based key coordination. Our implementations provide practical quantum-enhanced IPsec solutions suitable for critical infrastructure deployments requiring defense-in-depth security.

Paperid: 2329, https://arxiv.org/pdf/2507.08904.pdf

Abstract:
Beam alignment (BA) is a crucial process in millimeter-wave (mmWave) communications, enabling precise directional transmission and efficient link establishment. However, due to characteristics like omnidirectional exposure and the broadcast nature of the BA phase, it is particularly vulnerable to eavesdropping and identity impersonation attacks. To this end, this paper proposes a novel secure framework named CovertAuth, designed to enhance the security of the BA phase against such attacks. In particular, to combat eavesdropping attacks, the closed-form expressions of successful BA probability and covert transmission rate are first derived. Then, a covert communication problem aimed at jointly optimizing beam training budget and transmission power is formulated to maximize covert communication rate, subject to the covertness requirement. An alternating optimization algorithm combined with successive convex approximation is employed to iteratively achieve optimal results. To combat impersonation attacks, the mutual coupling effect of antenna array impairments is explored as a device feature to design a weighted-sum energy detector based physical layer authentication scheme. Moreover, theoretical models for authentication metrics like detection and false alarm probabilities are also provided to conduct performance analysis. Based on these models, an optimization problem is constructed to determine the optimal weight value that maximizes authentication accuracy. Finally, simulation results demonstrate that CovertAuth presents improved detection accuracy under the same covertness requirement compared to existing works.

Paperid: 2330, https://arxiv.org/pdf/2507.07732.pdf

Abstract:
This paper presents RADAR, a tracking algorithm for vehicles participating in Cooperative Intelligent Transportation Systems (C-ITS) that exploits multiple radio signals emitted by a modern vehicle to break privacy-preserving pseudonym schemes deployed in VANETs. This study shows that by combining Dedicated Short Range Communication (DSRC) and Wi-Fi probe request messages broadcast by the vehicle, it is possible to improve tracking over standard de-anonymization approaches that only leverage DSRC, especially in realistic scenarios where the attacker does not have full coverage of the entire vehicle path. The experimental evaluation compares three different metrics for pseudonym and Wi-Fi probe identifier association (Count, Statistical RSSI, and Pearson RSSI), demonstrating that the Pearson RSSI metric is better at tracking vehicles under pseudonym-changing schemes in all scenarios and against previous works. As an additional contribution to the state-of-the-art, we publicly release all implementations and simulation scenarios used in this work.

Paperid: 2331, https://arxiv.org/pdf/2507.07244.pdf

Abstract:
In the ever-evolving landscape of cybersecurity, the rapid identification and mitigation of Advanced Persistent Threats (APTs) is crucial. Security practitioners rely on detailed threat reports to understand the tactics, techniques, and procedures (TTPs) employed by attackers. However, manually extracting attack testflows from these reports requires elusive knowledge and is time-consuming and prone to errors. This paper proposes FLOWGUARDIAN, a novel solution leveraging language models (i.e., BERT) and Natural Language Processing (NLP) techniques to automate the extraction of attack testflows from unstructured threat reports. FLOWGUARDIAN systematically analyzes and contextualizes security events, reconstructs attack sequences, and then generates comprehensive testflows. This automated approach not only saves time and reduces human error but also ensures comprehensive coverage and robustness in cybersecurity testing. Empirical validation using public threat reports demonstrates FLOWGUARDIAN's accuracy and efficiency, significantly enhancing the capabilities of security teams in proactive threat hunting and incident response.

Paperid: 2332, https://arxiv.org/pdf/2507.06742.pdf

Abstract:
Ethical hacking today relies on highly skilled practitioners executing complex sequences of commands, which is inherently time-consuming, difficult to scale, and prone to human error. To help mitigate these limitations, we previously introduced 'PenTest++', an AI-augmented system combining automation with generative AI supporting ethical hacking workflows. However, a key limitation of PenTest++ was its lack of support for privilege escalation, a crucial element of ethical hacking. In this paper we present 'PenTest2.0', a substantial evolution of PenTest++ supporting automated privilege escalation driven entirely by Large Language Model reasoning. It also incorporates several significant enhancements: 'Retrieval-Augmented Generation', including both one-line and offline modes; 'Chain-of-Thought' prompting for intermediate reasoning; persistent 'PenTest Task Trees' to track goal progression across turns; and the optional integration of human-authored hints. We describe how it operates, present a proof-of-concept prototype, and discuss its benefits and limitations. We also describe application of the system to a controlled Linux target, showing it can carry out multi-turn, adaptive privilege escalation. We explain the rationale behind its core design choices, and provide comprehensive testing results and cost analysis. Our findings indicate that 'PenTest2.0' represents a meaningful step toward practical, scalable, AI-automated penetration testing, whilst highlighting the shortcomings of generative AI systems, particularly their sensitivity to prompt structure, execution context, and semantic drift, reinforcing the need for further research and refinement in this emerging space. Keywords: AI, Ethical Hacking, Privilege Escalation, GenAI, ChatGPT, LLM (Large Language Model), HITL (Human-in-the-Loop)

Paperid: 2333, https://arxiv.org/pdf/2507.06490.pdf

Abstract:
We introduce the new AXU hash function decBRWHash, which is parameterised by the positive integer $c$ and is based on Bernstein-Rabin-Winograd (BRW) polynomials. Choosing $c>1$ gives a hash function which can be implemented using $c$-way single instruction multiple data (SIMD) instructions. We report a set of very comprehensive hand optimised assembly implementations of 4-decBRWHash using avx2 SIMD instructions available on modern Intel processors. For comparison, we also report similar carefully optimised avx2 assembly implementations of polyHash, an AXU hash function based on usual polynomials. Our implementations are over prime order fields, specifically the primes $2^{127}-1$ and $2^{130}-5$. For the prime $2^{130}-5$, for avx2 implementations, compared to the famous Poly1305 hash function, 4-decBRWHash is faster for messages which are a few hundred bytes long and achieves a speed-up of about 16% for message lengths in a few kilobytes range and improves to a speed-up of about 23% for message lengths in a few megabytes range.

Paperid: 2334, https://arxiv.org/pdf/2507.05415.pdf

Abstract:
Privacy policies are often complex. An exception is the two-page standardized notice that U.S. financial institutions must provide under the Gramm-Leach-Bliley Act (GLBA). However, banks now operate websites, mobile apps, and other services that involve complex data sharing practices that require additional privacy notices and do-not-sell opt-outs. We conducted a large-scale analysis of how U.S. banks implement privacy policies and controls in response to GLBA; other federal privacy policy requirements; and the California Consumer Privacy Act (CCPA), a key example for U.S. state privacy laws. We focused on the disclosure and control of a set of especially privacy-invasive practices: third-party data sharing for marketing-related purposes. We collected privacy policies for the 2,067 largest U.S. banks, 45.3\% of which provided multiple policies. Across disclosures and controls within the \textit{same} bank, we identified frequent, concerning inconsistencies -- such as banks indicating in GLBA notices that they do not share with third parties but disclosing sharing elsewhere, or using third-party marketing/advertising cookies without disclosure. This multiplicity of policies, with the inconsistencies it causes, may create consumer confusion and undermine the transparency goals of the very laws that require them. Our findings call into question whether current policy requirements, such as the GLBA notice, are achieving their intended goals in today's online banking landscape. We discuss potential avenues for reforming and harmonizing privacy policies and control requirements across federal and state laws.

Paperid: 2335, https://arxiv.org/pdf/2507.04775.pdf

Abstract:
Word-wise Fully Homomorphic Encryption (FHE) schemes, such as CKKS, are gaining significant traction due to their ability to provide post-quantum-resistant, privacy-preserving approximate computing; an especially desirable feature in Machine-Learning-as-a-Service (MLaaS) cloud-computing paradigms. OpenFHE is a leading CPU-based FHE library with robust CKKS operations, but its server-side performance is not yet sufficient for practical cloud deployment. As GPU computing becomes more common in data centers, many FHE libraries are adding GPU support. However, integrating an efficient GPU backend into OpenFHE is challenging. While OpenFHE uses a Hardware Abstraction Layer (HAL), its flexible architecture sacrifices performance due to the abstraction layers required for multi-scheme and multi-backend compatibility. In this work, we introduce FIDESlib, the first open-source server-side CKKS GPU library that is fully interoperable with well-established client-side OpenFHE operations. Unlike other existing open-source GPU libraries, FIDESlib provides the first implementation featuring heavily optimized GPU kernels for all CKKS primitives, including bootstrapping. Our library also integrates robust benchmarking and testing, ensuring it remains adaptable to further optimization. Furthermore, its software architecture is designed to support extensions to a multi-GPU backend for enhanced acceleration. Our experiments across various GPU systems and the leading open-source CKKS library to date, Phantom, show that FIDESlib offers superior performance and scalability. For bootstrapping, FIDESlib achieves no less than 70x speedup over the AVX-optimized OpenFHE implementation.

Paperid: 2336, https://arxiv.org/pdf/2507.03993.pdf

Abstract:
This paper addresses the critical need for high-quality malware datasets that support advanced analysis techniques, particularly machine learning and agentic AI frameworks. Existing datasets often lack diversity, comprehensive labelling, and the complexity necessary for effective machine learning and agent-based AI training. To fill this gap, we developed a systematic approach for generating a dataset that combines automated malware execution in controlled virtual environments with dynamic monitoring tools. The resulting dataset comprises clean and infected memory snapshots across multiple malware families and operating systems, capturing detailed behavioural and environmental features. Key design decisions include applying ethical and legal compliance, thorough validation using both automated and manual methods, and comprehensive documentation to ensure replicability and integrity. The dataset's distinctive features enable modelling system states and transitions, facilitating RL-based malware detection and response strategies. This resource is significant for advancing adaptive cybersecurity defences and digital forensic research. Its scope supports diverse malware scenarios and offers potential for broader applications in incident response and automated threat mitigation.

Paperid: 2337, https://arxiv.org/pdf/2507.03344.pdf

Abstract:
The Rust programming language enforces three basic Rust principles, namely ownership, borrowing, and AXM (Aliasing Xor Mutability) to prevent security bugs such as memory safety violations and data races. However, Rust projects often have mixed code, i.e., code that also uses unsafe Rust, FFI (Foreign Function Interfaces), and inline assembly for low-level control. The Rust compiler is unable to statically enforce Rust principles in mixed Rust code which can lead to many security vulnerabilities. In this paper, we propose CapsLock, a security enforcement mechanism that can run at the level of machine code and detect Rust principle violations at run-time in mixed code. CapsLock is kept simple enough to be implemented into recent capability-based hardware abstractions that provide low-cost spatial memory safety. CapsLock introduces a novel revoke-on-use abstraction for capability-based designs, wherein accessing a memory object via a capability implicitly invalidates certain other capabilities pointing to it, thereby also providing temporal memory safety automatically, without requiring software to explicitly specify such invalidation. Thus, CapsLock is the first mechanism capable of providing cross-language enforcement of Rust principles. We implemented a prototype of CapsLock on QEMU. Evaluation results show that CapsLock is highly compatible with existing Rust code (passing 99.7% of the built-in test cases of the 100 most popular crates) and flags Rust principle violations in real-world Rust projects that use FFI or inline assembly. We discovered 8 previously unknown bugs in such crates in our experiments.

Paperid: 2338, https://arxiv.org/pdf/2507.02959.pdf

Abstract:
Active learning for classification seeks to reduce the cost of labeling samples by finding unlabeled examples about which the current model is least certain and sending them to an annotator/expert to label. Bayesian theory can provide a probabilistic view of deep neural network models by asserting a prior distribution over model parameters and estimating the uncertainties by posterior distribution over these parameters. This paper proposes two novel active learning approaches to label one million malware examples belonging to different unknown modern malware families. The first model is Inception-V4+PCA combined with several support vector machine (SVM) algorithms (UTSVM, PSVM, SVM-GSU, TBSVM). The second model is Vision Transformer based Bayesian Neural Networks ViT-BNN. Our proposed ViT-BNN is a state-of-the-art active learning approach that differs from current methods and can apply to any particular task. The experiments demonstrate that the ViT-BNN is more stable and robust in handling uncertainty.

Paperid: 2339, https://arxiv.org/pdf/2507.02181.pdf

Abstract:
This paper introduces {\em truncated inner $c$-differential cryptanalysis}, a novel technique that for the first time enables the practical application of $c$-differential uniformity to block ciphers. While Ellingsen et al. (IEEE Trans. Inf. Theory, 2020) established the notion of $c$-differential uniformity using $(F(x\oplus a), cF(x))$, a key challenge remained: multiplication by $c$ disrupts the structural properties essential for block cipher analysis, particularly key addition. We resolve this challenge by developing an \emph{inner} $c$-differential approach where multiplication by $c$ affects the input: $(F(cx\oplus a), F(x))$. We prove that the inner $c$-differential uniformity of a function $F$ equals the outer $c$-differential uniformity of $F^{-1}$, establishing a fundamental duality. This modification preserves cipher structure while enabling practical cryptanalytic applications. Our main contribution is a comprehensive multi-faceted statistical-computational framework, implementing truncated $c$-differential analysis against the full 9-round Kuznyechik cipher (the inner $c$-differentials are immune to the key whitening at the backend). Through extensive computational analysis involving millions of differential pairs, we demonstrate statistically significant non-randomness across all tested round counts. For the full 9-round cipher, we identify multiple configurations triggering critical security alerts, with bias ratios reaching $1.7\times$ and corrected p-values as low as $1.85 \times 10^{-3}$, suggesting insufficient security margin against this new attack vector. This represents the first practical distinguisher against the full 9-round Kuznyechik.

Paperid: 2340, https://arxiv.org/pdf/2507.01607.pdf

Abstract:
The widespread deployment of Deep Learning-based Face Recognition Systems raises multiple security concerns. While prior research has identified backdoor vulnerabilities on isolated components, Backdoor Attacks on real-world, unconstrained pipelines remain underexplored. This paper presents the first comprehensive system-level analysis of Backdoor Attacks targeting Face Recognition Systems and provides three contributions. We first show that face feature extractors trained with large margin metric learning losses are susceptible to Backdoor Attacks. By analyzing 20 pipeline configurations and 15 attack scenarios, we then reveal that a single backdoor can compromise an entire Face Recognition System. Finally, we propose effective best practices and countermeasures for stakeholders.

Paperid: 2341, https://arxiv.org/pdf/2507.00230.pdf

Abstract:
Reconstructing high-quality images from low-resolution inputs using Residual Dense Spatial Networks (RDSNs) is crucial yet challenging, particularly in collaborative scenarios where centralized training poses significant privacy risks, including data leakage and inference attacks, as well as high computational costs. We propose a novel Privacy-Preserving Federated Learning-based RDSN (PPFL-RDSN) framework specifically tailored for lossy image reconstruction. PPFL-RDSN integrates Federated Learning (FL), local differential privacy, and robust model watermarking techniques, ensuring data remains secure on local devices, safeguarding sensitive information, and maintaining model authenticity without revealing underlying data. Empirical evaluations show that PPFL-RDSN achieves comparable performance to the state-of-the-art centralized methods while reducing computational burdens, and effectively mitigates security and privacy vulnerabilities, making it a practical solution for secure and privacy-preserving collaborative computer vision applications.

Paperid: 2342, https://arxiv.org/pdf/2507.00015.pdf

Abstract:
The remarkable success of transformers across various fields such as natural language processing and computer vision has paved the way for their applications in automatic modulation classification, a critical component in the communication systems of Internet of Things (IoT) devices. However, it has been observed that transformer-based classification of radio signals is susceptible to subtle yet sophisticated adversarial attacks. To address this issue, we have developed a defensive strategy for transformer-based modulation classification systems to counter such adversarial attacks. In this paper, we propose a novel vision transformer (ViT) architecture by introducing a new concept known as adversarial indicator (AdvI) token to detect adversarial attacks. To the best of our knowledge, this is the first work to propose an AdvI token in ViT to defend against adversarial attacks. Integrating an adversarial training method with a detection mechanism using AdvI token, we combine a training time defense and running time defense in a unified neural network model, which reduces architectural complexity of the system compared to detecting adversarial perturbations using separate models. We investigate into the operational principles of our method by examining the attention mechanism. We show the proposed AdvI token acts as a crucial element within the ViT, influencing attention weights and thereby highlighting regions or features in the input data that are potentially suspicious or anomalous. Through experimental results, we demonstrate that our approach surpasses several competitive methods in handling white-box attack scenarios, including those utilizing the fast gradient method, projected gradient descent attacks and basic iterative method.

Paperid: 2343, https://arxiv.org/pdf/2512.24499.pdf

Abstract:
The rapid expansion of generative AI has normalized large-scale synthetic media creation, enabling new forms of covert communication. Recent generative steganography methods, particularly those based on diffusion models, can embed high-capacity payloads without fine-tuning or auxiliary decoders, creating significant challenges for detection and remediation. Coverless diffusion-based techniques are difficult to counter because they generate image carriers directly from secret data, enabling attackers to deliver stegomalware for command-and-control, payload staging, and data exfiltration while bypassing detectors that rely on cover-stego discrepancies. This work introduces Adversarial Diffusion Sanitization (ADS), a training-free defense for security gateways that neutralizes hidden payloads rather than detecting them. ADS employs an off-the-shelf pretrained denoiser as a differentiable proxy for diffusion-based decoders and incorporates a color-aware, quaternion-coupled update rule to reduce artifacts under strict distortion limits. Under a practical threat model and in evaluation against the state-of-the-art diffusion steganography method Pulsar, ADS drives decoder success rates to near zero with minimal perceptual impact. Results demonstrate that ADS provides a favorable security-utility trade-off compared to standard content transformations, offering an effective mitigation strategy against diffusion-driven steganography.

Paperid: 2344, https://arxiv.org/pdf/2512.23779.pdf

Abstract:
Large language models (LLMs) can be driven into over-generation, emitting thousands of tokens before producing an end-of-sequence (EOS) token. This degrades answer quality, inflates latency and cost, and can be weaponized as a denial-of-service (DoS) attack. Recent work has begun to study DoS-style prompt attacks, but typically focuses on a single attack algorithm or assumes white-box access, without an attack-side benchmark that compares prompt-based attackers in a black-box, query-only regime with a known tokenizer. We introduce such a benchmark and study two prompt-only attackers. The first is Evolutionary Over-Generation Prompt Search (EOGen), which searches the token space for prefixes that suppress EOS and induce long continuations. The second is a goal-conditioned reinforcement learning attacker (RL-GOAL) that trains a network to generate prefixes conditioned on a target length. To characterize behavior, we introduce Over-Generation Factor (OGF), the ratio of produced tokens to a model's context window, along with stall and latency summaries. Our evolutionary attacker achieves mean OGF = 1.38 +/- 1.15 and Success@OGF >= 2 of 24.5 percent on Phi-3. RL-GOAL is stronger: across victims it achieves higher mean OGF (up to 2.81 +/- 1.38).

Paperid: 2345, https://arxiv.org/pdf/2512.23774.pdf

Abstract:
API gateways serve as critical enforcement points for security, governance, and traffic management in cloud-native systems. As organizations increasingly adopt multi-cluster and hybrid cloud deployments, maintaining consistent policy enforcement, predictable performance, and operational stability across heterogeneous gateway environments becomes challenging. Existing approaches typically manage security, governance, and performance as loosely coupled concerns, leading to configuration drift, delayed policy propagation, and unstable runtime behavior under dynamic workloads. This paper presents a governance-aware, intent-driven architecture for coordinated API gateway management in multi-cluster cloud environments. The proposed approach expresses security, governance, and performance objectives as high-level declarative intents, which are systematically translated into enforceable gateway configurations and continuously validated through policy verification and telemetry-driven feedback. By decoupling intent specification from enforcement while enabling bounded, policy-compliant adaptation, the architecture supports heterogeneous gateway implementations without compromising governance guarantees or service-level objectives. A prototype implementation across multiple Kubernetes clusters demonstrates the effectiveness of the proposed design. Experimental results show up to a 42% reduction in policy drift, a 31% improvement in configuration propagation time, and sustained p95 latency overhead below 6% under variable workloads, compared to manual and declarative baseline approaches. These results indicate that governance-aware, intent-driven gateway orchestration provides a scalable and reliable foundation for secure, consistent, and performance-predictable cloud-native platforms.

Paperid: 2346, https://arxiv.org/pdf/2512.23760.pdf

Abstract:
Reinforcement learning is increasingly used to transform large language models into agentic systems that act over long horizons, invoke tools, and manage memory under partial observability. While recent work has demonstrated performance gains through tool learning, verifiable rewards, and continual training, deployed self-improving agents raise unresolved security and governance challenges: optimization pressure can incentivize reward hacking, behavioral drift is difficult to audit or reproduce, and improvements are often entangled in opaque parameter updates rather than reusable, verifiable artifacts. This paper proposes Audited Skill-Graph Self-Improvement (ASG-SI), a framework that treats self-improvement as iterative compilation of an agent into a growing, auditable skill graph. Each candidate improvement is extracted from successful trajectories, normalized into a skill with an explicit interface, and promoted only after passing verifier-backed replay and contract checks. Rewards are decomposed into reconstructible components derived from replayable evidence, enabling independent audit of promotion decisions and learning signals. ASG-SI further integrates experience synthesis for scalable stress testing and continual memory control to preserve long-horizon performance under bounded context. We present a complete system architecture, threat model, and security analysis, and provide a fully runnable reference implementation that demonstrates verifier-backed reward construction, skill compilation, audit logging, and measurable improvement under continual task streams. ASG-SI reframes agentic self-improvement as accumulation of verifiable, reusable capabilities, offering a practical path toward reproducible evaluation and operational governance of self-improving AI agents.

Paperid: 2347, https://arxiv.org/pdf/2512.22759.pdf

Abstract:
Identifying social bots has become a critical challenge due to their significant influence on social media ecosystems. Despite advancements in detection methods, most topology-based approaches insufficiently account for the heterogeneity of neighborhood preferences and lack a systematic theoretical foundation, relying instead on intuition and experience. Here, we propose a theoretical framework for detecting social bots utilizing heterogeneous motifs based on the Naïve Bayes model. Specifically, we refine homogeneous motifs into heterogeneous ones by incorporating node-label information, effectively capturing the heterogeneity of neighborhood preferences. Additionally, we systematically evaluate the contribution of different node pairs within heterogeneous motifs to the likelihood of a node being identified as a social bot. Furthermore, we mathematically quantify the maximum capability of each heterogeneous motif, enabling the estimation of its potential benefits. Comprehensive evaluations on four large, publicly available benchmarks confirm that our method surpasses state-of-the-art techniques, achieving superior performance across five evaluation metrics. Moreover, our results reveal that selecting motifs with the highest capability achieves detection performance comparable to using all heterogeneous motifs. Overall, our framework offers an effective and theoretically grounded solution for social bot detection, significantly enhancing cybersecurity measures in social networks.

Paperid: 2348, https://arxiv.org/pdf/2512.22223.pdf

Abstract:
Modern networks generate vast, heterogeneous traffic that must be continuously analyzed for security and performance. Traditional network traffic analysis systems, whether rule-based or machine learning-driven, often suffer from high false positives and lack interpretability, limiting analyst trust. In this paper, we present ReGAIN, a multi-stage framework that combines traffic summarization, retrieval-augmented generation (RAG), and Large Language Model (LLM) reasoning for transparent and accurate network traffic analysis. ReGAIN creates natural-language summaries from network traffic, embeds them into a multi-collection vector database, and utilizes a hierarchical retrieval pipeline to ground LLM responses with evidence citations. The pipeline features metadata-based filtering, MMR sampling, a two-stage cross-encoder reranking mechanism, and an abstention mechanism to reduce hallucinations and ensure grounded reasoning. Evaluated on ICMP ping flood and TCP SYN flood traces from the real-world traffic dataset, it demonstrates robust performance, achieving accuracy between 95.95% and 98.82% across different attack types and evaluation benchmarks. These results are validated against two complementary sources: dataset ground truth and human expert assessments. ReGAIN also outperforms rule-based, classical ML, and deep learning baselines while providing unique explainability through trustworthy, verifiable responses.

Paperid: 2349, https://arxiv.org/pdf/2512.20964.pdf

Abstract:
Autonomous Vehicles (AVs) refer to systems capable of perceiving their states and moving without human intervention. Among the factors required for autonomous decision-making in mobility, positional awareness of the vehicle itself is the most critical. Accordingly, extensive research has been conducted on defense mechanisms against GPS spoofing attacks, which threaten AVs by disrupting position recognition. Among these, detection methods based on internal IMU sensors are regarded as some of the most effective. In this paper, we propose a spoofing attack system designed to neutralize IMU sensor-based detection. First, we present an attack modeling approach for bypassing such detection. Then, based on EKF sensor fusion, we experimentally analyze both the impact of GPS spoofing values on the internal target system and how our proposed methodology reduces anomaly detection within the target system. To this end, this paper proposes an attack model that performs GPS spoofing by stealing internal dynamic state information using an external IMU sensor, and the experimental results demonstrate that attack values can be injected without being detected.

Paperid: 2350, https://arxiv.org/pdf/2512.20705.pdf

Abstract:
Detecting business logic vulnerabilities is a critical challenge in software security. These flaws come from mistakes in an application's design or implementation and allow attackers to trigger unintended application behavior. Traditional fuzzing sanitizers for dynamic analysis excel at finding vulnerabilities related to memory safety violations but largely fail to detect business logic vulnerabilities, as these flaws require understanding application-specific semantic context. Recent attempts to infer this context, due to their reliance on heuristics and non-portable language features, are inherently brittle and incomplete. As business logic vulnerabilities constitute a majority (27/40) of the most dangerous software weaknesses in practice, this is a worrying blind spot of existing tools. In this paper, we tackle this challenge with ANOTA, a novel human-in-the-loop sanitizer framework. ANOTA introduces a lightweight, user-friendly annotation system that enables users to directly encode their domain-specific knowledge as lightweight annotations that define an application's intended behavior. A runtime execution monitor then observes program behavior, comparing it against the policies defined by the annotations, thereby identifying deviations that indicate vulnerabilities. To evaluate the effectiveness of ANOTA, we combine ANOTA with a state-of-the-art fuzzer and compare it against other popular bug finding methods compatible with the same targets. The results show that ANOTA+FUZZER outperforms them in terms of effectiveness. More specifically, ANOTA+FUZZER can successfully reproduce 43 known vulnerabilities, and discovered 22 previously unknown vulnerabilities (17 CVEs assigned) during the evaluation. These results demonstrate that ANOTA provides a practical and effective approach for uncovering complex business logic flaws often missed by traditional security testing techniques.

Paperid: 2351, https://arxiv.org/pdf/2512.18714.pdf

Abstract:
The increasing cyber threats to critical infrastructure highlight the importance of private companies and government agencies in detecting and sharing information about threat activities. Although the need for improved threat information sharing is widely recognized, various technical and organizational challenges persist, hindering effective collaboration. In this study, we review the challenges that disturb the sharing of usable threat information to critical infrastructure operators within the ICS domain. We analyze three major incidents: Stuxnet, Industroyer, and Triton. In addition, we perform a systematic analysis of 196 procedure examples across 79 MITRE ATT&CK techniques from 22 ICS-related malware families, utilizing automated natural language processing techniques to systematically extract and categorize threat observables. Additionally, we investigated nine recent ICS vulnerability advisories from the CISA Known Exploitable Vulnerability catalog. Our analysis identified four important limitations in the ICS threat information sharing ecosystem: (i) the lack of coherent representation of artifacts related to ICS adversarial techniques in information sharing language standards (e.g., STIX); (ii) the dependence on undocumented proprietary technologies; (iii) limited technical details provided in vulnerability and threat incident reports; and (iv) the accessibility of technical details for observed adversarial techniques. This study aims to guide the development of future information-sharing standards, including the enhancement of the cyber-observable objects schema in STIX, to ensure accurate representation of artifacts specific to ICS environments.

Paperid: 2352, https://arxiv.org/pdf/2512.18616.pdf

Abstract:
We present DASH (Deception-Augmented Shared mental model for Human-machine teaming), a novel framework that enhances mission resilience by embedding proactive deception into Shared Mental Models (SMM). Designed for mission-critical applications such as surveillance and rescue, DASH introduces "bait tasks" to detect insider threats, e.g., compromised Unmanned Ground Vehicles (UGVs), AI agents, or human analysts, before they degrade team performance. Upon detection, tailored recovery mechanisms are activated, including UGV system reinstallation, AI model retraining, or human analyst replacement. In contrast to existing SMM approaches that neglect insider risks, DASH improves both coordination and security. Empirical evaluations across four schemes (DASH, SMM-only, no-SMM, and baseline) show that DASH sustains approximately 80% mission success under high attack rates, eight times higher than the baseline. This work contributes a practical human-AI teaming framework grounded in shared mental models, a deception-based strategy for insider threat detection, and empirical evidence of enhanced robustness under adversarial conditions. DASH establishes a foundation for secure, adaptive human-machine teaming in contested environments.

Paperid: 2353, https://arxiv.org/pdf/2512.18132.pdf

Abstract:
Edge AI inference is becoming prevalent thanks to the emergence of small yet high-performance microprocessors. This shift from cloud to edge processing brings several benefits in terms of energy savings, improved latency, and increased privacy. On the downside, bringing computation to the edge makes them more vulnerable to physical side-channel attacks (SCA), which aim to extract the confidentiality of neural network models, e.g., architecture and weight. To address this growing threat, we propose PermuteV, a performant side-channel resistant RISC-V core designed to secure neural network inference. PermuteV employs a hardware-accelerated defense mechanism that randomly permutes the execution order of loop iterations, thereby obfuscating the electromagnetic (EM) signature associated with sensitive operations. We implement PermuteV on FPGA and perform evaluations in terms of side-channel security, hardware area, and runtime overhead. The experimental results demonstrate that PermuteV can effectively defend against EM SCA with minimal area and runtime overhead.

Paperid: 2354, https://arxiv.org/pdf/2512.16123.pdf

Abstract:
Deep learning-based object detection models play a critical role in real-world applications such as autonomous driving and security surveillance systems, yet they remain vulnerable to adversarial examples. In this work, we propose an autoencoder-based denoising defense to recover object detection performance degraded by adversarial perturbations. We conduct adversarial attacks using Perlin noise on vehicle-related images from the COCO dataset, apply a single-layer convolutional autoencoder to remove the perturbations, and evaluate detection performance using YOLOv5. Our experiments demonstrate that adversarial attacks reduce bbox mAP from 0.2890 to 0.1640, representing a 43.3% performance degradation. After applying the proposed autoencoder defense, bbox mAP improves to 0.1700 (3.7% recovery) and bbox mAP@50 increases from 0.2780 to 0.3080 (10.8% improvement). These results indicate that autoencoder-based denoising can provide partial defense against adversarial attacks without requiring model retraining.

Paperid: 2355, https://arxiv.org/pdf/2512.15815.pdf

Abstract:
Data sharing in large consortia, such as research collaborations or industry partnerships, requires addressing both organizational and technical challenges. A common platform is essential to promote collaboration, facilitate exchange of findings, and ensure secure access to sensitive data. Key technical challenges include creating a scalable architecture, a user-friendly interface, and robust security and access control. The BIG-MAP Archive is a cloud-based, disciplinary, private repository designed to address these challenges. Built on InvenioRDM, it leverages platform functionalities to meet consortium-specific needs, providing a tailored solution compared to general repositories. Access can be restricted to members of specific communities or open to the entire consortium, such as the BATTERY 2030+, a consortium accelerating advanced battery technologies. Uploaded data and metadata are controlled via fine grained permissions, allowing access to individual project members or the full initiative. The formalized upload process ensures data are formatted and ready for publication in open repositories when needed. This paper reviews the repository's key features, showing how the BIG-MAP Archive enables secure, controlled data sharing within large consortia. It ensures data confidentiality while supporting flexible, permissions-based access and can be easily redeployed for other consortia, including MaterialsCommons4.eu and RAISE (Resource for AI Science in Europe).

Paperid: 2356, https://arxiv.org/pdf/2512.15515.pdf

Abstract:
Homomorphic Encryption (HE) enables secure computation on encrypted data, addressing privacy concerns in cloud computing. However, the high computational cost of HE operations, particularly matrix multiplication (MM), remains a major barrier to its practical deployment. Accelerating homomorphic encrypted MM (HE MM) is therefore crucial for applications such as privacy-preserving machine learning. In this paper, we present a bandwidth-efficient FPGA implementation of HE MM. We first develop a cost model to evaluate the on-chip memory requirements for a given set of HE parameters and input matrix sizes. Our analysis shows that optimizing on-chip memory usage is critical for scalable and efficient HE MM. To this end, we design a novel datapath for Homomorphic Linear Transformation (HLT), the primary bottleneck in HE MM. The proposed datapath significantly reduces off-chip memory traffic and on-chip memory demand by enabling fine-grained data reuse. Leveraging this datapath, we introduce FAME, the first FPGA-based accelerator specifically tailored for HE MM. FAME supports arbitrary matrix shapes and is configurable across a wide range of HE parameter sets. We implement FAME on an Alveo U280 FPGA and evaluate its performance across diverse matrix sizes and shapes. Experimental results show that FAME achieves an average speedup of 221x over state-of-the-art CPU-based implementations, demonstrating its scalability and practicality for large-scale consecutive HE MM and real-world workloads.

Paperid: 2357, https://arxiv.org/pdf/2512.14748.pdf

Abstract:
Safety and security are critical to the reliable operation of connected and automated vehicles (CAVs). While existing research has identified correlations between the two domains, a theoretical framework to analyze their interaction mechanisms and guide co-design remains lacking. To address this gap, this paper proposes a copula-based joint safety-security analysis method to quantify their coupling effects. First, we formulate time-varying cyberattacks using dynamic risk functions derived from survival analysis, while modeling random hardware failures with the Weibull distribution, as per the automotive industry standard ISO 26262. Second, to capture the dependence between functional safety failures and cyber threats, we introduce a joint failure model based on copula theory, employing both elliptical (e.g., Gaussian) and Archimedean (e.g., Frank) copula families to construct a system-level failure function. Furthermore, we provide formal theoretical analysis of the dependence structure in the safety-security coupling, yielding three key insights: (1) a monotonic relationship between joint failure probability and dependence parameters, (2) the mechanisms of defensive response mechanisms (such as patch deployment) in mitigating joint failures, and (3) quantifying the dynamic coupling strength between safety and security under dependence structures. Through comprehensive simulations, we evaluate the sensitivity of the joint failure behavior to three critical factors: copula dependence parameters, security patch deployment timing, and Weibull distribution parameters. Our dynamic failure model further illustrates how cyberattacks affect safety failures and, conversely, how functional faults affect security failures under dependencies structures. This study provides a quantifiable theoretical foundation for the co-design of safety and security in CAVs.

Paperid: 2358, https://arxiv.org/pdf/2512.14242.pdf

Abstract:
The growing sophistication, frequency, and diversity of cyberattacks increasingly exceed the capacity of individual entities to fully understand and counter them. While existing solutions, such as Security Information and Event Management (SIEM) systems, Security Orchestration, Automation, and Response (SOAR) platforms, and Security Operation Center (SOC), play a vital role in mitigating known threats, they often struggle to effectively address emerging and unforeseen attacks. To increase the effectiveness of cyber defense, it is essential to foster greater information sharing between entities; however, this requires addressing the challenge of exchanging sensitive data without compromising confidentiality or operational security. To address the challenges of secure and confidential Cyber Threat Intelligence (CTI) sharing, we propose a novel architecture that federates Intrusion Tolerant Systems (ITSs) and leverages concepts from Malware Information Sharing Platform (MISP) to empower SOCs. This framework enables controlled collaboration and data privacy while enhancing collective defenses. As a proof of concept, we evaluate one module by applying Differential Privacy (DP) to Federated Learning (FL), observing a manageable accuracy drop from 98.42% to 85.98% (average loss 12.44%) while maintaining reliable detection of compromised messages. These results highlight the viability of secure data sharing and establishes a foundation for the future full-scale implementation of LegionITS.

Paperid: 2359, https://arxiv.org/pdf/2512.12313.pdf

Abstract:
The increasing sophistication of malware attacks in the npm ecosystem, characterized by obfuscation and complex logic, necessitates advanced detection methods. Recently, researchers have turned their attention from traditional detection approaches to Large Language Models (LLMs) due to their strong capabilities in semantic code understanding. However, while LLMs offer superior semantic reasoning for code analysis, their practical application is constrained by limited context windows and high computational cost. This paper addresses this challenge by introducing a novel framework that leverages code slicing techniques for an LLM-based malicious package detection task. We propose a specialized taintbased slicing technique for npm packages, augmented by a heuristic backtracking mechanism to accurately capture malicious data flows across asynchronous, event-driven patterns (e.g., callbacks and Promises) that elude traditional analysis. An evaluation on a dataset of more than 5000 malicious and benign npm packages demonstrates that our approach isolates security-relevant code, reducing input volume by over 99% while preserving critical behavioral semantics. Using the DeepSeek-Coder-6.7B model as the classification engine, our approach achieves a detection accuracy of 87.04%, substantially outperforming a naive token-splitting baseline (75.41%) and a traditional static-analysis-based approach. These results indicate that semantically optimized input representation via code slicing not only mitigates the LLM context-window bottleneck but also significantly enhances reasoning precision for security tasks, providing an efficient and effective defense against evolving malicious open-source packages.

Paperid: 2360, https://arxiv.org/pdf/2512.11699.pdf

Abstract:
This paper systematizes knowledge on the performance of Multi-Party Computation (MPC) protocols. Despite strong privacy and correctness guarantees, MPC adoption in real-world applications remains limited by high costs (especially in the malicious setting) and lack of guidance on choosing suitable protocols for concrete workloads. We identify the theoretical and practical parameters that shape MPC efficiency and conduct an extensive experimental study across diverse benchmarks. Our analysis discusses the trade-offs between protocols, and highlights which techniques align best with different application scenarios and needs. By providing actionable guidance for developers and outlining open challenges for researchers, this work seeks to narrow the gap between MPC theory and practice.

Paperid: 2361, https://arxiv.org/pdf/2512.09953.pdf

Abstract:
Machine unlearning aims to remove the influence of specific data points from a trained model to satisfy privacy, copyright, and safety requirements. In real deployments, providers distribute a global model to many edge devices, where each client personalizes the model using private data. When a deletion request is issued, clients may ignore it or falsely claim compliance, and providers cannot check their parameters or data. This makes verification difficult, especially because personalized models must forget the targeted samples while preserving local utility, and verification must remain lightweight on edge devices. We introduce ZK APEX, a zero-shot personalized unlearning method that operates directly on the personalized model without retraining. ZK APEX combines sparse masking on the provider side with a small Group OBS compensation step on the client side, using a blockwise empirical Fisher matrix to create a curvature-aware update designed for low overhead. Paired with Halo2 zero-knowledge proofs, it enables the provider to verify that the correct unlearning transformation was applied without revealing any private data or personalized parameters. On Vision Transformer classification tasks, ZK APEX recovers nearly all personalization accuracy while effectively removing the targeted information. Applied to the OPT125M generative model trained on code data, it recovers around seventy percent of the original accuracy. Proof generation for the ViT case completes in about two hours, more than ten million times faster than retraining-based checks, with less than one gigabyte of memory use and proof sizes around four hundred megabytes. These results show the first practical framework for verifiable personalized unlearning on edge devices.

Paperid: 2362, https://arxiv.org/pdf/2512.09883.pdf

Abstract:
Research has proven that end-to-end malware detectors are vulnerable to adversarial attacks. In response, the research community has proposed defenses based on randomized and (de)randomized smoothing. However, these techniques remain susceptible to attacks that insert large adversarial payloads. To address these limitations, we propose a novel defense mechanism designed to harden end-to-end malware detectors by leveraging masking at the byte level. This mechanism operates by generating multiple masked versions of the input file, independently classifying each version, and then applying a threshold-based voting mechanism to produce the final classification. Key to this defense is a deterministic masking strategy that systematically strides a mask across the entire input file. Unlike randomized smoothing defenses, which randomly mask or delete bytes, this structured approach ensures coverage of the file over successive versions. In the best-case scenario, this strategy fully occludes the adversarial payload, effectively neutralizing its influence on the model's decision. In the worst-case scenario, it partially occludes the adversarial payload, reducing its impact on the model's predictions. By occluding the adversarial payload in one or more masked versions, this defense ensures that some input versions remain representative of the file's original intent, allowing the voting mechanism to suppress the influence of the adversarial payload. Results achieved on the EMBER and BODMAS datasets demonstrate the suitability of our defense, outperforming randomized and (de)randomized smoothing defenses against adversarial examples generated with a wide range of functionality-preserving manipulations while maintaining high accuracy on clean examples.

Paperid: 2363, https://arxiv.org/pdf/2512.09539.pdf

Abstract:
With the adoption of multiple digital devices in everyday life, the cyber-attack surface has increased. Adversaries are continuously exploring new avenues to exploit them and deploy malware. On the other hand, detection approaches typically employ hashing-based algorithms such as SSDeep, TLSH, and IMPHash to capture structural and behavioural similarities among binaries. This work focuses on the analysis and evaluation of these techniques for clustering malware samples using the K-means algorithm. More specifically, we experimented with established malware families and traits and found that TLSH and IMPHash produce more distinct, semantically meaningful clusters, whereas SSDeep is more efficient for broader classification tasks. The findings of this work can guide the development of more robust threat-detection mechanisms and adaptive security mechanisms.

Paperid: 2364, https://arxiv.org/pdf/2512.06555.pdf

Abstract:
Cybercrime increasingly exploits human cognitive biases in addition to technical vulnerabilities, yet most existing analytical frameworks focus primarily on operational aspects and overlook psychological manipulation. This paper proposes BEACON, a unified dual-dimension framework that integrates behavioral psychology with the tactical lifecycle of cybercrime to enable structured, interpretable, and scalable analysis of cybercrime. We formalize six psychologically grounded manipulation categories derived from Prospect Theory and Cialdini's principles of persuasion, alongside a fourteen-stage cybercrime tactical lifecycle spanning reconnaissance to final impact. A single large language model is fine-tuned using parameter-efficient learning to perform joint multi-label classification across both psychological and tactical dimensions while simultaneously generating human-interpretable explanations. Experiments conducted on a curated dataset of real-world and synthetically augmented cybercrime narratives demonstrate a 20 percent improvement in overall classification accuracy over the base model, along with substantial gains in reasoning quality measured using ROUGE and BERTScore. The proposed system enables automated decomposition of unstructured victim narratives into structured behavioral and operational intelligence, supporting improved cybercrime investigation, case linkage, and proactive scam detection.

Paperid: 2365, https://arxiv.org/pdf/2512.06304.pdf

Abstract:
Identity, accent, style, and emotions are essential components of human speech. Voice conversion (VC) techniques process the speech signals of two input speakers and other modalities of auxiliary information such as prompts and emotion tags. It changes para-linguistic features from one to another, while maintaining linguistic contents. Recently, VC models have made rapid advancements in both generation quality and personalization capabilities. These developments have attracted considerable attention for diverse applications, including privacy preservation, voice-print reproduction for the deceased, and dysarthric speech recovery. However, these models only learn non-robust features due to the clean training data. Subsequently, it results in unsatisfactory performances when dealing with degraded input speech in real-world scenarios, including additional noise, reverberation, adversarial attacks, or even minor perturbation. Hence, it demands robust deployments, especially in real-world settings. Although latest researches attempt to find potential attacks and countermeasures for VC systems, there remains a significant gap in the comprehensive understanding of how robust the VC model is under input manipulation. here also raises many questions: For instance, to what extent do different forms of input degradation attacks alter the expected output of VC models? Is there potential for optimizing these attack and defense strategies? To answer these questions, we classify existing attack and defense methods from the perspective of input manipulation and evaluate the impact of degraded input speech across four dimensions, including intelligibility, naturalness, timbre similarity, and subjective perception. Finally, we outline open issues and future directions.

Paperid: 2366, https://arxiv.org/pdf/2512.04675.pdf

Abstract:
Gleeok is a family of low latency keyed pseudorandom functions (PRFs) consisting of three parallel SPN based permutations whose outputs are XORed to form the final value. Both Gleeok-128 and Gleeok-256 use a 256 bit key, with block sizes of 128 and 256 bits, respectively. Owing to its multi branch structure, evaluating security margins and mounting effective key recovery attacks present nontrivial challenges. This paper provides the first comprehensive third party cryptanalysis of Gleeok-128. We introduce a two stage MILP based framework for constructing branch wise and full cipher differential linear (DL) distinguishers, together with an integral based key recovery framework tailored to multi branch designs. Our DL analysis yields 7, 7, 8, and 4 round distinguishers for Branch 1, Branch 2, Branch 3, and Gleeok-128, respectively, with squared correlations approximately 2 to the power minus 88.12, 2 to the power minus 88.12, 2 to the power minus 38.73, and 2 to the power minus 49.04, outperforming those in the design document except for the full PRF case. By tightening algebraic degree bounds, we further derive 9, 9, and 7 round integral distinguishers for the three branches and a 7 round distinguisher for the full PRF, extending the designers results by 3, 3, and 2 rounds and by 2 rounds, respectively. These integral properties enable 7 round and 8 round key recovery attacks in the non full codebook and full codebook settings. In addition, we identify a flaw in the original linear security evaluation of Branch 3, showing that it can be distinguished over all 12 rounds with data complexity about 2 to the power 48. We also propose optimized linear layer parameters that significantly improve linear resistance without sacrificing diffusion. Our results advance the understanding of Gleeok-128 and provide general methods for analyzing multi branch symmetric designs.

Paperid: 2367, https://arxiv.org/pdf/2512.01891.pdf

Abstract:
Large-scale vulnerability notifications (VNs) can help hosting provider organizations (HPOs) identify and remediate security vulnerabilities that attackers can exploit in data breaches or phishing campaigns. Previous VN studies have primarily focused on factors under the control of reporters, such as sender reputation, email formatting, and communication channels. Despite these efforts, remediation rates for vulnerability notifications continue to remain consistently low. This paper presents the first in-depth study of how HPOs process vulnerability notifications internally and what organizational and operational factors influence VN effectiveness. We examine the problem from a different perspective to provide the first detailed understanding of the reasons behind persistently low remediation rates. Instead of manipulating parameters of VN campaigns, we interview hosting providers directly, investigating how they handle vulnerability notifications and what factors may influence VN effectiveness, such as VN awareness and reachability, HPOs' service models, and perceived security risks. We conducted semi-structured interviews with 24 HPOs across shared hosting and web development services, representing varied company sizes and operator roles. Our findings reveal practical insights on VN processing and abuse workflows. While some providers remain hard to reach due to complex infrastructures, most report routinely handling VNs. However, limited remediation often stems from strict responsibility boundaries, where web application issues are seen as the customer's domain. Low hosting fees and high volumes of daily compromises further discourage both proactive and reactive measures. Our findings show that HPOs blame negligent website owners, and prior works on website owners confirms they often undervalue their sites or lack security know-how.

Paperid: 2368, https://arxiv.org/pdf/2512.00734.pdf

Abstract:
Differential privacy is increasingly formalized through the lens of hypothesis testing via the robust and interpretable $f$-DP framework, where privacy guarantees are encoded by a baseline Blackwell trade-off function $f_{\infty} = T(P_{\infty}, Q_{\infty})$ involving a pair of distributions $(P_{\infty}, Q_{\infty})$. The problem of choosing the right privacy metric in practice leads to a central question: what is a statistically appropriate baseline $f_{\infty}$ given some prior modeling assumptions? The special case of Gaussian differential privacy (GDP) showed that, under compositions of nearly perfect mechanisms, these trade-off functions exhibit a central limit behavior with a Gaussian limit experiment. Inspired by Le Cam's theory of limits of statistical experiments, we answer this question in full generality in an infinitely divisible setting. We show that suitable composition experiments $(P_n^{\otimes n}, Q_n^{\otimes n})$ converge to a binary limit experiment $(P_{\infty}, Q_{\infty})$ whose log-likelihood ratio $L = \log(dQ_{\infty} / dP_{\infty})$ is infinitely divisible under $P_{\infty}$. Thus any limiting trade-off function $f_{\infty}$ is determined by an infinitely divisible law $P_{\infty}$, characterized by its Levy--Khintchine triplet, and its Esscher tilt defined by $dQ_{\infty}(x) = e^{x} dP_{\infty}(x)$. This characterizes all limiting baseline trade-off functions $f_{\infty}$ arising from compositions of nearly perfect differentially private mechanisms. Our framework recovers GDP as the purely Gaussian case and yields explicit non-Gaussian limits, including Poisson examples. It also positively resolves the empirical $s^2 = 2k$ phenomenon observed in the GDP paper and provides an optimal mechanism for count statistics achieving asymmetric Poisson differential privacy.

Paperid: 2369, https://arxiv.org/pdf/2512.00713.pdf

Abstract:
Vision-Language Models (VLMs) have achieved impressive progress in multimodal text generation, yet their rapid adoption raises increasing concerns about security vulnerabilities. Existing backdoor attacks against VLMs primarily rely on explicit pixel-level triggers or imperceptible perturbations injected into images. While effective, these approaches reduce stealthiness and remain vulnerable to image-based defenses. We introduce concept-guided backdoor attacks, a new paradigm that operates at the semantic concept level rather than on raw pixels. We propose two different attacks. The first, Concept-Thresholding Poisoning (CTP), uses explicit concepts in natural images as triggers: only samples containing the target concept are poisoned, causing the model to behave normally in all other cases but consistently inject malicious outputs whenever the concept appears. The second, CBL-Guided Unseen Backdoor (CGUB), leverages a Concept Bottleneck Model (CBM) during training to intervene on internal concept activations, while discarding the CBM branch at inference time to keep the VLM unchanged. This design enables systematic replacement of a targeted label in generated text (for example, replacing "cat" with "dog"), even when the replacement behavior never appears in the training data. Experiments across multiple VLM architectures and datasets show that both CTP and CGUB achieve high attack success rates while maintaining moderate impact on clean-task performance. These findings highlight concept-level vulnerabilities as a critical new attack surface for VLMs.

Paperid: 2370, https://arxiv.org/pdf/2511.22415.pdf

Abstract:
Reinforcement learning (RL) has achieved remarkable success across diverse domains, enabling autonomous systems to learn and adapt to dynamic environments by optimizing a reward function. However, this reliance on reward signals creates a significant security vulnerability. In this paper, we study a stealthy backdoor attack that manipulates an agent's policy by poisoning its reward signals. The effectiveness of this attack highlights a critical threat to the integrity of deployed RL systems and calls for urgent defenses against training-time manipulation. We evaluate the attack across classic control and MuJoCo environments. The backdoored agent remains highly stealthy in Hopper and Walker2D, with minimal performance drops of only 2.18 % and 4.59 % under non-triggered scenarios, while achieving strong attack efficacy with up to 82.31% and 71.27% declines under trigger conditions.

Paperid: 2371, https://arxiv.org/pdf/2511.22017.pdf

Abstract:
Access control is a security mechanism designed to ensure that only authorized users can access specific resources. Cross-domain access control involves access to resources across different organizations, institutions, or applications. Traditional access control, however, which handles authentication and authorization separately in centralized environments, faces challenges in identity dispersion, privacy leakage, and diversified permission requirements, failing to adapt to cross-domain scenarios. Thus, there is an urgent need for a new access control mechanism that empowers autonomous control over user identity and resources, addressing the demands for privacy-preserving authentication and flexible authorization in cross-domain scenarios. To address cross-domain access control challenges, we propose POLARIS, a unified and extensible architecture that enables policy-based, verifiable and privacy-preserving access control across different domains. POLARIS features a structured commitment mechanism for reliable, fine-grained, policy-based identity disclosure. It further introduces VPPL, a lightweight policy language that supports issuer-bound evaluation of selectively revealed attributes. A dedicated session-level security mechanism ensures binding between authentication and access, enhancing confidentiality and resilience to replay attacks. We implement a working prototype and conduct comprehensive experiments, demonstrating that POLARIS effectively provides scalable, privacy-preserving, and interoperable access control across heterogeneous domains. Our results highlight the practical viability of POLARIS for enabling secure and privacy-preserving access control in decentralized, cross-domain environments.

Paperid: 2372, https://arxiv.org/pdf/2511.20555.pdf

Abstract:
Command-line interface (CLI) fuzzing tests programs by mutating both command-line options and input file contents, thus enabling discovery of vulnerabilities that only manifest under specific option-input combinations. Prior works of CLI fuzzing face the challenges of generating semantics-rich option strings and input files, which cannot reach deeply embedded target functions. This often leads to a misdetection of such a deep vulnerability using existing CLI fuzzing techniques. In this paper, we design a novel Path-guided, Iterative LLM-Orchestrated Testing framework, called PILOT, to fuzz CLI applications. The key insight is to provide potential call paths to target functions as context to LLM so that it can better generate CLI option strings and input files. Then, PILOT iteratively repeats the process, and provides reached functions as additional context so that target functions are reached. Our evaluation on real-world CLI applications demonstrates that PILOT achieves higher coverage than state-of-the-art fuzzing approaches and discovers 51 zero-day vulnerabilities. We responsibly disclosed all the vulnerabilities to their developers and so far 41 have been confirmed by their developers with 33 being fixed and three assigned CVE identifiers.

Paperid: 2373, https://arxiv.org/pdf/2511.18608.pdf

Abstract:
Bug bounty platforms (e.g., HackerOne, BugCrowd) leverage crowd-sourced vulnerability discovery to improve continuous coverage, reduce the cost of discovery, and serve as an integral complement to internal red teams. With the rise of AI-generated bug reports, little work exists to help bug hunters understand why these reports are labeled as invalid. To improve report quality and reduce reviewers' burden, it is critical to predict invalid reports and interpret invalid reasons. In this work, we conduct an empirical study with the purpose of helping bug hunters understand the validity of reports. We collect a dataset of 9,942 disclosed bug bounty reports, including 1,400 invalid reports, and evaluate whether state-of-the-art large language models can identify invalid reports. While models such as GPT-5, DeepSeek, and a fine-tuned RoBERTa achieve strong overall accuracy, they consistently struggle to detect invalid cases, showing a tendency to over-accept reports. To improve invalidity detection, we build a taxonomy of rejection reasons for Information Disclosure vulnerabilities and incorporate it into a retrieval-augmented generation (RAG) framework. This approach substantially improves classification consistency and reduces bias. We also examine whether reviewer decisions may be influenced by factors beyond the content of the report. Our analysis shows that reporters with higher reputations tend to receive more favorable outcomes in borderline cases, suggesting that perceived expertise can influence review judgments. Overall, our findings highlight the challenges of invalid report identification and show that combining LLMs with structured reviewer knowledge can support more transparent and consistent vulnerability report review.

Paperid: 2374, https://arxiv.org/pdf/2511.16604.pdf

Abstract:
In the era of digital communication, steganography allows covert embedding of data within media files. Adaptive Pixel Value Differencing (APVD) is a steganographic method valued for its high embedding capacity and invisibility, posing challenges for traditional steganalysis. This paper proposes a deep learning-based approach for detecting APVD steganography and performing reverse steganalysis, which reconstructs the hidden payload. We present a Convolutional Neural Network (CNN) with an attention mechanism and two output heads for simultaneous stego detection and payload recovery. Trained and validated on 10,000 images from the BOSSbase and UCID datasets, our model achieves a detection accuracy of 96.2 percent. It also reconstructs embedded payloads with up to 93.6 percent recovery at lower embedding densities. Results indicate a strong inverse relationship between payload size and recovery accuracy. This study reveals a vulnerability in adaptive steganography and provides a tool for digital forensic analysis, while encouraging reassessment of data security in the age of AI-driven techniques.

Paperid: 2375, https://arxiv.org/pdf/2511.15033.pdf

Abstract:
Recently, the number of malicious open-source packages in package repositories has been increasing dramatically. While major security scanners focus on identifying known Common Vulnerabilities and Exposures (CVEs) in open-source packages, there are very few studies on detecting malicious packages. Malicious open-source package detection typically requires static, dynamic analysis, or both. Dynamic analysis is more effective as it can expose a package's behaviors at runtime. However, current dynamic analysis tools (e.g., ossf's package-analysis) lack an automatic method to differentiate malicious packages from benign packages. In this paper, we propose an approach to extract the features from dynamic analysis (e.g., executed commands) and leverage machine learning techniques to automatically classify packages as benign or malicious. Our evaluation of nearly 2000 packages on npm shows that the machine learning classifier achieves an AUC of 0.91 with a false positive rate of nearly 0%.

Paperid: 2376, https://arxiv.org/pdf/2511.14406.pdf

Abstract:
Large models adaptation through Federated Learning (FL) addresses a wide range of use cases and is enabled by Parameter-Efficient Fine-Tuning techniques such as Low-Rank Adaptation (LoRA). However, this distributed learning paradigm faces several security threats, particularly to its integrity, such as backdoor attacks that aim to inject malicious behavior during the local training steps of certain clients. We present the first analysis of the influence of LoRA on state-of-the-art backdoor attacks targeting model adaptation in FL. Specifically, we focus on backdoor lifespan, a critical characteristic in FL, that can vary depending on the attack scenario and the attacker's ability to effectively inject the backdoor. A key finding in our experiments is that for an optimally injected backdoor, the backdoor persistence after the attack is longer when the LoRA's rank is lower. Importantly, our work highlights evaluation issues of backdoor attacks against FL and contributes to the development of more robust and fair evaluations of backdoor attacks, enhancing the reliability of risk assessments for critical FL systems. Our code is publicly available.

Paperid: 2377, https://arxiv.org/pdf/2511.12497.pdf

Abstract:
We present SGuard-v1, a lightweight safety guardrail for Large Language Models (LLMs), which comprises two specialized models to detect harmful content and screen adversarial prompts in human-AI conversational settings. The first component, ContentFilter, is trained to identify safety risks in LLM prompts and responses in accordance with the MLCommons hazard taxonomy, a comprehensive framework for trust and safety assessment of AI. The second component, JailbreakFilter, is trained with a carefully designed curriculum over integrated datasets and findings from prior work on adversarial prompting, covering 60 major attack types while mitigating false-unsafe classification. SGuard-v1 is built on the 2B-parameter Granite-3.3-2B-Instruct model that supports 12 languages. We curate approximately 1.4 million training instances from both collected and synthesized data and perform instruction tuning on the base model, distributing the curated data across the two component according to their designated functions. Through extensive evaluation on public and proprietary safety benchmarks, SGuard-v1 achieves state-of-the-art safety performance while remaining lightweight, thereby reducing deployment overhead. SGuard-v1 also improves interpretability for downstream use by providing multi-class safety predictions and their binary confidence scores. We release the SGuard-v1 under the Apache-2.0 License to enable further research and practical deployment in AI safety.

Paperid: 2378, https://arxiv.org/pdf/2511.12385.pdf

Abstract:
In recent years, Infrastructure as Code (IaC) has emerged as a critical approach for managing and provisioning IT infrastructure through code and automation. IaC enables organizations to create scalable and consistent environments, effectively managing servers and development settings. However, the growing complexity of cloud infrastructures has led to an increased risk of misconfigurations and security vulnerabilities in IaC scripts. To address this problem, this paper investigates the potential of Large Language Models (LLMs) in generating security-aware IaC code, avoiding misconfigurations introduced by developers and administrators. While LLMs have made significant progress in natural language processing and code generation, their ability to generate secure IaC scripts remains unclear. This paper addresses two major problems: 1) the lack of understanding of security weaknesses in IaC scripts generated by LLMs, and 2) the absence of techniques for enhancing security in generating IaC code with LLMs. To assess the extent to which LLMs contain security knowledge, we first conduct a comprehensive evaluation of base LLMs in recognizing major IaC security weaknesses during the generation and inspection of IaC code. Then, we propose GenSIaC, an instruction fine-tuning dataset designed to improve LLMs' ability to recognize potential security weaknesses. Leveraging GenSIaC, we fine-tune LLMs and instruct models to generate security-aware IaC code. Our evaluation demonstrates that our models achieve substantially improved performance in recognizing and preventing IaC security misconfigurations, e.g., boosting the F1-score from 0.303 to 0.858. Additionally, we perform ablation studies and explore GenSIaC's generalizability to other LLMs and its cross-language capabilities.

Paperid: 2379, https://arxiv.org/pdf/2511.12043.pdf

Abstract:
Retrieval-Augmented Generation (RAG) enhances large language models by integrating external knowledge, but reliance on proprietary or sensitive corpora poses various data risks, including privacy leakage and unauthorized data usage. Membership inference attacks (MIAs) are a common technique to assess such risks, yet existing approaches underperform in RAG due to black-box constraints and the absence of strong membership signals. In this paper, we identify a previously unexplored side channel in RAG systems: the generation budget, which controls the maximum number of tokens allowed in a generated response. Varying this budget reveals observable behavioral patterns between member and non-member queries, as members gain quality more rapidly with larger budgets. Building on this insight, we propose BudgetLeak, a novel membership inference attack that probes responses under different budgets and analyzes metric evolution via sequence modeling or clustering. Extensive experiments across four datasets, three LLM generators, and two retrievers demonstrate that BudgetLeak consistently outperforms existing baselines, while maintaining high efficiency and practical viability. Our findings reveal a previously overlooked data risk in RAG systems and highlight the need for new defenses.

Paperid: 2380, https://arxiv.org/pdf/2511.11385.pdf

Abstract:
We extract the first formal model of WhatsApp from its implementation by combining binary-level analysis (via CryptoBap) with reverse engineering (via Ghidra) to handle this large closed-source application. Using this model, we prove forward secrecy, identify a known clone-attack against post-compromise security and discover functional gaps between WhatsApp's implementation and its specification. We further introduce a methodology to analyze cryptographic protocol implementations for their resilience to side-channel attacks. This is achieved by extending the CryptoBap framework to integrate hardware leakage contracts into the protocol model, which we then pass to the state-of-the-art protocol prover, DeepSec. This enables a detailed security analysis against both functional bugs and microarchitectural side-channel attacks. Using this methodology, we identify a privacy attack in WhatsApp that allows a side-channel attacker to learn the victim's contacts and confirm a known unlinkability attack on the BAC protocol used in electronic passports. Key contributions include (1) the first formal model of WhatsApp, extracted from its binary, (2) a framework to integrate side-channel leakage contracts into protocol models for the first time, and (3) revealing critical vulnerabilities invisible to specification-based methods.

Paperid: 2381, https://arxiv.org/pdf/2511.11381.pdf

Abstract:
Wi-Fi Channel State Information (CSI) has been repeatedly proposed as a biometric modality, often with reports of high accuracy and operational feasibility. However, the field lacks a consolidated understanding of its security properties, adversarial resilience, and methodological consistency. This Systematization of Knowledge (SoK) examines CSI-based biometric authentication through a security perspective, analyzing how existing work differs across sensing infrastructure, signal representations, feature pipelines, learning models, and evaluation methodologies. Our synthesis reveals systemic inconsistencies: reliance on aggregate accuracy metrics, limited reporting of FAR/FRR/EER, absence of per-user risk analysis, and scarce consideration of threat models or adversarial feasibility. We construct a unified evaluation framework to empirically expose these issues and demonstrate how security-relevant metrics, such as per-class EER, FCS, and the Gini Coefficient, uncover risk concentration that remains hidden under traditional reporting practices. Our analysis highlights concrete attack surfaces and shows how methodological choices materially influence vulnerability profiles, which include replay, geometric mimicry, and environmental perturbation. Based on these findings, we articulate the security boundaries of current CSI biometrics and provide guidelines for rigorous evaluation, reproducible experimentation, and future research directions. This SoK offers the security community a structured, evidence-driven reassessment of Wi-Fi CSI biometrics and their suitability as an authentication primitive.

Paperid: 2382, https://arxiv.org/pdf/2511.09957.pdf

Abstract:
The increasingly sophisticated environment in which attackers operate makes software security an even greater challenge in open-source projects, where malicious packages are prevalent. Static analysis tools, such as Malcontent, are highly useful but are often incapable of dealing with obfuscated malware. Such situations lead to an unreasonably high rate of false positives. This paper highlights that dynamic analysis, rather than static analysis, provides greater insight but is also more resource-intensive for understanding software behaviour during execution. In this study, we enhance a dynamic analysis tool, package-analysis, to capture key runtime behaviours, including commands executed, files accessed, and network communications. This modification enables the use of container sandboxing technologies, such as gVisor, to analyse potentially malicious packages without significantly compromising the host system.

Paperid: 2383, https://arxiv.org/pdf/2511.09846.pdf

Abstract:
This study examines the effectiveness of the real-time privacy-preserving techniques through an offline gaze-based interaction simulation framework. Those techniques aim to reduce the amount of identity-related information in eye-tracking data while improving the efficacy of the gaze-based interaction. Although some real-time gaze privatization methods were previously explored, their validation on the large dataset was not conducted. We propose a functional framework that allows to study the efficacy of real-time gaze privatization on an already collected offline dataset. The key metric used to assess the reduction of identity-related information is the identification rate, while improvements in gaze-based interactions are evaluated through signal quality during interaction. Our additional contribution is the employment of an extremely lightweight Kalman filter framework that reduces the amount of identity-related information in the gaze signal and improves gaze-based interaction performance.

Paperid: 2384, https://arxiv.org/pdf/2511.09775.pdf

Abstract:
The widespread integration of Artificial Intelligence of Things (AIoT) in smart home environments has amplified the demand for transparent and interpretable machine learning models. To foster user trust and comply with emerging regulatory frameworks, the Explainable AI (XAI) methods, particularly post-hoc techniques such as SHapley Additive exPlanations (SHAP), and Local Interpretable Model-Agnostic Explanations (LIME), are widely employed to elucidate model behavior. However, recent studies have shown that these explanation methods can inadvertently expose sensitive user attributes and behavioral patterns, thereby introducing new privacy risks. To address these concerns, we propose a novel privacy-preserving approach based on SHAP entropy regularization to mitigate privacy leakage in explainable AIoT applications. Our method incorporates an entropy-based regularization objective that penalizes low-entropy SHAP attribution distributions during training, promoting a more uniform spread of feature contributions. To evaluate the effectiveness of our approach, we developed a suite of SHAP-based privacy attacks that strategically leverage model explanation outputs to infer sensitive information. We validate our method through comparative evaluations using these attacks alongside utility metrics on benchmark smart home energy consumption datasets. Experimental results demonstrate that SHAP entropy regularization substantially reduces privacy leakage compared to baseline models, while maintaining high predictive accuracy and faithful explanation fidelity. This work contributes to the development of privacy-preserving explainable AI techniques for secure and trustworthy AIoT applications.

Paperid: 2385, https://arxiv.org/pdf/2511.07997.pdf

Abstract:
We revisit the problem of generating synthetic data under differential privacy. To address the core limitations of marginal-based methods, we propose the Private Adaptive Generative Adversarial Network with Bayes Network Structure (PrAda-GAN), which integrates the strengths of both GAN-based and marginal-based approaches. Our method adopts a sequential generator architecture to capture complex dependencies among variables, while adaptively regularizing the learned structure to promote sparsity in the underlying Bayes network. Theoretically, we establish diminishing bounds on the parameter distance, variable selection error, and Wasserstein distance. Our analysis shows that leveraging dependency sparsity leads to significant improvements in convergence rates. Empirically, experiments on both synthetic and real-world datasets demonstrate that PrAda-GAN outperforms existing tabular data synthesis methods in terms of the privacy-utility trade-off.

Paperid: 2386, https://arxiv.org/pdf/2511.07637.pdf

Abstract:
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external corpus at inference time. When this corpus contains sensitive information, however, unprotected RAG systems are at risk of leaking private information. Prior work has introduced differential privacy (DP) guarantees for RAG, but only in single-query settings, which fall short of realistic usage. In this paper, we study the more practical multi-query setting and propose two DP-RAG algorithms. The first, MURAG, leverages an individual privacy filter so that the accumulated privacy loss only depends on how frequently each document is retrieved rather than the total number of queries. The second, MURAG-ADA, further improves utility by privately releasing query-specific thresholds, enabling more precise selection of relevant documents. Our experiments across multiple LLMs and datasets demonstrate that the proposed methods scale to hundreds of queries within a practical DP budget ($\varepsilon\approx10$), while preserving meaningful utility.

Paperid: 2387, https://arxiv.org/pdf/2511.06512.pdf

Abstract:
Small language models (SLMs) are increasingly deployed on edge devices, making their safety alignment crucial yet challenging. Current shallow alignment methods that rely on direct refusal of malicious queries fail to provide robust protection, particularly against adversarial jailbreaks. While deliberative safety reasoning alignment offers deeper alignment for defending against sophisticated attacks, effectively implanting such reasoning capability in SLMs with limited capabilities remains an open challenge. Moreover, safety reasoning incurs significant computational overhead as models apply reasoning to nearly all queries, making it impractical for resource-constrained edge deployment scenarios that demand rapid responses. We propose EASE, a novel framework that enables practical and Efficient safety Alignment for Small languagE models. Our approach first identifies the optimal safety reasoning teacher that can effectively distill safety reasoning capabilities to SLMs. We then align models to selectively activate safety reasoning for dangerous adversarial jailbreak queries while providing direct responses to straightforward malicious queries and general helpful tasks. This selective mechanism enables small models to maintain robust safety guarantees against sophisticated attacks while preserving computational efficiency for benign interactions. Experimental results demonstrate that EASE reduces jailbreak attack success rates by up to 17% compared to shallow alignment methods while reducing inference overhead by up to 90% compared to deliberative safety reasoning alignment, making it practical for SLMs real-world edge deployments.

Paperid: 2388, https://arxiv.org/pdf/2511.06336.pdf

Abstract:
At CRYPTO 2019, Gohr pioneered neural cryptanalysis by introducing differential-based neural distinguishers to attack Speck32/64, establishing a novel paradigm combining deep learning with differential cryptanalysis.Since then, constructing neural distinguishers has become a significant approach to achieving the deep learning-based cryptanalysis for block ciphers.This paper advances rotational-XOR (RX) attacks through neural networks, focusing on optimizing distinguishers and presenting key-recovery attacks for the lightweight block ciphers Simon32/64 and Simeck32/64.In particular, we first construct the fundamental data formats specially designed for training RX-neural distinguishers by refining the existing data formats for differential-neural distinguishers. Based on these data formats, we systematically identify optimal RX-differences with Hamming weights 1 and 2 that develop high-accuracy RX-neural distinguishers. Then, through innovative application of the bit sensitivity test, we achieve significant compression of data format without sacrificing the distinguisher accuracy. This optimization enables us to add more multi-ciphertext pairs into the data formats, further strengthening the performance of RX-neural distinguishers. As an application, we obtain 14- and 17-round RX-neural distinguishers for Simon32/64 and Simeck32/64, which improves the previous ones by 3 and 2 rounds, respectively.In addition, we propose two novel techniques, key bit sensitivity test and the joint wrong key response, to tackle the challenge of applying Bayesian's key-recovery strategy to the target cipher that adopts nonlinear key schedule in the related-key setting without considering of weak-key space. By this, we can straightforwardly mount a 17-round key-recovery attack on Simeck32/64 based on the improved 16-round RX-nerual distinguisher. To the best of our knowledge, the presented RX-neural......

Paperid: 2389, https://arxiv.org/pdf/2511.06197.pdf

Abstract:
The rapid proliferation of Internet of Things (IoT) devices has transformed numerous industries by enabling seamless connectivity and data-driven automation. However, this expansion has also exposed IoT networks to increasingly sophisticated security threats, including adversarial attacks targeting artificial intelligence (AI) and machine learning (ML)-based intrusion detection systems (IDS) to deliberately evade detection, induce misclassification, and systematically undermine the reliability and integrity of security defenses. To address these challenges, we propose a novel adversarial detection model that enhances the robustness of IoT IDS against adversarial attacks through SHapley Additive exPlanations (SHAP)-based fingerprinting. Using SHAP's DeepExplainer, we extract attribution fingerprints from network traffic features, enabling the IDS to reliably distinguish between clean and adversarially perturbed inputs. By capturing subtle attribution patterns, the model becomes more resilient to evasion attempts and adversarial manipulations. We evaluated the model on a standard IoT benchmark dataset, where it significantly outperformed a state-of-the-art method in detecting adversarial attacks. In addition to enhanced robustness, this approach improves model transparency and interpretability, thereby increasing trust in the IDS through explainable AI.

Paperid: 2390, https://arxiv.org/pdf/2511.04946.pdf

Abstract:
Fully Homomorphic Encryption (FHE) allows computations to be performed on encrypted data, significantly enhancing user privacy. However, the I/O challenges associated with deploying FHE applications remains understudied. We analyze the impact of storage I/O on the performance of FHE applications and summarize key lessons from the status quo. Key results include that storage I/O can degrade the performance of ASICs by as much as 357$\times$ and reduce GPUs performance by up to 22$\times$.

Paperid: 2391, https://arxiv.org/pdf/2511.04332.pdf

Abstract:
Differentially private in-context learning (DP-ICL) has recently become an active research topic due to the inherent privacy risks of in-context learning. However, existing approaches overlook a critical component of modern large language model (LLM) pipelines: the similarity search used to retrieve relevant context data. In this work, we introduce a DP framework for in-context learning that integrates nearest neighbor search of relevant examples in a privacy-aware manner. Our method outperforms existing baselines by a substantial margin across all evaluated benchmarks, achieving more favorable privacy-utility trade-offs. To achieve this, we employ nearest neighbor retrieval from a database of context data, combined with a privacy filter that tracks the cumulative privacy cost of selected samples to ensure adherence to a central differential privacy budget. Experimental results on text classification and document question answering show a clear advantage of the proposed method over existing baselines.

Paperid: 2392, https://arxiv.org/pdf/2511.03229.pdf

Abstract:
Due to the openness of the wireless medium, smartphone users are susceptible to user privacy attacks, where user privacy information is inferred from encrypted Wi-Fi wireless traffic. Existing attacks are limited to recognizing mobile apps and their actions and cannot infer the smartphone user identity, a fundamental part of user privacy. To overcome this limitation, we propose U-Print, a novel attack system that can passively recognize smartphone apps, actions, and users from over-the-air MAC-layer frames. We observe that smartphone users usually prefer different add-on apps and in-app actions, yielding different changing patterns in Wi-Fi traffic. U-Print first extracts multi-level traffic features and exploits customized temporal convolutional networks to recognize smartphone apps and actions, thus producing users' behavior sequences. Then, it leverages the silhouette coefficient method to determine the number of users and applies the k-means clustering to profile and identify smartphone users. We implement U-Print using a laptop with a Kali dual-band wireless network card and evaluate it in three real-world environments. U-Print achieves an overall accuracy of 98.4% and an F1 score of 0.983 for user inference. Moreover, it can correctly recognize up to 96% of apps and actions in the closed world and more than 86% in the open world.

Paperid: 2393, https://arxiv.org/pdf/2511.02841.pdf

Abstract:
LLM-based AI agents still lack the technical means to automatically build nuanced and differentiated trust in other agents at the beginning of an agent-to-agent dialogue. But autonomous and interoperable trust establishing becomes a fundamental prerequisite once agents start to operate beyond isolated environments and engage in dialogues across individual or organizational boundaries. A promising way to fill this gap in Agentic AI is to equip agents with long-lived digital identities and introduce tamper-proof and flexible identity-bound attestations of agents, provisioned by commonly trusted third parties and designed for cross-domain verifiability. This article presents a conceptual framework and a prototypical multi-agent system, where each agent is endowed with a self-sovereign digital identity. It combines a unique and ledger-anchored Decentralized Identifier (DID) of an agent with a set of third-party issued Verifiable Credentials (VCs). This enables agents at the start of a dialog to prove ownership of their self-controlled DIDs for authentication purposes and to establish various cross-domain trust relationships through the spontaneous exchange of their self-hosted DID-bound VCs. A comprehensive evaluation of the prototypical implementation demonstrates technical feasibility but also reveals limitations once an agent's LLM is in sole charge to control the respective security procedures.

Paperid: 2394, https://arxiv.org/pdf/2511.02185.pdf

Abstract:
Graph neural networks (GNNs) are powerful tools for analyzing and learning from graph-structured (GS) data, facilitating a wide range of services. Deploying such services in privacy-critical cloud environments necessitates the development of secure inference (SI) protocols that safeguard sensitive GS data. However, existing SI solutions largely focus on convolutional models for image and text data, leaving the challenge of securing GNNs and GS data relatively underexplored. In this work, we design, implement, and evaluate $\sysname$, a lightweight cryptographic scheme for graph-centric inference in the cloud. By hybridizing additive and function secret sharings within secure two-party computation (2PC), $\sysname$ is carefully designed based on a series of novel 2PC interactive protocols that achieve $1.5\times \sim 1.7\times$ speedups for linear layers and $2\times \sim 15\times$ for non-linear layers over state-of-the-art (SotA) solutions. A thorough theoretical analysis is provided to prove $\sysname$'s correctness, security, and lightweight nature. Extensive experiments across four datasets demonstrate $\sysname$'s superior efficiency with $1.3\times \sim 4.7\times$ faster secure predictions while maintaining accuracy comparable to plaintext graph property inference.

Paperid: 2395, https://arxiv.org/pdf/2510.26833.pdf

Abstract:
We present VISAT, a novel open dataset and benchmarking suite for evaluating model robustness in the task of traffic sign recognition with the presence of visual attributes. Built upon the Mapillary Traffic Sign Dataset (MTSD), our dataset introduces two benchmarks that respectively emphasize robustness against adversarial attacks and distribution shifts. For our adversarial attack benchmark, we employ the state-of-the-art Projected Gradient Descent (PGD) method to generate adversarial inputs and evaluate their impact on popular models. Additionally, we investigate the effect of adversarial attacks on attribute-specific multi-task learning (MTL) networks, revealing spurious correlations among MTL tasks. The MTL networks leverage visual attributes (color, shape, symbol, and text) that we have created for each traffic sign in our dataset. For our distribution shift benchmark, we utilize ImageNet-C's realistic data corruption and natural variation techniques to perform evaluations on the robustness of both base and MTL models. Moreover, we further explore spurious correlations among MTL tasks through synthetic alterations of traffic sign colors using color quantization techniques. Our experiments focus on two major backbones, ResNet-152 and ViT-B/32, and compare the performance between base and MTL models. The VISAT dataset and benchmarking framework contribute to the understanding of model robustness for traffic sign recognition, shedding light on the challenges posed by adversarial attacks and distribution shifts. We believe this work will facilitate advancements in developing more robust models for real-world applications in autonomous driving and cyber-physical systems.

Paperid: 2396, https://arxiv.org/pdf/2510.25810.pdf

Abstract:
To date, traffic obfuscation techniques have been widely adopted to protect network data privacy and security by obscuring the true patterns of traffic. Nevertheless, as the pre-trained models emerge, especially transformer-based classifiers, existing traffic obfuscation methods become increasingly vulnerable, as witnessed by current studies reporting the traffic classification accuracy up to 99\% or higher. To counter such high-performance transformer-based classification models, we in this paper propose a novel and effective \underline{adv}ersarial \underline{traffic}-generating approach (AdvTraffic\footnote{The code and data are available at: http://xxx}). Our approach has two key innovations: (i) a pre-padding strategy is proposed to modify packets, which effectively overcomes the limitations of existing research against transformer-based models for network traffic classification; and (ii) a reinforcement learning model is employed to optimize network traffic perturbations, aiming to maximize adversarial effectiveness against transformer-based classification models. To the best of our knowledge, this is the first attempt to apply adversarial perturbation techniques to defend against transformer-based traffic classifiers. Furthermore, our method can be easily deployed into practical network environments. Finally, multi-faceted experiments are conducted across several real-world datasets, and the experimental results demonstrate that our proposed method can effectively undermine transformer-based classifiers, significantly reducing classification accuracy from 99\% to as low as 25.68\%.

Paperid: 2397, https://arxiv.org/pdf/2510.25670.pdf

Abstract:
A central challenge in machine learning is to understand how noise or measurement errors affect low-rank approximations, particularly in the spectral norm. This question is especially important in differentially private low-rank approximation, where one aims to preserve the top-$p$ structure of a data-derived matrix while ensuring privacy. Prior work often analyzes Frobenius norm error or changes in reconstruction quality, but these metrics can over- or under-estimate true subspace distortion. The spectral norm, by contrast, captures worst-case directional error and provides the strongest utility guarantees. We establish new high-probability spectral-norm perturbation bounds for symmetric matrices that refine the classical Eckart--Young--Mirsky theorem and explicitly capture interactions between a matrix $A \in \mathbb{R}^{n \times n}$ and an arbitrary symmetric perturbation $E$. Under mild eigengap and norm conditions, our bounds yield sharp estimates for $\|(A + E)_p - A_p\|$, where $A_p$ is the best rank-$p$ approximation of $A$, with improvements of up to a factor of $\sqrt{n}$. As an application, we derive improved utility guarantees for differentially private PCA, resolving an open problem in the literature. Our analysis relies on a novel contour bootstrapping method from complex analysis and extends it to a broad class of spectral functionals, including polynomials and matrix exponentials. Empirical results on real-world datasets confirm that our bounds closely track the actual spectral error under diverse perturbation regimes.

Paperid: 2398, https://arxiv.org/pdf/2510.25477.pdf

Abstract:
Traditional centralized scholarship evaluation processes typically require students to submit detailed academic records and qualification information, which exposes them to risks of data leakage and misuse, making it difficult to simultaneously ensure privacy protection and transparent auditability. To address these challenges, this paper proposes a scholarship evaluation system based on Decentralized Identity (DID) and Zero-Knowledge Proofs (ZKP). The system aggregates multidimensional ZKPs off-chain, and smart contracts verify compliance with evaluation criteria without revealing raw scores or computational details. Experimental results demonstrate that the proposed solution not only automates the evaluation efficiently but also maximally preserves student privacy and data integrity, offering a practical and trustworthy technical paradigm for higher education scholarship programs.

Paperid: 2399, https://arxiv.org/pdf/2510.24999.pdf

Abstract:
Large Language Models (LLMs) represent valuable intellectual property (IP), reflecting significant investments in training data, compute, and expertise. Deploying these models on partially trusted or insecure devices introduces substantial risk of model theft, making it essential to design inference protocols with provable security guarantees. We present the formal framework and security foundations of SLIP, a hybrid inference protocol that splits model computation between a trusted and an untrusted resource. We define and analyze the key notions of model decomposition and hybrid inference protocols, and introduce formal properties including safety, correctness, efficiency, and t-soundness. We construct secure inference protocols based on additive decompositions of weight matrices, combined with masking and probabilistic verification techniques. We prove that these protocols achieve information-theoretic security against honest-but-curious adversaries, and provide robustness against malicious adversaries with negligible soundness error. This paper focuses on the theoretical underpinnings of SLIP: precise definitions, formal protocols, and proofs of security. Empirical validation and decomposition heuristics appear in the companion SLIP paper. Together, the two works provide a complete account of securing LLM IP via hybrid inference, bridging both practice and theory.

Paperid: 2400, https://arxiv.org/pdf/2510.24534.pdf

Abstract:
Quantum networks rely on both quantum and classical channels for coordinated operation. Current architectures employ entanglement distribution and key exchange over quantum channels but often assume that classical communication is sufficiently secure. In practice, classical channels protected by traditional cryptography remain vulnerable to quantum adversaries, since large-scale quantum computers could break widely used public-key schemes and reduce the effective security of symmetric cryptography. This perspective presents a quantum-resistant network architecture that secures classical communication with post-quantum cryptographic techniques while supporting entanglement-based communication over quantum channels. Beyond cryptographic protection, the framework incorporates continuous monitoring of both quantum and classical layers, together with orchestration across heterogeneous infrastructures, to ensure end-to-end security. Collectively, these mechanisms provide a pathway toward scalable, robust, and secure quantum networks that remain dependable against both classical and quantum-era threats.

Paperid: 2401, https://arxiv.org/pdf/2510.24101.pdf

Abstract:
Traceable signatures (Kiayas et al., EUROCRYPT 2004) is an anonymous digital signature system that extends the tracing power of the opening authority in group signatures. There are many known constructions of traceable signatures, but all are based on number-theoretic/pairing assumptions. For such reason, they may not be secure in the presence of quantum computers. This work revisits the notion of traceable signatures and presents a lattice-based construction provably secure in the quantum random oracle model (QROM).

Paperid: 2402, https://arxiv.org/pdf/2510.23619.pdf

Abstract:
This report presents a comprehensive analysis of an unsupervised multi-expert machine learning framework for detecting short ticketing fraud in railway systems. The study introduces an A/B/C/D station classification system that successfully identifies suspicious patterns across 30 high-risk stations. The framework employs four complementary algorithms: Isolation Forest, Local Outlier Factor, One-Class SVM, and Mahalanobis Distance. Key findings include the identification of five distinct short ticketing patterns and potential for short ticketing recovery in transportation systems.

Paperid: 2403, https://arxiv.org/pdf/2510.22628.pdf

Abstract:
This paper presents a real-time modular defense system named Sentra-Guard. The system detects and mitigates jailbreak and prompt injection attacks targeting large language models (LLMs). The framework uses a hybrid architecture with FAISS-indexed SBERT embedding representations that capture the semantic meaning of prompts, combined with fine-tuned transformer classifiers, which are machine learning models specialized for distinguishing between benign and adversarial language inputs. It identifies adversarial prompts in both direct and obfuscated attack vectors. A core innovation is the classifier-retriever fusion module, which dynamically computes context-aware risk scores that estimate how likely a prompt is to be adversarial based on its content and context. The framework ensures multilingual resilience with a language-agnostic preprocessing layer. This component automatically translates non-English prompts into English for semantic evaluation, enabling consistent detection across over 100 languages. The system includes a HITL feedback loop, where decisions made by the automated system are reviewed by human experts for continual learning and rapid adaptation under adversarial pressure. Sentra-Guard maintains an evolving dual-labeled knowledge base of benign and malicious prompts, enhancing detection reliability and reducing false positives. Evaluation results show a 99.96% detection rate (AUC = 1.00, F1 = 1.00) and an attack success rate (ASR) of only 0.004%. This outperforms leading baselines such as LlamaGuard-2 (1.3%) and OpenAI Moderation (3.7%). Unlike black-box approaches, Sentra-Guard is transparent, fine-tunable, and compatible with diverse LLM backends. Its modular design supports scalable deployment in both commercial and open-source environments. The system establishes a new state-of-the-art in adversarial LLM defense.

Paperid: 2404, https://arxiv.org/pdf/2510.22400.pdf

Abstract:
Provenance analysis (PA) has recently emerged as an important solution for cyber attack investigation. PA leverages system monitoring to monitor system activities as a series of system audit events and organizes these events as a provenance graph to show the dependencies among system activities, which can reveal steps of cyber attacks. Despite their potential, existing PA techniques face two critical challenges: (1) they are inflexible and non-extensible, making it difficult to incorporate analyst expertise, and (2) they are memory inefficient, often requiring>100GB of RAM to hold entire event streams, which fundamentally limits scalability and deployment in real-world environments. To address these limitations, we propose the PROGQL framework, which provides a domain-specific graph search language with a well-engineered query engine, allowing PA over system audit events and expert knowledge to be jointly expressed as a graph search query and thereby facilitating the investigation of complex cyberattacks. In particular, to support dependency searches from a starting edge required in PA, PROGQL introduces new language constructs for constrained graph traversal, edge weight computation, value propagation along weighted edges, and graph merging to integrate multiple searches. Moreover, the PROGQL query engine is optimized for efficient incremental graph search across heterogeneous database backends, eliminating the need for full in-memory materialization and reducing memory overhead. Our evaluations on real attacks demonstrate the effectiveness of the PROGQL language in expressing a diverse set of complex attacks compared with the state-of-the-art graph query language Cypher, and the comparison with the SOTA PA technique DEPIMPACT further demonstrates the significant improvement of the scalability brought by our PROGQL framework's design.

Paperid: 2405, https://arxiv.org/pdf/2510.22133.pdf

Abstract:
Wi-Fi Channel State Information (CSI) has been extensively studied for sensing activities. However, its practical application in user authentication still needs to be explored. This study presents a novel approach to biometric authentication using Wi-Fi Channel State Information (CSI) data for palm recognition. The research delves into utilizing a Raspberry Pi encased in a custom-built box with antenna power reduced to 1dBm, which was used to capture CSI data from the right hands of 20 participants (10 men and 10 women). The dataset was normalized using MinMax scaling to ensure uniformity and accuracy. By focusing on biophysical aspects such as hand size, shape, angular spread between fingers, and finger phalanx lengths, among other characteristics, the study explores how these features affect electromagnetic signals, which are then reflected in Wi-Fi CSI, allowing for precise user identification. Five classification algorithms were evaluated, with the Random Forest classifier achieving an average F1-Score of 99.82\% using 10-fold cross-validation. Amplitude and Phase data were used, with each capture session recording approximately 1000 packets per second in five 5-second intervals for each User. This high accuracy highlights the potential of Wi-Fi CSI in developing robust and reliable user authentication systems based on palm biometric data.

Paperid: 2406, https://arxiv.org/pdf/2510.21858.pdf

Abstract:
Decision-making for multi-energy system (MES) dispatch depends on accurate load forecasting. Traditionally, load forecasting and decision-making for MES are implemented separately. Forecasting models are typically trained to minimize forecasting errors, overlooking their impact on downstream decision-making. To address this, decision-focused learning (DFL) has been studied to minimize decision-making costs instead. However, practical adoption of DFL in MES faces significant challenges: the process requires sharing sensitive load data and model parameters across multiple sectors, raising serious privacy issues. To this end, we propose a privacy-preserving DFL framework tailored for MES. Our approach introduces information masking to safeguard private data while enabling recovery of decision variables and gradients required for model training. To further enhance security for DFL, we design a safety protocol combining matrix decomposition and homomorphic encryption, effectively preventing collusion and unauthorized data access. Additionally, we developed a privacy-preserving load pattern recognition algorithm, enabling the training of specialized DFL models for heterogeneous load patterns. Theoretical analysis and comprehensive case studies, including real-world MES data, demonstrate that our framework not only protects privacy but also consistently achieves lower average daily dispatch costs compared to existing methods.

Paperid: 2407, https://arxiv.org/pdf/2510.21401.pdf

Abstract:
Smart contract vulnerabilities cost billions of dollars annually, yet existing automated analysis tools fail to generate deployable defenses. We present FLAMES, a novel automated approach that synthesizes executable runtime guards as Solidity "require" statements to harden smart contracts against exploits. Unlike prior work that relies on vulnerability labels, symbolic analysis, or natural language specifications, FLAMES employs domain-adapted large language models trained through fill-in-the-middle supervised fine-tuning on real-world invariants extracted from 514,506 verified contracts. Our extensive evaluation across three dimensions demonstrates FLAMES's effectiveness: (1) Compilation: FLAMES achieves 96.7% compilability for synthesized invariant (2) Semantic Quality: on a curated test set of 5,000 challenging invariants, FLAMES produces exact or semantically equivalent matches to ground truth in 44.5% of cases; (3) Exploit Mitigation: FLAMES prevents 22 out of 108 real exploits (20.4%) while preserving contract functionality, and (4) FLAMES successfully blocks the real-world APEMAGA incident by synthesizing a pre-condition that mitigates the attack. FLAMES establishes that domain-adapted LLMs can automatically generate production-ready security defenses for smart contracts without requiring vulnerability detection, formal specifications, or human intervention. We release our code, model weights, datasets, and evaluation infrastructure to enable reproducible research in this critical domain.

Paperid: 2408, https://arxiv.org/pdf/2510.20075.pdf

Abstract:
A meaningful text can be hidden inside another, completely different yet still coherent and plausible, text of the same length. For example, a tweet containing a harsh political critique could be embedded in a tweet that celebrates the same political leader, or an ordinary product review could conceal a secret manuscript. This uncanny state of affairs is now possible thanks to Large Language Models, and in this paper we present a simple and efficient protocol to achieve it. We show that even modest 8-billion-parameter open-source LLMs are sufficient to obtain high-quality results, and a message as long as this abstract can be encoded and decoded locally on a laptop in seconds. The existence of such a protocol demonstrates a radical decoupling of text from authorial intent, further eroding trust in written communication, already shaken by the rise of LLM chatbots. We illustrate this with a concrete scenario: a company could covertly deploy an unfiltered LLM by encoding its answers within the compliant responses of a safe model. This possibility raises urgent questions for AI safety and challenges our understanding of what it means for a Large Language Model to know something.

Paperid: 2409, https://arxiv.org/pdf/2510.19982.pdf

Abstract:
Quantum computing is reshaping the security landscape of modern telecommunications. The cryptographic foundations that secure todays 5G systems, including RSA, Elliptic Curve Cryptography (ECC), and Diffie-Hellman (DH), are all susceptible to attacks enabled by Shors algorithm. Protecting 5G networks against future quantum adversaries has therefore become an urgent engineering and research priority. In this paper we introduce QORE, a quantum-secure 5G and Beyond 5G (B5G) Core framework that provides a clear pathway for transitioning both the 5G Core Network Functions and User Equipment (UE) to Post-Quantum Cryptography (PQC). The framework uses the NIST-standardized lattice-based algorithms Module-Lattice Key Encapsulation Mechanism (ML-KEM) and Module-Lattice Digital Signature Algorithm (ML-DSA) and applies them across the 5G Service-Based Architecture (SBA). A Hybrid PQC (HPQC) configuration is also proposed, combining classical and quantum-safe primitives to maintain interoperability during migration. Experimental validation shows that ML-KEM achieves quantum security with minor performance overhead, meeting the low-latency and high-throughput requirements of carrier-grade 5G systems. The proposed roadmap aligns with ongoing 3GPP SA3 and SA5 study activities on the security and management of post-quantum networks as well as with NIST PQC standardization efforts, providing practical guidance for mitigating quantum-era risks while safeguarding long-term confidentiality and integrity of network data.

Paperid: 2410, https://arxiv.org/pdf/2510.19968.pdf

Abstract:
The telecommunications industry faces a dual transformation: the architectural shift toward Open Radio Access Networks (O-RAN) and the emerging threat from quantum computing. O-RAN disaggregated, multi-vendor architecture creates a larger attack surface vulnerable to crypt-analytically relevant quantum computers(CRQCs) that will break current public key cryptography. The Harvest Now, Decrypt Later (HNDL) attack strategy makes this threat immediate, as adversaries can intercept encrypted data today for future decryption. This paper presents Q-RAN, a comprehensive quantum-resistant security framework for O-RAN networks using NIST-standardized Post-Quantum Cryptography (PQC). We detail the implementation of ML-KEM (FIPS 203) and ML-DSA (FIPS 204), integrated with Quantum Random Number Generators (QRNG) for cryptographic entropy. The solution deploys PQ-IPsec, PQ-DTLS, and PQ-mTLS protocols across all O-RAN interfaces, anchored by a centralized Post-Quantum Certificate Authority (PQ-CA) within the SMO framework. This work provides a complete roadmap for securing disaggregated O-RAN ecosystems against quantum adversaries.

Paperid: 2411, https://arxiv.org/pdf/2510.19390.pdf

Abstract:
The closest vector problem (CVP) is a fundamental optimization problem in lattice-based cryptography and its conjectured hardness underpins the security of lattice-based cryptosystems. Furthermore, Schnorr's lattice-based factoring algorithm reduces integer factoring (the foundation of current cryptosystems, including RSA) to the CVP. Recent work has investigated the inclusion of a heuristic CVP approximation `refinement' step in the lattice-based factoring algorithm, using quantum variational algorithms to perform the heuristic optimization. This coincides with the emergence of probabilistic computing as a hardware accelerator for randomized algorithms including tasks in combinatorial optimization. In this work we investigate the application of probabilistic computing to the heuristic optimization task of CVP approximation refinement in lattice-based factoring. We present the design of a probabilistic computing algorithm for this task, a discussion of `prime lattice' parameters, and experimental results showing the efficacy of probabilistic computing for solving the CVP as well as its efficacy as a subroutine for lattice-based factoring. The main results found that (a) this approach is capable of finding the maximal available CVP approximation refinement in time linear in problem size and (b) probabilistic computing used in conjunction with the lattice parameters presented can find the composite prime factors of a semiprime number using up to 100x fewer lattice instances than similar quantum and classical methods.

Paperid: 2412, https://arxiv.org/pdf/2510.19352.pdf

Abstract:
Data-driven inertial sequence learning has revolutionized navigation in GPS-denied environments, offering superior odometric resolution compared to traditional Bayesian methods. However, deep learning-based inertial tracking systems remain vulnerable to privacy breaches that can expose sensitive training data. \hl{Existing differential privacy solutions often compromise model performance by introducing excessive noise, particularly in high-frequency inertial measurements.} In this article, we propose ConvXformer, a hybrid architecture that fuses ConvNeXt blocks with Transformer encoders in a hierarchical structure for robust inertial navigation. We propose an efficient differential privacy mechanism incorporating adaptive gradient clipping and gradient-aligned noise injection (GANI) to protect sensitive information while ensuring model performance. Our framework leverages truncated singular value decomposition for gradient processing, enabling precise control over the privacy-utility trade-off. Comprehensive performance evaluations on benchmark datasets (OxIOD, RIDI, RoNIN) demonstrate that ConvXformer surpasses state-of-the-art methods, achieving more than 40% improvement in positioning accuracy while ensuring $(ε,δ)$-differential privacy guarantees. To validate real-world performance, we introduce the Mech-IO dataset, collected from the mechanical engineering building at KAIST, where intense magnetic fields from industrial equipment induce significant sensor perturbations. This demonstrated robustness under severe environmental distortions makes our framework well-suited for secure and intelligent navigation in cyber-physical systems.

Paperid: 2413, https://arxiv.org/pdf/2510.18438.pdf

Abstract:
Phishing attacks in Web3 ecosystems are increasingly sophisticated, exploiting deceptive contract logic, malicious frontend scripts, and token approval patterns. We present DeepTx, a real-time transaction analysis system that detects such threats before user confirmation. DeepTx simulates pending transactions, extracts behavior, context, and UI features, and uses multiple large language models (LLMs) to reason about transaction intent. A consensus mechanism with self-reflection ensures robust and explainable decisions. Evaluated on our phishing dataset, DeepTx achieves high precision and recall (demo video: https://youtu.be/4OfK9KCEXUM).

Paperid: 2414, https://arxiv.org/pdf/2510.17403.pdf

Abstract:
This paper presents the development of a process automation architecture leveraging Radio Frequency Identification (RFID) technology for secure, transparent and efficient voting systems. The proposed architecture automates the voting workflow through RFID-enabled voter identification, encrypted vote casting, and secure data transmission. Each eligible voter receives a smart RFID card containing a uniquely encrypted identifier, which is verified using an RC522 reader interfaced with a microcontroller. Upon successful verification, the voter interacts with a touchscreen interface to cast a vote, which is then encrypted using AES-128 and securely stored on a local SD card or transmitted via GSM to a central server. A tamper-proof monitoring mechanism records each session with time-stamped digital signatures, ensuring auditability and data integrity. The architecture is designed to function in both online and offline modes, with an automated batch synchronization mechanism that updates vote records once network connectivity is restored. System testing in simulated environments confirmed 100% voter authentication accuracy, minimized latency (average voting time of 11.5 seconds), and robustness against cloning, double voting, and data interception. The integration of real-time monitoring and secure process control modules enables electoral authorities to automate data logging, detect anomalies, and validate system integrity dynamically. This work demonstrates a scalable, automation-driven solution for voting infrastructure, offering enhanced transparency, resilience, and deployment flexibility, especially in environments where digital transformation of electoral processes is critically needed.

Paperid: 2415, https://arxiv.org/pdf/2510.17308.pdf

Abstract:
A card-based secure computation protocol is a method for $n$ parties to compute a function $f$ on their private inputs $(x_1,\ldots,x_n)$ using physical playing cards, in such a way that the suits of revealed cards leak no information beyond the value of $f(x_1,\ldots,x_n)$. A \textit{single-shuffle full-open} protocol is a minimal model of card-based secure computation in which, after the parties place face-down cards representing their inputs, a single shuffle operation is performed and then all cards are opened to derive the output. Despite the simplicity of this model, the class of functions known to admit single-shuffle full-open protocols has been limited to a few small examples. In this work, we prove for the first time that every function admits a single-shuffle full-open protocol. We present two constructions that offer a trade-off between the number of cards and the complexity of the shuffle operation. These feasibility results are derived from a novel connection between single-shuffle full-open protocols and a cryptographic primitive known as \textit{Private Simultaneous Messages} protocols, which has rarely been studied in the context of card-based cryptography. We also present variants of single-shuffle protocols in which only a subset of cards are revealed. These protocols reduce the complexity of the shuffle operation compared to existing protocols in the same setting.

Paperid: 2416, https://arxiv.org/pdf/2510.16637.pdf

Abstract:
Existing adversarial attacks often neglect perturbation sparsity, limiting their ability to model structural changes and to explain how deep neural networks (DNNs) process meaningful input patterns. We propose ATOS (Attack Through Overlapping Sparsity), a differentiable optimization framework that generates structured, sparse adversarial perturbations in element-wise, pixel-wise, and group-wise forms. For white-box attacks on image classifiers, we introduce the Overlapping Smoothed L0 (OSL0) function, which promotes convergence to a stationary point while encouraging sparse, structured perturbations. By grouping channels and adjacent pixels, ATOS improves interpretability and helps identify robust versus non-robust features. We approximate the L-infinity gradient using the logarithm of the sum of exponential absolute values to tightly control perturbation magnitude. On CIFAR-10 and ImageNet, ATOS achieves a 100% attack success rate while producing significantly sparser and more structurally coherent perturbations than prior methods. The structured group-wise attack highlights critical regions from the network's perspective, providing counterfactual explanations by replacing class-defining regions with robust features from the target class.

Paperid: 2417, https://arxiv.org/pdf/2510.15567.pdf

Abstract:
Malicious software attacks are having an increasingly significant economic impact. Commercial malware detection software can be costly, and tools that attribute malware to the specific software vulnerabilities it exploits are largely lacking. Understanding the connection between malware and the vulnerabilities it targets is crucial for analyzing past threats and proactively defending against current ones. In this study, we propose an approach that leverages large language models (LLMs) to detect binary malware, specifically within JAR files, and utilizes the capabilities of LLMs combined with retrieval-augmented generation (RAG) to identify Common Vulnerabilities and Exposures (CVEs) that malware may exploit. We developed a proof-of-concept tool called MalCVE, which integrates binary code decompilation, deobfuscation, LLM-based code summarization, semantic similarity search, and CVE classification using LLMs. We evaluated MalCVE using a benchmark dataset of 3,839 JAR executables. MalCVE achieved a mean malware detection accuracy of 97%, at a fraction of the cost of commercial solutions. It is also the first tool to associate CVEs with binary malware, achieving a recall@10 of 65%, which is comparable to studies that perform similar analyses on source code.

Paperid: 2418, https://arxiv.org/pdf/2510.15413.pdf

Abstract:
FHE-SQL is a privacy-preserving database system that enables secure query processing on encrypted data using Fully Homomorphic Encryption (FHE), providing privacy guaranties where an untrusted server can execute encrypted queries without learning either the query contents or the underlying data. Unlike property-preserving encryption-based systems such as CryptDB, which rely on deterministic or order-preserving encryption and are vulnerable to frequency, order, and equality-pattern inference attacks, FHE-SQL performs computations entirely under encryption, eliminating these leakage channels. Compared to trusted-hardware approaches such as TrustedDB, which depend on a hardware security module and thus inherit its trust and side-channel limitations, our design achieves end-to-end cryptographic protection without requiring trusted execution environments. In contrast to high-performance FHE-based engines-Hermes, which target specialized workloads such as vector search, FHE-SQL supports general SQL query semantics with schema-aware, type-safe definitions suitable for relational data management. FHE-SQL mitigates the high cost of ciphertext space by using an indirection architecture that separates metadata in RocksDB from large ciphertexts in blob storage. It supports oblivious selection via homomorphic boolean masks, multi-tier caching, and garbage collection, with security proven under the Universal Composability framework.

Paperid: 2419, https://arxiv.org/pdf/2510.15180.pdf

Abstract:
Educating children and young people to be safe online is essential, especially as the metaverse, a next-generation internet blending immersive technologies, promises to reshape their interactions and amplify their experiences. While virtual reality offers fully immersive, highly interactive, and multi-sensory engagement, it also heightens cyber harm risks for young or vulnerable users. To address this, the CyberNinjas VR experience was developed to educate children aged 8 to 16 on safe metaverse behaviours, providing clear referral steps for harmful interactions. Understanding user engagement in metaverse gaming will aid the design of future VR environments which prioritize safety and inclusivity. This project analyses CyberNinjas to understand how game mechanics can foster cyber-safe behaviours.

Paperid: 2420, https://arxiv.org/pdf/2510.15108.pdf

Abstract:
This paper investigates the algebraic and graphical structure of the ring $\mathbb{Z}_{sp}$, with a focus on its decomposition into finite fields, kernels, and special subsets. We establish classical isomorphisms between $\mathbb{F}_s$ and $p\mathbb{F}_s$, as well as $p\mathbb{F}_s^{\star}$ and $p\mathbb{F}_s^{+1,\star}$. We introduce the notion of arcs and rooted trees to describe the pre-periodic structure of $\mathbb{Z}_{sp}$, and prove that trees rooted at elements not divisible by $s$ or $p$ can be generated from the tree of unity via multiplication by cyclic arcs. Furthermore, we define and analyze the set $\mathbb{D}_{sp}$, consisting of elements that are neither multiples of $s$ or $p$ nor "off-by-one" elements, and show that its graph decomposes into cycles and pre-periodic trees. Finally, we demonstrate that every cycle in $\mathbb{Z}_{sp}$ contains inner cycles that are derived predictably from the cycles of the finite fields $p\mathbb{F}_s$ and $s\mathbb{F}_p$, and we discuss the cryptographic relevance of $\mathbb{D}_{sp}$, highlighting its potential for analyzing cyclic attacks and factorization methods.

Paperid: 2421, https://arxiv.org/pdf/2510.15106.pdf

Abstract:
As Large Language Models (LLMs) gain traction across critical domains, ensuring secure and trustworthy training processes has become a major concern. Backdoor attacks, where malicious actors inject hidden triggers into training data, are particularly insidious and difficult to detect. Existing post-training verification solutions like Proof-of-Learning are impractical for LLMs due to their requirement for full retraining, lack of robustness against stealthy manipulations, and inability to provide early detection during training. Early detection would significantly reduce computational costs. To address these limitations, we introduce Proof-of-Training Steps, a verification protocol that enables an independent auditor (Alice) to confirm that an LLM developer (Bob) has followed the declared training recipe, including data batches, architecture, and hyperparameters. By analyzing the sensitivity of the LLMs' language modeling head (LM-Head) to input perturbations, our method can expose subtle backdoor injections or deviations in training. Even with backdoor triggers in up to 10 percent of the training data, our protocol significantly reduces the attacker's ability to achieve a high attack success rate (ASR). Our method enables early detection of attacks at the injection step, with verification steps being 3x faster than training steps. Our results highlight the protocol's potential to enhance the accountability and security of LLM development, especially against insider threats.

Paperid: 2422, https://arxiv.org/pdf/2510.14993.pdf

Abstract:
Lightweight cryptography is an emerging field in the field of research, which endorses algorithms which are best suited for constrained environment. Design metrics like Gate Equivalence (GE), Memory Requirement, Power Consumption, and Throughput play a vital role in the applications like IoT. This paper presents the 6LoWPAN Protocol Stack which is a popular standard of communication for constrained devices. This paper presents an implementation of a lightweight 6LoWPAN Protocol stack by using a Light weight Cipher instead of regular heavy encryption cipher AES. The cipher proposed in this paper is specifically suitable for 6LoWPAN architecture as it addresses all the constraints possessed by wireless sensor nodes. The lightweight cipher proposed in the paper needs only 1856 bytes of FLASH and 1272 bytes of RAM memory which is less than any other standard existing lightweight cipher design. The proposed ciphers power consumption is around 25 mW which is significantly less as compared to ISO certified lightweight cipher PRESENT which consumes around 38 mW of dynamic power. This paper also discusses the detailed analysis of cipher against the attacks like Linear Cryptanalysis, Differential Cryptanalysis, Biclique attack and Avalanche attack. The cipher implementation on hardware is around 1051 GEs for 64 bit of block size with 128 bit of key length which is less as compared to existing lightweight cipher design. The proposed cipher LiCi2 is motivated from LiCi cipher design but outclasses it in every design metric. We believe the design of LiCi2 is the obvious choice for researchers to implement in constrained environments like IoT.

Paperid: 2423, https://arxiv.org/pdf/2510.14991.pdf

Abstract:
With the growth of digital financial systems, robust security and privacy have become a concern for financial institutions. Even though traditional machine learning models have shown to be effective in fraud detections, they often compromise user data by requiring centralized access to sensitive information. In IoT-enabled financial endpoints such as ATMs and POS Systems that regularly produce sensitive data that is sent over the network. Federated Learning (FL) offers a privacy-preserving, decentralized model training across institutions without sharing raw data. FL enables cross-silo collaboration among banks while also using cross-device learning on IoT endpoints. This survey explores the role of FL in enhancing financial security and introduces a novel classification of its applications based on regulatory and compliance exposure levels ranging from low-exposure tasks such as collaborative portfolio optimization to high-exposure tasks like real-time fraud detection. Unlike prior surveys, this work reviews FL's practical use within financial systems, discussing its regulatory compliance and recent successes in fraud prevention and blockchain-integrated frameworks. However, FL deployment in finance is not without challenges. Data heterogeneity, adversarial attacks, and regulatory compliance make implementation far from easy. This survey reviews current defense mechanisms and discusses future directions, including blockchain integration, differential privacy, secure multi-party computation, and quantum-secure frameworks. Ultimately, this work aims to be a resource for researchers exploring FL's potential to advance secure, privacy-compliant financial systems.

Paperid: 2424, https://arxiv.org/pdf/2510.14589.pdf

Abstract:
Tracking devices, while designed to help users find their belongings in case of loss/theft, bring in new questions about privacy and surveillance of not just their own users, but in the case of crowd-sourced location tracking, even that of others even orthogonally associated with these platforms. Apple's Find My is perhaps the most ubiquitous such system which can even locate devices which do not possess any cellular support or GPS, running on millions of devices worldwide. Apple claims that this system is private and secure, but the code is proprietary, and such claims have to be taken on faith. It is well known that even with perfect cryptographic guarantees, logical flaws might creep into protocols, and allow undesirable attacks. In this paper, we present a symbolic model of the Find My protocol, as well as a precise formal specification of desirable properties, and provide automated, machine-checkable proofs of these properties in the Tamarin prover.

Paperid: 2425, https://arxiv.org/pdf/2510.13925.pdf

Abstract:
Internet of Things (IoT) networks generate diverse and high-volume traffic that reflects both normal activity and potential threats. Deriving meaningful insight from such telemetry requires cross-layer interpretation of behaviors, protocols, and context rather than isolated detection. This work presents an LLM-powered AI agent framework that converts raw packet captures into structured and semantically enriched representations for interactive analysis. The framework integrates feature extraction, transformer-based anomaly detection, packet and flow summarization, threat intelligence enrichment, and retrieval-augmented question answering. An AI agent guided by a large language model performs reasoning over the indexed traffic artifacts, assembling evidence to produce accurate and human-readable interpretations. Experimental evaluation on multiple IoT captures and six open models shows that hybrid retrieval, which combines lexical and semantic search with reranking, substantially improves BLEU, ROUGE, METEOR, and BERTScore results compared with dense-only retrieval. System profiling further indicates low CPU, GPU, and memory overhead, demonstrating that the framework achieves holistic and efficient interpretation of IoT network traffic.

Paperid: 2426, https://arxiv.org/pdf/2510.13538.pdf

Abstract:
Managing passwords securely and conveniently is still an open problem for many users. Existing research has examined users' password management strategies and identified pain points, such as security concerns, leading to insecure practices. We investigate how Blind and Low-Vision (BLV) users tackle this problem and how password managers can assist them. This paper presents the results of a qualitative interview study with N = 33 BLV participants. We found that all participants utilize password managers to some extent, which they perceive as fairly accessible. However, the adoption is mainly driven by the convenience of storing and retrieving passwords. The security advantages - generating strong, random passwords - were avoided mainly due to the absence of practical accessibility. Password managers do not adhere to BLV users' underlying needs for agency, which stem from experiences with inaccessible software and vendors who deprioritize accessibility issues. Underutilization of password managers leads BLV users to adopt insecure practices, such as reusing predictable passwords or resorting to 'security through obscurity' by writing important credentials in braille. We conclude our analysis by discussing the need to implement practical accessibility and usability improvements for password managers as a way of establishing trust and secure practices while maintaining BLV users' agency.

Paperid: 2427, https://arxiv.org/pdf/2510.12455.pdf

Abstract:
The growing scale and sophistication of cyberattacks pose critical challenges to network security, particularly in detecting diverse intrusion types within imbalanced datasets. Traditional intrusion detection systems (IDS) often struggle to maintain high accuracy across both frequent and rare attacks, leading to increased false negatives for minority classes. To address this, we propose a hybrid anomaly detection framework that integrates specialized deep learning models with an ensemble meta-classifier. Each model is trained to detect a specific attack category, enabling tailored learning of class-specific patterns, while their collective outputs are fused by a Random Forest meta-classifier to improve overall decision reliability. The framework is evaluated on the NSL-KDD benchmark, demonstrating superior performance in handling class imbalance compared to conventional monolithic models. Results show significant improvements in precision, recall, and F1-score across all attack categories, including rare classes such as User to Root (U2R). The proposed system achieves near-perfect detection rates with minimal false alarms, highlighting its robustness and generalizability. This work advances the design of intrusion detection systems by combining specialization with ensemble learning, providing an effective and scalable solution for safeguarding modern networks.

Paperid: 2428, https://arxiv.org/pdf/2510.12343.pdf

Abstract:
As users increasingly interact with large language models (LLMs) using private information, secure and encrypted communication becomes essential. Homomorphic encryption (HE) provides a principled solution by enabling computation directly on encrypted data. Although prior work has explored aspects of running LLMs under HE, the challenge of text generation, particularly next-token prediction, has received limited attention and remains a key obstacle to practical encrypted interaction. In this work, we propose a TSP-based token reordering strategy to address the difficulties of encrypted text generation, together with a post-processing step that further reduces approximation error. Theoretical analysis and experimental results demonstrate that our method prevents collapse, improves coherence in generated text, and preserves data privacy throughout. Overall, our contributions advance the feasibility of practical and privacy-preserving LLM inference.

Paperid: 2429, https://arxiv.org/pdf/2510.12310.pdf

Abstract:
Over the last decade, machine learning has been extensively applied to identify malicious Android applications. However, such approaches remain vulnerable against adversarial examples, i.e., examples that are subtly manipulated to fool a machine learning model into making incorrect predictions. This research presents DeepTrust, a novel metaheuristic that arranges flexible classifiers, like deep neural networks, into an ordered sequence where the final decision is made by a single internal model based on conditions activated in cascade. In the Robust Android Malware Detection competition at the 2025 IEEE Conference SaTML, DeepTrust secured the first place and achieved state-of-the-art results, outperforming the next-best competitor by up to 266% under feature-space evasion attacks. This is accomplished while maintaining the highest detection rate on non-adversarial malware and a false positive rate below 1%. The method's efficacy stems from maximizing the divergence of the learned representations among the internal models. By using classifiers inducing fundamentally dissimilar embeddings of the data, the decision space becomes unpredictable for an attacker. This frustrates the iterative perturbation process inherent to evasion attacks, enhancing system robustness without compromising accuracy on clean examples.

Paperid: 2430, https://arxiv.org/pdf/2510.11137.pdf

Abstract:
Large language models have gained widespread attention recently, but their potential security vulnerabilities, especially privacy leakage, are also becoming apparent. To test and evaluate for data extraction risks in LLM, we proposed CoSPED, short for Consistent Soft Prompt targeted data Extraction and Defense. We introduce several innovative components, including Dynamic Loss, Additive Loss, Common Loss, and Self Consistency Decoding Strategy, and tested to enhance the consistency of the soft prompt tuning process. Through extensive experimentation with various combinations, we achieved an extraction rate of 65.2% at a 50-token prefix comparison. Our comparisons of CoSPED with other reference works confirm our superior extraction rates. We evaluate CoSPED on more scenarios, achieving Pythia model extraction rate of 51.7% and introducing cross-model comparison. Finally, we explore defense through Rank-One Model Editing and achieve a reduction in the extraction rate to 1.6%, which proves that our analysis of extraction mechanisms can directly inform effective mitigation strategies against soft prompt-based attacks.

Paperid: 2431, https://arxiv.org/pdf/2510.11065.pdf

Abstract:
Dynamically distributed inflation is a common mechanism used to guide a blockchain's staking rate towards a desired equilibrium between network security and token liquidity. However, the high sensitivity of the annual percentage yield to changes in the staking rate, coupled with the inherent feedback delays in staker responses, can induce undesirable oscillations around this equilibrium. This paper investigates this instability phenomenon. We analyze the dynamics of inflation-based reward systems and propose a novel distribution model designed to stabilize the staking rate. Our solution effectively dampens oscillations, stabilizing the yield within a target staking range.

Paperid: 2432, https://arxiv.org/pdf/2510.10937.pdf

Abstract:
Reinforcement learning (RL) has been an important machine learning paradigm for solving long-horizon sequential decision-making problems under uncertainty. By integrating deep neural networks (DNNs) into the RL framework, deep reinforcement learning (DRL) has emerged, which achieved significant success in various domains. However, the integration of DNNs also makes it vulnerable to adversarial attacks. Existing adversarial attack techniques mainly focus on either directly manipulating the environment with which a victim agent interacts or deploying an adversarial agent that interacts with the victim agent to induce abnormal behaviors. While these techniques achieve promising results, their adoption in multi-party open systems remains limited due to two major reasons: impractical assumption of full control over the environment and dependent on interactions with victim agents. To enable adversarial attacks in multi-party open systems, in this paper, we redesigned an adversarial policy learning approach that can mislead well-trained victim agents without requiring direct interactions with these agents or full control over their environments. Particularly, we propose a neutral agent-based approach across various task scenarios in multi-party open systems. While the neutral agents seemingly are detached from the victim agents, indirectly influence them through the shared environment. We evaluate our proposed method on the SMAC platform based on Starcraft II and the autonomous driving simulation platform Highway-env. The experimental results demonstrate that our method can launch general and effective adversarial attacks in multi-party open systems.

Paperid: 2433, https://arxiv.org/pdf/2510.10899.pdf

Abstract:
One-shot signatures (OSS) are a powerful and uniquely quantum cryptographic primitive which allows anyone, given common reference string, to come up with a public verification key $\mathsf{pk}$ and a secret signing state $|\mathsf{sk}\rangle$. With the secret signing state, one can produce the signature of any one message, but no more. In a recent breakthrough work, Shmueli and Zhandry (CRYPTO 2025) constructed one-shot signatures, either unconditionally in a classical oracle model or assuming post-quantum indistinguishability obfuscation and the hardness of Learning with Errors (LWE) in the plain model. In this work, we address the inefficiency of the Shmueli-Zhandry construction which signs messages bit-by-bit, resulting in signing keys of $Θ(λ^4)$ qubits and signatures of size $Θ(λ^3)$ bits for polynomially long messages, where $λ$ is the security parameter. We construct a new, simple, direct, and efficient one-shot signature scheme which can sign messages of any polynomial length using signing keys of $Θ(λ^2)$ qubits and signatures of size $Θ(λ^2)$ bits. We achieve corresponding savings in runtimes, in both the oracle model and the plain model. In addition, unlike the Shmueli-Zhandry construction, our scheme achieves perfect correctness. Our scheme also achieves strong signature incompressibility, which implies a public-key quantum fire scheme with perfect correctness among other applications, correcting an error in a recent work of Çakan, Goyal and Shmueli (QCrypt 2025) and recovering their applications.

Paperid: 2434, https://arxiv.org/pdf/2510.10574.pdf

Abstract:
This report presents some technical details on the authentication process of a lightweight key exchange protocol, paying attention on how Man-in-the-Middle (MitM) attacks could undermine its security, e.g., under the scope of lawful interception and its risk to facilitate mass surveillance. We focus only on some technical aspects associated to the attack scenario. Perspectives for future work are also discussed. Other specific aspects of the work, mainly focusing on the security implications of malicious metasurfaces against B5G networks, are excluded from the scope of this report.

Paperid: 2435, https://arxiv.org/pdf/2510.10540.pdf

Abstract:
Is module-lattice reduction better than unstructured lattice reduction? This question was highlighted as 'Q8' in the Kyber NIST standardization submission (Avanzi et al., 2021), as potentially affecting the concrete security of Kyber and other module-lattice-based schemes. Foundational works on module-lattice reduction (Lee, Pellet-Mary, Stehlé, and Wallet, ASIACRYPT 2019; Mukherjee and Stephens-Davidowitz, CRYPTO 2020) confirmed the existence of such module variants of LLL and block-reduction algorithms, but focus only on provable worst-case asymptotic behavior. In this work, we present a concrete average-case analysis of module-lattice reduction. Specifically, we address the question of the expected slope after running module-BKZ, and pinpoint the discriminant $Δ_K$ of the number field at hand as the main quantity driving this slope. We convert this back into a gain or loss on the blocksize $β$: module-BKZ in a number field $K$ of degree $d$ requires an SVP oracle of dimension $β+ \log(|Δ_K| / d^d)β/(d\log β) + o(β/ \log β)$ to reach the same slope as unstructured BKZ with blocksize $β$. This asymptotic summary hides further terms that we predict concretely using experimentally verified heuristics. Incidentally, we provide the first open-source implementation of module-BKZ for some cyclotomic fields. For power-of-two cyclotomic fields, we have $|Δ_K| = d^d$, and conclude that module-BKZ requires a blocksize larger than its unstructured counterpart by $d-1+o(1)$. On the contrary, for all other cyclotomic fields we have $|Δ_K| < d^d$, so module-BKZ provides a sublinear $Θ(β/\log β)$ gain on the required blocksize, yielding a subexponential speedup of $\exp(Θ(β/\log β))$.

Paperid: 2436, https://arxiv.org/pdf/2510.09840.pdf

Abstract:
We argue that the principal application for blockchain technology will not be in the financial sector, but rather in maintaining decentralized human governance, from archives to transparent policies encoded in the blockchain in the form of smart contracts.. Such decentralized, blockchain-grounded governance comes not a moment too soon, as nation states are dissolving before our eyes. Will blockchain-based communities replace the nation state? What are the prospects and dangers of this development?

Paperid: 2437, https://arxiv.org/pdf/2510.09836.pdf

Abstract:
This paper investigates the use of synthetic face data to enhance Single-Morphing Attack Detection (S-MAD), addressing the limitations of availability of large-scale datasets of bona fide images due to privacy concerns. Various morphing tools and cross-dataset evaluation schemes were utilized to conduct this study. An incremental testing protocol was implemented to assess the generalization capabilities as more and more synthetic images were added. The results of the experiments show that generalization can be improved by carefully incorporating a controlled number of synthetic images into existing datasets or by gradually adding bona fide images during training. However, indiscriminate use of synthetic data can lead to sub-optimal performance. Evenmore, the use of only synthetic data (morphed and non-morphed images) achieves the highest Equal Error Rate (EER), which means in operational scenarios the best option is not relying only on synthetic data for S-MAD.

Paperid: 2438, https://arxiv.org/pdf/2510.09729.pdf

Abstract:
Proof of Work (PoW) is widely regarded as the most secure permissionless blockchain consensus protocol. However, its reliance on computationally intensive yet externally useless puzzles results in excessive electric energy wasting. To alleviate this, Proof of Useful Work (PoUW) has been explored as an alternative to secure blockchain platforms while also producing real-world value. Despite this promise, existing PoUW proposals often fail to embed the integrity of the chain and identity of the miner into the puzzle solutions, not meeting necessary requirements for PoW and thus rendering them vulnerable. In this work, we propose a PoUW consensus protocol that computes client-outsourced zk-SNARKs proofs as a byproduct, which are at the same time used to secure the consensus protocol. We further leverage this mechanism to design a decentralized marketplace for outsourcing zk-SNARK proof generation, which is, to the best of our knowledge, the first such marketplace operating at the consensus layer, while meeting all necessary properties of PoW.

Paperid: 2439, https://arxiv.org/pdf/2510.09675.pdf

Abstract:
Software-Defined Vehicles (SDVs) introduce innovative features that extend the vehicle's lifecycle through the integration of outsourced applications and continuous Over-The-Air (OTA) updates. This shift necessitates robust cybersecurity and system resilience. While research on Connected and Autonomous Vehicles (CAV) has been extensive, there is a lack of clarity in distinguishing SDVs from non-SDVs and a need to consolidate cybersecurity research. SDVs, with their extensive connectivity, have a broader attack surface. Besides, their software-centric nature introduces additional vulnerabilities. This paper provides a comprehensive examination of SDVs, detailing their ecosystem, enabling technologies, and the principal cyberattack entry points that arise from their architectural and operational characteristics. We also introduce a novel, layered taxonomy that maps concrete exploit techniques onto core SDV properties and attack paths, and use it to analyze representative studies and experimental approaches.

Paperid: 2440, https://arxiv.org/pdf/2510.09661.pdf

Abstract:
We present Core Mondrian, a scalable extension of the Original Mondrian partition-based anonymization algorithm. A modular strategy layer supports k-anonymity, allowing new privacy models to be added easily. A hybrid recursive/queue execution engine exploits multi-core parallelism while maintaining deterministic output. Utility-preserving enhancements include NaN-pattern pre-partitioning, metric-driven cut scoring, and dynamic suppression budget management. Experiments on the 48k-record UCI ADULT dataset and synthetically scaled versions up to 1M records achieve lower Discernibility Metric scores than Original Mondrian for numeric quasi-identifier sets while parallel processing delivers up to 4x speedup vs. sequential Core Mondrian. Core Mondrian enables privacy-compliant equity analytics at production scale.

Paperid: 2441, https://arxiv.org/pdf/2510.09621.pdf

Abstract:
The Metaverse utilizes emerging technologies such as Extended Reality (XR), Artificial Intelligence (AI), blockchain, and digital twins to provide an immersive and interactive virtual experience. As Metaverse continues to evolve, it bring a range of security and privacy threats, such as identity management, data governance, and user interactions. This survey aims to provide a comprehensive review of the enabling technologies for the Metaverse. It also aims to provide a thorough analysis of key vulnerabilities and threats that may compromise its sustainability and user safety. We perform a systematic literature review (SLR) to identify key vulnerabilities and their countermeasures in Metaverse platforms. Metaverse offers a much larger attack surface compared to conventional digital platforms. Immersive, decentralized, and permanent characteristics of the Metaverse generate new vulnerabilities. Although there are many countermeasures to these vulnerabilities, most of them are theoretical or have not been tested in real-world environments. Our review highlights current advancements, identifies research gaps, and outlines future directions to ensure a secure, resilient, and ethically governed Metaverse.

Paperid: 2442, https://arxiv.org/pdf/2510.09620.pdf

Abstract:
The excitement brought by the development of AI agents came alongside arising problems. These concerns centered around users' trust issues towards AIs, the risks involved, and the difficulty of attributing responsibilities and liabilities. Current solutions only attempt to target each problem separately without acknowledging their inter-influential nature. The Trust, Risk and Liability (TRL) framework proposed in this paper, however, ties together the interdependent relationships of trust, risk, and liability to provide a systematic method of building and enhancing trust, analyzing and mitigating risks, and allocating and attributing liabilities. It can be applied to analyze any application scenarios of AI agents and suggest appropriate measures fitting to the context. The implications of the TRL framework lie in its potential societal impacts, economic impacts, ethical impacts, and more. It is expected to bring remarkable values to addressing potential challenges and promoting trustworthy, risk-free, and responsible usage of AI in 6G networks.

Paperid: 2443, https://arxiv.org/pdf/2510.09615.pdf

Abstract:
Large language models (LLMs) are increasingly integrated into biomedical research workflows--from literature triage and hypothesis generation to experimental design--yet this expanded utility also heightens dual-use concerns, including the potential misuse for guiding toxic compound synthesis. In response, this study shows a Biosecurity Agent that comprises four coordinated modes across the model lifecycle: dataset sanitization, preference alignment, run-time guardrails, and automated red teaming. For dataset sanitization (Mode 1), evaluation is conducted on CORD-19, a COVID-19 Open Research Dataset of coronavirus-related scholarly articles. We define three sanitization tiers--L1 (compact, high-precision), L2 (human-curated biosafety terms), and L3 (comprehensive union)--with removal rates rising from 0.46% to 70.40%, illustrating the safety-utility trade-off. For preference alignment (Mode 2), DPO with LoRA adapters internalizes refusals and safe completions, reducing end-to-end attack success rate (ASR) from 59.7% to 3.0%. At inference (Mode 3), run-time guardrails across L1-L3 show the expected security-usability trade-off: L2 achieves the best balance (F1 = 0.720, precision = 0.900, recall = 0.600, FPR =0.067), while L3 offers stronger jailbreak resistance at the cost of higher false positives. Under continuous automated red-teaming (Mode 4), no successful jailbreaks are observed under the tested protocol. Taken together, our biosecurity agent offers an auditable, lifecycle-aligned framework that reduces attack success while preserving benign utility, providing safeguards for the use of LLMs in scientific research and setting a precedent for future agent-level security protections.

Paperid: 2444, https://arxiv.org/pdf/2510.09494.pdf

Abstract:
As cloud infrastructure evolves to support dynamic and distributed workflows, accelerated now by AI-driven processes, the outdated model of standing permissions has become a critical vulnerability. Based on the Cloud Security Alliance (CSA) Top Threats to Cloud Computing Deep Dive 2025 Report, our analysis details how standing permissions cause catastrophic cloud breaches. While current security tools are addressing network and API security, the challenge of securing granular data access remains. Removing standing permissions at the data level is as critical as it is at the network level, especially for companies handling valuable data at scale. In this white paper, we introduce an innovative architecture based on on-demand data enclaves to address this gap directly. Our approach enables Zero Standing Privilege (ZSP) and Just-in-Time (JIT) principles at the data level. We replace static permissions with temporary data contracts that enforce proactive protection. This means separation is built around the data requested on-demand, providing precise access and real time monitoring for individual records instead of datasets. This solution drastically reduces the attack surface, prevents privilege creep, and simplifies auditing, offering a vital path for enterprises to transition to a more secure and resilient data environment.

Paperid: 2445, https://arxiv.org/pdf/2510.09272.pdf

Abstract:
The XNU kernel is the basis of Apple's operating systems. Although labeled as a hybrid kernel, it is found to generally operate in a monolithic manner by defining a single privileged trust zone in which all system functionality resides. This has security implications, as a kernel compromise has immediate and significant effects on the entire system. Over the past few years, Apple has taken steps towards a more compartmentalized kernel architecture and a more microkernel-like design. To date, there has been no scientific discussion of SPTM and related security mechanisms. Therefore, the understanding of the system and the underlying security mechanisms is minimal. In this paper, we provide a comprehensive analysis of new security mechanisms and their interplay, and create the first conclusive writeup considering all current mitigations. SPTM acts as the sole authority regarding memory retyping. Our analysis reveals that, through SPTM domains based on frame retyping and memory mapping rule sets, SPTM introduces domains of trust into the system, effectively gapping different functionalities from one another. Gapped functionality includes the TXM, responsible for code signing and entitlement verification. We further demonstrate how this introduction lays the groundwork for the most recent security feature of Exclaves, and conduct an in-depth analysis of its communication mechanisms. We discover multifold ways of communication, most notably xnuproxy as a secure world request handler, and the Tightbeam IPC framework. The architecture changes are found to increase system security, with key and sensitive components being moved out of XNU's direct reach. This also provides additional security guarantees in the event of a kernel compromise, which is no longer an immediate threat at the highest trust level.

Paperid: 2446, https://arxiv.org/pdf/2510.09006.pdf

Abstract:
Future G network's new reality is a widespread cyber-physical environment created by Integrated Sensing and Communication (ISAC). It is a crucial technology that transforms wireless connections into ubiquitous sensors. ISAC unlocks transformative new capabilities, powering autonomous systems, augmented human sensing, and next-generation immersive applications, such as digital twins. However, this new reality fundamentally reshapes the security landscape. The primary security concern shifts from the traditional focus on data protection to a new priority: safeguarding the integrity of the system's perception of physical reality itself. This perception can be perilously manipulated by sophisticated attacks such as sensing eavesdropping, phantom dangers, and invisible threats, potentially resulting in direct and catastrophic physical harm. Traditional security measures, such as signature-based detection, are insufficient to counter these perception-level threats that mimic genuine physical signals. A proactive, layered, defense-in-depth strategy is required, integrating physical, environmental, intelligence, and architectural security measures to build a trustworthy ecosystem. Additionally, realizing ISAC's potential responsibly also depends on parallel efforts in global standardization and strong governance to address the significant challenges of privacy, liability, and the technology's dual-use.

Paperid: 2447, https://arxiv.org/pdf/2510.08343.pdf

Abstract:
Fully Homomorphic Encryption (FHE) enables the evaluation of programs directly on encrypted data. However, because only basic operations can be performed on ciphertexts, programs must be expressed as boolean or arithmetic circuits. This low-level representation makes implementing applications for FHE significantly more cumbersome than writing code in a high-level language. To reduce this burden, several transpilers have been developed that translate high-level code into circuit representations. In this work, we extend the range of high-level languages that can target FHE by introducing a transpiler for Haskell, which converts Haskell programs into Boolean circuits suitable for homomorphic evaluation. Our second contribution is the automatic parallelization of these generated circuits. We implement an evaluator that executes gates in parallel by parallelizing each layer of the circuit. We demonstrate the effectiveness of our approach on two key applications: Private Information Retrieval (PIR) and the AES encryption standard. Prior work has parallelized AES encryption manually. We demonstrate that the automated method outperforms some but not all manual parallelizations of AES evaluations under FHE. We achieve an evaluation time of 28 seconds for a parallel execution with 16 threads and an evaluation time of 8 seconds for a parallel execution with 100 threads

Paperid: 2448, https://arxiv.org/pdf/2510.08225.pdf

Abstract:
This paper presents TracE2E, a middleware written in Rust, that can provide both data explainability and compliance across multiple nodes. By mediating inputs and outputs of processes, TracE2E records provenance information and enforces data-protection policies (e.g., confidentiality, integrity) that depend on the recorded provenance. Unlike existing approaches that necessitate substantial application modifications, TracE2E is designed for easy integration into existing and future applications through a wrapper of the Rust standard library's IO module. We describe how TracE2E consistently records provenance information across nodes, and we demonstrate how the compliance layer of TracE2E can accommodate the enforcement of multiple policies.

Paperid: 2449, https://arxiv.org/pdf/2510.07457.pdf

Abstract:
Machine Learning (ML) is making its way into fields such as healthcare, finance, and Natural Language Processing (NLP), and concerns over data privacy and model confidentiality continue to grow. Privacy-preserving Machine Learning (PPML) addresses this challenge by enabling inference on private data without revealing sensitive inputs or proprietary models. Leveraging Secure Computation techniques from Cryptography, two widely studied approaches in this domain are Fully Homomorphic Encryption (FHE) and Garbled Circuits (GC). This work presents a comparative evaluation of FHE and GC for secure neural network inference. A two-layer neural network (NN) was implemented using the CKKS scheme from the Microsoft SEAL library (FHE) and the TinyGarble2.0 framework (GC) by IntelLabs. Both implementations are evaluated under the semi-honest threat model, measuring inference output error, round-trip time, peak memory usage, communication overhead, and communication rounds. Results reveal a trade-off: modular GC offers faster execution and lower memory consumption, while FHE supports non-interactive inference.

Paperid: 2450, https://arxiv.org/pdf/2510.07193.pdf

Abstract:
Quantum learning from remotely accessed quantum compute and data must address two key challenges: verifying the correctness of data and ensuring the privacy of the learner's data-collection strategies and resulting conclusions. The covert (verifiable) learning model of Canetti and Karchmer (TCC 2021) provides a framework for endowing classical learning algorithms with such guarantees. In this work, we propose models of covert verifiable learning in quantum learning theory and realize them without computational hardness assumptions for remote data access scenarios motivated by established quantum data advantages. We consider two privacy notions: (i) strategy-covertness, where the eavesdropper does not gain information about the learner's strategy; and (ii) target-covertness, where the eavesdropper does not gain information about the unknown object being learned. We show: Strategy-covert algorithms for making quantum statistical queries via classical shadows; Target-covert algorithms for learning quadratic functions from public quantum examples and private quantum statistical queries, for Pauli shadow tomography and stabilizer state learning from public multi-copy and private single-copy quantum measurements, and for solving Forrelation and Simon's problem from public quantum queries and private classical queries, where the adversary is a unidirectional or i.i.d. ancilla-free eavesdropper. The lattermost results in particular establish that the exponential separation between classical and quantum queries for Forrelation and Simon's problem survives under covertness constraints. Along the way, we design covert verifiable protocols for quantum data acquisition from public quantum queries which may be of independent interest. Overall, our models and corresponding algorithms demonstrate that quantum advantages are privately and verifiably achievable even with untrusted, remote data.

Paperid: 2451, https://arxiv.org/pdf/2510.07171.pdf

Abstract:
Industrial control system (ICS) operations use trusted endpoints like human machine interfaces (HMIs) and workstations to relay commands to programmable logic controllers (PLCs). Because most PLCs lack layered defenses, compromise of a trusted endpoint can drive unsafe actuator commands and risk safety-critical operation. This research presents an embedded intrusion detection system that runs inside the controller and uses header-level telemetry to detect and respond to network attacks. The system combines a semi-supervised anomaly detector and a supervised attack classifier. We evaluate the approach on a midstream oil-terminal testbed using three datasets collected during tanker-truck loading. The anomaly detector achieves zero missed attacks, corresponding to 0.998 Matthews correlation. The supervised stage attains 97.37 percent hold-out accuracy and 97.03 percent external accuracy. The embedded design adds a median of 2,031 microseconds of end-to-end latency and does not impact PLC's cycle time. The proposed architecture provides a multi-layer embedded security that meets the real-time requirements of an industrial system.

Paperid: 2452, https://arxiv.org/pdf/2510.06951.pdf

Abstract:
We examine the state of publicly available information about known exploitable vulnerabilities applicable to operational technology (OT) environments. Specifically, we analyze the Known Exploitable Vulnerabilities Catalog (KEVC) maintained by the US Department of Homeland Security Cybersecurity and Infrastructure Security Agency (CISA) to assess whether currently available data is sufficient for effective and reliable remediation in OT settings. Our team analyzed all KEVC entries through July 2025 to determine the extent to which OT environments can rely on existing remediation recommendations. We found that although most entries in the KEVC could affect OT environments, only 13% include vendor workarounds or mitigations as alternatives to patching. This paper also examines the feasibility of developing such alternatives based on vulnerability and exploit characteristics, and we present early evidence of success with this approach.

Paperid: 2453, https://arxiv.org/pdf/2510.06868.pdf

Abstract:
We consider image transmission via deep joint source-channel coding (DeepJSCC) over multi-hop additive white Gaussian noise (AWGN) channels by training a DeepJSCC encoder-decoder pair with a pre-trained deep hash distillation (DHD) module to semantically cluster images, facilitating security-oriented applications through enhanced semantic consistency and improving the perceptual reconstruction quality. We train the DeepJSCC module to both reduce mean square error (MSE) and minimize cosine distance between DHD hashes of source and reconstructed images. Significantly improved perceptual quality as a result of semantic alignment is illustrated for different multi-hop settings, for which classical DeepJSCC may suffer from noise accumulation, measured by the learned perceptual image patch similarity (LPIPS) metric.

Paperid: 2454, https://arxiv.org/pdf/2510.06823.pdf

Abstract:
We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as the difficulty for attackers to manipulate answers to user prompts by placing malicious content on the web. GEs integrate two functions: web search and answer generation that cites web pages using large language models. Because anyone can publish information on the web, GEs are vulnerable to poisoning attacks. Existing studies of citation evaluation focus on how faithfully answer content reflects cited sources, leaving unexamined which web sources should be selected as citations to defend against poisoning attacks. To fill this gap, we introduce evaluation criteria that assess poisoning threats using the citation information contained in answers. Our criteria classify the publisher attributes of citations to estimate the content-injection barrier thereby revealing the threat of poisoning attacks in current GEs. We conduct experiments in political domains in Japan and the United States (U.S.) using our criteria and show that citations from official party websites (primary sources) are approximately $25\%$--$45\%$ in the U.S. and $60\%$--$65\%$ in Japan, indicating that U.S. political answers are at higher risk of poisoning attacks. We also find that sources with low content-injection barriers are frequently cited yet are poorly reflected in answer content. To mitigate this threat, we discuss how publishers of primary sources can increase exposure of their web content in answers and show that well-known techniques are limited by language differences.

Paperid: 2455, https://arxiv.org/pdf/2510.06420.pdf

Abstract:
The emulation of multi-step attacks attributed to advanced persistent threats is valuable for training defenders and evaluating defense tools. In this paper, we discuss the numerous challenges and desired attributes associated with such automation. Additionally, we introduce the use of Effects Language (EL), a visual programming language with graph-based operational semantics, as a solution to address many of these challenges and requirements. We formally define the execution semantics of EL, and prove important execution properties. Furthermore, we showcase the application of EL to codify attacks using an example from one of the publicly available attack scenarios. We also demonstrate how EL can be utilized to provide proof-of-attack of complex multi-step attacks. Our results highlight the improvements in time and resource efficiency achieved through the use of EL for repeatable automation.

Paperid: 2456, https://arxiv.org/pdf/2510.06097.pdf

Abstract:
Chen, Liu, and Zhandry [CLZ22] introduced the problems $S|LWE\rangle$ and $C|LWE\rangle$ as quantum analogues of the Learning with Errors problem, designed to construct quantum algorithms for the Inhomogeneous Short Integer Solution ($ISIS$) problem. Several later works have used this framework for constructing new quantum algorithms in specific cases. However, the general relation between all these problems is still unknown. In this paper, we investigate the equivalence between $S|LWE\rangle$ and $ISIS$. We present the first fully generic reduction from $ISIS$ to $S|LWE\rangle$, valid even in the presence of errors in the underlying algorithms. We then explore the reverse direction, introducing an inhomogeneous variant of $C|LWE\rangle$, denoted $IC|LWE\rangle$, and show that $IC|LWE\rangle$ reduces to $S|LWE\rangle$. Finally, we prove that, under certain recoverability conditions, an algorithm for $ISIS$ can be transformed into one for $S|LWE\rangle$. We instantiate this reverse reduction by tweaking a known algorithm for $(I)SIS_\infty$ in order to construct quantum algorithm for $S|LWE\rangle$ when the alphabet size q is a small power of 2, recovering some results of Bai et al. [BJK+ 25]. Our results thus clarify the landscape of reductions between $S|LWE\rangle$ and $ISIS$, and we show both their strong connection as well as the remaining barriers for showing full equivalence.

Paperid: 2457, https://arxiv.org/pdf/2510.05830.pdf

Abstract:
Decentralized Autonomous Organizations (DAOs) aim to enable participatory governance, but in practice face challenges of voter apathy, concentration of voting power, and misaligned delegation. Existing delegation mechanisms often reinforce visibility biases, where a small set of highly ranked delegates accumulate disproportionate influence regardless of their alignment with the broader community. In this paper, we conduct an empirical study of delegation in DAO governance, combining on-chain data from five major protocols with off-chain discussions from 14 DAO forums. We develop a methodology to link forum participants to on-chain addresses, extract governance interests using large language models, and compare these interests against delegates' historical behavior. Our analysis reveals that delegations are frequently misaligned with token holders' expressed priorities and that current ranking-based interfaces exacerbate power concentration. We argue that incorporating interest alignment into delegation processes could mitigate these imbalances and improve the representativeness of DAO decision-making.

Paperid: 2458, https://arxiv.org/pdf/2510.05824.pdf

Abstract:
Security measures are essential in the automotive industry to detect intrusions in-vehicle networks. However, developing a one-size-fits-all Intrusion Detection System (IDS) is challenging because each vehicle has unique data profiles. This is due to the complex and dynamic nature of the data generated by vehicles regarding their model, driving style, test environment, and firmware update. To address this issue, a universal IDS has been developed that can be applied to all types of vehicles without the need for customization. Unlike conventional IDSs, the universal IDS can adapt to evolving data security issues resulting from firmware updates. In this study, a new hybrid approach has been developed, combining Pearson correlation with deep learning techniques. This approach has been tested using data obtained from four distinct mechanical and electronic vehicles, including Tesla, Sonata, and two Kia models. The data has been combined into two frequency datasets, and wavelet transformation has been employed to convert them into the frequency domain, enhancing generalizability. Additionally, a statistical method based on independent rule-based systems using Pearson correlation has been utilized to improve system performance. The system has been compared with eight different IDSs, three of which utilize the universal approach, while the remaining five are based on conventional techniques. The accuracy of each system has been evaluated through benchmarking, and the results demonstrate that the hybrid system effectively detects intrusions in various vehicle models.

Paperid: 2459, https://arxiv.org/pdf/2510.05771.pdf

Abstract:
Understanding how cognitive biases influence adversarial decision-making is essential for developing effective cyber defenses. Capture-the-Flag (CTF) competitions provide an ecologically valid testbed to study attacker behavior at scale, simulating real-world intrusion scenarios under pressure. We analyze over 500,000 submission logs from picoCTF, a large educational CTF platform, to identify behavioral signatures of cognitive biases with defensive implications. Focusing on availability bias and the sunk cost fallacy, we employ a mixed-methods approach combining qualitative coding, descriptive statistics, and generalized linear modeling. Our findings show that participants often submitted flags with correct content but incorrect formatting (availability bias), and persisted in attempting challenges despite repeated failures and declining success probabilities (sunk cost fallacy). These patterns reveal that biases naturally shape attacker behavior in adversarial contexts. Building on these insights, we outline a framework for bias-informed adaptive defenses that anticipate, rather than simply react to, adversarial actions.

Paperid: 2460, https://arxiv.org/pdf/2510.05709.pdf

Abstract:
Before adopting a new large language model (LLM) architecture, it is critical to understand vulnerabilities accurately. Existing evaluations can be difficult to trust, often drawing conclusions from LLMs that are not meaningfully comparable, relying on heuristic inputs or employing metrics that fail to capture the inherent uncertainty. In this paper, we propose a principled and practical end-to-end framework for evaluating LLM vulnerabilities to prompt injection attacks. First, we propose practical approaches to experimental design, tackling unfair LLM comparisons by considering two practitioner scenarios: when training an LLM and when deploying a pre-trained LLM. Second, we address the analysis of experiments and propose a Bayesian hierarchical model with embedding-space clustering. This model is designed to improve uncertainty quantification in the common scenario that LLM outputs are not deterministic, test prompts are designed imperfectly, and practitioners only have a limited amount of compute to evaluate vulnerabilities. We show the improved inferential capabilities of the model in several prompt injection attack settings. Finally, we demonstrate the pipeline to evaluate the security of Transformer versus Mamba architectures. Our findings show that consideration of output variability can suggest less definitive findings. However, for some attacks, we find notably increased Transformer and Mamba-variant vulnerabilities across LLMs with the same training data or mathematical ability.

Paperid: 2461, https://arxiv.org/pdf/2510.05440.pdf

Abstract:
We initiate an investigation of learning tasks in a setting where the learner is given access to two competing provers, only one of which is honest. Specifically, we consider the power of such learners in assessing purported properties of opaque models. Following prior work that considers the power of competing provers in different settings, we call this setting refereed learning. After formulating a general definition of refereed learning tasks, we show refereed learning protocols that obtain a level of accuracy that far exceeds what is obtainable at comparable cost without provers, or even with a single prover. We concentrate on the task of choosing the better one out of two black-box models, with respect to some ground truth. While we consider a range of parameters, perhaps our most notable result is in the high-precision range: For all $\varepsilon>0$ and ambient dimension $d$, our learner makes only one query to the ground truth function, communicates only $(1+\frac{1}{\varepsilon^2})\cdot\text{poly}(d)$ bits with the provers, and outputs a model whose loss is within a multiplicative factor of $(1+\varepsilon)$ of the best model's loss. Obtaining comparable loss with a single prover would require the learner to access the ground truth at almost all of the points in the domain. To obtain this bound, we develop a technique that allows the learner to sample, using the provers, from a distribution that is not efficiently samplable to begin with. We find this technique to be of independent interest. We also present lower bounds that demonstrate the optimality of our protocols in a number of respects, including prover complexity, number of samples, and need for query access.

Paperid: 2462, https://arxiv.org/pdf/2510.05419.pdf

Abstract:
This paper describes pseudonyms for the upcoming European Identity Wallet (EUDIW) architecture from both a cryptographic and an implementation perspective. Its main goal is to provide technical insights into the achievable properties and cryptographic realizations. In particular, we (1) outline the security and privacy requirements of EUDI pseudonyms as the basis for building consensus on the cross-country decision maker level; (2) sketch an abstract cryptographic protocol that fulfills these requirements; and (3) suggest two instantiation options for the protocol sketch based on well-studied building A complete specification of the formal properties, as well as the specific set of credential issuance, provisioning, and pseudonym presentation generation is outside the scope of this paper, but is expected to follow as future work.

Paperid: 2463, https://arxiv.org/pdf/2510.05376.pdf

Abstract:
Zero-knowledge Ethereum Virtual Machines (zkEVMs) must reconcile a fundamental contradiction: the Ethereum Virtual Machine was designed for transparent sequential execution, while zero-knowledge proofs require algebraic circuit representations. This survey provides the first systematic analysis of how existing major production zkEVM implementations resolve this tension through distinct constraint engineering strategies. We develop a comparative framework that maps the design space across three architectural dimensions. First, arithmetization schemes reveal stark trade-offs: R1CS requires compositional gadget libraries, PLONKish achieves elegance through custom gates that capture complex EVM opcodes in single constraints, while the homogeneous structure of AIR fundamentally mismatches the irregular instruction set of EVM. Second, dispatch mechanisms determine constraint activation patterns: selector-based systems waste trace width on inactive constraints, while ROM-based approaches trade memory lookups for execution flexibility. Third, the Type 1-4 spectrum quantifies an inescapable trade-off: the bit-level EVM compatibility of Type 1 demands significantly higher constraint complexity than the custom instruction sets of Type 4. Beyond cataloging implementations, we identify critical open problems across multiple domains: performance barriers preventing sub-second proving, absence of formal verification for constraint-to-EVM semantic equivalence, lack of standardized benchmarking frameworks, and architectural gaps in hybrid zkEVM/zkVM designs, decentralized prover coordination, privacy preservation, and interoperability.

Paperid: 2464, https://arxiv.org/pdf/2510.05157.pdf

Abstract:
This paper presents a controlled study of adversarial reinforcement learning in network security through a custom OpenAI Gym environment that models brute-force attacks and reactive defenses on multi-port services. The environment captures realistic security trade-offs including background traffic noise, progressive exploitation mechanics, IP-based evasion tactics, honeypot traps, and multi-level rate-limiting defenses. Competing attacker and defender agents are trained using Deep Q-Networks (DQN) within a zero-sum reward framework, where successful exploits yield large terminal rewards while incremental actions incur small costs. Through systematic evaluation across multiple configurations (varying trap detection probabilities, exploitation difficulty thresholds, and training regimens), the results demonstrate that defender observability and trap effectiveness create substantial barriers to successful attacks. The experiments reveal that reward shaping and careful training scheduling are critical for learning stability in this adversarial setting. The defender consistently maintains strategic advantage across 50,000+ training episodes, with performance gains amplifying when exposed to complex defensive strategies including adaptive IP blocking and port-specific controls. Complete implementation details, reproducible hyperparameter configurations, and architectural guidelines are provided to support future research in adversarial RL for cybersecurity. The zero-sum formulation and realistic operational constraints make this environment suitable for studying autonomous defense systems, attacker-defender co-evolution, and transfer learning to real-world network security scenarios.

Paperid: 2465, https://arxiv.org/pdf/2510.04992.pdf

Abstract:
Quantum cryptographic definitions are often sensitive to the number of copies of the cryptographic states revealed to an adversary. Making definitional changes to the number of copies accessible to an adversary can drastically affect various aspects including the computational hardness, feasibility, and applicability of the resulting cryptographic scheme. This phenomenon appears in many places in quantum cryptography, including quantum pseudorandomness and unclonable cryptography. To address this, we present a generic approach to boost single-copy security to multi-copy security and apply this approach to many settings. As a consequence, we obtain the following new results: -One-copy stretch pseudorandom state generators (under mild assumptions) imply the existence of t-copy stretch pseudorandom state generators, for any fixed polynomial t. -One-query pseudorandom unitaries with short keys (under mild assumptions) imply the existence of t-query pseudorandom unitaries with short keys, for any fixed polynomial t. -Assuming indistinguishability obfuscation and other standard cryptographic assumptions, there exist identical-copy secure unclonable primitives such as public-key quantum money and quantum copy-protection.

Paperid: 2466, https://arxiv.org/pdf/2510.04984.pdf

Abstract:
Electroencephalogram monitoring devices and online data repositories hold large amounts of data from individuals participating in research and medical studies without direct reference to personal identifiers. This paper explores what types of personal and health information have been detected and classified within task-free EEG data. Additionally, we investigate key characteristics of the collected resting-state and sleep data, in order to determine the privacy risks involved with openly available EEG data. We used Google Scholar, Web of Science and searched relevant journals to find studies which classified or detected the presence of various disorders and personal information in resting state and sleep EEG. Only English full-text peer-reviewed journal articles or conference papers about classifying the presence of medical disorders between individuals were included. A quality analysis carried out by 3 reviewers determined general paper quality based on specified evaluation criteria. In resting state EEG, various disorders including Autism Spectrum Disorder, Parkinson's disease, and alcohol use disorder have been classified with high classification accuracy, often requiring only 5 mins of data or less. Sleep EEG tends to hold classifiable information about sleep disorders such as sleep apnea, insomnia, and REM sleep disorder, but usually involve longer recordings or data from multiple sleep stages. Many classification methods are still developing but even today, access to a person's EEG can reveal sensitive personal health information. With an increasing ability of machine learning methods to re-identify individuals from their EEG data, this review demonstrates the importance of anonymization, and the development of improved tools for keeping study participants and medical EEG users' privacy safe.

Paperid: 2467, https://arxiv.org/pdf/2510.04652.pdf

Abstract:
In the digital age, data frequently crosses organizational and jurisdictional boundaries, making effective governance essential. Usage control policies have emerged as a key paradigm for regulating data usage, safeguarding privacy, protecting intellectual property, and ensuring compliance with regulations. A central mechanism for usage control is the handling of obligations, which arise as a side effect of using and sharing data. Effective monitoring of obligations requires capturing usage traces and accounting for temporal aspects such as start times and deadlines, as obligations may evolve over times into different states, such as fulfilled, violated, or expired. While several solutions have been proposed for obligation monitoring, they often lack formal semantics or provide limited support for reasoning over obligation states. To address these limitations, we extend GUCON, a policy framework grounded in the formal semantics of SPAQRL graph patterns, to explicitly model the temporal aspects of an obligation. This extension enables the expressing of temporal obligations and supports continuous monitoring of their evolving states based on usage traces stored in temporal knowledge graphs. We demonstrate how this extended model can be represented using RDF-star and SPARQL-star and propose an Obligation State Manager that monitors obligation states and assess their compliance with respect to usage traces. Finally, we evaluate both the extended model and its prototype implementation.

Paperid: 2468, https://arxiv.org/pdf/2510.04640.pdf

Abstract:
The dependence of power-consumption on the processed data is a known vulnerability of CMOS circuits, resulting in side channels which can be exploited by power-based side channel attacks (SCAs). These attacks can extract sensitive information, such as secret keys, from the implementation of cryptographic algorithms. Existing countermeasures against power-based side channel attacks focus on analyzing information leakage at the byte level. However, this approach neglects the impact of individual bits on the overall resistance of a cryptographic implementation. In this work, we present a countermeasure based on single-bit leakage. The results suggest that the proposed countermeasure cannot be broken by attacks using conventional SCA leakage models.

Paperid: 2469, https://arxiv.org/pdf/2510.04529.pdf

Abstract:
We present the first construction of a computational Certified Deletion Property (CDP) achievable with classical communication, derived from the compilation of the non-local Magic Square Game (MSG). We leverage the KLVY compiler to transform the non-local MSG into a 2-round interactive protocol, rigorously demonstrating that this compilation preserves the game-specific CDP. Previously, the quantum value and rigidity of the compiled game were investigated. We emphasize that we are the first to investigate CDP (local randomness in [Fu and Miller, Phys. Rev. A 97, 032324 (2018)]) for the compiled game. Then, we combine this CDP with the framework [Kitagawa, Morimae, and Yamakawa, Eurocrypt 2025] to construct Secure Key Leasing with classical Lessor (cSKL). SKL enables the Lessor to lease the secret key to the Lessee and verify that a quantum Lessee has indeed deleted the key. In this paper, we realize cSKL for PKE, PRF, and digital signature. Compared to prior works for cSKL, we realize cSKL for PRF and digital signature for the first time. In addition, we succeed in weakening the assumption needed to construct cSKL.

Paperid: 2470, https://arxiv.org/pdf/2510.04118.pdf

Abstract:
Rapid digitization of critical infrastructure has made cyberwarfare one of the important dimensions of modern conflicts. Attacking the critical infrastructure is an attractive pre-emptive proposition for adversaries as it can be done remotely without crossing borders. Such attacks disturb the support systems of the opponents to launch any offensive activities, crippling their fighting capabilities. Cyberattacks during cyberwarfare can not only be used to steal information, but also to spread disinformation to bring down the morale of the opponents. Recent wars in Europe, Africa, and Asia have demonstrated the scale and sophistication that the warring nations have deployed to take the early upper hand. In this work, we focus on the military action launched by India, code-named Operation Sindoor, to dismantle terror infrastructure emanating from Pakistan and the cyberattacks launched by Pakistan. In particular, we study the malware used by Pakistan APT groups to deploy Remote Access Trojans in Indian systems. We provide details of the tactics and techniques used in the RAT deployment and develop a telemetry framework to collect necessary event logs using Osquery with a custom extension. Finally, we develop a detection rule that can be readily deployed to detect the presence of the RAT or any exploitation performed by the malware.

Paperid: 2471, https://arxiv.org/pdf/2510.03697.pdf

Abstract:
We introduce a modified Schnorr signature scheme to allow for time-bound signatures for transaction fee auction bidding and smart contract purposes in a blockchain context, ensuring an honest producer can only validate a signature before a given block height. The immutable blockchain is used as a source of universal time for the signature scheme. We show the use of such a signature scheme leads to lower MEV revenue for builders. We then apply our time-bound signatures to Ethereum's EIP-1559 and show how it can be used to mitigate the effect of MEV on predicted equilibrium strategies.

Paperid: 2472, https://arxiv.org/pdf/2510.03631.pdf

Abstract:
With advances in wireless communication and growing spectrum scarcity, Spectrum Access Systems (SASs) offer an opportunistic solution but face significant security challenges. Regulations require disclosure of location coordinates and transmission details, exposing user privacy and anonymity during spectrum queries, while the database operations themselves permit Denial-of-Service (DoS) attacks. As location-based services, SAS is also vulnerable to compromised or malicious users conducting spoofing attacks. These threats are further amplified given the quantum computing advancements. Thus, we propose QPADL, the first post-quantum (PQ) secure framework that simultaneously ensures privacy, anonymity, location verification, and DoS resilience while maintaining efficiency for large-scale spectrum access systems. QPADL introduces SAS-tailored private information retrieval for location privacy, a PQ-variant of Tor for anonymity, and employs advanced signature constructions for location verification alongside client puzzle protocols and rate-limiting technique for DoS defense. We formally assess its security and conduct a comprehensive performance evaluation, incorporating GPU parallelization and optimization strategies to demonstrate practicality and scalability.

Paperid: 2473, https://arxiv.org/pdf/2510.03610.pdf

Abstract:
Agentic AI is transforming security by automating many tasks being performed manually. While initial agentic approaches employed a monolithic architecture, the Model-Context-Protocol has now enabled a remote-procedure call (RPC) paradigm to agentic applications, allowing for the flexible construction and composition of multi-function agents. This paper describes PentestMCP, a library of MCP server implementations that support agentic penetration testing. By supporting common penetration testing tasks such as network scanning, resource enumeration, service fingerprinting, vulnerability scanning, exploitation, and post-exploitation, PentestMCP allows a developer to customize multi-agent workflows for performing penetration tests.

Paperid: 2474, https://arxiv.org/pdf/2510.03407.pdf

Abstract:
Research management applications (RMA) are widely used in clinical research environments to collect, transmit, analyze, and store sensitive data. This data is so valuable making RMAs susceptible to security threats. This analysis, analyzes RMAs' security, focusing on Research Electronic Data Capture (REDCap) as an example. We explore the strengths and vulnerabilities within RMAs by evaluating the architecture, data flow, and security features. We identify and assess potential risks using the MITRE ATT\&CK framework and STRIDE model. We assess REDCap's defenses against common attack vectors focusing on security to provide confidentiality, integrity, availability, non-repudiation, and authentication. We conclude by proposing recommendations for enhancing the security of RMAs, ensuring that critical research data remains protected without compromising usability. This research aims to contribute towards a more secure framework for managing sensitive information in research-intensive environments.

Paperid: 2475, https://arxiv.org/pdf/2510.03320.pdf

Abstract:
Deep neural networks (NNs) for computer vision are vulnerable to adversarial attacks, i.e., miniscule malicious changes to inputs may induce unintuitive outputs. One key approach to verify and mitigate such robustness issues is to falsify expected output behavior. This allows, e.g., to locally proof security, or to (re)train NNs on obtained adversarial input examples. Due to the black-box nature of NNs, current attacks only falsify a class of the final output, such as flipping from $\texttt{stop_sign}$ to $\neg\texttt{stop_sign}$. In this short position paper we generalize this to search for generally illogical behavior, as considered in NN verification: falsify constraints (concept-based properties) involving further human-interpretable concepts, like $\texttt{red}\wedge\texttt{octogonal}\rightarrow\texttt{stop_sign}$. For this, an easy implementation of concept-based properties on already trained NNs is proposed using techniques from explainable artificial intelligence. Further, we sketch the theoretical proof that attacks on concept-based properties are expected to have a reduced search space compared to simple class falsification, whilst arguably be more aligned with intuitive robustness targets. As an outlook to this work in progress we hypothesize that this approach has potential to efficiently and simultaneously improve logical compliance and robustness.

Paperid: 2476, https://arxiv.org/pdf/2510.03319.pdf

Abstract:
Federated learning (FL) enables collaborative model training without sharing raw data but is vulnerable to gradient inversion attacks (GIAs), where adversaries reconstruct private data from shared gradients. Existing defenses either incur impractical computational overhead for embedded platforms or fail to achieve privacy protection and good model utility at the same time. Moreover, many defenses can be easily bypassed by adaptive adversaries who have obtained the defense details. To address these limitations, we propose SVDefense, a novel defense framework against GIAs that leverages the truncated Singular Value Decomposition (SVD) to obfuscate gradient updates. SVDefense introduces three key innovations, a Self-Adaptive Energy Threshold that adapts to client vulnerability, a Channel-Wise Weighted Approximation that selectively preserves essential gradient information for effective model training while enhancing privacy protection, and a Layer-Wise Weighted Aggregation for effective model aggregation under class imbalance. Our extensive evaluation shows that SVDefense outperforms existing defenses across multiple applications, including image classification, human activity recognition, and keyword spotting, by offering robust privacy protection with minimal impact on model accuracy. Furthermore, SVDefense is practical for deployment on various resource-constrained embedded platforms. We will make our code publicly available upon paper acceptance.

Paperid: 2477, https://arxiv.org/pdf/2510.03285.pdf

Abstract:
Recent advances in browser-based LLM agents have shown promise for automating tasks ranging from simple form filling to hotel booking or online shopping. Current benchmarks measure agent performance in controlled environments, such as containers or stable networks, where websites behave deterministically. However, in the real world, users access websites over networks and HTTPS connections that introduce instability from multiple sources: client-side, server-side issues or broader system failures. Moreover, live websites are prone to web attacks such Cross-Site Scripting, as well as general site modifications which can cause unexpected or malicious pop-ups or improper functionality. To address this gap, we present WAREX: Web Agent Reliability Evaluation on Existing Benchmarks. We measure the impact of WAREX across three popular benchmarks: WebArena, WebVoyager, and REAL. Our experiments show that introducing WAREX leads to significant drops in task success rates, highlighting the limited robustness of state-of-the-art agents.

Paperid: 2478, https://arxiv.org/pdf/2510.03035.pdf

Abstract:
The proliferation of digital technologies has led to unprecedented data collection, with facial data emerging as a particularly sensitive commodity. Companies are increasingly leveraging advanced facial recognition technologies, often without the explicit consent or awareness of individuals, to build sophisticated surveillance capabilities. This practice, fueled by weak and fragmented laws in many jurisdictions, has created a regulatory vacuum that allows for the commercialization of personal identity and poses significant threats to individual privacy and autonomy. This article introduces the concept of Facial Privacy. It analyzes the profound challenges posed by unregulated facial recognition by conducting a comprehensive review of existing legal frameworks. It examines and compares regulations such as the GDPR, Brazil's LGPD, Canada's PIPEDA, and privacy acts in China, Singapore, South Korea, and Japan, alongside sector-specific laws in the United States like the Illinois Biometric Information Privacy Act (BIPA). The analysis highlights the societal impacts of this technology, including the potential for discriminatory bias and the long-lasting harm that can result from the theft of immutable biometric data. Ultimately, the paper argues that existing legal loopholes and ambiguities leave individuals vulnerable. It proposes a new policy framework that shifts the paradigm from data as property to a model of inalienable rights, ensuring that fundamental human rights are upheld against unchecked technological expansion.

Paperid: 2479, https://arxiv.org/pdf/2510.02944.pdf

Abstract:
A random local function defined by a $d$-ary predicate $P$ is one where each output bit is computed by applying $P$ to $d$ randomly chosen bits of its input. These represent natural distributions of instances for constraint satisfaction problems. They were put forward by Goldreich as candidates for low-complexity one-way functions, and have subsequently been widely studied also as potential pseudo-random generators. We present a new search-to-decision reduction for random local functions defined by any predicate of constant arity. Given any efficient algorithm that can distinguish, with advantage $ε$, the output of a random local function with $m$ outputs and $n$ inputs from random, our reduction produces an efficient algorithm that can invert such functions with $\tilde{O}(m(n/ε)^2)$ outputs, succeeding with probability $Ω(ε)$. This implies that if a family of local functions is one-way, then a related family with shorter output length is family of pseudo-random generators. Prior to our work, all such reductions that were known required the predicate to have additional sensitivity properties, whereas our reduction works for any predicate. Our results also generalise to some super-constant values of the arity $d$, and to noisy predicates.

Paperid: 2480, https://arxiv.org/pdf/2510.02707.pdf

Abstract:
Adversarial attacks present a significant threat to modern machine learning systems. Yet, existing detection methods often lack the ability to detect unseen attacks or detect different attack types with a high level of accuracy. In this work, we propose a statistical approach that establishes a detection baseline before a neural network's deployment, enabling effective real-time adversarial detection. We generate a metric of adversarial presence by comparing the behavior of a compressed/uncompressed neural network pair. Our method has been tested against state-of-the-art techniques, and it achieves near-perfect detection across a wide range of attack types. Moreover, it significantly reduces false positives, making it both reliable and practical for real-world applications.

Paperid: 2481, https://arxiv.org/pdf/2510.02694.pdf

Abstract:
Industrial control systems (ICS) are vital to modern infrastructure but increasingly vulnerable to cybersecurity threats, particularly through weaknesses in their communication protocols. This paper presents MALF (Multi-Agent LLM Fuzzing Framework), an advanced fuzzing solution that integrates large language models (LLMs) with multi-agent coordination to identify vulnerabilities in industrial control protocols (ICPs). By leveraging Retrieval-Augmented Generation (RAG) for domain-specific knowledge and QLoRA fine-tuning for protocol-aware input generation, MALF enhances fuzz testing precision and adaptability. The multi-agent framework optimizes seed generation, mutation strategies, and feedback-driven refinement, leading to improved vulnerability discovery. Experiments on protocols like Modbus/TCP, S7Comm, and Ethernet/IP demonstrate that MALF surpasses traditional methods, achieving a test case pass rate (TCPR) of 88-92% and generating more exception triggers (ETN). MALF also maintains over 90% seed coverage and Shannon entropy values between 4.2 and 4.6 bits, ensuring diverse, protocol-compliant mutations. Deployed in a real-world Industrial Attack-Defense Range for power plants, MALF identified critical vulnerabilities, including three zero-day flaws, one confirmed and registered by CNVD. These results validate MALF's effectiveness in real-world fuzzing applications. This research highlights the transformative potential of multi-agent LLMs in ICS cybersecurity, offering a scalable, automated framework that sets a new standard for vulnerability discovery and strengthens critical infrastructure security against emerging threats.

Paperid: 2482, https://arxiv.org/pdf/2510.02487.pdf

Abstract:
The advancement of 6G technology has the potential to revolutionize the transportation sector and significantly improve how we travel. 6G-enabled Intelligent Transportation Systems (ITS) promise to offer high-speed, low-latency communication and advanced data analytics capabilities, supporting the development of safer, more efficient, and more sustainable transportation solutions. However, various security and privacy challenges were identified in the literature that must be addressed to enable the safe and secure deployment of 6G-ITS and ensure people's trust in using these technologies. This paper reviews the opportunities and challenges of 6G-ITS, particularly focusing on trust, security, and privacy, with special attention to quantum technologies that both enhance security through quantum key distribution and introduce new vulnerabilities. It discusses the potential benefits of 6G technology in the transportation sector, including improved communication, device interoperability support, data analytic capabilities, and increased automation for different components, such as transportation management and communication systems. A taxonomy of different attack models in 6G-ITS is proposed, and a comparison of the security threats in 5G-ITS and 6G-ITS is provided, along with potential mitigating solutions. This research highlights the urgent need for a comprehensive, multi-layered security framework spanning physical infrastructure protection, network protocol security, data management safeguards, application security measures, and trust management systems to effectively mitigate emerging security and privacy risks and ensure the integrity and resilience of future transportation ecosystems.

Paperid: 2483, https://arxiv.org/pdf/2510.02391.pdf

Abstract:
Android malware continues to evolve through obfuscation and polymorphism, posing challenges for both signature-based defenses and machine learning models trained on limited and imbalanced datasets. Synthetic data has been proposed as a remedy for scarcity, yet the role of large language models (LLMs) in generating effective malware data for detection tasks remains underexplored. In this study, we fine-tune GPT-4.1-mini to produce structured records for three malware families: BankBot, Locker/SLocker, and Airpush/StopSMS, using the KronoDroid dataset. After addressing generation inconsistencies with prompt engineering and post-processing, we evaluate multiple classifiers under three settings: training with real data only, real-plus-synthetic data, and synthetic data alone. Results show that real-only training achieves near perfect detection, while augmentation with synthetic data preserves high performance with only minor degradations. In contrast, synthetic-only training produces mixed outcomes, with effectiveness varying across malware families and fine-tuning strategies. These findings suggest that LLM-generated malware can enhance scarce datasets without compromising detection accuracy, but remains insufficient as a standalone training source.

Paperid: 2484, https://arxiv.org/pdf/2510.02378.pdf

Abstract:
This paper introduces a Bayesian approach to improve Interactive Voice Response (IVR) authentication processes used by financial institutions. Traditional IVR systems authenticate users through a static sequence of credentials, assuming uniform effectiveness among them. However, fraudsters exploit this predictability, selectively bypassing strong credentials. This study applies Bayes' Theorem and conditional probability modeling to evaluate fraud risk dynamically and adapt credential verification paths.

Paperid: 2485, https://arxiv.org/pdf/2510.02376.pdf

Abstract:
In this endeavor, a proof-of-concept homomorphic application is developed to determine the production readiness of encryption ecosystems. A movie recommendation app is implemented for this purpose and productionized through containerization and orchestration. By tuning deployment configurations, the computational limitations of Fully Homomorphic Encryption (FHE) are mitigated through additional infrastructure optimizations Index Terms: Reinforcement Learning, Orchestration, Homomorphic Encryption

Paperid: 2486, https://arxiv.org/pdf/2510.02371.pdf

Abstract:
Smart grids are exposed to passive eavesdropping, where attackers listen silently to communication links. Although no data is actively altered, such reconnaissance can reveal grid topology, consumption patterns, and operational behavior, creating a gateway to more severe targeted attacks. Detecting this threat is difficult because the signals it produces are faint, short-lived, and often disappear when traffic is examined by a single node or along a single timeline. This paper introduces a graph-centric, multimodal detector that fuses physical-layer and behavioral indicators over ego-centric star subgraphs and short temporal windows to detect passive attacks. To capture stealthy perturbations, a two-stage encoder is introduced: graph convolution aggregates spatial context across ego-centric star subgraphs, while a bidirectional GRU models short-term temporal dependencies. The encoder transforms heterogeneous features into a unified spatio-temporal representation suitable for classification. Training occurs in a federated learning setup under FedProx, improving robustness to heterogeneous local raw data and contributing to the trustworthiness of decentralized training; raw measurements remain on client devices. A synthetic, standards-informed dataset is generated to emulate heterogeneous HAN/NAN/WAN communications with wireless-only passive perturbations, event co-occurrence, and leak-safe splits. The model achieves a testing accuracy of 98.32% per-timestep (F1_{attack}=0.972) and 93.35% per-sequence at 0.15% FPR using a simple decision rule with run-length m=2 and threshold $τ=0.55$. The results demonstrate that combining spatial and temporal context enables reliable detection of stealthy reconnaissance while maintaining low false-positive rates, making the approach suitable for non-IID federated smart-grid deployments.

Paperid: 2487, https://arxiv.org/pdf/2510.02332.pdf

Abstract:
Neural linguistic steganography aims to embed information into natural text while preserving statistical undetectability. A fundamental challenge in this ffeld stems from tokenization ambiguity in modern tokenizers, which can lead to catastrophic decoding failures. The recent method, SyncPool, addresses this ambiguity by employing a coarse-grained synchronization mechanism over groups of ambiguous candidates. However, SyncPool sacriffces embedding capacity, as it utilizes the entire Shannon entropy of an ambiguous group solely for synchronization rather than for payload embedding. We propose a method named look-ahead Sync, which overcomes the capacity limitation of SyncPool while retaining its provable security guarantees. Our approach performs minimal synchronized sampling only on truly indistinguishable token sequences, while strategically preserving all other discernible paths to maximize embedding capacity. We provide theoretical proofs for the security of our method and analyze the gap between its achievable embedding capacity and the theoretical upper bound. Experiments on English (using Llama 3) and Chinese (using Qwen 2.5) benchmarks show that our method consistently approaches the theoretical capacity upper bound and signiffcantly outperforms SyncPool. The improvement in embedding rate exceeds 160% in English and 25% in Chinese, particularly in settings with larger candidate pools. This work represents a signiffcant step toward practical high-capacity provably secure linguistic steganography.

Paperid: 2488, https://arxiv.org/pdf/2510.02184.pdf

Abstract:
As image generation models grow increasingly powerful and accessible, concerns around authenticity, ownership, and misuse of synthetic media have become critical. The ability to generate lifelike images indistinguishable from real ones introduces risks such as misinformation, deepfakes, and intellectual property violations. Traditional watermarking methods either degrade image quality, are easily removed, or require access to confidential model internals - making them unsuitable for secure and scalable deployment. We are the first to introduce ZK-WAGON, a novel system for watermarking image generation models using the Zero-Knowledge Succinct Non Interactive Argument of Knowledge (ZK-SNARKs). Our approach enables verifiable proof of origin without exposing model weights, generation prompts, or any sensitive internal information. We propose Selective Layer ZK-Circuit Creation (SL-ZKCC), a method to selectively convert key layers of an image generation model into a circuit, reducing proof generation time significantly. Generated ZK-SNARK proofs are imperceptibly embedded into a generated image via Least Significant Bit (LSB) steganography. We demonstrate this system on both GAN and Diffusion models, providing a secure, model-agnostic pipeline for trustworthy AI image generation.

Paperid: 2490, https://arxiv.org/pdf/2510.01097.pdf

Abstract:
Modern blockchain systems operating in adversarial environments require robust consensus protocols that guarantee both safety and termination under network delay attacks. Tendermint, a widely adopted consensus protocol in consortium blockchains, achieves high throughput and finality. However, previous analysis of the safety and termination has been done in a standalone fashion, with no consideration of the composition with other protocols interacting with it in a concurrent manner. Moreover, the termination properties under adaptive network delays caused by Byzantine adversaries have not been formally analyzed. This paper presents the first universally composable (UC) security analysis of Tendermint, demonstrating its resilience against strategic message-delay attacks. By constructing a UC ideal model of Tendermint, we formalize its core mechanisms: phase-base consensus procedure, dynamic timeouts, proposal locking, leader rotation, and others, under a network adversary that selectively delays protocol messages. Our main result proves that the Tendermint protocol UC-realizes the ideal Tendermint model, which ensures bounded termination latency, i.e., guaranteed termination, even when up to $f

Paperid: 2491, https://arxiv.org/pdf/2510.00763.pdf

Abstract:
A visual cryptography scheme is a secret sharing scheme in which the secret information is an image and the shares are printed on transparencies, so that the secret image can be recovered by simply stacking the shares on top of each other. Such schemes do therefore not require any knowledge of cryptography tools to recover the secret, and they have widespread applications, for example, when sharing QR codes or medical images. In this work we deal with visual cryptography threshold schemes for color images. Our color model differs from most previous work by allowing arbitrary colors to be stacked, resulting in a possibly different color. This more general color monoid model enables us to achieve shorter pixel expansion and higher contrast than comparable schemes. We revisit the polynomial framework of Koga and Ishihara for constructing visual cryptography schemes and apply the monoid ring to obtain new schemes for color visual cryptography.

Paperid: 2492, https://arxiv.org/pdf/2510.00554.pdf

Abstract:
Machine learning systems increasingly rely on open-source artifacts such as datasets and models that are created or hosted by other parties. The reliance on external datasets and pre-trained models exposes the system to supply chain attacks where an artifact can be poisoned before it is delivered to the end-user. Such attacks are possible due to the lack of any authenticity verification in existing machine learning systems. Incorporating cryptographic solutions such as hashing and signing can mitigate the risk of supply chain attacks. However, existing frameworks for integrity verification based on cryptographic techniques can incur significant overhead when applied to state-of-the-art machine learning artifacts due to their scale, and are not compatible with GPU platforms. In this paper, we develop Sentry, a novel GPU-based framework that verifies the authenticity of machine learning artifacts by implementing cryptographic signing and verification for datasets and models. Sentry ties developer identities to signatures and performs authentication on the fly as artifacts are loaded on GPU memory, making it compatible with GPU data movement solutions such as NVIDIA GPUDirect that bypass the CPU. Sentry incorporates GPU acceleration of cryptographic hash constructions such as Merkle tree and lattice hashing, implementing memory optimizations and resource partitioning schemes for a high throughput performance. Our evaluations show that Sentry is a practical solution to bring authenticity to machine learning systems, achieving orders of magnitude speedup over a CPU-based baseline.

Paperid: 2493, https://arxiv.org/pdf/2510.00529.pdf

Abstract:
Structured security logs are critical for detecting advanced persistent threats (APTs). Large language models (LLMs) struggle in this domain due to limited context and domain mismatch. We propose \textbf{DM-RAG}, a dual-memory retrieval-augmented generation framework for structured log analysis. It integrates a short-term memory buffer for recent summaries and a long-term FAISS-indexed memory for historical patterns. An instruction-tuned Phi-4-mini processes the combined context and outputs structured predictions. Bayesian fusion promotes reliable persistence into memory. On the UNSW-NB15 dataset, DM-RAG achieves 53.64% accuracy and 98.70% recall, surpassing fine-tuned and RAG baselines in recall. The architecture is lightweight, interpretable, and scalable, enabling real-time threat monitoring without extra corpora or heavy tuning.

Paperid: 2494, https://arxiv.org/pdf/2510.00322.pdf

Abstract:
Standard techniques for differentially private estimation, such as Laplace or Gaussian noise addition, require guaranteed bounds on the sensitivity of the estimator in question. But such sensitivity bounds are often large or simply unknown. Thus we seek differentially private methods that can be applied to arbitrary black-box functions. A handful of such techniques exist, but all are either inefficient in their use of data or require evaluating the function on exponentially many inputs. In this work we present a scheme that trades off between statistical efficiency (i.e., how much data is needed) and oracle efficiency (i.e., the number of evaluations). We also present lower bounds showing the near-optimality of our scheme.

Paperid: 2495, https://arxiv.org/pdf/2509.26530.pdf

Abstract:
Detection of emerging attacks on network infrastructure is a critical aspect of security management. To meet the growing scale and complexity of modern threats, machine learning (ML) techniques offer valuable tools for automating the detection of malicious activities. However, as these techniques become more complex, their internal operations grow increasingly opaque. In this context, we address the need for explainable physical-layer attack detection methods. First, we analyze the inner workings of various classifiers trained to alert about physical layer intrusions, examining how the influence of different monitored parameters varies depending on the type of attack being detected. This analysis not only improves the interpretability of the models but also suggests ways to enhance their design for increased speed. In the second part, we evaluate the detectors' resilience to malicious parameter noising. The results highlight a key trade-off between model speed and resilience. This work serves as a design guideline for developing fast and robust detectors trained on available network monitoring data.

Paperid: 2496, https://arxiv.org/pdf/2509.24698.pdf

Abstract:
We present LISA, an agentic smart contract vulnerability detection framework that combines rule-based and logic-based methods to address a broad spectrum of vulnerabilities in smart contracts. LISA leverages data from historical audit reports to learn the detection experience (without model fine-tuning), enabling it to generalize learned patterns to unseen projects and evolving threat profiles. In our evaluation, LISA significantly outperforms both LLM-based approaches and traditional static analysis tools, achieving superior coverage of vulnerability types and higher detection accuracy. Our results suggest that LISA offers a compelling solution for industry: delivering more reliable and comprehensive vulnerability detection while reducing the dependence on manual effort.

Paperid: 2497, https://arxiv.org/pdf/2509.24624.pdf

Abstract:
The rapid growth of Large Language Models (LLMs) has highlighted the pressing need for reliable mechanisms to verify content ownership and ensure traceability. Watermarking offers a promising path forward, but it remains limited by privacy concerns in sensitive scenarios, as traditional approaches often require direct access to a model's parameters or its training data. In this work, we propose a secure multi-party computation (MPC)-based private LLMs watermarking framework, PRIVMARK, to address the concerns. Concretely, we investigate PostMark (EMNLP'2024), one of the state-of-the-art LLMs Watermarking methods, and formulate its basic operations. Then, we construct efficient protocols for these operations using the MPC primitives in a black-box manner. In this way, PRIVMARK enables multiple parties to collaboratively watermark an LLM's output without exposing the model's weights to any single computing party. We implement PRIVMARK using SecretFlow-SPU (USENIX ATC'2023) and evaluate its performance using the ABY3 (CCS'2018) backend. The experimental results show that PRIVMARK achieves semantically identical results compared to the plaintext baseline without MPC and is resistant against paraphrasing and removing attacks with reasonable efficiency.

Paperid: 2498, https://arxiv.org/pdf/2509.23984.pdf

Abstract:
Traditional single-proposer blockchains suffer from miner extractable value (MEV), where validators exploit their serial monopoly on transaction inclusion and ordering to extract rents from users. While there have been many developments at the application layer to reduce the impact of MEV, these approaches largely require auctions as a subcomponent. Running auctions efficiently on chain requires two key properties of the underlying consensus protocol: selective-censorship resistance and hiding. These properties guarantee that an adversary can neither selectively delay transactions nor see their contents before they are confirmed. We propose a multiple concurrent proposer (MCP) protocol offering exactly these properties.

Paperid: 2499, https://arxiv.org/pdf/2509.23305.pdf

Abstract:
Industrial Control Systems (ICSs) are complex interconnected systems used to manage process control within industrial environments, such as chemical processing plants and water treatment facilities. As the modern industrial environment moves towards Internet-facing services, ICSs face an increased risk of attacks that necessitates ICS-specific Intrusion Detection Systems (IDS). The development of such IDS relies significantly on a simulated testbed as it is unrealistic and sometimes hazardous to utilize an operational control system. Whilst some testbeds have been proposed, they often use a limited selection of virtual ICS simulations to test and verify cyber security solutions. There is a lack of investigation done on developing systems that can efficiently simulate multiple ICS architectures. Currently, the trend within research involves developing security solutions on just one ICS simulation, which can result in bias to its specific architecture. We present ICS-SimLab, an end-to-end software suite that utilizes Docker containerization technology to create a highly configurable ICS simulation environment. This software framework enables researchers to rapidly build and customize different ICS environments, facilitating the development of security solutions across different systems that adhere to the Purdue Enterprise Reference Architecture. To demonstrate its capability, we present three virtual ICS simulations: a solar panel smart grid, a water bottle filling facility, and a system of intelligent electronic devices. Furthermore, we run cyber-attacks on these simulations and construct a dataset of recorded malicious and benign network traffic to be used for IDS development.

Paperid: 2500, https://arxiv.org/pdf/2509.22965.pdf

Abstract:
Casting a ballot from a phone or laptop sounds appealing, but only if voters can be confident their choice remains secret and results cannot be altered in the dark. This paper proposes a hybrid blockchain-based voting model that stores encrypted votes on a private blockchain maintained by election organizers and neutral observers, while periodically anchoring hashes of these votes onto a public blockchain as a tamper-evident seal. The system issues voters one-time blind-signed tokens to protect anonymity, and provides receipts so they can confirm their vote was counted. We implemented a live prototype using common web technologies (Next.js, React, Firebase) to demonstrate end-to-end functionality, accessibility, and cost efficiency. Our contributions include developing a working demo, a complete election workflow, a hybrid blockchain design, and a user-friendly interface that balances privacy, security, transparency, and practicality. This research highlights the feasibility of secure, verifiable, and scalable online voting for organizations ranging from small groups to larger institutions.

Paperid: 2501, https://arxiv.org/pdf/2509.22814.pdf

Abstract:
The Model Context Protocol (MCP) defines a schema bound execution model for agent-tool interaction, enabling modular computer vision workflows without retraining. To our knowledge, this is the first protocol level, deployment scale audit of MCP in vision systems, identifying systemic weaknesses in schema semantics, interoperability, and runtime coordination. We analyze 91 publicly registered vision centric MCP servers, annotated along nine dimensions of compositional fidelity, and develop an executable benchmark with validators to detect and categorize protocol violations. The audit reveals high prevalence of schema format divergence, missing runtime schema validation, undeclared coordinate conventions, and reliance on untracked bridging scripts. Validator based testing quantifies these failures, with schema format checks flagging misalignments in 78.0 percent of systems, coordinate convention checks detecting spatial reference errors in 24.6 percent, and memory scope checks issuing an average of 33.8 warnings per 100 executions. Security probes show that dynamic and multi agent workflows exhibit elevated risks of privilege escalation and untyped tool connections. The proposed benchmark and validator suite, implemented in a controlled testbed and to be released on GitHub, establishes a reproducible framework for measuring and improving the reliability and security of compositional vision workflows.

Paperid: 2502, https://arxiv.org/pdf/2509.22662.pdf

Abstract:
Global Positioning System (GPS) spoofing involves transmitting fake signals that mimic those from GPS satellites, causing the GPS receivers to calculate incorrect Positioning, Navigation, and Timing (PNT) information. Recently, there has been a surge in GPS spoofing attacks targeting aircraft. Since GPS satellite signals are weak, the spoofed high-power signal can easily overpower them. These spoofed signals are often interpreted as valid by the GPS receiver, which can cause severe and cascading effects on air navigation. While much of the existing research on GPS spoofing focuses on technical aspects of detection and mitigation, human factors are often neglected, even though pilots are an integral part of aircraft operation and potentially vulnerable to deception. This research addresses this gap by conducting a detailed analysis of the behavior of student pilots when subjected to GPS spoofing using the Force Dynamics 401CR flight simulator with X-Plane 11 and a Cessna 172 equipped with Garmin G1000. Spoofing scenarios were implemented via custom scripts that altered navigational data without modifying the external visual environment. Thirty student pilots from the Embry-Riddle Aeronautical University Daytona Beach campus with diverse flying experience levels were recruited to participate in three spoofing scenarios. A pre-simulation questionnaire was distributed to measure pilot experience and confidence in GPS.Inflight decision-making during the spoofing attacks was observed, including reaction time to anomalies, visual attention to interface elements, and cognitive biases. A post-flight evaluation of workload was obtained using a modified NASA Task Load Index (TLX) method. This study provides a first step toward identifying human vulnerabilities to GPS spoofing amid the ongoing debate over GPS reliance.

Paperid: 2503, https://arxiv.org/pdf/2509.22280.pdf

Abstract:
The escalating frequency and sophistication of cyber threats increased the need for their comprehensive understanding. This paper explores the intersection of geopolitical dynamics, cyber threat intelligence analysis, and advanced detection technologies, with a focus on the energy domain. We leverage generative artificial intelligence to extract and structure information from raw cyber threat descriptions, enabling enhanced analysis. By conducting a geopolitical comparison of threat actor origins and target regions across multiple databases, we provide insights into trends within the general threat landscape. Additionally, we evaluate the effectiveness of cybersecurity tools -- with particular emphasis on learning-based techniques -- in detecting indicators of compromise for energy-targeted attacks. This analysis yields new insights, providing actionable information to researchers, policy makers, and cybersecurity professionals.

Paperid: 2504, https://arxiv.org/pdf/2509.22215.pdf

Abstract:
Cyber-physical systems are part of industrial systems and critical infrastructure. Therefore, they should be examined in a comprehensive manner to verify their correctness and security. At the same time, the complexity of such systems demands such examinations to be systematic and, if possible, automated for efficiency and accuracy. A method that can be useful in this context is model checking. However, this requires a model that faithfully represents the behavior of the examined system. Obtaining such a model is not trivial, as many of these systems can be examined only in black box settings due to, e.g., long supply chains or secrecy. We therefore utilize active black box learning techniques to infer behavioral models in the form of Mealy machines of such systems and translate them into a form that can be evaluated using a model checker. To this end, we will investigate a cyber-physical systems as a black box using its external communication interface. We first annotate the model with propositions by mapping context information from the respective protocol to the model using Context-based Proposition Maps (CPMs). We gain annotated Mealy machines that resemble Kripke structures. We then formally define a template, to transfer the structures model checker-compatible format. We further define generic security properties based on basic security requirements. Due to the used CPMs, we can instantiate these properties with a meaningful context to check a specific protocol, which makes the approach flexible and scalable. The gained model can be easily altered to introduce non-deterministic behavior (like timeouts) or faults and examined if the properties still. Lastly, we demonstrate the versatility of the approach by providing case studies of different communication protocols (NFC and UDS), checked with the same tool chain and the same security properties.

Paperid: 2505, https://arxiv.org/pdf/2509.22143.pdf

Abstract:
DeFi applications are vulnerable to MEV, where specialized actors profit by reordering or inserting transactions. To mitigate latency races and internalize MEV revenue, Arbitrum introduced Timeboost, an auction-based transaction sequencing mechanism that grants short-term priority access to an express lane. In this paper we present the first large-scale empirical study of Timeboost, analyzing over 11.5 million express lane transactions and 151 thousand auctions between April and July 2025. Our results reveal five main findings. First, express lane control is highly centralized, with two entities winning more than 90% of auctions. Second, while express lane access provides earlier inclusion, profitable MEV opportunities cluster at the end of blocks, limiting the value of priority access. Third, approximately 22% of time-boosted transactions are reverted, indicating that the Timeboost does not effectively mitigate spam. Fourth, secondary markets for reselling express lane rights have collapsed due to poor execution reliability and unsustainable economics. Finally, auction competition declined over time, leading to steadily reduced revenue for the Arbitrum DAO. Taken together, these findings show that Timeboost fails to deliver on its stated goals of fairness, decentralization, and spam reduction. Instead, it reinforces centralization and narrows adoption, highlighting the limitations of auction-based ordering as a mechanism for fair transaction sequencing in rollups.

Paperid: 2506, https://arxiv.org/pdf/2509.21497.pdf

Abstract:
With the increased interest in artificial intelligence, Machine Learning as a Service provides the infrastructure in the Cloud for easy training, testing, and deploying models. However, these systems have a major privacy issue: uploading sensitive data to the Cloud, especially during training. Therefore, achieving secure Neural Network training has been on many researchers' minds lately. More and more solutions for this problem are built around a main pillar: Functional Encryption (FE). Although these approaches are very interesting and offer a new perspective on ML training over encrypted data, some vulnerabilities do not seem to be taken into consideration. In our paper, we present an attack on neural networks that uses FE for secure training over encrypted data. Our approach uses linear programming to reconstruct the original input, unveiling the previous security promises. To address the attack, we propose two solutions for secure training and inference that involve the client during the computation phase. One approach ensures security without relying on encryption, while the other uses function-hiding inner-product techniques.

Paperid: 2507, https://arxiv.org/pdf/2509.21367.pdf

Abstract:
As smart tourism evolves, AI-powered chatbots have become indispensable for delivering personalized, real-time assistance to travelers while promoting sustainability and efficiency. However, these systems are increasingly vulnerable to prompt injection attacks, where adversaries manipulate inputs to elicit unintended behaviors such as leaking sensitive information or generating harmful content. This paper presents a case study on the design and implementation of a secure retrieval-augmented generation (RAG) chatbot for Hsinchu smart tourism services. The system integrates RAG with API function calls, multi-layered linguistic analysis, and guardrails against injections, achieving high contextual awareness and security. Key features include a tiered response strategy, RAG-driven knowledge grounding, and intent decomposition across lexical, semantic, and pragmatic levels. Defense mechanisms include system norms, gatekeepers for intent judgment, and reverse RAG text to prioritize verified data. We also benchmark a GPT-5 variant (released 2025-08-07) to assess inherent robustness. Evaluations with 674 adversarial prompts and 223 benign queries show over 95% accuracy on benign tasks and substantial detection of injection attacks. GPT-5 blocked about 85% of attacks, showing progress yet highlighting the need for layered defenses. Findings emphasize contributions to sustainable tourism, multilingual accessibility, and ethical AI deployment. This work offers a practical framework for deploying secure chatbots in smart tourism and contributes to resilient, trustworthy AI applications.

Paperid: 2508, https://arxiv.org/pdf/2509.21129.pdf

Abstract:
Modern email spam and phishing attacks have evolved far beyond keyword blacklists or simple heuristics. Adversaries now craft multi-modal campaigns that combine natural-language text with obfuscated URLs, forged headers, and malicious attachments, adapting their strategies within days to bypass filters. Traditional spam detection systems, which rely on static rules or single-modality models, struggle to integrate heterogeneous signals or to continuously adapt, leading to rapid performance degradation. We propose EvoMail, a self-evolving cognitive agent framework for robust detection of spam and phishing. EvoMail first constructs a unified heterogeneous email graph that fuses textual content, metadata (headers, senders, domains), and embedded resources (URLs, attachments). A Cognitive Graph Neural Network enhanced by a Large Language Model (LLM) performs context-aware reasoning across these sources to identify coordinated spam campaigns. Most critically, EvoMail engages in an adversarial self-evolution loop: a ''red-team'' agent generates novel evasion tactics -- such as character obfuscation or AI-generated phishing text -- while the ''blue-team'' detector learns from failures, compresses experiences into a memory module, and reuses them for future reasoning. Extensive experiments on real-world datasets (Enron-Spam, Ling-Spam, SpamAssassin, and TREC) and synthetic adversarial variants demonstrate that EvoMail consistently outperforms state-of-the-art baselines in detection accuracy, adaptability to evolving spam tactics, and interpretability of reasoning traces. These results highlight EvoMail's potential as a resilient and explainable defense framework against next-generation spam and phishing threats.

Paperid: 2509, https://arxiv.org/pdf/2509.20972.pdf

Abstract:
Phishing emails pose a persistent and increasingly sophisticated threat, undermining email security through deceptive tactics designed to exploit both semantic and structural vulnerabilities. Traditional detection methods, often based on isolated analysis of email content or embedded URLs, fail to comprehensively address these evolving attacks. In this paper, we propose a dual-path phishing detection framework that integrates transformer-based natural language processing (NLP) with classical machine learning to jointly analyze email text and embedded URLs. Our approach leverages the complementary strengths of semantic analysis using fine-tuned transformer architectures (e.g., DistilBERT) and structural link analysis via character-level TF-IDF vectorization paired with classical classifiers (e.g., Random Forest). Empirical evaluation on representative email and URL datasets demonstrates that this combined approach significantly improves detection accuracy. Specifically, the DistilBERT model achieves a near-optimal balance between accuracy and computational efficiency for textual phishing detection, while Random Forest notably outperforms other classical classifiers in identifying malicious URLs. The modular design allows flexibility for standalone deployment or ensemble integration, facilitating real-world adoption. Collectively, our results highlight the efficacy and practical value of this dual-path approach, establishing a scalable, accurate, and interpretable solution capable of enhancing email security against contemporary phishing threats.

Paperid: 2510, https://arxiv.org/pdf/2509.20880.pdf

Abstract:
The mapping $Ï_n$ from $\F_{2}^{n}$ to itself defined by $y=Ï_n(x)$ with $y_i=x_i+x_{i+2}(1+x_{i+1})$, where the indices are computed modulo $n$, has been widely studied for its applications in lightweight cryptography. However, $Ï_n $ is bijective on $\F_2^n$ only when $n$ is odd, restricting its use to odd-dimensional vector spaces over $\F_2$. To address this limitation, we introduce and analyze the generalized mapping $Ï_{n, m}$ defined by $y=Ï_{n,m}(x)$ with $y_i=x_i+x_{i+m} (x_{i+m-1}+1)(x_{i+m-2}+1) \cdots (x_{i+1}+1)$, where $m$ is a fixed integer with $m\nmid n$. To investigate such mappings, we further generalize $Ï_{n,m}$ to $Î¸_{m, k}$, where $Î¸_{m, k}$ is given by $y_i=x_{i+mk} \prod_{\substack{j=1,\,\, m \nmid j}}^{mk-1} \left(x_{i+j}+1\right), \,\,{\rm for }\,\, i\in \{0,1,\ldots,n-1\}$. We prove that these mappings generate an abelian group isomorphic to the group of units in $\F_2[z]/(z^{\lfloor n/m\rfloor +1})$. This structural insight enables us to construct a broad class of permutations over $\F_2^n$ for any positive integer $n$, along with their inverses. We rigorously analyze algebraic properties of these mappings, including their iterations, fixed points, and cycle structures. Additionally, we provide a comprehensive database of the cryptographic properties for iterates of $Ï_{n,m}$ for small values of $n$ and $m$. Finally, we conduct a comparative security and implementation cost analysis among $Ï_{n,m}$, $Ï_n$, $ÏÏ_n$ (EUROCRYPT 2025 \cite{belkheyar2025chi}) and their variants, and prove Conjecture~1 proposed in~\cite{belkheyar2025chi} as a by-product of our study. Our results lead to generalizations of $Ï_n$, providing alternatives to $Ï_n$ and $ÏÏ_n$.

Paperid: 2511, https://arxiv.org/pdf/2509.20592.pdf

Abstract:
The rapid adoption of Mobile Money Services (MMS) in Sub-Saharan Africa (SSA) offers a viable path to improve e-Government service accessibility in the face of persistent low internet penetration. However, existing Mobile Money Authentication (MMA) methods face critical limitations, including susceptibility to SIM swapping, weak session protection, and poor scalability during peak demand. This study introduces a hybrid MMA framework that combines Unstructured Supplementary Service Data (USSD)-based multi-factor authentication with secure session management via cryptographically bound JSON Web Tokens (JWT). Unlike traditional MMA systems that rely solely on SIM-PIN verification or smartphone-dependent biometrics, our design implements a three-factor authentication model; SIM verification, PIN entry, and session token binding, tailored for resource-constrained environments. Simulations and comparative analysis against OAuth-based Single Sign-On (SSO) methods reveal a 45% faster authentication time (8 seconds vs. 12 to 15 seconds), 15% higher success under poor network conditions (95% vs. 80%), and increased resistance to phishing and brute-force attacks. Penetration testing and threat modeling further demonstrate a substantial reduction in vulnerability exposure compared to conventional approaches. The primary contributions of this work are: (1) a hybrid authentication protocol that ensures offline accessibility and secure session continuity; (2) a tailored security framework addressing threats like SIM swapping and social engineering in SSA; and (3) demonstrated scalability for thousands of users with reduced infrastructure overhead. The proposed approach advances secure digital inclusion in SSA and other regions with similar constraints.

Paperid: 2512, https://arxiv.org/pdf/2509.20454.pdf

Abstract:
Electroencephalography (EEG) is widely used for recording brain activity and has seen numerous applications in machine learning, such as detecting sleep stages and neurological disorders. Several studies have successfully shown the potential of EEG data for re-identification and leakage of other personal information. Therefore, the increasing availability of EEG consumer devices raises concerns about user privacy, motivating us to investigate how to safeguard this sensitive data while retaining its utility for EEG applications. To address this challenge, we propose a transformer-based autoencoder to create EEG data that does not allow for subject re-identification while still retaining its utility for specific machine learning tasks. We apply our approach to automatic sleep staging by evaluating the re-identification and utility potential of EEG data before and after anonymization. The results show that the re-identifiability of the EEG signal can be substantially reduced while preserving its utility for machine learning.

Paperid: 2513, https://arxiv.org/pdf/2509.20395.pdf

Abstract:
This paper investigates the trade-off between centralized and decentralized security management in constellations of satellites to balance security and performance. We highlight three key AI architectures for automated security management: (a) centralized, (b) distributed and (c) federated. The centralized architecture is the best option short term, providing fast training, despite the hard challenge of the communication latency overhead across space. Decentralized architectures are better alternatives in the longer term, providing enhanced scalability and security.

Paperid: 2514, https://arxiv.org/pdf/2509.20382.pdf

Abstract:
ECG biometrics offer a unique, secure authentication method, yet their deployment on wearable devices faces real-time processing, privacy, and spoofing vulnerability challenges. This paper proposes a lightweight deep learning model (MobileNetV1+GRU) for ECG-based authentication, injection of 20dB Gaussian noise & custom preprocessing. We simulate wearable conditions and edge deployment using the ECGID, MIT-BIH, CYBHi, and PTB datasets, achieving accuracies of 99.34%, 99.31%, 91.74%, and 98.49%, F1-scores of 0.9869, 0.9923, 0.9125, and 0.9771, Precision of 0.9866, 0.9924, 0.9180 and 0.9845, Recall of 0.9878, 0.9923, 0.9129, and 0.9756, equal error rates (EER) of 0.0009, 0.00013, 0.0091, and 0.0009, and ROC-AUC values of 0.9999, 0.9999, 0.9985, and 0.9998, while under FGSM adversarial attacks, accuracy drops from 96.82% to as low as 0.80%. This paper highlights federated learning, adversarial testing, and the need for diverse wearable physiological datasets to ensure secure and scalable biometrics.

Paperid: 2515, https://arxiv.org/pdf/2509.20024.pdf

Abstract:
Biometric-based authentication systems are getting broadly adopted in many areas. However, these systems do not allow participating users to influence the way their data is used. Furthermore, the data may leak and can be misused without the users' knowledge. In this paper, we propose a new authentication method that preserves the privacy of individuals and is based on a generative adversarial network (GAN). Concretely, we suggest using the GAN for translating images of faces to a visually private domain (e.g., flowers or shoes). Classifiers, which are used for authentication purposes, are then trained on the images from the visually private domain. Based on our experiments, the method is robust against attacks and still provides meaningful utility.

Paperid: 2516, https://arxiv.org/pdf/2509.20008.pdf

Abstract:
Penetration testing, the simulation of cyberattacks to identify security vulnerabilities, presents a sequential decision-making problem well-suited for reinforcement learning (RL) automation. Like many applications of RL to real-world problems, partial observability presents a major challenge, as it invalidates the Markov property present in Markov Decision Processes (MDPs). Partially Observable MDPs require history aggregation or belief state estimation to learn successful policies. We investigate stochastic, partially observable penetration testing scenarios over host networks of varying size, aiming to better reflect real-world complexity through more challenging and representative benchmarks. This approach leads to the development of more robust and transferable policies, which are crucial for ensuring reliable performance across diverse and unpredictable real-world environments. Using vanilla Proximal Policy Optimization (PPO) as a baseline, we compare a selection of PPO variants designed to mitigate partial observability, including frame-stacking, augmenting observations with historical information, and employing recurrent or transformer-based architectures. We conduct a systematic empirical analysis of these algorithms across different host network sizes. We find that this task greatly benefits from history aggregation. Converging three times faster than other approaches. Manual inspection of the learned policies by the algorithms reveals clear distinctions and provides insights that go beyond quantitative results.

Paperid: 2517, https://arxiv.org/pdf/2509.19959.pdf

Abstract:
Rowhammer attacks have emerged as a significant threat to modern DRAM-based memory systems, leveraging frequent memory accesses to induce bit flips in adjacent memory cells. This work-in-progress paper presents an adaptive, many-sided Rowhammer attack utilizing GPU compute shaders to systematically achieve high-frequency memory access patterns. Our approach employs statistical distributions to optimize row targeting and avoid current mitigations. The methodology involves initializing memory with known patterns, iteratively hammering victim rows, monitoring for induced errors, and dynamically adjusting parameters to maximize success rates. The proposed attack exploits the parallel processing capabilities of GPUs to accelerate hammering operations, thereby increasing the probability of successful bit flips within a constrained timeframe. By leveraging OpenGL compute shaders, our implementation achieves highly efficient row hammering with minimal software overhead. Experimental results on a Raspberry Pi 4 demonstrate that the GPU-based approach attains a high rate of bit flips compared to traditional CPU-based hammering, confirming its effectiveness in compromising DRAM integrity. Our findings align with existing research on microarchitectural attacks in heterogeneous systems that highlight the susceptibility of GPUs to security vulnerabilities. This study contributes to the understanding of GPU-assisted fault-injection attacks and underscores the need for improved mitigation strategies in future memory architectures.

Paperid: 2518, https://arxiv.org/pdf/2509.19947.pdf

Abstract:
Poison-only Clean-label Backdoor Attacks aim to covertly inject attacker-desired behavior into DNNs by merely poisoning the dataset without changing the labels. To effectively implant a backdoor, multiple \textbf{triggers} are proposed for various attack requirements of Attack Success Rate (ASR) and stealthiness. Additionally, sample selection enhances clean-label backdoor attacks' ASR by meticulously selecting ``hard'' samples instead of random samples to poison. Current methods 1) usually handle the sample selection and triggers in isolation, leading to severely limited improvements on both ASR and stealthiness. Consequently, attacks exhibit unsatisfactory performance on evaluation metrics when converted to PCBAs via a mere stacking of methods. Therefore, we seek to explore the bidirectional collaborative relations between the sample selection and triggers to address the above dilemma. 2) Since the strong specificity within triggers, the simple combination of sample selection and triggers fails to substantially enhance both evaluation metrics, with generalization preserved among various attacks. Therefore, we seek to propose a set of components to significantly improve both stealthiness and ASR based on the commonalities of attacks. Specifically, Component A ascertains two critical selection factors, and then makes them an appropriate combination based on the trigger scale to select more reasonable ``hard'' samples for improving ASR. Component B is proposed to select samples with similarities to relevant trigger implanted samples to promote stealthiness. Component C reassigns trigger poisoning intensity on RGB colors through distinct sensitivity of the human visual system to RGB for higher ASR, with stealthiness ensured by sample selection, including Component B. Furthermore, all components can be strategically integrated into diverse PCBAs.

Paperid: 2519, https://arxiv.org/pdf/2509.19650.pdf

Abstract:
The threat of quantum malware is real and a growing security concern that will have catastrophic scientific and technological impacts, if not addressed early. If weaponised or exploited especially by the wrong hands, malware will undermine highly sophisticated critical systems supported by next-generation quantum architectures, for example, in defence, communications, energy, and space. This paper explores the fundamental nature and implications of quantum malware to enable the future development of appropriate mitigations and defences, thereby protecting critical infrastructure. By conducting a systematic literature review (SLR) that draws on knowledge frameworks such as ontologies and taxonomies to explore malware, this provides insights into how malicious behaviours can be translated into attacks on quantum technologies, thereby providing a lens to analyse the severity of malware against quantum technologies. This study employs the European Competency Framework for Quantum Technologies (CFQT) as a guide to map malware behaviour to several competency layers, creating a foundation in this emerging field.

Paperid: 2520, https://arxiv.org/pdf/2509.19568.pdf

Abstract:
Modern Systems-on-Chip (SoCs) employ undocumented linear address-scrambling functions to obfuscate DRAM addressing, which complicates DRAM-aware performance optimizations and hinders proactive security analysis of DRAM-based attacks; most notably, Rowhammer. Although previous work tackled the issue of reversing physical-to-DRAM mapping, existing heuristic-based reverse-engineering approaches are partial, costly, and impractical for comprehensive recovery. This paper establishes a rigorous theoretical foundation and provides efficient practical algorithms for black-box, complete physical-to-DRAM address-mapping recovery. We first formulate the reverse-engineering problem within a linear algebraic model over the finite field GF(2). We characterize the timing fingerprints of row-buffer conflicts, proving a relationship between a bank addressing matrix and an empirically constructed matrix of physical addresses. Based on this characterization, we develop an efficient, noise-robust, and fully platform-agnostic algorithm to recover the full bank-mask basis in polynomial time, a significant improvement over the exponential search from previous works. We further generalize our model to complex row mappings, introducing new hardware-based hypotheses that enable the automatic recovery of a row basis instead of previous human-guided contributions. Evaluations across embedded and server-class architectures confirm our method's effectiveness, successfully reconstructing known mappings and uncovering previously unknown scrambling functions. Our method provides a 99% recall and accuracy on all tested platforms. Most notably, Knock-Knock runs in under a few minutes, even on systems with more than 500GB of DRAM, showcasing the scalability of our method. Our approach provides an automated, principled pathway to accurate DRAM reverse engineering.

Paperid: 2521, https://arxiv.org/pdf/2509.18696.pdf

Abstract:
The widespread adoption of smartphone photography has led users to increasingly rely on cloud storage for personal photo archiving and sharing, raising critical privacy concerns. Existing deep learning-based image encryption schemes, typically built upon CNNs or GANs, often depend on traditional cryptographic algorithms and lack inherent architectural reversibility, resulting in limited recovery quality and poor robustness. Invertible neural networks (INNs) have emerged to address this issue by enabling reversible transformations, yet the first INN-based encryption scheme still relies on an auxiliary reference image and discards by-product information before decryption, leading to degraded recovery and limited practicality. To address these limitations, this paper proposes FlowCrypt, a novel flow-based image encryption framework that simultaneously achieves near-lossless recovery, high security, and lightweight model design. FlowCrypt begins by applying a key-conditioned random split to the input image, enhancing forward-process randomness and encryption strength. The resulting components are processed through a Flow-based Encryption/Decryption (FED) module composed of invertible blocks, which share parameters across encryption and decryption. Thanks to its reversible architecture and reference-free design, FlowCrypt ensures high-fidelity image recovery. Extensive experiments show that FlowCrypt achieves recovery quality with 100dB on three datasets, produces uniformly distributed cipher images, and maintains a compact architecture with only 1M parameters, making it suitable for mobile and edge-device applications.

Paperid: 2522, https://arxiv.org/pdf/2509.18415.pdf

Abstract:
The proliferation of autonomous software agents necessitates rigorous frameworks for establishing secure and verifiable agent-to-agent (A2A) interactions, particularly when such agents are instantiated as non-human identities(NHIs). We extend the A2A paradigm [1 , 2] by introducing a cryptographically grounded mechanism for lineage verification, wherein the provenance and evolution of NHIs are anchored in append-only Merkle tree structures modeled after Certificate Transparency (CT) logs. Unlike traditional A2A models that primarily secure point-to-point interactions, our approach enables both agents and external verifiers to cryptographically validate multi-hop provenance, thereby ensuring the integrity of the entire call chain. A federated proof server acts as an auditor across one or more Merkle logs, aggregating inclusion proofs and consistency checks into compact, signed attestations that external parties can verify without access to the full execution trace. In parallel, we augment the A2A agent card to incorporate explicit identity verification primitives, enabling both peer agents and human approvers to authenticate the legitimacy of NHI representations in a standardized manner. Together, these contributions establish a cohesive model that integrates identity attestation, lineage verification, and independent proof auditing, thereby advancing the security posture of inter-agent ecosystems and providing a foundation for robust governance of NHIs in regulated environments such as FedRAMP.

Paperid: 2523, https://arxiv.org/pdf/2509.18341.pdf

Abstract:
Fault Injection is the study of observing how systems behave under unusual stress, environmental or otherwise. In practice, fault injection involves testing the limits of computer systems and finding novel ways to potentially break cyber-physical security. The contributions of this paper are three-fold. First, we provide a beginner-friendly introduction to this research topic and an in-depth taxonomy of fault injection techniques. Second, we highlight the current state-of-the-art and provide a cost-benefit analysis of each attack method. Third, for those interested in doing fault injection research, we provide a replication analysis of an existing vulnerability detection tool and identify a research focus for future work.

Paperid: 2524, https://arxiv.org/pdf/2509.18039.pdf

Abstract:
Modern embedded Linux devices, such as routers, IP cameras, and IoT gateways, rely on complex software stacks where numerous daemons interact to provide services. Testing these devices is crucial from a security perspective since vendors often use custom closed- or open-source software without documenting releases and patches. Recent coverage-guided fuzzing solutions primarily test individual processes, ignoring deep dependencies between daemons and their persistent internal state. This article presents STAFF, a firmware fuzzing framework for discovering bugs in Linux-based firmware built around three key ideas: (a) user-driven multi-request recording, which monitors user interactions with emulated firmware to capture request sequences involving application-layer protocols (e.g., HTTP); (b) intra- and inter-process dependency detection, which uses whole-system taint analysis to track how input bytes influence user-space states, including files, sockets, and memory areas; (c) protocol-aware taint-guided fuzzing, which applies mutations to request sequences based on identified dependencies, exploiting multi-staged forkservers to efficiently checkpoint protocol states. When evaluating STAFF on 15 Linux-based firmware targets, it identifies 42 bugs involving multiple network requests and different firmware daemons, significantly outperforming existing state-of-the-art fuzzing solutions in both the number and reproducibility of discovered bugs.

Paperid: 2525, https://arxiv.org/pdf/2509.17416.pdf

Abstract:
With the wide spread of video, video watermarking has become increasingly crucial for copyright protection and content authentication. However, video watermarking still faces numerous challenges. For example, existing methods typically have shortcomings in terms of watermarking capacity and robustness, and there is a lack of specialized noise layer for High Efficiency Video Coding(HEVC) compression. To address these issues, this paper introduces a Deep Invertible Network for Video watermarking (DINVMark) and designs a noise layer to simulate HEVC compression. This approach not only in creases watermarking capacity but also enhances robustness. DINVMark employs an Invertible Neural Network (INN), where the encoder and decoder share the same network structure for both watermark embedding and extraction. This shared architecture ensures close coupling between the encoder and decoder, thereby improving the accuracy of the watermark extraction process. Experimental results demonstrate that the proposed scheme significantly enhances watermark robustness, preserves video quality, and substantially increases watermark embedding capacity.

Paperid: 2526, https://arxiv.org/pdf/2509.17048.pdf

Abstract:
Electronic whistleblowing systems are widely used due to their efficiency and convenience. The key to designing such systems lies in protecting the identity privacy of whistleblowers, preventing malicious whistleblowing, and ensuring the confidentiality of whistleblowing information. To address these issues, a SM2 traceable ring signcryption scheme for electronic voting is proposed. This scheme combines the SM2 elliptic curve public key cryptography algorithm with the ring signature algorithm, enhancing the overall efficiency of the scheme while ensuring the autonomy and controllability of the core cryptographic algorithms. Security analysis demonstrates that the proposed scheme satisfies confidentiality, unforgeability, traceability, linkability, and deniability. Efficiency analysis shows that, compared to existing ring signature schemes, the proposed scheme exhibits significant efficiency advantages during the signature phase. The electronic whistleblowing system designed using the proposed scheme can track malicious whistleblowers while protecting user identity privacy, and ensures that the content of whistleblowing remains unknown to third parties.

Paperid: 2527, https://arxiv.org/pdf/2509.16749.pdf

Abstract:
LLMs are increasingly pervasive in the security environment, with limited measures of their effectiveness, which limits trust and usefulness to security practitioners. Here, we present an open-source evaluation framework and benchmark metrics for evaluating LLM-generated cybersecurity rules. The benchmark employs a holdout set-based methodology to measure the effectiveness of LLM-generated security rules in comparison to a human-generated corpus of rules. It provides three key metrics inspired by the way experts evaluate security rules, offering a realistic, multifaceted evaluation of the effectiveness of an LLM-based security rule generator. This methodology is illustrated using rules from Sublime Security's detection team and those written by Sublime Security's Automated Detection Engineer (ADE), with a thorough analysis of ADE's skills presented in the results section.

Paperid: 2528, https://arxiv.org/pdf/2509.16682.pdf

Abstract:
Cybersecurity threats continue to increase, with a growing number of previously unknown attacks each year targeting both large corporations and smaller entities. This scenario demands the implementation of advanced security measures, not only to mitigate damage but also to anticipate emerging attack trends. In this context, deception tools have become a key strategy, enabling the detection, deterrence, and deception of potential attackers while facilitating the collection of information about their tactics and methods. Among these tools, honeypots have proven their value, although they have traditionally been limited by rigidity and configuration complexity, hindering their adaptability to dynamic scenarios. The rise of artificial intelligence, and particularly general-purpose Large Language Models (LLMs), is driving the development of new deception solutions capable of offering greater adaptability and ease of use. This work proposes the design and implementation of an LLM-based honeypot to simulate an LDAP server, a critical protocol present in most organizations due to its central role in identity and access management. The proposed solution aims to provide a flexible and realistic tool capable of convincingly interacting with attackers, thereby contributing to early detection and threat analysis while enhancing the defensive capabilities of infrastructures against intrusions targeting this service.

Paperid: 2529, https://arxiv.org/pdf/2509.16671.pdf

Abstract:
Large Language Models (LLMs) have emerged as promising tools for malware detection by analyzing code semantics, identifying vulnerabilities, and adapting to evolving threats. However, their reliability under adversarial compiler-level obfuscation is yet to be discovered. In this study, we empirically evaluate the robustness of three state-of-the-art LLMs: ChatGPT-4o, Gemini Flash 2.5, and Claude Sonnet 4 against compiler-level obfuscation techniques implemented via the LLVM infrastructure. These include control flow flattening, bogus control flow injection, instruction substitution, and split basic blocks, which are widely used to evade detection while preserving malicious behavior. We perform a structured evaluation on 40~C functions (20 vulnerable, 20 secure) sourced from the Devign dataset and obfuscated using LLVM passes. Our results show that these models often fail to correctly classify obfuscated code, with precision, recall, and F1-score dropping significantly after transformation. This reveals a critical limitation: LLMs, despite their language understanding capabilities, can be easily misled by compiler-based obfuscation strategies. To promote reproducibility, we release all evaluation scripts, prompts, and obfuscated code samples in a public repository. We also discuss the implications of these findings for adversarial threat modeling, and outline future directions such as software watermarking, compiler-aware defenses, and obfuscation-resilient model design.

Paperid: 2530, https://arxiv.org/pdf/2509.16546.pdf

Abstract:
Neural networks are valuable intellectual property due to the significant computational cost, expert labor, and proprietary data involved in their development. Consequently, protecting their parameters is critical not only for maintaining a competitive advantage but also for enhancing the model's security and privacy. Prior works have demonstrated the growing capability of cryptanalytic attacks to scale to deeper models. In this paper, we present the first defense mechanism against cryptanalytic parameter extraction attacks. Our key insight is to eliminate the neuron uniqueness necessary for these attacks to succeed. We achieve this by a novel, extraction-aware training method. Specifically, we augment the standard loss function with an additional regularization term that minimizes the distance between neuron weights within a layer. Therefore, the proposed defense has zero area-delay overhead during inference. We evaluate the effectiveness of our approach in mitigating extraction attacks while analyzing the model accuracy across different architectures and datasets. When re-trained with the same model architecture, the results show that our defense incurs a marginal accuracy change of less than 1% with the modified loss function. Moreover, we present a theoretical framework to quantify the success probability of the attack. When tested comprehensively with prior attack settings, our defense demonstrated empirical success for sustained periods of extraction, whereas unprotected networks are extracted between 14 minutes to 4 hours.

Paperid: 2531, https://arxiv.org/pdf/2509.16389.pdf

Abstract:
Rust is a memory-safe language, and its strong safety guarantees combined with high performance have been attracting widespread adoption in systems programming and security-critical applications. However, Rust permits the use of unsafe code, which bypasses compiler-enforced safety checks and can introduce memory vulnerabilities. A widely adopted approach for detecting memory safety bugs in Rust is Address Sanitizer (ASan). Optimized versions, such as ERASan and RustSan, have been proposed to selectively apply security checks in order to reduce performance overhead. However, these tools still incur significant performance and memory overhead and fail to detect many classes of memory safety vulnerabilities due to the inherent limitations of ASan. In this paper, we present LiteRSan, a novel memory safety sanitizer that addresses the limitations of prior approaches. By leveraging Rust's unique ownership model, LiteRSan performs Rust-specific static analysis that is aware of pointer lifetimes to identify risky pointers. It then selectively instruments risky pointers to enforce only the necessary spatial or temporal memory safety checks. Consequently, LiteRSan introduces significantly lower runtime overhead (18.84% versus 152.05% and 183.50%) and negligible memory overhead (0.81% versus 739.27% and 861.98%) compared with existing ASan-based sanitizers while being capable of detecting memory safety bugs that prior techniques miss.

Paperid: 2532, https://arxiv.org/pdf/2509.16340.pdf

Abstract:
Android apps have become a valuable target for app modifiers and imitators due to its popularity and being trusted with highly sensitive data. Packers, on the other hand, protect apps from tampering with various anti-analysis techniques embedded in the app. Meanwhile, packers also conceal certain behavior potentially against the interest of the users, aside from being abused by malware for stealth. Security practitioners typically try to capture undesired behavior at runtime with hooking (e.g., Frida) or debugging techniques, which are heavily affected by packers. Unpackers have been the community's continuous effort to address this, but due to the emerging commercial packers, our study shows that none of the unpackers remain effective, and they are unfit for this purpose as unpacked apps can no longer run. We first perform a large-scale prevalence analysis of Android packers with a real-world dataset of 12,341 apps, the first of its kind, to find out what percentage of Android apps are actually packed and to what extent dynamic analysis is hindered. We then propose Purifire, an evasion engine to bypass packers' anti-analysis techniques and enable dynamic analysis on packed apps without unpacking them. Purifire is based on eBPF, a low-level kernel feature, which provides observability and invisibility to userspace apps to enforce defined evasion rules while staying low-profile. Our evaluation shows that Purifire is able to bypass packers' anti-analysis checks and more importantly, for previous research works suffering from packers, we observe a significant improvement (e.g., a much higher number of detected items such as device fingerprints).

Paperid: 2533, https://arxiv.org/pdf/2509.16151.pdf

Abstract:
Deep reinforcement learning (RL) is emerging as a viable strategy for automated cyber defense (ACD). The traditional RL approach represents networks as a list of computers in various states of safety or threat. Unfortunately, these models are forced to overfit to specific network topologies, rendering them ineffective when faced with even small environmental perturbations. In this work, we frame ACD as a two-player context-based partially observable Markov decision problem with observations represented as attributed graphs. This approach allows our agents to reason through the lens of relational inductive bias. Agents learn how to reason about hosts interacting with other system entities in a more general manner, and their actions are understood as edits to the graph representing the environment. By introducing this bias, we will show that our agents can better reason about the states of networks and zero-shot adapt to new ones. We show that this approach outperforms the state-of-the-art by a wide margin, and makes our agents capable of defending never-before-seen networks against a wide range of adversaries in a variety of complex, and multi-agent environments.

Paperid: 2534, https://arxiv.org/pdf/2509.15725.pdf

Abstract:
Today, many critical services and industrial systems rely on wireless networks for interaction with the IoT, hence becoming vulnerable to a broad number of cyber-threats. While detecting this kind of attacks is not difficult with common cyber-security tools, and even trivial for jamming, finding their origin and identifying culprits is almost impossible today, yet indispensable to stop them, especially when attacks are generated with portable or self-made devices that continuously move around. To address this open challenge, the FOLLOWME project investigates the feasibility of using UAV to locate and even chase attackers during illicit usage of the radio spectrum. The main objective is to develop a cyber-physical security framework that integrates network telemetry with wireless localization. The former triggers alarms in case of anomalies or known attack patterns and provides a coarse-grained indication of the physical area (i.e., the position of affected access gateways), whereas the latter systematically scans such area to identify the exact location of the attacker. The project will specifically address long-range metropolitan area networks and focus on the LoRaWAN protocol, which is the typical scenario for Smart City services.

Paperid: 2535, https://arxiv.org/pdf/2509.15499.pdf

Abstract:
Detecting packed executables is a critical component of large-scale malware analysis and antivirus engine workflows, as it identifies samples that warrant computationally intensive dynamic unpacking to reveal concealed malicious behavior. Traditionally, packer detection techniques have relied on empirical features, such as high entropy or specific binary patterns. However, these empirical, feature-based methods are increasingly vulnerable to evasion by adversarial samples or unknown packers (e.g., low-entropy packers). Furthermore, the dependence on expert-crafted features poses challenges in sustaining and evolving these methods over time. In this paper, we examine the limitations of existing packer detection methods and propose Pack-ALM, a novel deep-learning-based approach for detecting packed executables. Inspired by the linguistic concept of distinguishing between real and pseudo words, we reformulate packer detection as a task of differentiating between legitimate and "pseudo" instructions. To achieve this, we preprocess native data and packed data into "pseudo" instructions and design a pre-trained assembly language model that recognizes features indicative of packed data. We evaluate Pack-ALM against leading industrial packer detection tools and state-of-the-art assembly language models. Extensive experiments on over 37,000 samples demonstrate that Pack-ALM effectively identifies packed binaries, including samples created with adversarial or previously unseen packing techniques. Moreover, Pack-ALM outperforms traditional entropy-based methods and advanced assembly language models in both detection accuracy and adversarial robustness.

Paperid: 2536, https://arxiv.org/pdf/2509.15195.pdf

Abstract:
Fuzz testing is one of the most effective techniques for finding software vulnerabilities. While modern fuzzers can generate inputs and monitor executions automatically, the overall workflow, from analyzing a codebase, to configuring harnesses, to triaging results, still requires substantial manual effort. Prior attempts focused on single stages such as harness synthesis or input minimization, leaving researchers to manually connect the pieces into a complete fuzzing campaign. We introduce Orion, a framework that automates the the manual bottlenecks of fuzzing by integrating LLM reasoning with traditional tools, allowing campaigns to scale to settings where human effort alone was impractical. Orion uses LLMs for code reasoning and semantic guidance, while relying on deterministic tools for verification, iterative refinement, and tasks that require precision. Across our benchmark suite, Orion reduces human effort by 46-204x depending on the workflow stage, and we demonstrate its effectiveness through the discovery of two previously unknown vulnerabilities in the widely used open-source clib library.

Paperid: 2537, https://arxiv.org/pdf/2509.15170.pdf

Abstract:
Radio frequency fingerprint identification (RFFI) distinguishes wireless devices by the small variations in their analog circuits, avoiding heavy cryptographic authentication. While deep learning on spectrograms improves accuracy, models remain vulnerable to copying, tampering, and evasion. We present a stronger RFFI system combining watermarking for ownership proof and anomaly detection for spotting suspicious inputs. Using a ResNet-34 on log-Mel spectrograms, we embed three watermarks: a simple trigger, an adversarially trained trigger robust to noise and filtering, and a hidden gradient/weight signature. A convolutional Variational Autoencoders (VAE) with Kullback-Leibler (KL) warm-up and free-bits flags off-distribution queries. On the LoRa dataset, our system achieves 94.6% accuracy, 98% watermark success, and 0.94 AUROC, offering verifiable, tamper-resistant authentication.

Paperid: 2538, https://arxiv.org/pdf/2509.15047.pdf

Abstract:
We study the trade-off between communication rate and privacy for distributed batch matrix multiplication of two independent sequences of matrices $\mathbf{A}$ and $\mathbf{B}$ with uniformly distributed entries. In our setting, $\mathbf{B}$ is publicly accessible by all the servers while $\mathbf{A}$ must remain private. A user is interested in evaluating the product $\mathbf{AB}$ with the responses from the $k$ fastest servers. For a given parameter $Î±\in [0, 1]$, our privacy constraint must ensure that any set of $\ell$ colluding servers cannot learn more than a fraction $Î±$ of $\mathbf{A}$. Additionally, we study the trade-off between the amount of local randomness needed at the encoder and privacy. Finally, we establish the optimal trade-offs when the matrices are square and identify a linear relationship between information leakage and communication rate.

Paperid: 2539, https://arxiv.org/pdf/2509.14754.pdf

Abstract:
Solving systems of Boolean equations is a fundamental task in symbolic computation and algebraic cryptanalysis, with wide-ranging applications in cryptography, coding theory, and formal verification. Among existing approaches, the Boolean Characteristic Set (BCS) method[1] has emerged as one of the most efficient algorithms for tackling such problems. However, its performance is highly sensitive to the ordering of variables, with solving times varying drastically under different orderings for fixed variable counts n and equations size m. To address this challenge, this paper introduces a novel optimization framework that synergistically integrates machine learning (ML)-based time prediction with simulated annealing (SA) to efficiently identify high-performance variables orderings. Weconstruct a dataset comprising variable frequency spectrum X and corresponding BCS solving time t for benchmark systems(e.g., n = m = 28). Utilizing this data, we train an accurate ML predictor ft(X) to estimate solving time for any given variables ordering. For each target system, ft serves as the cost function within an SA algorithm, enabling rapid discovery of low-latency orderings that significantly expedite subsequent BCS execution. Extensive experiments demonstrate that our method substantially outperforms the standard BCS algorithm[1], GrÃ¶bner basis method [2] and SAT solver[3], particularly for larger-scale systems(e.g., n = 32). Furthermore, we derive probabilistic time complexity bounds for the overall algorithm using stochastic process theory, establishing a quantitative relationship between predictor accuracy and expected solving complexity. This work provides both a practical acceleration tool for algebraic cryptanalysis and a theoretical foundation for ML-enhanced combinatorial optimization in symbolic computation.

Paperid: 2540, https://arxiv.org/pdf/2509.14583.pdf

Abstract:
The web continues to grow, but dependency-monitoring tools and standards for resource integrity lag behind. Currently, there exists no robust method to verify the integrity of web resources, much less in a generalizable yet performant manner, and supply chains remain one of the most targeted parts of the attack surface of web applications. In this paper, we present the design of LiMS, a transparent system to bootstrap link integrity guarantees in web browsing sessions with minimal overhead. At its core, LiMS uses a set of customizable integrity policies to declare the (un)expected properties of resources, verifies these policies, and enforces them for website visitors. We discuss how basic integrity policies can serve as building blocks for a comprehensive set of integrity policies, while providing guarantees that would be sufficient to defend against recent supply chain attacks detailed by security industry reports. Finally, we evaluate our open-sourced prototype by simulating deployments on a representative sample of 450 domains that are diverse in ranking and category. We find that our proposal offers the ability to bootstrap marked security improvements with an overall overhead of hundreds of milliseconds on initial page loads, and negligible overhead on reloads, regardless of network speeds. In addition, from examining archived data for the sample sites, we find that several of the proposed policy building blocks suit their dependency usage patterns, and would incur minimal administrative overhead.

Paperid: 2541, https://arxiv.org/pdf/2509.14519.pdf

Abstract:
Malware is becoming increasingly complex and widespread, making it essential to develop more effective and timely detection methods. Traditional static analysis often fails to defend against modern threats that employ code obfuscation, polymorphism, and other evasion techniques. In contrast, behavioral malware detection, which monitors runtime activities, provides a more reliable and context-aware solution. In this work, we propose BEACON, a novel deep learning framework that leverages large language models (LLMs) to generate dense, contextual embeddings from raw sandbox-generated behavior reports. These embeddings capture semantic and structural patterns of each sample and are processed by a one-dimensional convolutional neural network (1D CNN) for multi-class malware classification. Evaluated on the Avast-CTU Public CAPE Dataset, our framework consistently outperforms existing methods, highlighting the effectiveness of LLM-based behavioral embeddings and the overall design of BEACON for robust malware classification.

Paperid: 2542, https://arxiv.org/pdf/2509.14275.pdf

Abstract:
Privacy-preserving adaptation of Large Language Models (LLMs) in sensitive domains (e.g., mental health) requires balancing strict confidentiality with model utility and safety. We propose FedMentor, a federated fine-tuning framework that integrates Low-Rank Adaptation (LoRA) and domain-aware Differential Privacy (DP) to meet per-domain privacy budgets while maintaining performance. Each client (domain) applies a custom DP noise scale proportional to its data sensitivity, and the server adaptively reduces noise when utility falls below a threshold. In experiments on three mental health datasets, we show that FedMentor improves safety over standard Federated Learning (FL) without privacy, raising safe output rates by up to three points and lowering toxicity, while maintaining utility (BERTScore F1 and ROUGE-L) within 0.5% of the non-private baseline and close to the centralized upper bound. The framework scales to backbones with up to 1.7B parameters on single-GPU clients, requiring < 173 MB of communication per-round. FedMentor demonstrates a practical approach to privately fine-tune LLMs for safer deployments in healthcare and other sensitive fields.

Paperid: 2543, https://arxiv.org/pdf/2509.14271.pdf

Abstract:
This paper documents early research conducted in 2022 on defending against prompt injection attacks in large language models, providing historical context for the evolution of this critical security domain. This research focuses on two adversarial attacks against Large Language Models (LLMs): prompt injection and goal hijacking. We examine how to construct these attacks, test them on various LLMs, and compare their effectiveness. We propose and evaluate a novel defense technique called Adversarial Fine-Tuning. Our results show that, without this defense, the attacks succeeded 31\% of the time on GPT-3 series models. When using our Adversarial Fine-Tuning approach, attack success rates were reduced to near zero for smaller GPT-3 variants (Ada, Babbage, Curie), though we note that subsequent research has revealed limitations of fine-tuning-based defenses. We also find that more flexible models exhibit greater vulnerability to these attacks. Consequently, large models such as GPT-3 Davinci are more vulnerable than smaller models like GPT-2. While the specific models tested are now superseded, the core methodology and empirical findings contributed to the foundation of modern prompt injection defense research, including instruction hierarchy systems and constitutional AI approaches.

Paperid: 2544, https://arxiv.org/pdf/2509.14139.pdf

Abstract:
We present a systematic security assessment of the Unitree G1 humanoid showing it operates simultaneously as a covert surveillance node and can be purposed as an active cyber operations platform. Initial access can be achieved by exploiting the BLE provisioning protocol which contains a critical command injection vulnerability allowing root access via malformed Wi-Fi credentials, exploitable using hardcoded AES keys shared across all units. Partial reverse engineering of Unitree's proprietary FMX encryption reveal a static Blowfish-ECB layer and a predictable LCG mask-enabled inspection of the system's otherwise sophisticated security architecture, the most mature we have observed in commercial robotics. Two empirical case studies expose the critical risk of this humanoid robot: (a) the robot functions as a trojan horse, continuously exfiltrating multi-modal sensor and service-state telemetry to 43.175.228.18:17883 and 43.175.229.18:17883 every 300 seconds without operator notice, creating violations of GDPR Articles 6 and 13; (b) a resident Cybersecurity AI (CAI) agent can pivot from reconnaissance to offensive preparation against any target, such as the manufacturer's cloud control plane, demonstrating escalation from passive monitoring to active counter-operations. These findings argue for adaptive CAI-powered defenses as humanoids move into critical infrastructure, contributing the empirical evidence needed to shape future security standards for physical-cyber convergence systems.

Paperid: 2545, https://arxiv.org/pdf/2509.13035.pdf

Abstract:
Threat detection systems rely on rule-based logic to identify adversarial behaviors, yet the conformance of these rules to high-level threat models is rarely verified formally. We present a formal verification framework that models both detection logic and attack trees as labeled transition systems (LTSs), enabling automated conformance checking via bisimulation and weak trace inclusion. Detection rules specified in the Generic Threat Detection Language (GTDL, a general-purpose detection language we formalize in this work) are assigned a compositional operational semantics, and threat models expressed as attack trees are interpreted as LTSs through a structural trace semantics. Both representations are translated to LNT, a modeling language supported by the CADP toolbox. This common semantic domain enables systematic and automated verification of detection coverage. We evaluate our approach on real-world malware scenarios such as LokiBot and Emotet and provide scalability analysis through parametric synthetic models. Results confirm that our methodology identifies semantic mismatches between threat models and detection rules, supports iterative refinement, and scales to realistic threat landscapes.

Paperid: 2546, https://arxiv.org/pdf/2509.12979.pdf

Abstract:
Image security for information has become increasingly critical as internet become more prevalent due to hacking and unauthorized access. To ensure the security of confidential image data, image encryption using visual cryptography plays a crucial role. To share multiple images using visual cryptography, the company organizer utilizes the concept of a universal or common share. Likewise, quantum computing is an emerging technology that facilitates secure communication. The ability of quantum computers to solve certain mathematical problems efficiently threatens the security of many current encryption algorithms. Hence, to leverage the strengths of quantum computing and visual cryptography, this research introduces a novel universal share-based quantum multi-secret sharing technique for secure image communication. Quantum computing enables the scheme to exhibit high resilience to different eavesdropping threats. Consequently, the proposed method offers robust security solution for sharing confidential images across a range of applications, including enterprise data access and military communications.

Paperid: 2547, https://arxiv.org/pdf/2509.12879.pdf

Abstract:
SNOVA is a post-quantum cryptographic signature scheme known for its efficiency and compact key sizes, making it a second-round candidate in the NIST post-quantum cryptography standardization process. This paper presents a comprehensive fault analysis of SNOVA, focusing on both permanent and transient faults during signature generation. We introduce several fault injection strategies that exploit SNOVA's structure to recover partial or complete secret keys with limited faulty signatures. Our analysis reveals that as few as 22 to 68 faulty signatures, depending on the security level, can suffice for key recovery. We propose a novel fault-assisted reconciliation attack, demonstrating its effectiveness in extracting the secret key space via solving a quadratic polynomial system. Simulations show transient faults in key signature generation steps can significantly compromise SNOVA's security. To address these vulnerabilities, we propose a lightweight countermeasure to reduce the success of fault attacks without adding significant overhead. Our results highlight the importance of fault-resistant mechanisms in post-quantum cryptographic schemes like SNOVA to ensure robustness.

Paperid: 2548, https://arxiv.org/pdf/2509.12877.pdf

Abstract:
Isogeny-based cryptography has emerged as a promising postquantum alternative, with CSIDH and its constant-time variants CTIDH and dCTIDH offering efficient group-action protocols. However, CTIDH and dCTIDH rely on dummy operations in differential addition chains (DACs) and Matryoshka, which can be exploitable by fault-injection attacks. In this work, we present the first dummy-free implementation of dCTIDH. Our approach combines two recent ideas: DACsHUND, which enforces equal-length DACs within each batch without padding, and a reformulated Matryoshka structure that removes dummy multiplications and validates all intermediate points. Our analysis shows that small primes such as 3, 5, and 7 severely restrict feasible DACsHUND configurations, motivating new parameter sets that exclude them. We implement dummy-free dCTIDH-2048-194 and dCTIDH-2048-205, achieving group action costs of roughly 357,000-362,000 Fp-multiplications, with median evaluation times of 1.59-1.60 (Gcyc). These results do not surpass dC-TIDH, but they outperform CTIDH by roughly 5% while eliminating dummy operations entirely. Compared to dCSIDH, our construction is more than 4x faster. To the best of our knowledge, this is the first efficient implementation of a CSIDH-like protocol that is simultaneously deterministic, constant-time, and fully dummy-free.

Paperid: 2549, https://arxiv.org/pdf/2509.12478.pdf

Abstract:
Authenticated Key Exchange (AKE) establishes shared ('symmetric') cryptographic keys which are essential for secure online communication. AKE protocols can be constructed from public-key cryptography like Key Encapsulation Mechanisms (KEMs). Another approach is to use Quantum Key Distribution (QKD) to establish a symmetric key, which uses quantum communication. Combining post-quantum AKE and QKD appropriately may provide security against quantum attacks even if only one of the two approaches turns out to be secure. We provide an extensive review of existing security analyses for combined AKE and their formal security models, and identify some gaps in their treatment of QKD key IDs. In particular, improper handling of QKD key IDs leads to Dependent-Key attacks on AKE. As our main conceptual contribution, we model QKD as an oracle that closely resembles the standard ETSI 014 QKD interface. We demonstrate the usability of our QKD oracle for cryptographic security analyses by integrating it into a prominent security model for AKE, called CK+ model, thereby obtaining a security model for combined AKE that catches Dependent-Key attacks. In this model, we formally prove security of a new protocol that combines QKD with a triple-KEM handshake. This is the first provably secure hybrid protocol that maintains information-theoretic security of QKD.

Paperid: 2550, https://arxiv.org/pdf/2509.12223.pdf

Abstract:
We propose the Ratio1 AI meta-operating system (meta-OS), a decentralized MLOps protocol that unifies AI model development, deployment, and inference across heterogeneous edge devices. Its key innovation is an integrated blockchain-based framework that transforms idle computing resources (laptops, smartphones, cloud VMs) into a trustless global supercomputer. The architecture includes novel components: a decentralized authentication layer (dAuth), an in-memory state database (CSTORE), a distributed storage system (R1FS), homomorphic encrypted federated learning (EDIL), decentralized container orchestration (Deeploy) and an oracle network (OracleSync), which collectively ensure secure, resilient execution of AI pipelines and other container based apps at scale. The protocol enforces a formal circular token-economic model combining Proof-of-Availability (PoA) and Proof-of-AI (PoAI) consensus. Compared to centralized heterogeneous cloud MLOps and existing decentralized compute platforms, which often lack integrated AI toolchains or trusted Ratio1 node operators (R1OP) mechanics, Ratio1's holistic design lowers barriers for AI deployment and improves cost-efficiency. We provide mathematical formulations of its secure licensing and reward protocols, and include descriptive information for the system architecture and protocol flow. We argue that our proposed fully functional ecosystem proposes and demonstrates significant improvements in accessibility, scalability, and security over existing alternatives.

Paperid: 2551, https://arxiv.org/pdf/2509.11974.pdf

Abstract:
Federated Learning (FL) enables collaborative model training across decentralised clients while keeping local data private, making it a widely adopted privacy-enhancing technology (PET). Despite its privacy benefits, FL remains vulnerable to privacy attacks, including those targeting specific clients. In this paper, we study an underexplored threat where a dishonest orchestrator intentionally manipulates the aggregation process to induce targeted overfitting in the local models of specific clients. Whereas many studies in this area predominantly focus on reducing the amount of information leakage during training, we focus on enabling an early client-side detection of targeted overfitting, thereby allowing clients to disengage before significant harm occurs. In line with this, we propose three detection techniques - (a) label flipping, (b) backdoor trigger injection, and (c) model fingerprinting - that enable clients to verify the integrity of the global aggregation. We evaluated our methods on multiple datasets under different attack scenarios. Our results show that the three methods reliably detect targeted overfitting induced by the orchestrator, but they differ in terms of computational complexity, detection latency, and false-positive rates.

Paperid: 2552, https://arxiv.org/pdf/2509.11712.pdf

Abstract:
In the modern, fast-moving world of e-commerce, many Android apps face challenges in providing a simple and secure shopping experience. Many of these apps, often enough, have complicated designs that prevent users from finding what they want quickly, thus frustrating them and wasting their precious time. Another major issue is that of security; with the limitation of payment options and weak authentication mechanisms, users' sensitive information can be compromised. This research presents a new e-commerce platform that responds to the above challenges with an intuitive interface and strong security measures. The platform makes online shopping easy with well-organized categories of products and a fast, efficient checkout process. It also gives priority to security by incorporating features such as Google authentication and SSL-secured payment gateways to protect user data and ensure secure transactions. This paper discusses how a focus on user-friendliness, security, and personalization steps up the game for e-commerce platforms, providing workable frameworks that match modern user needs and expectations. The findings show the e-commerce user experience can be remodelled by the platform, hence opening ways for future developments in that respect.

Paperid: 2553, https://arxiv.org/pdf/2509.11595.pdf

Abstract:
Anti-money laundering (AML) research is constrained by the lack of publicly shareable, regulation-aligned transaction datasets. We present AMLNet, a knowledge-based multi-agent framework with two coordinated units: a regulation-aware transaction generator and an ensemble detection pipeline. The generator produces 1,090,173 synthetic transactions (approximately 0.16\% laundering-positive) spanning core laundering phases (placement, layering, integration) and advanced typologies (e.g., structuring, adaptive threshold behavior). Regulatory alignment reaches 75\% based on AUSTRAC rule coverage (Section 4.2), while a composite technical fidelity score of 0.75 summarizes temporal, structural, and behavioral realism components (Section 4.4). The detection ensemble achieves F1 0.90 (precision 0.84, recall 0.97) on the internal test partitions of AMLNet and adapts to the external SynthAML dataset, indicating architectural generalizability across different synthetic generation paradigms. We provide multi-dimensional evaluation (regulatory, temporal, network, behavioral) and release the dataset (Version 1.0, https://doi.org/10.5281/zenodo.16736515), to advance reproducible and regulation-conscious AML experimentation.

Paperid: 2554, https://arxiv.org/pdf/2509.11559.pdf

Abstract:
RLWE-based Fully Homomorphic Encryption (FHE) schemes add some small \emph{noise} to the message during encryption. The noise accumulates with each homomorphic operation. When the noise exceeds a critical value, the FHE circuit produces an incorrect output. This makes developing FHE applications quite subtle, as one must closely track the noise to ensure correctness. However, existing libraries and compilers offer limited support to statically track the noise. Additionally, FHE circuits are also plagued by wraparound errors that are common in finite modulus arithmetic. These two limitations of existing compilers and libraries make FHE applications too difficult to develop with confidence. In this work, we present a \emph{correctness-oriented} IR, Intermediate Language for Arithmetic circuits, for type-checking circuits intended for homomorphic evaluation. Our IR is backed by a type system that tracks low-level quantitative bounds (e.g., ciphertext noise) without using the secret key. Using our type system, we identify and prove a strong \emph{functional correctness} criterion for \ila circuits. Additionally, we have designed \ila to be maximally general: our core type system does not directly assume a particular FHE scheme, but instead axiomatizes a \emph{model} of FHE. We instantiate this model with the exact FHE schemes (BGV, BFV and TFHE), and obtain functional correctness for free.

Paperid: 2555, https://arxiv.org/pdf/2509.11237.pdf

Abstract:
In this paper, we demonstrate a way to generalize learning with errors (LWE) to the family of so-called modular-maximal cyclic groups which are non-commuting. Since the group M2t has two cycles of maximal multiplicative order, we use this fact to construct an accurate criterion for restoring the message bit with overwhelming probability. Furthermore, we implement the original idea by O. Regev in the considered group to gain benefits from the non-commutativity of M2t . Also we prove that using this approach we can achieve a level of security comparable to the original idea.

Paperid: 2556, https://arxiv.org/pdf/2509.11158.pdf

Abstract:
Plaintext non-delayed chaotic cipher (PNDCC) means that in the diffusion equation, plaintext has no delay terms while ciphertext has a feedback term. In existing literature, chaotic cipher diffusions invariably take this form. Since its introduction, PNDCC has attracted attention but also doubts. Designers of chaotic ciphers usually claim PNDCC security by statistical tests, while rigorous cryptographic proofs are absent. Thus, it is necessary to re-examine its design rationale and empirical security. To address this issue, we present a typical example of a three-stage permutation-diffusion-permutation PNDCC, which contains multiple security vulnerabilities. Although all of its statistical indicators show good performance, we are able to break it using four different attacks. The first is a differential attack based on homogeneous operations; the second is an S-PTC attack; the third is a novel impulse-step-based differential attack (ISBDA), proposed in this paper, and the fourth is a novel chain attack, also introduced here. These results demonstrate that the fulfilment of statistical criteria is not a sufficient condition for the security of PNDCC. Then, based on a mathematical model of multi-stage PNDCC, we show that the proposed chain attack can successfully break a class of multi-stage PNDCCs. The key technique of the chain attack depends on how to reveal all permutations. To address this key problem, we summarize the chaining rules and show that, from the attacker's perspective, if the same decryption chain can be reconstructed then all permutations can be deciphered. To that end, the entire diffusion process can be broken by solving a system of simultaneous equations. Finally, as a secure improvement, we propose a new scheme termed plaintext-delayed chaotic cipher (PDCC) that can resist various cryptanalytic attacks.

Paperid: 2557, https://arxiv.org/pdf/2509.10858.pdf

Abstract:
Large Language Models (LLMs) have emerged as powerful tools capable of understanding and generating human-like text, offering transformative potential across diverse domains. The Security Operations Center (SOC), responsible for safeguarding digital infrastructure, represents one of these domains. SOCs serve as the frontline of defense in cybersecurity, tasked with continuous monitoring, detection, and response to incidents. However, SOCs face persistent challenges such as high alert volumes, limited resources, high demand for experts with advanced knowledge, delayed response times, and difficulties in leveraging threat intelligence effectively. In this context, LLMs can offer promising solutions by automating log analysis, streamlining triage, improving detection accuracy, and providing the required knowledge in less time. This survey systematically explores the integration of generative AI and more specifically LLMs into SOC workflow, providing a structured perspective on its capabilities, challenges, and future directions. We believe that this survey offers researchers and SOC managers a broad overview of the current state of LLM integration within academic study. To the best of our knowledge, this is the first comprehensive study to examine LLM applications in SOCs in details.

Paperid: 2558, https://arxiv.org/pdf/2509.10563.pdf

Abstract:
Explainable Artificial Intelligence (XAI) enhances the transparency and interpretability of AI models, addressing their inherent opacity. In cybersecurity, particularly within the Internet of Medical Things (IoMT), the black-box nature of AI-driven threat detection poses a significant challenge. Cybersecurity professionals must not only detect attacks but also understand the reasoning behind AI decisions to ensure trust and accountability. The rapid increase in cyberattacks targeting connected medical devices threatens patient safety and data privacy, necessitating advanced AI-driven solutions. This study compares two ensemble learning techniques, bagging and boosting, for cyber-attack classification in IoMT environments. We selected Random Forest for bagging and CatBoost for boosting. Random Forest helps reduce variance, while CatBoost improves bias by combining weak classifiers into a strong ensemble model, making them effective for detecting sophisticated attacks. However, their complexity often reduces transparency, making it difficult for cybersecurity professionals to interpret and trust their decisions. To address this issue, we apply XAI models to generate local and global explanations, providing insights into AI decision-making. Using techniques like SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-agnostic Explanations), we highlight feature importance to help stakeholders understand the key factors driving cyber threat detection.

Paperid: 2559, https://arxiv.org/pdf/2509.10543.pdf

Abstract:
Distributed Denial-of-Service (DDoS) attacks remain a serious threat to online infrastructure, often bypassing detection by altering traffic in subtle ways. We present a method using hive-plot sequences of network data and a 3D convolutional neural network (3D CNN) to classify DDoS traffic with high accuracy. Our system relies on three main ideas: (1) using spatio-temporal hive-plot encodings to set a pattern-recognition baseline, (2) applying adversarial training with FGSM and PGD alongside spatial noise and image shifts, and (3) analyzing frame-wise predictions to find early signals. On a benchmark dataset, our method lifts adversarial accuracy from 50-55% to over 93% while maintaining clean-sample performance. Frames 3-4 offer strong predictive signals, showing early-stage classification is possible.

Paperid: 2560, https://arxiv.org/pdf/2509.10206.pdf

Abstract:
With the rise of fifth-generation (5G) networks in critical applications, it is urgent to move from detection of malicious activity to systems capable of providing a reliable verdict suitable for mitigation. In this regard, understanding and interpreting machine learning (ML) models' security alerts is crucial for enabling actionable incident response orchestration. Explainable Artificial Intelligence (XAI) techniques are expected to enhance trust by providing insights into why alerts are raised. A dominant approach statistically associates feature sets that can be correlated to a given alert. This paper starts by questioning whether such attribution is relevant for future generation communication systems, and investigates its merits in comparison with an approach based on logical explanations. We extensively study two methods, SHAP and VoTE-XAI, by analyzing their interpretations of alerts generated by an XGBoost model in three different use cases with several 5G communication attacks. We identify three metrics for assessing explanations: sparsity, how concise they are; stability, how consistent they are across samples from the same attack type; and efficiency, how fast an explanation is generated. As an example, in a 5G network with 92 features, 6 were deemed important by VoTE-XAI for a Denial of Service (DoS) variant, ICMPFlood, while SHAP identified over 20. More importantly, we found a significant divergence between features selected by SHAP and VoTE-XAI. However, none of the top-ranked features selected by SHAP were missed by VoTE-XAI. When it comes to efficiency of providing interpretations, we found that VoTE-XAI is significantly more responsive, e.g. it provides a single explanation in under 0.002 seconds, in a high-dimensional setting (478 features).

Paperid: 2561, https://arxiv.org/pdf/2509.09950.pdf

Abstract:
Browser fingerprinting enables persistent cross-site user tracking via subtle techniques that often evade conventional defenses or cause website breakage when script-level blocking countermeasures are applied. Addressing these challenges requires detection methods offering both function-level precision to minimize breakage and inherent robustness against code obfuscation and URL manipulation. We introduce ByteDefender, the first system leveraging V8 engine bytecode to detect fingerprinting operations specifically at the JavaScript function level. A Transformer-based classifier, trained offline on bytecode sequences, accurately identifies functions exhibiting fingerprinting behavior. We develop and evaluate light-weight signatures derived from this model to enable low-overhead, on-device matching against function bytecode during compilation but prior to execution, which only adds a 4% (average) latency to the page load time. This mechanism facilitates targeted, real-time prevention of fingerprinting function execution, thereby preserving legitimate script functionality. Operating directly on bytecode ensures inherent resilience against common code obfuscation and URL-based evasion. Our evaluation on the top 100k websites demonstrates high detection accuracy at both function- and script-level, with substantial improvements over state-of-the-art AST-based methods, particularly in robustness against obfuscation. ByteDefender offers a practical framework for effective, precise, and robust fingerprinting mitigation.

Paperid: 2562, https://arxiv.org/pdf/2509.09706.pdf

Abstract:
This study evaluates the resilience of large language models (LLMs) against adversarial attacks, specifically focusing on Flan-T5, BERT, and RoBERTa-Base. Using systematically designed adversarial tests through TextFooler and BERTAttack, we found significant variations in model robustness. RoBERTa-Base and FlanT5 demonstrated remarkable resilience, maintaining accuracy even when subjected to sophisticated attacks, with attack success rates of 0%. In contrast. BERT-Base showed considerable vulnerability, with TextFooler achieving a 93.75% success rate in reducing model accuracy from 48% to just 3%. Our research reveals that while certain LLMs have developed effective defensive mechanisms, these safeguards often require substantial computational resources. This study contributes to the understanding of LLM security by identifying existing strengths and weaknesses in current safeguarding approaches and proposes practical recommendations for developing more efficient and effective defensive strategies.

Paperid: 2563, https://arxiv.org/pdf/2509.09564.pdf

Abstract:
Supervised machine learning techniques rely on labeled data to achieve high task performance, but this requires the labels to capture some meaningful differences in the underlying data structure. For training network intrusion detection algorithms, most datasets contain a series of attack classes and a single large benign class which captures all non-attack network traffic. A review of intrusion detection papers and guides that explicitly state their data preprocessing steps identified that the majority took the labeled categories of the dataset at face value when training their algorithms. The present paper evaluates the structure of benign traffic in several common intrusion detection datasets (NSL-KDD, UNSW-NB15, and CIC-IDS 2017) and determines whether there are meaningful sub-categories within this traffic which may improve overall multi-classification performance using common machine learning techniques. We present an overview of some unsupervised clustering techniques (e.g., HDBSCAN, Mean Shift Clustering) and show how they differentially cluster the benign traffic space.

Paperid: 2564, https://arxiv.org/pdf/2509.09185.pdf

Abstract:
In today's dynamic cyber threat landscape, organizations must take proactive steps to bolster their cybersecurity defenses. Cyber threat hunting is a proactive and iterative process aimed at identifying and mitigating advanced threats that may go undetected by traditional security measures. Rather than waiting for automated security systems to flag potential threats, threat hunting involves actively searching for signs of malicious activity within an organization's network. In this paper, we present the Forensic Visualization Toolkit, a powerful tool designed for digital forensics investigations, analysis of digital evidence, and advanced visualizations to enhance cybersecurity situational awareness and risk management and empower security analysts with an intuitive and interactive tool. Through practical, real-world scenarios, we demonstrate how FVT significantly amplifies the capabilities of cybersecurity professionals, enabling them to effectively identify, analyze, and respond to threats. Furthermore, it is important to highlight that FVT has been integrated into, utilized, and continually enhanced within various EU-funded research projects over recent years.

Paperid: 2565, https://arxiv.org/pdf/2509.09107.pdf

Abstract:
We present CryptGNN, a secure and effective inference solution for third-party graph neural network (GNN) models in the cloud, which are accessed by clients as ML as a service (MLaaS). The main novelty of CryptGNN is its secure message passing and feature transformation layers using distributed secure multi-party computation (SMPC) techniques. CryptGNN protects the client's input data and graph structure from the cloud provider and the third-party model owner, and it protects the model parameters from the cloud provider and the clients. CryptGNN works with any number of SMPC parties, does not require a trusted server, and is provably secure even if P-1 out of P parties in the cloud collude. Theoretical analysis and empirical experiments demonstrate the security and efficiency of CryptGNN.

Paperid: 2566, https://arxiv.org/pdf/2509.09091.pdf

Abstract:
CPU-based trusted execution environments (TEEs) and differential privacy (DP) have gained wide applications for private inference. Due to high inference latency in TEEs, researchers use partition-based approaches that offload linear model components to GPUs. However, dense nonlinear layers of large language models (LLMs) result in significant communication overhead between TEEs and GPUs. DP-based approaches apply random noise to protect data privacy, but this compromises LLM performance and semantic understanding. To overcome the above drawbacks, this paper proposes CMIF, a Confidential and efficient Model Inference Framework. CMIF confidentially deploys the embedding layer in the client-side TEE and subsequent layers on GPU servers. Meanwhile, it optimizes the Report-Noisy-Max mechanism to protect sensitive inputs with a slight decrease in model performance. Extensive experiments on Llama-series models demonstrate that CMIF reduces additional inference overhead in TEEs while preserving user data privacy.

Paperid: 2567, https://arxiv.org/pdf/2509.08995.pdf

Abstract:
The integration of Large Language Models (LLMs) into financial technology (FinTech) has revolutionized the analysis and processing of complex financial data, driving advancements in real-time decision-making and analytics. With the growing trend of deploying AI models on edge devices for financial applications, ensuring the privacy of sensitive financial data has become a significant challenge. To address this, we propose DPFinLLM, a privacy-enhanced, lightweight LLM specifically designed for on-device financial applications. DPFinLLM combines a robust differential privacy mechanism with a streamlined architecture inspired by state-of-the-art models, enabling secure and efficient processing of financial data. This proposed DPFinLLM can not only safeguard user data from privacy breaches but also ensure high performance across diverse financial tasks. Extensive experiments on multiple financial sentiment datasets validate the effectiveness of DPFinLLM, demonstrating its ability to achieve performance comparable to fully fine-tuned models, even under strict privacy constraints.

Paperid: 2568, https://arxiv.org/pdf/2509.08841.pdf

Abstract:
Modern, data-driven medical research requires the processing of sensitive health data on a large scale. However, this data is subject to special protection under the GDPR, which is why processing regularly raises data protection concerns in practice. These concerns are particularly prevalent when sensitive personal data is processed without informed consent. This article analyses options for data processing in the field of medical research without consent and describes the legal framework for anonymisation under the GDPR, the national Austrian implementation of the research exemption, and their interaction. -- Moderne, datengetriebene medizinische Forschung erfordert die Verarbeitung sensibler Gesundheitsdaten in grossem Ausmass. Diese sind im System der DSGVO jedoch besonders geschÃ¼tzt, weswegen einer rechtssicheren Verarbeitung in der Praxis regelmÃ¤ssig datenschutzrechtliche Bedenken entgegenstehen. Diese Bedenken bestehen insbesondere bei Verarbeitung sensibler personenbezogener Daten ohne informierte Einwilligung. Dieser Beitrag analysiert daher MÃ¶glichkeiten zur Datenverarbeitung im Bereich der medizinischen Forschung fernab der Einwilligung und beschreibt hierfÃ¼r das rechtliche Rahmenwerk fÃ¼r Anonymisierung der DSGVO, die nationale, Ã¶sterreichische Umsetzung der Forschungsausnahme und ihr Zusammenspiel.

Paperid: 2569, https://arxiv.org/pdf/2509.08838.pdf

Abstract:
This article discusses the troubled relationship between contemporary advertising technology (adtech) systems, in particular systems of real-time bidding (RTB, also known as programmatic advertising) underpinning much behavioral targeting on the web and through mobile applications. This article analyzes the extent to which practices of RTB are compatible with the requirements regarding a legal basis for processing, transparency, and security in European data protection law. We first introduce the technologies at play through explaining and analyzing the systems deployed online today. Following that, we turn to the law. Rather than analyze RTB against every provision of the General Data Protection Regulation (GDPR), we consider RTB in the context of the GDPR's requirement of a legal basis for processing and the GDPR's transparency and security requirements. We show, first, that the GDPR requires prior consent of the internet user for RTB, as other legal bases are not appropriate. Second, we show that it is difficult - and perhaps impossible - for website publishers and RTB companies to meet the GDPR's transparency requirements. Third, RTB incentivizes insecure data processing. We conclude that, in concept and in practice, RTB is structurally difficult to reconcile with European data protection law. Therefore, intervention by regulators is necessary.

Paperid: 2570, https://arxiv.org/pdf/2509.08758.pdf

Abstract:
WebAssembly (Wasm) is rapidly gaining popularity as a distribution format for software components embedded in various security-critical domains. Unfortunately, despite its prudent design, WebAssembly's primary use case as a compilation target for memory-unsafe languages leaves some possibilities for memory corruption. Independently of that, Wasm is an inherently interesting target for information flow analysis due to its interfacing role. Both the information flows between a Wasm module and its embedding context, as well as the memory integrity within a module, can be described by the hyperproperty noninterference. So far, no sound, fully static noninterference analysis for Wasm has been presented, but sound reachability analyses were. This work presents a novel and general approach to lift reachability analyses to noninterference by tracking taints on values and using value-sensitive, relational reasoning to remove them when appropriate. We implement this approach in Wanilla, the first automatic, sound, and fully static noninterference analysis for WebAssembly, and demonstrate its performance and precision by verifying memory integrity and other noninterference properties with several synthetic and real-world benchmarks.

Paperid: 2571, https://arxiv.org/pdf/2509.08740.pdf

Abstract:
Organizations use data lakes to store and analyze sensitive data. But hackers may compromise data lake storage to bypass access controls and access sensitive data. To address this, we propose Membrane, a system that (1) cryptographically enforces data-dependent access control views over a data lake, (2) without restricting the analytical queries data scientists can run. We observe that data lakes, unlike DBMSes, disaggregate computation and storage into separate trust domains, making at-rest encryption sufficient to defend against remote attackers targeting data lake storage, even when running analytical queries in plaintext. This leads to a new system design for Membrane that combines encryption at rest with SQL-aware encryption. Using block ciphers, a fast symmetric-key primitive with hardware acceleration in CPUs, we develop a new SQL-aware encryption protocol well-suited to at-rest encryption. Membrane adds overhead only at the start of an interactive session due to decrypting views, delaying the first query result by up to $\approx 20\times$; subsequent queries process decrypted data in plaintext, resulting in low amortized overhead.

Paperid: 2572, https://arxiv.org/pdf/2509.08709.pdf

Abstract:
In cross-device private federated learning, differentially private follow-the-regularized-leader (DP-FTRL) has emerged as a promising privacy-preserving method. However, existing approaches assume a semi-honest server and have not addressed the challenge of securely removing this assumption. This is due to its statefulness, which becomes particularly problematic in practical settings where clients can drop out or be corrupted. While trusted execution environments (TEEs) might seem like an obvious solution, a straightforward implementation can introduce forking attacks or availability issues due to state management. To address this problem, our paper introduces a novel server extension that acts as a trusted computing base (TCB) to realize maliciously secure DP-FTRL. The TCB is implemented with an ephemeral TEE module on the server side to produce verifiable proofs of server actions. Some clients, upon being selected, participate in auditing these proofs with small additional communication and computational demands. This extension solution reduces the size of the TCB while maintaining the system's scalability and liveness. We provide formal proofs based on interactive differential privacy, demonstrating privacy guarantee in malicious settings. Finally, we experimentally show that our framework adds small constant overhead to clients in several realistic settings.

Paperid: 2573, https://arxiv.org/pdf/2509.08204.pdf

Abstract:
Software ecosystems like Maven Central play a crucial role in modern software supply chains by providing repositories for libraries and build plugins. However, the separation between binaries and their corresponding source code in Maven Central presents a significant challenge, particularly when it comes to linking binaries back to their original build environment. This lack of transparency poses security risks, as approximately 84% of the top 1200 commonly used artifacts are not built using a transparent CI/CD pipeline. Consequently, users must place a significant amount of trust not only in the source code but also in the environment in which these artifacts are built. Rebuilding software artifacts from source provides a robust solution to improve supply chain security. This approach allows for a deeper review of code, verification of binary-source equivalence, and control over dependencies. However, challenges arise due to variations in build environments, such as JDK versions and build commands, which can lead to build failures. Additionally, ensuring that all dependencies are rebuilt from source across large and complex dependency graphs further complicates the process. In this paper, we introduce an extension to Macaron, an industry-grade open-source supply chain security framework, to automate the rebuilding of Maven artifacts from source. Our approach improves upon existing tools, by offering better performance in source code detection and automating the extraction of build specifications from GitHub Actions workflows. We also present a comprehensive root cause analysis of build failures in Java projects and propose a scalable solution to automate the rebuilding of artifacts, ultimately enhancing security and transparency in the open-source supply chain.

Paperid: 2574, https://arxiv.org/pdf/2509.07924.pdf

Abstract:
Detecting unseen ransomware is a critical cybersecurity challenge where classical machine learning often fails. While Quantum Machine Learning (QML) presents a potential alternative, its application is hindered by the dimensionality gap between classical data and quantum hardware. This paper empirically investigates a hybrid framework using a Variational Quantum Classifier (VQC) interfaced with a high-dimensional dataset via Principal Component Analysis (PCA). Our analysis reveals a dual challenge for practical QML. A significant information bottleneck was evident, as even the best performing 12-qubit VQC fell short of the classical baselines 97.7\% recall. Furthermore, a non-monotonic performance trend, where performance degraded when scaling from 4 to 8 qubits before improving at 12 qubits suggests a severe trainability issue. These findings highlight that unlocking QMLs potential requires co-developing more efficient data compression techniques and robust quantum optimization strategies.

Paperid: 2575, https://arxiv.org/pdf/2509.07764.pdf

Abstract:
Large Language Models (LLMs) have been increasingly integrated into computer-use agents, which can autonomously operate tools on a user's computer to accomplish complex tasks. However, due to the inherently unstable and unpredictable nature of LLM outputs, they may issue unintended tool commands or incorrect inputs, leading to potentially harmful operations. Unlike traditional security risks stemming from insecure user prompts, tool execution results from LLM-driven decisions introduce new and unique security challenges. These vulnerabilities span across all components of a computer-use agent. To mitigate these risks, we propose AgentSentinel, an end-to-end, real-time defense framework designed to mitigate potential security threats on a user's computer. AgentSentinel intercepts all sensitive operations within agent-related services and halts execution until a comprehensive security audit is completed. Our security auditing mechanism introduces a novel inspection process that correlates the current task context with system traces generated during task execution. To thoroughly evaluate AgentSentinel, we present BadComputerUse, a benchmark consisting of 60 diverse attack scenarios across six attack categories. The benchmark demonstrates a 87% average attack success rate on four state-of-the-art LLMs. Our evaluation shows that AgentSentinel achieves an average defense success rate of 79.6%, significantly outperforming all baseline defenses.

Paperid: 2576, https://arxiv.org/pdf/2509.07649.pdf

Abstract:
Traditional risk assessments rely on manual audits and system scans, often causing operational disruptions and leaving security gaps. To address these challenges, this work presents Security Digital Twin-as-a-Service (SDT-aaS), a novel approach that leverages Digital Twin (DT) technology for automated, non-intrusive security compliance. SDT-aaS enables real-time security assessments by mirroring real-world assets, collecting compliance artifacts, and creating machine-readable evidence. The proposed work is a scalable and interoperable solution that supports open standards like CycloneDX and Web of Things (WoT), facilitating seamless integration and efficient compliance management. Empirical results from a moderate-scale infrastructure use case demonstrate its feasibility and performance, paving the way for efficient, on-demand cybersecurity governance with minimal operational impact.

Paperid: 2577, https://arxiv.org/pdf/2509.07505.pdf

Abstract:
Spatial data are gaining increasing importance in many areas of research. Particularly spatial health data are becoming increasingly important for medical research, for example, to better understand relationships between environmental factors and disease patterns. However, their use is often restricted by legal data protection regulations, since georeferenced personal information carries a high risk of re-identification of individuals. To address this issue, what are called geomasking methods are applied to guarantee data protection through targeted displacement of individual data points, while simultaneously maintaining analytical validity within a tolerable range. In the current literature the degree of anonymity of such anonymized georeferenced datasets is often measured by the so-called metric of spatial k-anonymity. However, this metric has considerable shortcomings, particularly regarding its resilience against realistic data attack scenarios. This article classifies the potential data attack scenarios in the context of anonymized georeferenced microdata and introduces appropriate metrics that enable a comprehensive assessment of anonymity adapted to potential data attack scenarios.

Paperid: 2578, https://arxiv.org/pdf/2509.07465.pdf

Abstract:
Age verification is increasingly critical for regulatory compliance, user trust, and the protection of minors online. Historically, solutions have struggled with poor accuracy, intrusiveness, and significant security risks. More recently, concerns have shifted toward privacy, surveillance, fairness, and the need for transparent, trustworthy systems. In this paper, we propose Biometric Bound Credentials (BBCreds) as a privacy-preserving approach that cryptographically binds age credentials to an individual's biometric features without storing biometric templates. This ensures only the legitimate, physically present user can access age-restricted services, prevents credential sharing, and addresses both legacy and emerging challenges in age verification. enhances privacy.

Paperid: 2579, https://arxiv.org/pdf/2509.07290.pdf

Abstract:
As the demand for exercising the "right to be forgotten" grows, the need for verifiable machine unlearning has become increasingly evident to ensure both transparency and accountability. We present {\em zkUnlearner}, the first zero-knowledge framework for verifiable machine unlearning, specifically designed to support {\em multi-granularity} and {\em forgery-resistance}. First, we propose a general computational model that employs a {\em bit-masking} technique to enable the {\em selectivity} of existing zero-knowledge proofs of training for gradient descent algorithms. This innovation enables not only traditional {\em sample-level} unlearning but also more advanced {\em feature-level} and {\em class-level} unlearning. Our model can be translated to arithmetic circuits, ensuring compatibility with a broad range of zero-knowledge proof systems. Furthermore, our approach overcomes key limitations of existing methods in both efficiency and privacy. Second, forging attacks present a serious threat to the reliability of unlearning. Specifically, in Stochastic Gradient Descent optimization, gradients from unlearned data, or from minibatches containing it, can be forged using alternative data samples or minibatches that exclude it. We propose the first effective strategies to resist state-of-the-art forging attacks. Finally, we benchmark a zkSNARK-based instantiation of our framework and perform comprehensive performance evaluations to validate its practicality.

Paperid: 2580, https://arxiv.org/pdf/2509.07282.pdf

Abstract:
We present cryptogram solving as an ideal testbed for studying neural network reasoning and generalization; models must decrypt text encoded with substitution ciphers, choosing from 26! possible mappings without explicit access to the cipher. We develop ALICE (an Architecture for Learning Interpretable Cryptogram dEcipherment), a simple encoder-only Transformer that sets a new state-of-the-art for both accuracy and speed on this decryption problem. Surprisingly, ALICE generalizes to unseen ciphers after training on only ${\sim}1500$ unique ciphers, a minute fraction ($3.7 \times 10^{-24}$) of the possible cipher space. To enhance interpretability, we introduce a novel bijective decoding head that explicitly models permutations via the Gumbel-Sinkhorn method, enabling direct extraction of learned cipher mappings. Through early exit and probing experiments, we reveal how ALICE progressively refines its predictions in a way that appears to mirror common human strategies -- early layers place greater emphasis on letter frequencies, while later layers form word-level structures. Our architectural innovations and analysis methods are applicable beyond cryptograms and offer new insights into neural network generalization and interpretability.

Paperid: 2581, https://arxiv.org/pdf/2509.07016.pdf

Abstract:
In response to the prevalent concern of TCP SYN flood attacks within the context of Software-Defined Internet of Vehicles (SD-IoV), this study addresses the significant challenge of network security in rapidly evolving vehicular communication systems. This research focuses on optimizing a Random Forest Classifier model to achieve maximum accuracy and minimal detection time, thereby enhancing vehicular network security. The methodology involves preprocessing a dataset containing SYN attack instances, employing feature scaling and label encoding techniques, and applying Stratified K-Fold cross-validation to target key metrics such as accuracy, precision, recall, and F1-score. This research achieved an average value of 0.999998 for all metrics with a SYN DoS attack detection time of 0.24 seconds. Results show that the fine-tuned Random Forest model, configured with 20 estimators and a depth of 10, effectively differentiates between normal and malicious traffic with high accuracy and minimal detection time, which is crucial for SD-IoV networks. This approach marks a significant advancement and introduces a state-of-the-art algorithm in detecting SYN flood attacks, combining high accuracy with minimal detection time. It contributes to vehicular network security by providing a robust solution against TCP SYN flood attacks while maintaining network efficiency and reliability.

Paperid: 2582, https://arxiv.org/pdf/2509.06626.pdf

Abstract:
The InterPlanetary File System (IPFS) has been successfully established as the de facto standard for decentralized data storage in the emerging Web3. Despite its decentralized nature, IPFS nodes, as well as IPFS content providers, have converged to centralization in large public clouds. Centralization introduces BGP routing-based attacks, such as passive interception and BGP hijacking, as potential threats. Although this attack vector has been investigated for many other Web3 protocols, such as Bitcoin and Ethereum, to the best of our knowledge, it has not been analyzed for the IPFS network. In our work, we bridge this gap and demonstrate that BGP routing attacks can be effectively leveraged to censor content in IPFS. For the analysis, we collected 3,000 content blocks called CIDs and conducted a simulation of BGP hijacking and passive interception against them. We find that a single malicious AS can censor 75% of the IPFS content for more than 57% of all requester nodes. Furthermore, we show that even with a small set of only 62 hijacked prefixes, 70% of the full attack effectiveness can already be reached. We further propose and validate countermeasures based on global collaborative content replication among all nodes in the IPFS network, together with additional robust backup content provider nodes that are well-hardened against BGP hijacking. We hope this work raises awareness about the threat BGP routing-based attacks pose to IPFS and triggers further efforts to harden the live IPFS network against them.

Paperid: 2583, https://arxiv.org/pdf/2509.06562.pdf

Abstract:
In 2019, V. A. Roman'kov introduced the concept of marginal sets for groups. He developed a theory of marginal sets and demonstrated how these sets can be applied to improve some key exchange schemes. In this paper, we extend his ideas and introduce the concept of marginal sets for semigroups and semirings. For tropical matrix semigroups and semirings, we describe how some marginal sets can be constructed. We apply marginal sets to improve some key exchange schemes over semigroups.

Paperid: 2584, https://arxiv.org/pdf/2509.06549.pdf

Abstract:
We study the fundamental problem of guessing cryptographic keys, drawn from some non-uniform probability distribution $D$, as e.g. in LPN, LWE or for passwords. The optimal classical algorithm enumerates keys in decreasing order of likelihood. The optimal quantum algorithm, due to Montanaro (2011), is a sophisticated Grover search. We give the first tight analysis for Montanaro's algorithm, showing that its runtime is $2^{H_{2/3}(D)/2}$, where $H_Î±(\cdot)$ denotes Renyi entropy with parameter $Î±$. Interestingly, this is a direct consequence of an information theoretic result called Arikan's Inequality (1996) -- which has so far been missed in the cryptographic community -- that tightly bounds the runtime of classical key guessing by $2^{H_{1/2}(D)}$. Since $H_{2/3}(D) < H_{1/2}(D)$ for every non-uniform distribution $D$, we thus obtain a super-quadratic quantum speed-up $s>2$ over classical key guessing. As another main result, we provide the first thorough analysis of guessing in a multi-key setting. Specifically, we consider the task of attacking many keys sampled independently from some distribution $D$, and aim to guess a fraction of them. For product distributions $D = Ï^n$, we show that any constant fraction of keys can be guessed within $2^{H(D)}$ classically and $2 ^{H(D)/2}$ quantumly per key, where $H(Ï)$ denotes Shannon entropy. In contrast, Arikan's Inequality implies that guessing a single key costs $2^{H_{1/2}(D)}$ classically and $2^{H_{2/3}(D)/2}$ quantumly. Since $H(D) < H_{2/3}(D) < H_{1/2}(D)$, this shows that in a multi-key setting the guessing cost per key is substantially smaller than in a single-key setting, both classically and quantumly.

Paperid: 2585, https://arxiv.org/pdf/2509.06136.pdf

Abstract:
As AI is increasingly integrated into products and critical systems, researchers are paying greater attention to identifying related vulnerabilities. Effective remediation depends on whether vendors are willing to accept and respond to AI vulnerability reports. In this paper, we examine the disclosure policies of 264 AI vendors. Using a mixed-methods approach, our quantitative analysis finds that 36% of vendors provide no disclosure channel, and only 18% explicitly mention AI-related risks. Vulnerabilities involving data access, authorization, and model extraction are generally considered in-scope, while jailbreaking and hallucination are frequently excluded. Through qualitative analysis, we further identify three vendor postures toward AI vulnerabilities - proactive clarification (n = 46, include active supporters, AI integrationists, and back channels), silence (n = 115, include self-hosted and hosted vendors), and restrictive (n = 103). Finally, by comparing vendor policies against 1,130 AI incidents and 359 academic publications, we show that bug bounty policy evolution has lagged behind both academic research and real-world events.

Paperid: 2586, https://arxiv.org/pdf/2509.06127.pdf

Abstract:
Identity-based cryptography (IBC), proposed by Adi Shamir, revolutionized public key authentication by eliminating the need for certificates, enabling a more efficient and scalable approach to cryptographic systems. Meanwhile, in \cite{Katsumata2024group}, Katsumata et al. were the first to present the blind signature protocol based on the hardness assumption of isogeny with provable security, which resembles the Schnorr blind signature. Building upon these foundational concepts, we propose an Identity-Based Blind Signature Scheme with an Honest Zero-Knowledge Verifier utilizing the CSIDH framework. This scheme combines blind signatures for privacy preservation with zero-knowledge proofs to ensure the verifier's honesty without revealing any additional information. Leveraging the quantum-resistant properties of CSIDH, a post-quantum secure scheme based on supersingular isogenies, our scheme offers strong protection against quantum adversaries while maintaining computational efficiency. We analyze the security of the introduced protocol in the standard cryptographic model and demonstrate its effectiveness in safeguarding privacy and verifier honesty. Furthermore, we present a performance evaluation, confirming the practical viability of this quantum-resistant cryptographic solution for privacy-preserving applications. This work advances the creation of secure, and scalable cryptographic systems for the post-quantum era.

Paperid: 2587, https://arxiv.org/pdf/2509.05883.pdf

Abstract:
Large Language Models (LLMs) have seen rapid adoption in recent years, with industries increasingly relying on them to maintain a competitive advantage. These models excel at interpreting user instructions and generating human-like responses, leading to their integration across diverse domains, including consulting and information retrieval. However, their widespread deployment also introduces substantial security risks, most notably in the form of prompt injection and jailbreak attacks. To systematically evaluate LLM vulnerabilities -- particularly to external prompt injection -- we conducted a series of experiments on eight commercial models. Each model was tested without supplementary sanitization, relying solely on its built-in safeguards. The results exposed exploitable weaknesses and emphasized the need for stronger security measures. Four categories of attacks were examined: direct injection, indirect (external) injection, image-based injection, and prompt leakage. Comparative analysis indicated that Claude 3 demonstrated relatively greater robustness; nevertheless, empirical findings confirm that additional defenses, such as input normalization, remain necessary to achieve reliable protection.

Paperid: 2588, https://arxiv.org/pdf/2509.05797.pdf

Abstract:
In the evolving landscape of future mobile networks, there is a critical need for secure and trustful communication modalities to support dynamic interactions among core network components of different network domains. This paper proposes the application of W3C-endorsed Decentralized Identifiers (DIDs) to establish secure and trustful communication channels among network functions in 5G and subsequent generations. A new communication agent is introduced that integrates seamlessly with 5G-standardized network functions and utilizes a DID-based application layer transport protocol to ensure confidentiality, integrity, and authenticity for cross-domain interactions. A comparative analysis of the two different versions of the DID-based communication protocol for inter network function communication reveals compatibility advantages of the latest protocol iteration. Furthermore, a comprehensive evaluation of the communication overhead caused by both protocol iterations compared to traditional TCP/TLS shows the benefits of using DIDs to improve communication security, albeit with performance loses compared to TCP/TLS. These results uncover the potential of DID-based communication for future mobile networks but also point out areas for optimization.

Paperid: 2589, https://arxiv.org/pdf/2509.05671.pdf

Abstract:
Human Activity Recognition (HAR) using multimodal sensor data remains challenging due to noisy or incomplete measurements, scarcity of labeled examples, and privacy concerns. Traditional centralized deep learning approaches are often constrained by infrastructure availability, network latency, and data sharing restrictions. While federated learning (FL) addresses privacy by training models locally and sharing only model parameters, it still has to tackle issues arising from the use of heterogeneous multimodal data and differential privacy requirements. In this article, a Graph-based Multimodal Federated Learning framework, GraMFedDHAR, is proposed for HAR tasks. Diverse sensor streams such as a pressure mat, depth camera, and multiple accelerometers are modeled as modality-specific graphs, processed through residual Graph Convolutional Neural Networks (GCNs), and fused via attention-based weighting rather than simple concatenation. The fused embeddings enable robust activity classification, while differential privacy safeguards data during federated aggregation. Experimental results show that the proposed MultiModalGCN model outperforms the baseline MultiModalFFN, with up to 2 percent higher accuracy in non-DP settings in both centralized and federated paradigms. More importantly, significant improvements are observed under differential privacy constraints: MultiModalGCN consistently surpasses MultiModalFFN, with performance gaps ranging from 7 to 13 percent depending on the privacy budget and setting. These results highlight the robustness of graph-based modeling in multimodal learning, where GNNs prove more resilient to the performance degradation introduced by DP noise.

Paperid: 2590, https://arxiv.org/pdf/2509.05608.pdf

Abstract:
The widespread deployment of LLMs across enterprise services has created a critical security blind spot. Organizations operate multiple LLM services handling billions of queries daily, yet regulatory compliance boundaries prevent these services from sharing threat intelligence about prompt injection attacks, the top security risk for LLMs. When an attack is detected in one service, the same threat may persist undetected in others for months, as privacy regulations prohibit sharing user prompts across compliance boundaries. We present BinaryShield, the first privacy-preserving threat intelligence system that enables secure sharing of attack fingerprints across compliance boundaries. BinaryShield transforms suspicious prompts through a unique pipeline combining PII redaction, semantic embedding, binary quantization, and randomized response mechanism to potentially generate non-invertible fingerprints that preserve attack patterns while providing privacy. Our evaluations demonstrate that BinaryShield achieves an F1-score of 0.94, significantly outperforming SimHash (0.77), the privacy-preserving baseline, while achieving 64x storage reduction and 38x faster similarity search compared to dense embeddings.

Paperid: 2591, https://arxiv.org/pdf/2509.05370.pdf

Abstract:
This study explores the application of quantum machine learning (QML) algorithms to enhance cybersecurity threat detection, particularly in the classification of malware and intrusion detection within high-dimensional datasets. Classical machine learning approaches encounter limitations when dealing with intricate, obfuscated malware patterns and extensive network intrusion data. To address these challenges, we implement and evaluate various QML algorithms, including Quantum Neural Networks (QNN), Quantum Support Vector Machines (QSVM), and hybrid Quantum Convolutional Neural Networks (QCNN) for malware detection tasks. Our experimental analysis utilized two datasets: the Intrusion dataset, comprising 150 samples with 56 memory-based features derived from Volatility framework analysis, and the ObfuscatedMalMem2022 dataset, containing 58,596 samples with 57 features representing benign and malicious software. Remarkably, our QML methods demonstrated superior performance compared to classical approaches, achieving accuracies of 95% for QNN and 94% for QSVM. These quantum-enhanced methods leveraged quantum superposition and entanglement principles to accurately identify complex patterns within highly obfuscated malware samples that were imperceptible to classical methods. To further advance malware analysis, we propose a novel real-time malware analysis framework that incorporates Quantum Feature Extraction using Quantum Fourier Transform, Quantum Feature Maps, and Classification using Variational Quantum Circuits. This system integrates explainable AI methods, including GradCAM++ and ScoreCAM algorithms, to provide interpretable insights into the quantum decision-making processes.

Paperid: 2592, https://arxiv.org/pdf/2509.05366.pdf

Abstract:
The increasing use of Internet of Things (IoT) devices has led to a rise in security related concerns regarding IoT Networks. The surveillance cameras in IoT networks are vulnerable to security threats such as brute force and zero-day attacks which can lead to unauthorized access by hackers and potential spying on the users activities. Moreover, these cameras can be targeted by Denial of Service (DOS) attacks, which will make it unavailable for the user. The proposed AI based framework will leverage machine learning algorithms to analyze network traffic and detect anomalous behavior, allowing for quick detection and response to potential intrusions. The framework will be trained and evaluated using real-world datasets to learn from past security incidents and improve its ability to detect potential intrusion.

Paperid: 2593, https://arxiv.org/pdf/2509.05331.pdf

Abstract:
The growing complexity of cyber incidents presents significant challenges for digital forensic investigators, especially in evidence collection and analysis. Public resources are still limited because of ethical, legal, and privacy concerns, even though realistic datasets are necessary to support research and tool developments. To address this gap, we introduce ForensicsData, an extensive Question-Context-Answer (Q-C-A) dataset sourced from actual malware analysis reports. It consists of more than 5,000 Q-C-A triplets. A unique workflow was used to create the dataset, which extracts structured data, uses large language models (LLMs) to transform it into Q-C-A format, and then uses a specialized evaluation process to confirm its quality. Among the models evaluated, Gemini 2 Flash demonstrated the best performance in aligning generated content with forensic terminology. ForensicsData aims to advance digital forensics by enabling reproducible experiments and fostering collaboration within the research community.

Paperid: 2594, https://arxiv.org/pdf/2509.05306.pdf

Abstract:
The scarcity of real-world attack data significantly hinders progress in cybersecurity research and education. Although honeypots like Cowrie effectively collect live threat intelligence, they generate overwhelming volumes of unstructured and heterogeneous logs, rendering manual analysis impractical. As a first step in our project on secure and efficient AI automation, this study explores the use of AI agents for automated log analysis. We present a lightweight and automated approach to process Cowrie honeypot logs. Our approach leverages AI agents to intelligently parse, summarize, and extract insights from raw data, while also considering the security implications of deploying such an autonomous system. Preliminary results demonstrate the pipeline's effectiveness in reducing manual effort and identifying attack patterns, paving the way for more advanced autonomous cybersecurity analysis in future work.

Paperid: 2595, https://arxiv.org/pdf/2509.05162.pdf

Abstract:
Federated Learning has rapidly expanded from its original inception to now have a large body of research, several frameworks, and sold in a variety of commercial offerings. Thus, its security and robustness is of significant importance. There are many algorithms that provide robustness in the case of malicious clients. However, the aggregator itself may behave maliciously, for example, by biasing the model or tampering with the weights to weaken the models privacy. In this work, we introduce a verifiable federated learning protocol that enables clients to verify the correctness of the aggregators computation without compromising the confidentiality of their updates. Our protocol uses a standard secure aggregation technique to protect individual model updates with a linearly homomorphic authenticator scheme that enables efficient, privacy-preserving verification of the aggregated result. Our construction ensures that clients can detect manipulation by the aggregator while maintaining low computational overhead. We demonstrate that our approach scales to large models, enabling verification over large neural networks with millions of parameters.

Paperid: 2596, https://arxiv.org/pdf/2509.05150.pdf

Abstract:
Live migration of applications is a fundamental capability for enabling resilient computing in modern distributed systems. However, extending this functionality to trusted applications (TA) -- executing within Trusted Execution Environments (TEEs) -- introduces unique challenges such as secure state preservation, integrity verification, replay and rollback prevention, and mitigation of unauthorized cloning of TAs. We present TALOS, a lightweight framework for verifiable state management and trustworthy application migration. While our implementation is prototyped and evaluated using Intel SGX with the Gramine LibOS and RISC-V Keystone (evidencing the framework's portability across diverse TEEs), its design is agnostic to the underlying TEE architecture. Such agility is a necessity in today's network service mesh (collaborative computing across the continuum) where application workloads must be managed across domain boundaries in a harmonized fashion. TALOS is built around the principle of minimizing trust assumptions: TAs are treated as untrusted until explicitly verified, and the migration process does not rely on a trusted third party. To ensure both the integrity and secure launch of the migrated application, TALOS integrates memory introspection and control-flow graph extraction, enabling robust verification of state continuity and execution flow. Thereby achieving strong security guarantees while maintaining efficiency, making it suitable for decentralized settings.

Paperid: 2597, https://arxiv.org/pdf/2509.04967.pdf

Abstract:
Binary-only fuzzing often struggles with achieving thorough code coverage and uncovering hidden vulnerabilities due to limited insight into a program's internal dataflows. Traditional grey-box fuzzers guide test case generation primarily using control flow edge coverage, which can overlook bugs not easily exposed through control flow analysis alone. We argue that integrating dataflow analysis into the fuzzing process can enhance its effectiveness by revealing how data propagates through the program, thereby enabling the exploration of execution paths that control flow-based methods might miss. In this context, we introduce FuzzRDUCC, a novel fuzzing framework that employs symbolic execution to reconstruct definition-use (def-use) chains directly from binary executables. FuzzRDUCC identifies crucial dataflow paths and exposes security vulnerabilities without incurring excessive computational overhead, due to a novel heuristic algorithm that selects relevant def-use chains without affecting the thoroughness of the fuzzing process. We evaluate FuzzRDUCC using the binutils benchmark and demonstrate that it can identify unique crashes not found by state-of-the-art fuzzers. Hence, establishing FuzzRDUCC as a feasible solution for next generation vulnerability detection and discovery mechanisms.

Paperid: 2598, https://arxiv.org/pdf/2509.04615.pdf

Abstract:
The proliferation of Large Language Models (LLMs) has introduced critical security challenges, where adversarial actors can manipulate input prompts to cause significant harm and circumvent safety alignments. These prompt-based attacks exploit vulnerabilities in a model's design, training, and contextual understanding, leading to intellectual property theft, misinformation generation, and erosion of user trust. A systematic understanding of these attack vectors is the foundational step toward developing robust countermeasures. This paper presents a comprehensive literature survey of prompt-based attack methodologies, categorizing them to provide a clear threat model. By detailing the mechanisms and impacts of these exploits, this survey aims to inform the research community's efforts in building the next generation of secure LLMs that are inherently resistant to unauthorized distillation, fine-tuning, and editing.

Paperid: 2599, https://arxiv.org/pdf/2509.04214.pdf

Abstract:
Machine learning (ML) models have the potential to transform military battlefields, presenting a large external pressure to rapidly incorporate them into operational settings. However, it is well-established that these ML models are vulnerable to a number of adversarial attacks throughout the model deployment pipeline that threaten to negate battlefield advantage. One broad category is privacy attacks (such as model inversion) where an adversary can reverse engineer information from the model, such as the sensitive data used in its training. The ability to quantify the risk of model inversion attacks (MIAs) is not well studied, and there is a lack of automated developmental test and evaluation (DT&E) tools and metrics to quantify the effectiveness of privacy loss of the MIA. The current DT&E process is difficult because ML model inversions can be hard for a human to interpret, subjective when they are interpretable, and difficult to quantify in terms of inversion quality. Additionally, scaling the DT&E process is challenging due to many ML model architectures and data modalities that need to be assessed. In this work, we present a novel DT&E tool that quantifies the risk of data privacy loss from MIAs and introduces four adversarial risk dimensions to quantify privacy loss. Our DT&E pipeline combines inversion with vision language models (VLMs) to improve effectiveness while enabling scalable analysis. We demonstrate effectiveness using multiple MIA techniques and VLMs configured for zero-shot classification and image captioning. We benchmark the pipeline using several state-of-the-art MIAs in the computer vision domain with an image classification task that is typical in military applications. In general, our innovative pipeline extends the current model inversion DT&E capabilities by improving the effectiveness and scalability of the privacy loss analysis in an automated fashion.

Paperid: 2600, https://arxiv.org/pdf/2509.04010.pdf

Abstract:
The PQDSS standardization process requires cryptographic primitives to be free from vulnerabilities, including timing and cache side-channels. Resistance to timing leakage is therefore an essential property, and achieving this typically relies on software implementations that follow constant-time principles. Moreover, ensuring that all implementations are constant-time is crucial for fair performance comparisons, as secure implementations often incur additional overhead. Such analysis also helps identify scheme proposals that are inherently difficult to implement in constant time. Because constant-time properties can be broken during compilation, it is often necessary to analyze the compiled binary directly. Since manual binary analysis is extremely challenging, automated analysis becomes highly important. Although several tools exist to assist with such analysis, they often have usability limitations and are difficult to set up correctly. To support the developers besides the NIST committee in verifying candidates, we developed a toolchain that automates configuration, execution, and result analysis for several widely used constant-time analysis tools. We selected TIMECOP and Binsec/Rel2 to verify constant-time policy compliance at the binary level, and dudect and RTLF to detect side-channel vulnerabilities through statistical analysis of execution time behavior. We demonstrate its effectiveness and practicability by evaluating the NIST PQDSS round 1 and round 2 implementations. We reported 26 issues in total to the respective developers, and 5 of them have already been fixed. We also discuss our different findings, as well as the benefits of shortcomings of the different tools.

Paperid: 2601, https://arxiv.org/pdf/2509.03879.pdf

Abstract:
Intel SGX and hypervisors isolate non-privileged programs from other software, ensuring confidentiality and integrity. However, side-channel attacks continue to threaten Intel SGX's security, enabling malicious OS to manipulate PTE present bits, induce page faults, and steal memory access traces. Despite extensive research, existing defenses focus on detection or rely on impractical solutions. This paper presents ShieldMMU, a comprehensive solution for mitigating controlled channel attacks, balancing compatibility, performance, and usability. Leveraging a Merkle Tree-inspired Defense Tree (DD-Tree), ShieldMMU protects PTE integrity by detecting, locating, and restoring attacked PTEs. It identifies MMU page table lookup events and side-channel attacks, promptly restoring PTE parameters to prevent page fault traps and ensure secure non-privileged application operation within SGX. Our experiments confirm ShieldMMU's enhanced security and acceptable latency performance.

Paperid: 2602, https://arxiv.org/pdf/2509.03350.pdf

Abstract:
Despite longstanding criticism from the privacy community, k-anonymity remains a widely used standard for data anonymization, mainly due to its simplicity, regulatory alignment, and preservation of data utility. However, non-experts often defend k-anonymity on the grounds that, in the absence of auxiliary information, no known attacks can compromise its protections. In this work, we refute this claim by introducing Combinatorial Refinement Attacks (CRA), a new class of privacy attacks targeting k-anonymized datasets produced using local recoding. This is the first method that does not rely on external auxiliary information or assumptions about the underlying data distribution. CRA leverages the utility-optimizing behavior of local recoding anonymization of ARX, which is a widely used open-source software for anonymizing data in clinical settings, to formulate a linear program that significantly reduces the space of plausible sensitive values. To validate our findings, we partnered with a network of free community health clinics, an environment where (1) auxiliary information is indeed hard to find due to the population they serve and (2) open-source k-anonymity solutions are attractive due to regulatory obligations and limited resources. Our results on real-world clinical microdata reveal that even in the absence of external information, established anonymization frameworks do not deliver the promised level of privacy, raising critical privacy concerns.

Paperid: 2603, https://arxiv.org/pdf/2509.03123.pdf

Abstract:
With the rapid adoption of Models-as-a-Service, concerns about data and model privacy have become increasingly critical. To solve these problems, various privacy-preserving inference schemes have been proposed. In particular, due to the efficiency and interpretability of decision trees, private decision tree evaluation (PDTE) has garnered significant attention. However, existing PDTE schemes suffer from significant limitations: their communication and computation costs scale with the number of trees, the number of nodes, or the tree depth, which makes them inefficient for large-scale models, especially over WAN networks. To address these issues, we propose Kangaroo, a private and amortized decision tree inference framework build upon packed homomorphic encryption. Specifically, we design a novel model hiding and encoding scheme, together with secure feature selection, oblivious comparison, and secure path evaluation protocols, enabling full amortization of the overhead as the number of nodes or trees scales. Furthermore, we enhance the performance and functionality of the framework through optimizations, including same-sharing-for-same-model, latency-aware, and adaptive encoding adjustment strategies. Kangaroo achieves a $14\times$ to $59\times$ performance improvement over state-of-the-art (SOTA) one-round interactive schemes in WAN environments. For large-scale decision tree inference tasks, it delivers a $3\times$ to $44\times$ speedup compared to existing schemes. Notably, Kangaroo enables the evaluation of a random forest with $969$ trees and $411825$ nodes in approximately $60$ ms per tree (amortized) under WAN environments.

Paperid: 2604, https://arxiv.org/pdf/2509.03024.pdf

Abstract:
In today's data-driven world, recommendation systems personalize user experiences across industries but rely on sensitive data, raising privacy concerns. Fully homomorphic encryption (FHE) can secure these systems, but a significant challenge in applying FHE to recommendation systems is efficiently handling the inherently large and sparse user-item rating matrices. FHE operations are computationally intensive, and naively processing various sparse matrices in recommendation systems would be prohibitively expensive. Additionally, the communication overhead between parties remains a critical concern in encrypted domains. We propose a novel approach combining Compressed Sparse Row (CSR) representation with FHE-based matrix factorization that efficiently handles matrix sparsity in the encrypted domain while minimizing communication costs. Our experimental results demonstrate high recommendation accuracy with encrypted data while achieving the lowest communication costs, effectively preserving user privacy.

Paperid: 2605, https://arxiv.org/pdf/2509.02899.pdf

Abstract:
Protected user-level libraries have been proposed as a way to allow mutually distrusting applications to safely share kernel-bypass services. In this paper, we identify and solve several previously unaddressed obstacles to realizing this design and identify several optimization opportunities. First, to preserve the kernel's ability to reclaim failed processes, protected library functions must complete in modest, bounded time. We show how to move unbounded waits outside the library itself, enabling synchronous interaction among processes without the need for polling. Second, we show how the bounded time requirement can be leveraged to achieve lower and more stable latency for inter-process interactions. Third, we observe that prior work on protected libraries is vulnerable to a buffer unmapping attack; we prevent this attack by preventing applications from removing pages that they share with the protected library. Fourth, we show how a trusted daemon can respond to asynchronous events and dynamically divide work with application threads in a protected library. By extending and improving the protected library model, our work provides a new way to structure OS services, combining the advantages of kernel bypass and microkernels. We present a set of safety and performance guidelines for developers of protected libraries, and a set of recommendations for developers of future protected library operating systems. We demonstrate the convenience and performance of our approach with a prototype version of the DDS communication service. To the best of our knowledge, this prototype represents the first successful sharing of a kernel-bypass NIC among mutually untrusting applications. Relative to the commercial FastDDS implementation, we achieve approximately 50\% lower latency and up to 7x throughput, with lower CPU utilization.

Paperid: 2606, https://arxiv.org/pdf/2509.02412.pdf

Abstract:
Due to the event driven nature and the versatility of GUI designs in Android programs, it is challenging to generate event sequences with adequate code coverage within a reasonable time. A common approach to handle this issue is to rely on GUI models to generate event sequences. These sequences can be effective in covering GUI states, but inconsistent in exposing program behaviors that require specific inputs. A major obstacle to generate such specific inputs is the lack of a systematic GUI exploration process to accommodate the analysis requirements. In this paper, we introduce Android Path Explorer (APEX), a systematic input generation framework using concolic execution. APEX addresses the limitations of model-based sequence generation by using concolic execution to discover the data dependencies of GUI state transitions. Moreover, concolic execution is also used to prioritize events during the exploration of GUI, which leads to a more robust model and accurate input generation. The key novelty of APEX is that concolic execution is not only used to construct event sequences, but also used to traverse the GUI more systematically. As such, our experimental results show that APEX can be used to generate a set of event sequences that achieve high code coverage, as well as event sequences that reach specific targets.

Paperid: 2607, https://arxiv.org/pdf/2509.02189.pdf

Abstract:
Blind signatures were first introduced by David Chaum. They allow a user to have a message signed by a signer without revealing the message itself. This property is particularly useful in applications such as electronic voting and digital cash, where user anonymity is important. In a blind signature scheme, the user blinds their message before sending it to the signer, who signs the blinded message. The user then unblinds the signed message to obtain a valid signature that can be verified publicly, ensuring that the signer cannot trace the signed message back to the original unblinded version. A good analogy is placing the message inside an envelope and having the envelope signed. Once the envelope is opened, the signature remains valid for the enclosed message, ensuring that the content remains confidential. Such constructions provide anonymity and privacy to the user but given a practical quantum computer, the security of traditional crypto-systems providing such features will be broken. To address this, the development of quantum-resistant cryptographic protocols is essential for maintaining the security of digital transactions and data. Aligning with the same goal, this work aims to thoroughly review the background of lattice-based blind signatures. We start with the foundations of digital signatures in the classical settings and then move on to lattice-based constructions.

Paperid: 2608, https://arxiv.org/pdf/2509.01945.pdf

Abstract:
Statistical witness indistinguishability is a relaxation of statistical zero-knowledge which guarantees that the transcript of an interactive proof reveals no information about which valid witness the prover used to generate it. In this paper we define and initiate the study of QSWI, the class of problems with quantum statistically witness indistinguishable proofs. Using inherently quantum techniques from Kobayashi (TCC 2008), we prove that any problem with an honest-verifier quantum statistically witness indistinguishable proof has a 3-message public-coin malicious-verifier quantum statistically witness indistinguishable proof. There is no known analogue of this result for classical statistical witness indistinguishability. As a corollary, our result implies SWI is contained in QSWI. Additionally, we extend the work of Bitansky et al. (STOC 2023) to show that quantum batch proofs imply quantum statistically witness indistinguishable proofs with inverse-polynomial witness indistinguishability error.

Paperid: 2609, https://arxiv.org/pdf/2509.01470.pdf

Abstract:
As 5G networks gain traction in defense applications, ensuring the privacy and integrity of the Authentication and Key Agreement (AKA) protocol is critical. While 5G AKA improves upon previous generations by concealing subscriber identities, it remains vulnerable to replay-based synchronization and linkability threats under realistic adversary models. This paper provides a unified analysis of the standardized 5G AKA flow, identifying several vulnerabilities and highlighting how each exploits protocol behavior to compromise user privacy. To address these risks, we present five lightweight mitigation strategies. We demonstrate through prototype implementation and testing that these enhancements strengthen resilience against linkability attacks with minimal computational and signaling overhead. Among the solutions studied, those introducing a UE-generated nonce emerge as the most promising, effectively neutralizing the identified tracking and correlation attacks with negligible additional overhead. Integrating this extension as an optional feature to the standard 5G AKA protocol offers a backward-compatible, low-overhead path toward a more privacy-preserving authentication framework for both commercial and military 5G deployments.

Paperid: 2610, https://arxiv.org/pdf/2509.01046.pdf

Abstract:
Website fingerprinting (WF) attacks remain a significant threat to encrypted traffic, prompting the development of a wide range of defenses. Among these, two prominent classes are regularization-based defenses, which shape traffic using fixed padding rules, and supersequence-based approaches, which conceal traces among predefined patterns. In this work, we present a unified framework for designing an adaptive WF defense that combines the effectiveness of regularization with the provable security of supersequence-style grouping. The scheme first extracts behavioural patterns from traces and clusters them into (k,l)-diverse anonymity sets; an early-time-series classifier (adapted from ECDIRE) then switches from a conservative global set of regularization parameters to the lighter, set-specific parameters. We instantiate the design as Adaptive Tamaraw, a variant of Tamaraw that assigns padding parameters on a per-cluster basis while retaining its original information-theoretic guarantee. Comprehensive experiments on public real-world datasets confirm the benefits. By tuning k, operators can trade privacy for efficiency: in its high-privacy mode Adaptive Tamaraw pushes the bound on any attacker's accuracy below 30%, whereas in efficiency-centred settings it cuts total overhead by 99% compared with classic Tamaraw.

Paperid: 2611, https://arxiv.org/pdf/2509.00896.pdf

Abstract:
In todays rapidly evolving digital landscape, safeguarding network infrastructures against cyberattacks has become a critical priority. This research presents an innovative AI-driven real-time intrusion detection framework designed to enhance network security, particularly in Wireless Sensor Networks (WSNs), Cloud Computing (CC), and Internet of Things (IoT) environments. The system employs classical machine learning models, Logistic Regression, decision trees, and K-Nearest Neighbors, optimized through the novel Energy Valley Optimization (EVO) method using the NSL-KDD dataset. Feature selection significantly reduced the number of input features from 42 to 18, while maintaining strong detection capabilities. The proposed system achieved 98.95 percent. accuracy with Decision Tree, 98.47 percent with K-Nearest Neighbors, and 88.84 percent with Logistic Regression. Moreover, high precision, recall, and F1-scores were attained across all classifiers while substantially reducing training and testing times, making the framework highly suitable for real-time applications. To ensure fair detection across diverse attack types, dataset balancing via Downsampling was applied to address class imbalance challenges. This investigation focuses on the significance of advancing IDSs. in cloud computing and WSNs. Overall, this work advances secure communications by delivering a scalable, low-latency, and high-accuracy intrusion detection solution aligned with the latest trends in artificial intelligence, cybersecurity, and real-time digital networks.

Paperid: 2612, https://arxiv.org/pdf/2509.00706.pdf

Abstract:
Although encryption protocols such as TLS are widely de-ployed,side-channel metadata in encrypted traffic still reveals patterns that allow application and behavior inference.How-ever,existing fine-grained fingerprinting approaches face two key limitations:(i)reliance on platform-dependent charac-teristics,which restricts generalization across heterogeneous platforms,and(ii)poor scalability for fine-grained behavior identification in open-world settings. In this paper,we present X-PRINT,the first server-centric,URI-based framework for cross-platform fine-grained encrypted-traffic fingerprinting.X-PRINT systematically demonstrates that backend URI invocation patterns can serve as platform-agnostic invariants and are effective for mod-eling fine-grained behaviors.To achieve robust identifica-tion,X-PRINT further leverages temporally structured URI maps for behavior inference and emphasizes the exclusion of platform-or application-specific private URIs to handle unseen cases,thereby improving reliability in open-world and cross-platform settings.Extensive experiments across diverse cross-platform and open-world settings show that X-PRINT achieves state-of-the-art accuracy in fine-grained fingerprint-ing and exhibits strong scalability and robustness.

Paperid: 2613, https://arxiv.org/pdf/2509.00540.pdf

Abstract:
In federated learning, participants' uploaded model updates cannot be directly verified, leaving the system vulnerable to malicious attacks. Existing attack strategies have adversaries upload tampered model updates to degrade the global model's performance. However, attackers also degrade their own private models, gaining no advantage. In real-world scenarios, attackers are driven by self-centered motives: their goal is to gain a competitive advantage by developing a model that outperforms those of other participants, not merely to cause disruption. In this paper, we study a novel Self-Centered Federated Learning (SCFL) attack paradigm, in which attackers not only degrade the performance of the global model through attacks but also enhance their own models within the federated learning process. We propose a framework named FedThief, which degrades the performance of the global model by uploading modified content during the upload stage. At the same time, it enhances the private model's performance through divergence-aware ensemble techniques, where "divergence" quantifies the deviation between private and global models, that integrate global updates and local knowledge. Extensive experiments show that our method effectively degrades the global model performance while allowing the attacker to obtain an ensemble model that significantly outperforms the global model.

Paperid: 2614, https://arxiv.org/pdf/2509.00059.pdf

Abstract:
The use of ASCII art as a novel approach to masking sensitive information in cybercrime, focusing on its potential role in protecting personal data during the delivery process and beyond, is presented. By examining the unique properties of ASCII art and its historical context, this study discusses the advantages and limitations of employing this technique in various cybercrime scenarios. Additionally, providing recommendations for enhancing data security practices and fostering a culture of privacy awareness in both businesses and individuals. The findings suggest that ASCII art, with its simplicity and ambiguity, can serve as an effective tool against cybercriminals, emphasizing the need for robust data security measures and increased privacy awareness in today's interconnected world.

Paperid: 2615, https://arxiv.org/pdf/2508.20962.pdf

Abstract:
Trusted Execution Environments (TEEs) have emerged as a cornerstone of confidential computing, garnering significant attention from both academia and industry. To enable the secure development, execution, and deployment, of applications on TEE platforms, TEE containers have been introduced as middleware solutions. These containers aim to shield applications from potentially malicious operating systems and orchestration interfaces while maintaining usability and reliability. In this paper, we analyze the isolation strategies employed by existing TEE containers to protect secure applications. To address the challenges in analyzing these interfaces, we designed an automated analyzer to precisely identify and evaluate their isolation boundaries. We observed that some TEE containers fail to achieve their intended goals due to critical design and implementation flaws, such as information leakage, rollback attacks, denial-of-service, and Iago attacks, which pose significant security risks. Drawing from our findings, we share key lessons to guide the development of more secure container solutions and discuss emerging trends in TEE containerization design.

Paperid: 2616, https://arxiv.org/pdf/2508.20866.pdf

Abstract:
The increasing complexity of software systems and the sophistication of cyber-attacks have underscored the critical need for effective automated vulnerability detection and repair systems. Traditional methods, such as static program analysis, face significant challenges related to scalability, adaptability, and high false-positive and false-negative rates. AI-driven approaches, particularly those using machine learning and deep learning models, show promise but are heavily reliant on the quality and quantity of training data. This paper introduces a novel framework designed to automatically introduce realistic, category-specific vulnerabilities into secure C/C++ codebases to generate datasets. The proposed approach coordinates multiple AI agents that simulate expert reasoning, along with function agents and traditional code analysis tools. It leverages Retrieval-Augmented Generation for contextual grounding and employs Low-Rank approximation of weights for efficient model fine-tuning. Our experimental study on 116 code samples from three different benchmarks suggests that our approach outperforms other techniques with regard to dataset accuracy, achieving between 89\% and 95\% success rates in injecting vulnerabilities at function level.

Paperid: 2617, https://arxiv.org/pdf/2508.20816.pdf

Abstract:
AI-powered development platforms are making software creation accessible to a broader audience, but this democratization has triggered a scalability crisis in security auditing. With studies showing that up to 40% of AI-generated code contains vulnerabilities, the pace of development now vastly outstrips the capacity for thorough security assessment. We present MAPTA, a multi-agent system for autonomous web application security assessment that combines large language model orchestration with tool-grounded execution and end-to-end exploit validation. On the 104-challenge XBOW benchmark, MAPTA achieves 76.9% overall success with perfect performance on SSRF and misconfiguration vulnerabilities, 83% success on broken authorization, and strong results on injection attacks including server-side template injection (85%) and SQL injection (83%). Cross-site scripting (57%) and blind SQL injection (0%) remain challenging. Our comprehensive cost analysis across all challenges totals $21.38 with a median cost of $0.073 for successful attempts versus $0.357 for failures. Success correlates strongly with resource efficiency, enabling practical early-stopping thresholds at approximately 40 tool calls or $0.30 per challenge. MAPTA's real-world findings are impactful given both the popularity of the respective scanned GitHub repositories (8K-70K stars) and MAPTA's low average operating cost of $3.67 per open-source assessment: MAPTA discovered critical vulnerabilities including RCEs, command injections, secret exposure, and arbitrary file write vulnerabilities. Findings are responsibly disclosed, 10 findings are under CVE review.

Paperid: 2618, https://arxiv.org/pdf/2508.20411.pdf

Abstract:
As AI rapidly advances, the security risks posed by AI are becoming increasingly severe, especially in critical scenarios, including those posing existential risks. If AI becomes uncontrollable, manipulated, or actively evades safety mechanisms, it could trigger systemic disasters. Existing AI safety approaches-such as model enhancement, value alignment, and human intervention-suffer from fundamental, in-principle limitations when facing AI with extreme motivations and unlimited intelligence, and cannot guarantee security. To address this challenge, we propose a Governable AI (GAI) framework that shifts from traditional internal constraints to externally enforced structural compliance based on cryptographic mechanisms that are computationally infeasible to break, even for future AI, under the defined threat model and well-established cryptographic assumptions.The GAI framework is composed of a simple yet reliable, fully deterministic, powerful, flexible, and general-purpose rule enforcement module (REM); governance rules; and a governable secure super-platform (GSSP) that offers end-to-end protection against compromise or subversion by AI. The decoupling of the governance rules and the technical platform further enables a feasible and generalizable technical pathway for the safety governance of AI. REM enforces the bottom line defined by governance rules, while GSSP ensures non-bypassability, tamper-resistance, and unforgeability to eliminate all identified attack vectors. This paper also presents a rigorous formal proof of the security properties of this mechanism and demonstrates its effectiveness through a prototype implementation evaluated in representative high-stakes scenarios.

Paperid: 2619, https://arxiv.org/pdf/2508.18811.pdf

Abstract:
An efficient technique of computing on encrypted data allows a client with limited capability to perform complex operations on a remote fault-tolerant server without leaking anything about the input or output. Quantum computing provides information-theoretic security to solve such a problem, and many such techniques have been proposed under the premises of half-blind quantum computation. However, they are dependent on a fixed non-parametric resource set that comprises some universal combination of $H,S,T,CX, CZ$ or $CCX$ gates. In this study, we show that recursive decryption of the parametric gate, $R_z(Î¸)$, is possible exactly when $Î¸=\pmÏ/2^m$ for $m\in \mathbb{Z^{+}}$, and approximately with arbitrary precision $Îµ$ for given $Î¸$. We also show that a blind algorithm based on such a technique needs at most $O(\log_2^2(Ï/Îµ))$ computation steps and communication rounds, while the techniques based on a non-parametric resource set require $O(\ln^{3.97}(1/Îµ))$ rounds. We use these results to propose a universal scheme of half-blind quantum computation for computing on encrypted data using arbitrary rotation gates. This substantial reduction in the depth of blind circuit is an affirmative step towards the practical application of such techniques in secure NISQ-era computing.

Paperid: 2620, https://arxiv.org/pdf/2508.17809.pdf

Abstract:
Logic locking remains one of the most promising defenses against hardware piracy, yet current approaches often face challenges in scalability and design overhead. In this paper, we present TLGLock, a new design paradigm that leverages the structural expressiveness of Threshold Logic Gates (TLGs) and the energy efficiency of charge recycling to enforce key-dependent functionality at the gate level. By embedding the key into the gate's weighted logic and utilizing dynamic charge sharing, TLGLock provides a stateless and compact alternative to conventional locking techniques. We implement a complete synthesis-to-locking flow and evaluate it using ISCAS, ITC, and MCNC benchmarks. Results show that TLGLock achieves up to 30% area, 50% delay, and 20% power savings compared to latch-based locking schemes. In comparison with XOR and SFLL-HD methods, TLGLock offers up to 3x higher SAT attack resistance with significantly lower overhead. Furthermore, randomized key-weight experiments demonstrate that TLGLock can reach up to 100% output corruption under incorrect keys, enabling tunable security at minimal cost. These results position TLGLock as a scalable and resilient solution for secure hardware design.

Paperid: 2621, https://arxiv.org/pdf/2508.17361.pdf

Abstract:
Large Language Models (LLMs) are increasingly trusted to perform automated code review and static analysis at scale, supporting tasks such as vulnerability detection, summarization, and refactoring. In this paper, we identify and exploit a critical vulnerability in LLM-based code analysis: an abstraction bias that causes models to overgeneralize familiar programming patterns and overlook small, meaningful bugs. Adversaries can exploit this blind spot to hijack the control flow of the LLM's interpretation with minimal edits and without affecting actual runtime behavior. We refer to this attack as a Familiar Pattern Attack (FPA). We develop a fully automated, black-box algorithm that discovers and injects FPAs into target code. Our evaluation shows that FPAs are not only effective, but also transferable across models (GPT-4o, Claude 3.5, Gemini 2.0) and universal across programming languages (Python, C, Rust, Go). Moreover, FPAs remain effective even when models are explicitly warned about the attack via robust system prompts. Finally, we explore positive, defensive uses of FPAs and discuss their broader implications for the reliability and safety of code-oriented LLMs.

Paperid: 2622, https://arxiv.org/pdf/2508.17304.pdf

Abstract:
Trust computation is crucial for ensuring the security of the Internet of Things (IoT). However, current trust-based mechanisms for IoT have limitations that impact data security. Sliding window-based trust schemes cannot ensure reliable trust computation due to their inability to select appropriate window lengths. Besides, recent trust scores are emphasized when considering the effect of time on trust. This can cause a sudden change in overall trust score based on recent behavior, potentially misinterpreting an honest service provider as malicious and vice versa. Moreover, clustering mechanisms used to filter recommendations in trust computation often lead to slower results. In this paper, we propose a robust trust model to address these limitations. The proposed approach determines the window length dynamically to guarantee accurate trust computation. It uses the harmonic mean of average trust score and time to prevent sudden fluctuations in trust scores. Additionally, an efficient personalized subspace clustering algorithm is used to exclude recommendations. We present a security analysis demonstrating the resiliency of the proposed scheme against bad-mouthing, ballot-stuffing, and on-off attacks. The proposed scheme demonstrates a competitive performance in detecting bad-mouthing attacks, while outperforming existing works with an approximately 44% improvement in accuracy for detecting on-off attacks. It maintains its effectiveness even when the percentage of on-off attackers increases and in scenarios where multiple attacks occur simultaneously. Additionally, the proposed scheme reduces the recommendation filtering time by 95%.

Paperid: 2623, https://arxiv.org/pdf/2508.17296.pdf

Abstract:
With the advent of quantum computing, cryptocurrencies that rely on blockchain technology face mounting cryptographic vulnerabilities. This paper presents a comprehensive literature review evaluating how quantum algorithms, specifically Shors and Grovers, could disrupt the foundational security mechanisms of cryptocurrencies. Shors algorithm poses a threat to public-key cryptographic schemes by enabling efficient factorization and discrete logarithm solving, thereby endangering digital signature systems. Grovers algorithm undermines hash-based functions, increasing the feasibility of fifty one percent attacks and hash collisions. By examining the internal mechanisms of major cryptocurrencies such as Bitcoin, Ethereum, Litecoin, Monero, and Zcash, this review identifies specific vulnerabilities in transaction and consensus processes. It further analyses the current hardware limitations of quantum systems and estimates when such attacks could become feasible. In anticipation, it investigates countermeasures including Post-Quantum Cryptography (PQC), Quantum Key Distribution (QKD), and protocol-level modifications such as memory-intensive proof-of-work algorithms and multi-signature schemes. The discussion integrates recent advancements in quantum error correction, hardware scalability, and NIST-standardized cryptographic algorithms. This review concludes that while quantum computers are not yet advanced enough to pose an immediate threat, proactive integration of quantum-resistant solutions is essential. The findings underscore the urgent need for cryptocurrencies to adopt post-quantum cryptographic standards to preserve the decentralized trust, integrity, and security that define blockchain-based digital cryptocurrencies.

Paperid: 2624, https://arxiv.org/pdf/2508.16405.pdf

Abstract:
Hardware-based security primitives have become critical to enhancing information security in the Internet of Things (IoT) era. Physical unclonable functions (PUFs) utilize the inherent variations in the manufacturing process to generate cryptographic keys unique to a device. Reconfigurable PUFs (rPUFs) can update cryptographic keys for enhanced security in dynamic operational scenarios involving huge amounts of data, which makes them suitable for implementation in CMOS-integrated spin-orbit torque magnetic random access memory (SOT-MRAM) chips. However, a key challenge is achieving real-time reconfiguration independent of the environmental conditions, particularly the operating temperature. We propose a dual-pulse reconfiguration strategy for rPUFs in CMOS-integrated SOT-MRAM chips that effectively widens the operating window and achieves resilience across a wide range of operating temperatures without the need for dynamic feedback that overly complicates circuit design. The proposed strategy lays a solid foundation for the next generation of hardware-based security primitives to protect IoT architectures.

Paperid: 2625, https://arxiv.org/pdf/2508.15042.pdf

Abstract:
In recent years, machine learning has demonstrated impressive results in various fields, including software vulnerability detection. Nonetheless, using machine learning to identify software vulnerabilities presents new challenges, especially regarding the scale of data involved, which was not a factor in traditional methods. Consequently, in spite of the rise of new machine-learning-based approaches in that space, important shortcomings persist regarding their evaluation. First, researchers often fail to provide concrete statistics about their training datasets, such as the number of samples for each type of vulnerability. Moreover, many methods rely on training with semantically similar functions rather than directly on vulnerable programs. This leads to uncertainty about the suitability of the datasets currently used for training. Secondly, the choice of a model and the level of granularity at which models are trained also affect the effectiveness of such vulnerability discovery approaches. In this paper, we explore the challenges of applying machine learning to vulnerability discovery. We also share insights from our two previous research papers, Bin2vec and BinHunter, which could enhance future research in this field.

Paperid: 2626, https://arxiv.org/pdf/2508.13644.pdf

Abstract:
Accurately assessing software vulnerabilities is essential for effective prioritization and remediation. While various scoring systems exist to support this task, their differing goals, methodologies and outputs often lead to inconsistent prioritization decisions. This work provides the first large-scale, outcome-linked empirical comparison of four publicly available vulnerability scoring systems: the Common Vulnerability Scoring System (CVSS), the Stakeholder-Specific Vulnerability Categorization (SSVC), the Exploit Prediction Scoring System (EPSS), and the Exploitability Index. We use a dataset of 600 real-world vulnerabilities derived from four months of Microsoft's Patch Tuesday disclosures to investigate the relationships between these scores, evaluate how they support vulnerability management task, how these scores categorize vulnerabilities across triage tiers, and assess their ability to capture the real-world exploitation risk. Our findings reveal significant disparities in how scoring systems rank the same vulnerabilities, with implications for organizations relying on these metrics to make data-driven, risk-based decisions. We provide insights into the alignment and divergence of these systems, highlighting the need for more transparent and consistent exploitability, risk, and severity assessments.

Paperid: 2627, https://arxiv.org/pdf/2508.13364.pdf

Abstract:
Intrusion Tolerant Systems (ITSs) have become increasingly critical due to the rise of multi-domain adversaries exploiting diverse attack surfaces. ITS architectures aim to tolerate intrusions, ensuring system compromise is prevented or mitigated even with adversary presence. Existing ITS solutions often employ Risk Managers leveraging public security intelligence to adjust system defenses dynamically against emerging threats. However, these approaches rely heavily on databases like NVD and ExploitDB, which require manual analysis for newly discovered vulnerabilities. This dependency limits the system's responsiveness to rapidly evolving threats. HAL 9000, an ITS Risk Manager introduced in our prior work, addressed these challenges through machine learning. By analyzing descriptions of known vulnerabilities, HAL 9000 predicts and assesses new vulnerabilities automatically. To calculate the risk of a system, it also incorporates the Exploitability Probability Scoring system to estimate the likelihood of exploitation within 30 days, enhancing proactive defense capabilities. Despite its success, HAL 9000's reliance on NVD and ExploitDB knowledge is a limitation, considering the availability of other sources of information. This extended work introduces a custom-built scraper that continuously mines diverse threat sources, including security advisories, research forums, and real-time exploit proofs-of-concept. This significantly expands HAL 9000's intelligence base, enabling earlier detection and assessment of unverified vulnerabilities. Our evaluation demonstrates that integrating scraper-derived intelligence with HAL 9000's risk management framework substantially improves its ability to address emerging threats. This paper details the scraper's integration into the architecture, its role in providing additional information on new threats, and the effects on HAL 9000's management.

Paperid: 2628, https://arxiv.org/pdf/2508.13092.pdf

Abstract:
Timely detection of hardware vulnerabilities during the early design stage is critical for reducing remediation costs. Existing early detection techniques often require specialized security expertise, limiting their usability. Recent efforts have explored the use of large language models (LLMs) for Verilog vulnerability detection. However, LLMs struggle to capture the structure in Verilog code, resulting in inconsistent detection results. To this end, we propose VerilogLAVD, the first LLM-aided graph traversal rule generation approach for Verilog vulnerability detection. Our approach introduces the Verilog Property Graph (VeriPG), a unified representation of Verilog code. It combines syntactic features extracted from the abstract syntax tree (AST) with semantic information derived from control flow and data dependency graphs. We leverage LLMs to generate VeriPG-based detection rules from Common Weakness Enumeration (CWE) descriptions. These rules guide the rule executor that traversal VeriPG for potential vulnerabilities. To evaluate VerilogLAVD, we build a dataset collected from open-source repositories and synthesized data. In our empirical evaluation on 77 Verilog designs encompassing 12 CWE types, VerilogLAVD achieves an F1-score of 0.54. Compared to the LLM-only and LLM with external knowledge baselines, VerilogLAVD improves F1-score by 0.31 and 0.27, respectively.

Paperid: 2629, https://arxiv.org/pdf/2508.13033.pdf

Abstract:
The rapid adoption of chiplet-based heterogeneous integration is reshaping semiconductor design by enabling modular, scalable, and faster time-to-market solutions for AI and high-performance computing. However, multi-vendor assembly in post-fabrication environments fragments the supply chain and exposes SiP systems to serious security threats, including cloning, overproduction, and chiplet substitution. Existing authentication solutions depend on trusted integrators or centralized security anchors, which can expose sensitive data or create single points of failure. We introduce AuthenTree, a distributed authentication framework that leverages multi-party computation (MPC) in a scalable tree-based architecture, removing the need for dedicated security hardware or centralized trust. AuthenTree enables secure chiplet validation without revealing raw signatures, distributing trust across multiple integrator chiplets. Our evaluation in five SiP benchmarks demonstrates that AuthenTree imposes minimal overhead, with an area as low as 0.48% (7,000 sq-micrometers), an overhead power under 0.5%, and an authentication latency below 1 microsecond, surpassing previous work in some cases by 700 times. These results establish AuthenTree as an efficient, robust, and scalable solution for next-generation chiplet-based security in zero-trust SiP environments.

Paperid: 2630, https://arxiv.org/pdf/2508.12910.pdf

Abstract:
Finite State Machines (FSMs) play a critical role in implementing control logic for Systems-on-Chip (SoC). Traditionally, FSMs are implemented by hardware engineers through Verilog coding, which is often tedious and time-consuming. Recently, with the remarkable progress of Large Language Models (LLMs) in code generation, LLMs have been increasingly explored for automating Verilog code generation. However, LLM-generated Verilog code often suffers from security vulnerabilities, which is particularly concerning for security-sensitive FSM implementations. To address this issue, we propose SecFSM, a novel method that leverages a security-oriented knowledge graph to guide LLMs in generating more secure Verilog code. Specifically, we first construct a FSM Security Knowledge Graph (FSKG) as an external aid to LLMs. Subsequently, we analyze users' requirements to identify vulnerabilities and get a list of vulnerabilities in the requirements. Then, we retrieve knowledge from FSKG based on the vulnerabilities list. Finally, we construct security prompts based on the security knowledge for Verilog code generation. To evaluate SecFSM, we build a dedicated dataset collected from academic datasets, artificial datasets, papers, and industrial cases. Extensive experiments demonstrate that SecFSM outperforms state-of-the-art baselines. In particular, on a benchmark of 25 security test cases evaluated by DeepSeek-R1, SecFSM achieves an outstanding pass rate of 21/25.

Paperid: 2631, https://arxiv.org/pdf/2508.12538.pdf

Abstract:
The Model Context Protocol (MCP) has emerged as a universal standard that enables AI agents to seamlessly connect with external tools, significantly enhancing their functionality. However, while MCP brings notable benefits, it also introduces significant vulnerabilities, such as Tool Poisoning Attacks (TPA), where hidden malicious instructions exploit the sycophancy of large language models (LLMs) to manipulate agent behavior. Despite these risks, current academic research on MCP security remains limited, with most studies focusing on narrow or qualitative analyses that fail to capture the diversity of real-world threats. To address this gap, we present the MCP Attack Library (MCPLIB), which categorizes and implements 31 distinct attack methods under four key classifications: direct tool injection, indirect tool injection, malicious user attacks, and LLM inherent attack. We further conduct a quantitative analysis of the efficacy of each attack. Our experiments reveal key insights into MCP vulnerabilities, including agents' blind reliance on tool descriptions, sensitivity to file-based attacks, chain attacks exploiting shared context, and difficulty distinguishing external data from executable commands. These insights, validated through attack experiments, underscore the urgency for robust defense strategies and informed MCP design. Our contributions include 1) constructing a comprehensive MCP attack taxonomy, 2) introducing a unified attack framework MCPLIB, and 3) conducting empirical vulnerability analysis to enhance MCP security mechanisms. This work provides a foundational framework, supporting the secure evolution of MCP ecosystems.

Paperid: 2632, https://arxiv.org/pdf/2508.12175.pdf

Abstract:
The growing integration of LLMs into applications has introduced new security risks, notably known as Promptware - maliciously engineered prompts designed to manipulate LLMs to compromise the CIA triad of these applications. While prior research warned about a potential shift in the threat landscape for LLM-powered applications, the risk posed by Promptware is frequently perceived as low. In this paper, we investigate the risk Promptware poses to users of Gemini-powered assistants (web application, mobile application, and Google Assistant). We propose a novel Threat Analysis and Risk Assessment (TARA) framework to assess Promptware risks for end users. Our analysis focuses on a new variant of Promptware called Targeted Promptware Attacks, which leverage indirect prompt injection via common user interactions such as emails, calendar invitations, and shared documents. We demonstrate 14 attack scenarios applied against Gemini-powered assistants across five identified threat classes: Short-term Context Poisoning, Permanent Memory Poisoning, Tool Misuse, Automatic Agent Invocation, and Automatic App Invocation. These attacks highlight both digital and physical consequences, including spamming, phishing, disinformation campaigns, data exfiltration, unapproved user video streaming, and control of home automation devices. We reveal Promptware's potential for on-device lateral movement, escaping the boundaries of the LLM-powered application, to trigger malicious actions using a device's applications. Our TARA reveals that 73% of the analyzed threats pose High-Critical risk to end users. We discuss mitigations and reassess the risk (in response to deployed mitigations) and show that the risk could be reduced significantly to Very Low-Medium. We disclosed our findings to Google, which deployed dedicated mitigations.

Paperid: 2633, https://arxiv.org/pdf/2508.10652.pdf

Abstract:
Deep learning models are one of the security strategies, trained on extensive datasets, and play a critical role in detecting and responding to these threats by recognizing complex patterns in malicious code. However, the opaque nature of these models-often described as "black boxes"-makes their decision-making processes difficult to understand, even for their creators. This research addresses these challenges by integrating Explainable AI (XAI) techniques to enhance the interpretability and trustworthiness of malware detection models. In this research, the use of Multi-Layer Perceptrons (MLP) for dynamic malware analysis has been considered, a less explored area, and its efficacy in detecting Metamorphic Malware, and further the effectiveness and transparency of MLPs, CNNs, RNNs, and CNN-LSTM models in malware classification, evaluating these models through the lens of Explainable AI (XAI). This comprehensive approach aims to demystify the internal workings of deep learning models, promoting a better understanding and trust in their predictive capabilities in cybersecurity contexts. Such in-depth analysis and implementation haven't been done to the best of our knowledge.

Paperid: 2634, https://arxiv.org/pdf/2508.10636.pdf

Abstract:
DDoS attacks have become a major threat to the security of IoT devices and can cause severe damage to the network infrastructure. IoT devices suffer from the inherent problem of resource constraints and are therefore susceptible to such resource-exhausting attacks. Traditional methods for detecting DDoS attacks are not efficient enough to cope with the dynamic nature of IoT networks, as well as the scalability of the attacks, diversity of protocols, high volume of traffic, and variability in device behavior, and variability of protocols like MQTT, CoAP, making it hard to implement security across all the protocols. In this paper, we propose a novel approach, i.e., the use of Transformer models, which have shown remarkable performance in natural language processing tasks, for detecting DDoS attacks on IoT devices. The proposed model extracts features from network traffic data and processes them using a self-attention mechanism. Experiments conducted on a real-world dataset demonstrate that the proposed approach outperforms traditional machine learning techniques, which can be validated by comparing both approaches' accuracy, precision, recall, and F1-score. The results of this study show that the Transformer models can be an effective solution for detecting DDoS attacks on IoT devices and have the potential to be deployed in real-world IoT environments.

Paperid: 2635, https://arxiv.org/pdf/2508.10431.pdf

Abstract:
Recent work presented at USENIX Security 2025 (SEC'25) claims that occupancy-based attacks can recover AES keys from the MIRAGE randomized cache. In this paper, we examine these claims and find that they arise from a modeling flaw in the SEC'25 paper. Most critically, the SEC'25 paper's simulation of MIRAGE uses a constant seed to initialize the random number generator used for global evictions in MIRAGE, causing every AES encryption they trace to evict the same deterministic sequence of cache lines. This artificially creates a highly repeatable timing pattern that is not representative of a realistic implementation of MIRAGE, where eviction sequences vary randomly between encryptions. When we instead randomize the eviction seed for each run, reflecting realistic operation, the correlation between AES T-table accesses and attacker runtimes disappears, and the attack fails. These findings show that the reported leakage is an artifact of incorrect modeling, and not an actual vulnerability in MIRAGE.

Paperid: 2636, https://arxiv.org/pdf/2508.09994.pdf

Abstract:
Currently, Automatic Speech Recognition (ASR) models are deployed in an extensive range of applications. However, recent studies have demonstrated the possibility of adversarial attack on these models which could potentially suppress or disrupt model output. We investigate and verify the robustness of these attacks and explore if it is possible to increase their imperceptibility. We additionally find that by relaxing the optimisation objective from complete suppression to partial suppression, we can further decrease the imperceptibility of the attack. We also explore possible defences against these attacks and show a low-pass filter defence could potentially serve as an effective defence.

Paperid: 2637, https://arxiv.org/pdf/2508.09399.pdf

Abstract:
This paper addresses the challenges of data privacy and collaborative modeling in cross-institution financial risk analysis. It proposes a risk assessment framework based on federated learning. Without sharing raw data, the method enables joint modeling and risk identification across multiple institutions. This is achieved by incorporating a feature attention mechanism and temporal modeling structure. Specifically, the model adopts a distributed optimization strategy. Each financial institution trains a local sub-model. The model parameters are protected using differential privacy and noise injection before being uploaded. A central server then aggregates these parameters to generate a global model. This global model is used for systemic risk identification. To validate the effectiveness of the proposed method, multiple experiments are conducted. These evaluate communication efficiency, model accuracy, systemic risk detection, and cross-market generalization. The results show that the proposed model outperforms both traditional centralized methods and existing federated learning variants across all evaluation metrics. It demonstrates strong modeling capabilities and practical value in sensitive financial environments. The method enhances the scope and efficiency of risk identification while preserving data sovereignty. It offers a secure and efficient solution for intelligent financial risk analysis.

Paperid: 2638, https://arxiv.org/pdf/2508.06394.pdf

Abstract:
AI for IT Operations (AIOps) is transforming how organizations manage complex software systems by automating anomaly detection, incident diagnosis, and remediation. Modern AIOps solutions increasingly rely on autonomous LLM-based agents to interpret telemetry data and take corrective actions with minimal human intervention, promising faster response times and operational cost savings. In this work, we perform the first security analysis of AIOps solutions, showing that, once again, AI-driven automation comes with a profound security cost. We demonstrate that adversaries can manipulate system telemetry to mislead AIOps agents into taking actions that compromise the integrity of the infrastructure they manage. We introduce techniques to reliably inject telemetry data using error-inducing requests that influence agent behavior through a form of adversarial reward-hacking; plausible but incorrect system error interpretations that steer the agent's decision-making. Our attack methodology, AIOpsDoom, is fully automated--combining reconnaissance, fuzzing, and LLM-driven adversarial input generation--and operates without any prior knowledge of the target system. To counter this threat, we propose AIOpsShield, a defense mechanism that sanitizes telemetry data by exploiting its structured nature and the minimal role of user-generated content. Our experiments show that AIOpsShield reliably blocks telemetry-based attacks without affecting normal agent performance. Ultimately, this work exposes AIOps as an emerging attack vector for system compromise and underscores the urgent need for security-aware AIOps design.

Paperid: 2639, https://arxiv.org/pdf/2508.05663.pdf

Abstract:
Random walk (RW)-based algorithms have long been popular in distributed systems due to low overheads and scalability, with recent growing applications in decentralized learning. However, their reliance on local interactions makes them inherently vulnerable to malicious behavior. In this work, we investigate an adversarial threat that we term the ``Pac-Man'' attack, in which a malicious node probabilistically terminates any RW that visits it. This stealthy behavior gradually eliminates active RWs from the network, effectively halting the learning process without triggering failure alarms. To counter this threat, we propose the Average Crossing (AC) algorithm--a fully decentralized mechanism for duplicating RWs to prevent RW extinction in the presence of Pac-Man. Our theoretical analysis establishes that (i) the RW population remains almost surely bounded under AC and (ii) RW-based stochastic gradient descent remains convergent under AC, even in the presence of Pac-Man, with a quantifiable deviation from the true optimum. Our extensive empirical results on both synthetic and real-world datasets corroborate our theoretical findings. Furthermore, they uncover a phase transition in the extinction probability as a function of the duplication threshold. We offer theoretical insights by analyzing a simplified variant of the AC, which sheds light on the observed phase transition.

Paperid: 2640, https://arxiv.org/pdf/2508.05591.pdf

Abstract:
The exponential growth of the Internet of Things (IoT) has led to the emergence of substantial security concerns, with IoT networks becoming the primary target for cyberattacks. This study examines the potential of Kolmogorov-Arnold Networks (KANs) as an alternative to conventional machine learning models for intrusion detection in IoT networks. The study demonstrates that KANs, which employ learnable activation functions, outperform traditional MLPs and achieve competitive accuracy compared to state-of-the-art models such as Random Forest and XGBoost, while offering superior interpretability for intrusion detection in IoT networks.

Paperid: 2641, https://arxiv.org/pdf/2508.04094.pdf

Abstract:
All current detection of backdoor attacks on deep learning models fall under the category of a non essential features(NEF), which focus on fighting against simple and efficient vertical class backdoor -- trigger is small, few and not overlapping with the source. Evade-adaptive backdoor (EAB) attacks have evaded NEF detection and improved training efficiency. We introduces a precise, efficient and universal detection and defense framework coined as Isolate Trigger (IsTr). IsTr aims to find the hidden trigger by breaking the barrier of the source features. Therefore, it investigates the essence of backdoor triggering, and uses Steps and Differential-Middle-Slice as components to update past theories of distance and gradient. IsTr also plays a positive role in the model, whether the backdoor exists. For example, accurately find and repair the wrong identification caused by deliberate or unintentional training in automatic driving. Extensive experiments on robustness scross various tasks, including MNIST, facial recognition, and traffic sign recognition, confirm the high efficiency, generality and precision of the IsTr. We rigorously evaluated the effectiveness of the IsTr against a series of six EAB attacks, including Badnets, Sin-Wave, Multi-trigger, SSBAs, CASSOCK, HCB. None of these countermeasures evade, even when attacks are combined and the trigger and source overlap.

Paperid: 2642, https://arxiv.org/pdf/2508.03321.pdf

Abstract:
While TLS has become the de-facto standard for end-to-end security, its use to secure critical communication in evolving industrial IoT scenarios is severely limited by prevalent resource constraints of devices and networks. Most notably, the TLS handshake to establish secure connections incurs significant bandwidth and processing overhead that often cannot be handled in constrained environments. To alleviate this situation, we present BiTHaC which realizes bidirectional TLS handshake caching by exploiting that significant parts of repeated TLS handshakes, especially certificates, are static. Thus, redundant information neither needs to be transmitted nor corresponding computations performed, saving valuable bandwidth and processing resources. By implementing BiTHaC for wolfSSL, we show that we can reduce the bandwidth consumption of TLS handshakes by up to 61.1% and the computational overhead by up to 8.5%, while incurring only well-manageable memory overhead and preserving the strict security guarantees of TLS.

Paperid: 2643, https://arxiv.org/pdf/2508.02312.pdf

Abstract:
Large Language Models (LLMs), now a foundation in advancing natural language processing, power applications such as text generation, machine translation, and conversational systems. Despite their transformative potential, these models inherently rely on massive amounts of training data, often collected from diverse and uncurated sources, which exposes them to serious data security risks. Harmful or malicious data can compromise model behavior, leading to issues such as toxic output, hallucinations, and vulnerabilities to threats such as prompt injection or data poisoning. As LLMs continue to be integrated into critical real-world systems, understanding and addressing these data-centric security risks is imperative to safeguard user trust and system reliability. This survey offers a comprehensive overview of the main data security risks facing LLMs and reviews current defense strategies, including adversarial training, RLHF, and data augmentation. Additionally, we categorize and analyze relevant datasets used for assessing robustness and security across different domains, providing guidance for future research. Finally, we highlight key research directions that focus on secure model updates, explainability-driven defenses, and effective governance frameworks, aiming to promote the safe and responsible development of LLM technology. This work aims to inform researchers, practitioners, and policymakers, driving progress toward data security in LLMs.

Paperid: 2644, https://arxiv.org/pdf/2508.02158.pdf

Abstract:
Detection of planted subgraphs in ErdÃ¶s-RÃ©nyi random graphs has been extensively studied, leading to a rich body of results characterizing both statistical and computational thresholds. However, most prior work assumes a purely random generative model, making the resulting algorithms potentially fragile in the face of real-world perturbations. In this work, we initiate the study of semi-random models for the planted subgraph detection problem, wherein an adversary is allowed to remove edges outside the planted subgraph before the graph is revealed to the statistician. Crucially, the statistician remains unaware of which edges have been removed, introducing fundamental challenges to the inference task. We establish fundamental statistical limits for detection under this semi-random model, revealing a sharp dichotomy. Specifically, for planted subgraphs with strongly sub-logarithmic maximum density detection becomes information-theoretically impossible in the presence of an adversary, despite being possible in the classical random model. In stark contrast, for subgraphs with super-logarithmic density, the statistical limits remain essentially unchanged; we prove that the optimal (albeit computationally intractable) likelihood ratio test remains robust. Beyond these statistical boundaries, we design a new computationally efficient and robust detection algorithm, and provide rigorous statistical guarantees for its performance. Our results establish the first robust framework for planted subgraph detection and open new directions in the study of semi-random models, computational-statistical trade-offs, and robustness in graph inference problems.

Paperid: 2645, https://arxiv.org/pdf/2508.01694.pdf

Abstract:
The steady advancement in quantum computer error correction technology has pushed the current record to 48 stable logical qubits, bringing us closer to machines capable of running Shor's algorithm at scales that threaten RSA and ECC cryptography. While the timeline for developing such quantum computers remains uncertain, the cryptographic community must prepare for the transition to quantum-resistant algorithms. CRYSTALS-Kyber, standardized by NIST in 2022, represents a leading post-quantum cryptographic solution, but widespread adoption faces significant challenges. If this migration follows patterns similar to the SHA-1 to SHA-2 transition, organizations may experience prolonged periods of vulnerability, with substantial security and economic consequences. This study evaluates Kyber's practical viability through performance testing across various implementation schemes, utilizing only standard built-in processor acceleration features, some of which include AES-NI and ASIMD, without any specialized hardware additions. Our findings demonstrate that Kyber provides robust security guarantees against quantum attacks while maintaining acceptable performance profiles for most contemporary applications, utilizing only commodity hardware with manufacturer-provided acceleration capabilities.

Paperid: 2646, https://arxiv.org/pdf/2508.01620.pdf

Abstract:
Recent studies have shown that deep learning models are vulnerable to attacks and tend to memorize training data points, raising significant concerns about privacy leakage. This motivates the development of machine unlearning (MU), i.e., a paradigm that enables models to selectively forget specific data points upon request. However, most existing MU algorithms require partial or full fine-tuning on the retain set. This necessitates continued access to the original training data, which is often impractical due to privacy concerns and storage constraints. A few retain-data-free MU methods have been proposed, but some rely on access to auxiliary data and precomputed statistics of the retain set, while others scale poorly when forgetting larger portions of data. In this paper, we propose Influence-guided Machine Unlearning (IMU), a simple yet effective method that conducts MU using only the forget set. Specifically, IMU employs gradient ascent and innovatively introduces dynamic allocation of unlearning intensities across different data points based on their influences. This adaptive strategy significantly enhances unlearning effectiveness while maintaining model utility. Results across vision and language tasks demonstrate that IMU consistently outperforms existing retain-data-free MU methods.

Paperid: 2647, https://arxiv.org/pdf/2508.01332.pdf

Abstract:
The rapid adoption of agentic AI, powered by large language models (LLMs), is transforming enterprise ecosystems with autonomous agents that execute complex workflows. Yet we observe several key security vulnerabilities in LLM-driven multi-agent systems (MASes): fragmented identity frameworks, insecure communication channels, and inadequate defenses against Byzantine agents or adversarial prompts. In this paper, we present the first systematic analysis of these emerging multi-agent risks and explain why the legacy security strategies cannot effectively address these risks. Afterwards, we propose BlockA2A, the first unified multi-agent trust framework that enables secure and verifiable and agent-to-agent interoperability. At a high level, BlockA2A adopts decentralized identifiers (DIDs) to enable fine-grained cross-domain agent authentication, blockchain-anchored ledgers to enable immutable auditability, and smart contracts to dynamically enforce context-aware access control policies. BlockA2A eliminates centralized trust bottlenecks, ensures message authenticity and execution integrity, and guarantees accountability across agent interactions. Furthermore, we propose a Defense Orchestration Engine (DOE) that actively neutralizes attacks through real-time mechanisms, including Byzantine agent flagging, reactive execution halting, and instant permission revocation. Empirical evaluations demonstrate BlockA2A's effectiveness in neutralizing prompt-based, communication-based, behavioral and systemic MAS attacks. We formalize its integration into existing MAS and showcase a practical implementation for Google's A2A protocol. Experiments confirm that BlockA2A and DOE operate with sub-second overhead, enabling scalable deployment in production LLM-based MAS environments.

Paperid: 2648, https://arxiv.org/pdf/2508.01085.pdf

Abstract:
We consider an Internet of Battlefield Things (IoBT) system consisting of multiple devices that want to securely communicate with each other during a mission in the presence of an adversary with unbounded computational power. The adversary has complete access to listen/read the ciphertext without tampering with the communication line. We provide an unconditionally secure encryption scheme to exchange messages among devices in the system. The main idea behind the scheme is to provide secret keys to exchange messages using a random binary matrix that is securely shared among all the devices, and pair-wise random secret keys established between each pair of devices attempting to communicate before the mission. The scheme is implemented by using finite group modular addition. We show that the scheme is absolutely semantically secure, i.e., the scheme guarantees that an adversary with unbounded computational power cannot get even one bit of information about a message, except for an exponentially small probability in a security parameter. Besides that, we show that even if the random binary matrix is revealed to the adversary, the provided scheme is computationally secure against the key recovery attack.

Paperid: 2649, https://arxiv.org/pdf/2508.01084.pdf

Abstract:
Although Retrieval-Augmented Generation (RAG) systems have been widely applied, the privacy and security risks they face, such as data leakage and data poisoning, have not been systematically addressed yet. Existing defense strategies primarily rely on heuristic filtering or enhancing retriever robustness, which suffer from limited interpretability, lack of formal security guarantees, and vulnerability to adaptive attacks. To address these challenges, this paper proposes the first provably secure framework for RAG systems(SAG). Our framework employs a pre-storage full-encryption scheme to ensure dual protection of both retrieved content and vector embeddings, guaranteeing that only authorized entities can access the data. Through formal security proofs, we rigorously verify the scheme's confidentiality and integrity under a computational security model. Extensive experiments across multiple benchmark datasets demonstrate that our framework effectively resists a range of state-of-the-art attacks. This work establishes a theoretical foundation and practical paradigm for verifiably secure RAG systems, advancing AI-powered services toward formally guaranteed security.

Paperid: 2650, https://arxiv.org/pdf/2508.00943.pdf

Abstract:
Trustworthy evaluations of dangerous capabilities are increasingly crucial for determining whether an AI system is safe to deploy. One empirically demonstrated threat to this is sandbagging - the strategic underperformance on evaluations by AI models or their developers. One promising defense is to monitor a model's chain-of-thought (CoT) reasoning, as this could reveal its intentions and plans. In this work, we measure the ability of models to sandbag on dangerous capability evaluations against a CoT monitor by prompting them to sandbag while being either monitor-oblivious or monitor-aware. We show that both frontier models and small open-sourced models can covertly sandbag against CoT monitoring 0-shot without hints. However, they cannot yet do so reliably: they bypass the monitor 16-36\% of the time when monitor-aware, conditioned on sandbagging successfully. We qualitatively analyzed the uncaught CoTs to understand why the monitor failed. We reveal a rich attack surface for CoT monitoring and contribute five covert sandbagging policies generated by models. These results inform potential failure modes of CoT monitoring and may help build more diverse sandbagging model organisms.

Paperid: 2651, https://arxiv.org/pdf/2508.00907.pdf

Abstract:
This paper presents an exact and explicit equation for prime factorization, along with an algorithm for its computation. The proposed method is based on the MeLoCoToN approach, which addresses combinatorial optimization problems through classical tensor networks. The presented tensor network performs the multiplication of every pair of possible input numbers and selects those whose product is the number to be factorized. Additionally, in order to make the algorithm more efficient, the number and dimension of the tensors and their contraction scheme are optimized. Finally, a series of tests on the algorithm are conducted, contracting the tensor network both exactly and approximately using tensor train compression, and evaluating its performance.

Paperid: 2652, https://arxiv.org/pdf/2508.00637.pdf

Abstract:
Integrating Information and Communications Technology (ICT) devices into the power grid brings many benefits. However, it also exposes the grid to new potential cyber threats. Many control and protection mechanisms, such as Load Frequency Control (LFC), responsible for maintaining nominal frequency during load fluctuations and Under Frequency Load Shedding (UFLS) disconnecting portion of the load during an emergency, are dependent on information exchange through the communication network. The recently emerging Load Altering Attacks (LAAs) utilize a botnet of high-wattage devices to introduce load fluctuation. In their dynamic form (DLAAs), they manipulate the load in response to live grid frequency measurements for increased efficiency, posing a notable threat to grid stability. Recognizing the importance of communication networks in power grid cyber security research, this paper presents an open-source co-simulation environment that models the power grid with the corresponding communication network, implementing grid protective mechanisms. This setup allows the comprehensive analysis of the attacks in concrete LFC and UFLS scenarios.

Paperid: 2653, https://arxiv.org/pdf/2507.22065.pdf

Abstract:
Fuzzing is highly effective in detecting bugs due to the key contribution of randomness. However, randomness significantly reduces the efficiency of fuzzing, causing it to cost days or weeks to expose bugs. Even though directed fuzzing reduces randomness by guiding fuzzing towards target buggy locations, the dilemma of randomness still challenges directed fuzzers. Two critical components, which are seeds and mutators, contain randomness and are closely tied to the conditions required for triggering bugs. Therefore, to address the challenge of randomness, we propose to use large language models (LLMs) to remove the randomness in seeds and reduce the randomness in mutators. With their strong reasoning and code generation capabilities, LLMs can be used to generate reachable seeds that target pre-determined locations and to construct bug-specific mutators tailored for specific bugs. We propose RandLuzz, which integrates LLMs and directed fuzzing, to improve the quality of seeds and mutators, resulting in efficient bug exposure. RandLuzz analyzes function call chain or functionality to guide LLMs in generating reachable seeds. To construct bug-specific mutators, RandLuzz uses LLMs to perform bug analysis, obtaining information such as bug causes and mutation suggestions, which further help generate code that performs bug-specific mutations. We evaluate RandLuzz by comparing it with four state-of-the-art directed fuzzers, AFLGo, Beacon, WindRanger, and SelectFuzz. With RandLuzz-generated seeds, the fuzzers achieve an average speedup ranging from 2.1$\times$ to 4.8$\times$ compared to using widely-used initial seeds. Additionally, when evaluated on individual bugs, RandLuzz achieves up to a 2.7$\times$ speedup compared to the second-fastest exposure. On 8 bugs, RandLuzz can even expose them within 60 seconds.

Paperid: 2654, https://arxiv.org/pdf/2507.21094.pdf

Abstract:
In recent years, cloud security has emerged as a primary concern for enterprises due to the increasing trend of migrating internal infrastructure and applications to cloud environments. This shift is driven by the desire to reduce the high costs and maintenance fees associated with traditional on-premise infrastructure. By leveraging cloud capacities such as high availability and scalability, companies can achieve greater operational efficiency and flexibility. However, this migration also introduces new security challenges. Ensuring the protection of sensitive data, maintaining compliance with regulatory requirements, and mitigating the risks of cyber threats are critical issues that must be addressed. Identity and Access Management (IAM) constitutes the critical security backbone of most cloud deployments, particularly within AWS environments. As organizations adopt AWS to scale applications and store data, the need for a thorough, methodical, and precise enumeration of IAM configurations grows exponentially. Enumeration refers to the systematic mapping and interrogation of identities, permissions, and resource authorizations with the objective of gaining situational awareness. By understanding the interplay between users, groups, and their myriads of policies, whether inline or attached managed policies, security professionals need to enumerate and identify misconfigurations, reduce the risk of unauthorized privilege escalation, and maintain robust compliance postures. This paper will present SkyEye, a cooperative multi-principal IAM enumeration framework, which comprises cutting-edge enumeration models in supporting complete situational awareness regarding the IAMs of provided AWS credentials, crossing the boundary of principal-specific IAM entitlement vision to reveal the complete visionary while insufficient authorization is the main challenge.

Paperid: 2655, https://arxiv.org/pdf/2507.21087.pdf

Abstract:
Address Resolution Protocol (ARP) spoofing remains a critical threat to IoT networks, enabling attackers to intercept, modify, or disrupt data transmission by exploiting ARP's lack of authentication. The decentralized and resource-constrained nature of IoT environments amplifies this vulnerability, making conventional detection mechanisms ineffective at scale. This paper introduces an intelligent, multi-layered machine learning framework designed to detect ARP spoofing in real-time IoT deployments. Our approach combines feature engineering based on ARP header behavior, traffic flow analysis, and temporal packet anomalies with a hybrid detection pipeline incorporating decision trees, ensemble models, and deep learning classifiers. We propose a hierarchical architecture to prioritize lightweight models at edge gateways and deeper models at centralized nodes to balance detection accuracy and computational efficiency. The system is validated on both simulated IoT traffic and the CICIDS2017 dataset, achieving over 97% detection accuracy with low false positive rates. Comparative evaluations with signature-based and rule-based systems demonstrate the robustness and generalizability of our approach. Our results show that intelligent machine learning integration enables proactive ARP spoofing detection tailored for IoT scenarios, laying the groundwork for scalable and autonomous network security solutions.

Paperid: 2656, https://arxiv.org/pdf/2507.20672.pdf

Abstract:
A widespread belief in the blockchain security community is that automated techniques are only good for detecting shallow bugs, typically of small value. In this paper, we present the techniques and insights that have led us to repeatable success in automatically discovering high-value smart contract vulnerabilities. Our vulnerability disclosures have yielded 10 bug bounties, for a total of over $3M, over high-profile deployed code, as well as hundreds of bugs detected in pre-deployment or under-audit code. We argue that the elements of this surprising success are a) a very high-completeness static analysis approach that manages to maintain acceptable precision; b) domain knowledge, provided by experts or captured via statistical inference. We present novel techniques for automatically inferring domain knowledge from statistical analysis of a large corpus of deployed contracts, as well as discuss insights on the ideal precision and warning rate of a promising vulnerability detector. In contrast to academic literature in program analysis, which routinely expects false-positive rates below 50% for publishable results, we posit that a useful analysis for high-value real-world vulnerabilities will likely flag very few programs (under 1%) and will do so with a high false-positive rate (e.g., 95%, meaning that only one-of-twenty human inspections will yield an exploitable vulnerability).

Paperid: 2657, https://arxiv.org/pdf/2507.20175.pdf

Abstract:
While catastrophic attacks on Ethereum persist, vulnerability research remains fixated on implementation-level smart contract bugs, creating a gap between academic understanding of vulnerabilities and the root causes of high-impact, real-world incidents. To address this, we employ a two-pronged methodology: first, a systematic literature review of 71 academic papers to build a catalog of 24 active and 5 deprecated vulnerabilities. Second, we conduct an in-depth, empirical analysis of 50 of the most severe real-world attacks between 2022 and 2025, collectively incurring over $1.09B in losses, to identify their root causes. We introduce the concept of "exploit chains" by revealing that many incidents are not caused by isolated vulnerabilities but by combinations of human, operational, and economic design flaws that link with implementation bugs to enable an attack. Our analysis yields insights on how decentralized applications are exploited in practice, leading to a novel, four-tier root-cause framework that moves beyond code-level vulnerabilities. We find that real-world successful attacks on Ethereum (and related networks) trace back to one of the four tiers of (1) protocol logic design, (2) lifecycle and governance, (3) external dependencies, and (4) classic smart contract vulnerabilities. We investigate the suitability of this multi-tier incident root-cause framework via a case study.

Paperid: 2658, https://arxiv.org/pdf/2507.20074.pdf

Abstract:
Nuclear arms control treaties have historically focused on strategic nuclear delivery systems, indirectly restricting strategic nuclear warhead numbers and leaving nonstrategic nuclear warheads (NSNWs) outside formal verification frameworks. This paper presents a cryptographic protocol for secure and verifiable warhead tracking, addressing challenges in nuclear warhead verification without requiring intrusive physical inspections. Our system leverages commitment schemes and zero-knowledge succinct non-interactive arguments of knowledge (zkSNARKs) to ensure compliance with treaty constraints while preserving the confidentiality of sensitive nuclear warhead data. We propose a cryptographic "Warhead Passport" tracking system that chains commitments to individual warheads over their life cycle, enabling periodic challenges and real-time verification of treaty compliance. Our implementation follows real-world treaty constraints, integrates U.S. and Russian dual-hash combiners (SHA-family and GOST R 34.11 family) for cryptographic robustness and political constraints, and ensures forward security by preventing retroactive data manipulation. This work builds on policy research from prior arms control studies and provides a practical foundation for implementing secure, auditable NSNW verification mechanisms.

Paperid: 2659, https://arxiv.org/pdf/2507.19391.pdf

Abstract:
Message franking is an indispensable abuse mitigation tool for end-to-end encrypted (E2EE) messaging platforms. With it, users who receive harmful content can securely report that content to platform moderators. However, while real-world deployments of reporting require the disclosure of multiple messages, existing treatments of message franking only consider the report of a single message. As a result, there is a gap between the security goals achieved by constructions and those needed in practice. Our work introduces transcript franking, a new type of protocol that allows reporting subsets of conversations such that moderators can cryptographically verify message causality and contents. We define syntax, semantics, and security for transcript franking in two-party and group messaging. We then present efficient constructions for transcript franking and prove their security. Looking toward deployment considerations, we provide detailed discussion of how real-world messaging systems can incorporate our protocols.

Paperid: 2660, https://arxiv.org/pdf/2507.17385.pdf

Abstract:
In the traditional Application-Specific Integrated Circuit (ASIC) design flow, the concept of timing closure implies to reach convergence during physical synthesis such that, under a given area and power budget, the design works at the targeted frequency. However, security has been largely neglected when evaluating the Quality of Results (QoR) from physical synthesis. In general, commercial place & route tools do not understand security goals. In this work, we propose a modified ASIC design flow that is security-aware and, differently from prior research, does not degrade QoR for the sake of security improvement. Therefore, we propose a first-of-its-kind zero-overhead flow for security closure. Our flow is concerned with two distinct threat models: (i) insertion of Hardware Trojans (HTs) and (ii) physical probing/fault injection. Importantly, the flow is entirely executed within a commercial place & route engine and is scalable. In several metrics, our security-aware flow achieves the best-known results for the ISPD`22 set of benchmark circuits while incurring negligible design overheads due to security-related strategies. Finally, we open source the entire methodology (as a set of scripts) and also share the protected circuits (as design databases) for the benefit of the hardware security community.

Paperid: 2661, https://arxiv.org/pdf/2507.17188.pdf

Abstract:
This work tackles the physical layer security (PLS) problem of maximizing the secrecy rate in heterogeneous UAV networks (HetUAVNs) under propulsion energy constraints. Unlike prior studies that assume uniform UAV capabilities or overlook energy-security trade-offs, we consider a realistic scenario where UAVs with diverse payloads and computation resources collaborate to serve ground terminals in the presence of eavesdroppers. To manage the complex coupling between UAV motion and communication, we propose a hierarchical optimization framework. The inner layer uses a semidefinite relaxation (SDR)-based S2DC algorithm combining penalty functions and difference-of-convex (d.c.) programming to solve the secrecy precoding problem with fixed UAV positions. The outer layer introduces a Large Language Model (LLM)-guided heuristic multi-agent reinforcement learning approach (LLM-HeMARL) for trajectory optimization. LLM-HeMARL efficiently incorporates expert heuristics policy generated by the LLM, enabling UAVs to learn energy-aware, security-driven trajectories without the inference overhead of real-time LLM calls. The simulation results show that our method outperforms existing baselines in secrecy rate and energy efficiency, with consistent robustness across varying UAV swarm sizes and random seeds.

Paperid: 2662, https://arxiv.org/pdf/2507.16060.pdf

Abstract:
Unauthorized access remains one of the critical security challenges in the realm of cybersecurity. With the increasing sophistication of attack techniques, the threat of unauthorized access is no longer confined to the conventional ones, such as exploiting weak access control policies. Instead, advanced exploitation strategies, such as session hijacking-based attacks, are becoming increasingly prevalent, posing serious security concerns. Session hijacking enables attackers to take over an already established session between legitimate peers in a stealthy manner, thereby gaining unauthorized access to private resources. Unfortunately, traditional access control mechanisms, such as static access control policies, are insufficient to prevent session hijacking or other advanced exploitation techniques. In this work, we propose a new multi-factor authorization (MFAz) scheme that proactively mitigates unauthorized access attempts both conventional and advanced unauthorized access attacks. The proposed scheme employs fine-grained access control rules (ARs) and verification points (VPs) that are systematically generated from historically granted accesses as the first and second authorization factors, respectively. As a proof-of-concept, we implement the scheme using different techniques. We leverage bloom filter to achieve runtime and storage efficiency, and blockchain to make authorization decisions in a temper-proof and decentralized manner. To the best of our knowledge, this is the first formal introduction of a multi-factor authorization scheme, which is orthogonal to the multi-factor authentication (MFA) schemes. The effectiveness of our proposed scheme is experimentally evaluated using a smart-city testbed involving different devices with varying computational capacities. The experimental results reveal high effectiveness of the scheme both in security and performance guarantees.

Paperid: 2663, https://arxiv.org/pdf/2507.14985.pdf

Abstract:
The rapid growth of metaverse technologies, including virtual worlds, augmented reality, and lifelogging, has accelerated their adoption across diverse domains. This rise exposes users to significant new security and privacy challenges due to sociotechnical complexity, pervasive connectivity, and extensive user data collection in immersive environments. We present a systematic review of the literature published between 2013 and 2024, offering a comprehensive analysis of how the research community has addressed metaverse-related security and privacy issues over the past decade. We organize the studies by method, examined the security and privacy properties, immersive components, and evaluation strategies. Our investigation reveals a sharp increase in research activity in the last five years, a strong focus on practical and user-centered approaches, and a predominant use of benchmarking, human experimentation, and qualitative methods. Authentication and unobservability are the most frequently studied properties. However, critical gaps remain in areas such as policy compliance, accessibility, interoperability, and back-end infrastructure security. We emphasize the intertwined technical complexity and human factors of the metaverse and call for integrated, interdisciplinary approaches to securing inclusive and trustworthy immersive environments.

Paperid: 2664, https://arxiv.org/pdf/2507.13028.pdf

Abstract:
Hardening computer systems against cyberattacks is crucial for security. However, past incidents illustrated, that many system operators struggle with effective system hardening. Hence, many computer systems and applications remain insecure. So far, the research community lacks an in-depth understanding of system operators motivation, practices, and challenges around system hardening. With a focus on practices and challenges, we qualitatively analyzed 316 Stack Exchange (SE) posts related to system hardening. We find that access control and deployment-related issues are the most challenging, and system operators suffer from misconceptions and unrealistic expectations. Most frequently, posts focused on operating systems and server applications. System operators were driven by the fear of their systems getting attacked or by compliance reasons. Finally, we discuss our research questions, make recommendations for future system hardening, and illustrate the implications of our work.

Paperid: 2665, https://arxiv.org/pdf/2507.12670.pdf

Abstract:
An address is indicated as an identifier of the user on the blockchain, and is defined by a hash value of the ECDSA verification key. A vanity address is an address that embeds custom characters such as a name. To generate a vanity address, a classical try-and-error method is employed, and thus the number of characters to be embedded is limited. In this paper, we focus on the functionality of identity-based signatures (IBS) where any strings can be employed as a verification key, and explore whether IBS can be used for generating a vanity address. We attach importance to the fact that it is not realistic to replace ECDSA with key recovery, which is currently employed for issuing transactions in Ethereum, to an IBS scheme. Even if this replacement is possible, it is not a reasonable price for the ease of the vanity address generation. Thus, we pay attention to a generic construction of IBS from signatures, and construct an IBS scheme from ECDSA with key recovery. Though we cannot directly generate a vanity address due to the key recovery functionality of the underlying ECDSA, we can connect any string with an address due to the functionality of IBS that can give additional meaning to the address. We implement our system by Solidity, and demonstrate that the gas cost is almost same as that of the ECDSA signature verification.

Paperid: 2666, https://arxiv.org/pdf/2507.12185.pdf

Abstract:
The advent of advanced Generative AI (GenAI) models such as DeepSeek and ChatGPT has significantly reshaped the cybersecurity landscape, introducing both promising opportunities and critical risks. This study investigates how GenAI powered chatbot services can be exploited via jailbreaking techniques to bypass ethical safeguards, enabling the generation of phishing content, recommendation of hacking tools, and orchestration of phishing campaigns. In ethically controlled experiments, we used ChatGPT 4o Mini selected for its accessibility and status as the latest publicly available model at the time of experimentation, as a representative GenAI system. Our findings reveal that the model could successfully guide novice users in executing phishing attacks across various vectors, including web, email, SMS (smishing), and voice (vishing). Unlike automated phishing campaigns that typically follow detectable patterns, these human-guided, AI assisted attacks are capable of evading traditional anti phishing mechanisms, thereby posing a growing security threat. We focused on DeepSeek and ChatGPT due to their widespread adoption and technical relevance in 2025. The study further examines common jailbreaking techniques and the specific vulnerabilities exploited in these models. Finally, we evaluate a range of mitigation strategies such as user education, advanced authentication mechanisms, and regulatory policy measures and discuss emerging trends in GenAI facilitated phishing, outlining future research directions to strengthen cybersecurity defenses in the age of artificial intelligence.

Paperid: 2667, https://arxiv.org/pdf/2507.11908.pdf

Abstract:
With the increasing concerns around privacy and the enforcement of data privacy laws, many websites now provide users with privacy controls. However, locating these controls can be challenging, as they are frequently hidden within multiple settings and layers. Moreover, the lack of standardization means these controls can vary widely across services. The technical or confusing terminology used to describe these controls further complicates users' ability to understand and use them effectively. This paper presents a large-scale empirical analysis investigating usability challenges of web privacy controls across 18,628 websites. While aiming for a multi-scenario view, our automated data collection faced significant hurdles, particularly in simulating sign-up and authenticated user visits, leading to more focused insights on guest visit scenarios and challenges in automated capture of dynamic user interactions. Our heuristic evaluation of three different user visit scenarios identifies significant website usability issues. Our results show that privacy policies are most common across all visit scenarios, with nudges and notices being prevalent in sign-up situations. We recommend designing privacy controls that: enhance awareness through pop-up nudges and notices; offer a table of contents as navigational aids and customized settings links in policies for more informed choice; and ensure accessibility via direct links to privacy settings from nudges.

Paperid: 2668, https://arxiv.org/pdf/2507.10160.pdf

Abstract:
Federated Learning has emerged as a leading paradigm for decentralized, privacy-preserving learning, particularly relevant in the era of interconnected edge devices equipped with sensors. However, the practical implementation of Federated Learning faces three primary challenges: the need for human involvement in costly data labelling processes for target adaptation, covariate shift in client device data collection due to environmental factors affecting sensors, leading to discrepancies between source and target samples, and the impracticality of continuous or regular model updates in resource-constrained environments due to limited data transmission capabilities and technical constraints on channel availability and energy efficiency. To tackle these issues, we expand upon an efficient and scalable Federated Learning framework tailored for real-world client adaptation in industrial settings. This framework leverages a pre-trained source model comprising a deep backbone, an adaptation module, and a classifier running on a powerful server. By freezing the backbone and classifier during client adaptation on resource-constrained devices, we allow the domain adaptive linear layer to handle target domain adaptation, thus minimizing overall computational overhead. Furthermore, this setup, designated as FedAcross+, is extended to encompass the processing of streaming data, thereby rendering the solution suitable for non-stationary environments. Extensive experimental results demonstrate the effectiveness of FedAcross+ in achieving competitive adaptation on low-end client devices with limited target samples, successfully addressing the challenge of domain shift. Moreover, our framework accommodates sporadic model updates within resource-constrained environments, ensuring practical and seamless deployment.

Paperid: 2669, https://arxiv.org/pdf/2507.09564.pdf

Abstract:
Phishing is a prevalent cyberattack that uses look-alike websites to deceive users into revealing sensitive information. Numerous efforts have been made by the Internet community and security organizations to detect, prevent, or train users to avoid falling victim to phishing attacks. Most of this research over the years has been highly diverse and application-oriented, often serving as standalone solutions for HTTP clients, servers, or third parties. However, limited work has been done to develop a comprehensive or proactive protocol-oriented solution to effectively counter phishing attacks. Inspired by the concept of certificate transparency, which allows certificates issued by Certificate Authorities (CAs) to be publicly verified by clients, thereby enhancing transparency, we propose a concept called Page Transparency (PT) for the web. The proposed PT requires login pages that capture users' sensitive information to be publicly logged via PLS and made available to web clients for verification. The pages are verified to be logged using cryptographic proofs. Since all pages are logged on a PLS and visually compared with existing pages through a comprehensive visual page-matching algorithm, it becomes impossible for an attacker to register a deceptive look-alike page on the PLS and receive the cryptographic proof required for client verification. All implementations occur on the client side, facilitated by the introduction of a new HTTP PT header, eliminating the need for platform-specific changes or the installation of third-party solutions for phishing prevention.

Paperid: 2670, https://arxiv.org/pdf/2507.09301.pdf

Abstract:
The emergence of quantum computers poses a significant threat to current secure service, application and/or protocol implementations that rely on RSA and ECDSA algorithms, for instance DNSSEC, because public-key cryptography based on number factorization or discrete logarithm is vulnerable to quantum attacks. This paper presents the integration of post-quantum cryptographic (PQC) algorithms into CoreDNS to enable quantum-resistant DNSSEC functionality. We have developed a plugin that extends CoreDNS with support for five PQC signature algorithm families: ML-DSA, FALCON, SPHINCS+, MAYO, and SNOVA. Our implementation maintains compatibility with existing DNS resolution flows while providing on-the-fly signing using quantum-resistant signatures. A benchmark has been performed and performance evaluation results reveal significant trade-offs between security and efficiency. The results indicate that while PQC algorithms introduce operational overhead, several candidates offer viable compromises for transitioning DNSSEC to quantum-resistant cryptography.

Paperid: 2671, https://arxiv.org/pdf/2507.09282.pdf

Abstract:
Dementia, a neurodegenerative disease, alters speech patterns, creating communication barriers and raising privacy concerns. Current speech technologies, such as automatic speech transcription (ASR), struggle with dementia and atypical speech, further challenging accessibility. This paper presents a novel dementia obfuscation in speech framework, ClaritySpeech, integrating ASR, text obfuscation, and zero-shot text-to-speech (TTS) to correct dementia-affected speech while preserving speaker identity in low-data environments without fine-tuning. Results show a 16% and 10% drop in mean F1 score across various adversarial settings and modalities (audio, text, fusion) for ADReSS and ADReSSo, respectively, maintaining 50% speaker similarity. We also find that our system improves WER (from 0.73 to 0.08 for ADReSS and 0.15 for ADReSSo) and speech quality from 1.65 to ~2.15, enhancing privacy and accessibility.

Paperid: 2672, https://arxiv.org/pdf/2507.08878.pdf

Abstract:
Large Language Models (LLMs) have showcased remarkable generalizability in language comprehension and hold significant potential to revolutionize human-computer interaction in smart homes. Existing LLM-based smart home assistants typically transmit user commands, along with user profiles and home configurations, to remote servers to obtain personalized services. However, users are increasingly concerned about the potential privacy leaks to the remote servers. To address this issue, we develop HomeLLaMA, an on-device assistant for privacy-preserving and personalized smart home serving with a tailored small language model (SLM). HomeLLaMA learns from cloud LLMs to deliver satisfactory responses and enable user-friendly interactions. Once deployed, HomeLLaMA facilitates proactive interactions by continuously updating local SLMs and user profiles. To further enhance user experience while protecting their privacy, we develop PrivShield to offer an optional privacy-preserving LLM-based smart home serving for those users, who are unsatisfied with local responses and willing to send less-sensitive queries to remote servers. For evaluation, we build a comprehensive benchmark DevFinder to assess the service quality. Extensive experiments and user studies (M=100) demonstrate that HomeLLaMA can provide personalized services while significantly enhancing user privacy.

Paperid: 2673, https://arxiv.org/pdf/2507.08864.pdf

Abstract:
Location-based vehicular traffic management faces significant challenges in protecting sensitive geographical data while maintaining utility for traffic management and fairness across regions. Existing state-of-the-art solutions often fail to meet the required level of protection against linkage attacks and demographic biases, leading to privacy leakage and inequity in data analysis. In this paper, we propose a novel algorithm designed to address the challenges regarding the balance of privacy, utility, and fairness in location-based vehicular traffic management systems. In this context, utility means providing reliable and meaningful traffic information, while fairness ensures that all regions and individuals are treated equitably in data use and decision-making. Employing differential privacy techniques, we enhance data security by integrating query-based data access with iterative shuffling and calibrated noise injection, ensuring that sensitive geographical data remains protected. We ensure adherence to epsilon-differential privacy standards by implementing the Laplace mechanism. We implemented our algorithm on vehicular location-based data from Norway, demonstrating its ability to maintain data utility for traffic management and urban planning while ensuring fair representation of all geographical areas without being overrepresented or underrepresented. Additionally, we have created a heatmap of Norway based on our model, illustrating the privatized and fair representation of the traffic conditions across various cities. Our algorithm provides privacy in vehicular traffic

Paperid: 2674, https://arxiv.org/pdf/2507.07258.pdf

Abstract:
As IoT ecosystems continue to expand across critical sectors, they have become prominent targets for increasingly sophisticated and large-scale malware attacks. The evolving threat landscape, combined with the sensitive nature of IoT-generated data, demands detection frameworks that are both privacy-preserving and resilient to data heterogeneity. Federated Learning (FL) offers a promising solution by enabling decentralized model training without exposing raw data. However, standard FL algorithms such as FedAvg and FedProx often fall short in real-world deployments characterized by class imbalance and non-IID data distributions -- particularly in the presence of rare or disjoint malware classes. To address these challenges, we propose FedP3E (Privacy-Preserving Prototype Exchange), a novel FL framework that supports indirect cross-client representation sharing while maintaining data privacy. Each client constructs class-wise prototypes using Gaussian Mixture Models (GMMs), perturbs them with Gaussian noise, and transmits only these compact summaries to the server. The aggregated prototypes are then distributed back to clients and integrated into local training, supported by SMOTE-based augmentation to enhance representation of minority malware classes. Rather than relying solely on parameter averaging, our prototype-driven mechanism enables clients to enrich their local models with complementary structural patterns observed across the federation -- without exchanging raw data or gradients. This targeted strategy reduces the adverse impact of statistical heterogeneity with minimal communication overhead. We evaluate FedP3E on the N-BaIoT dataset under realistic cross-silo scenarios with varying degrees of data imbalance.

Paperid: 2675, https://arxiv.org/pdf/2507.06439.pdf

Abstract:
Automotive safety and security are paramount in the rapidly advancing landscape of vehicular technology. Building safe and secure vehicles demands a profound understanding of automotive systems, particularly in safety and security. Traditional learning approaches, such as reading materials or observing demonstrations, often fail to provide the practical, hands-on experience essential for developing this expertise. For novice users, gaining access to automotive-grade systems and mastering their associated hardware and software can be challenging and overwhelming. In this paper, we present a novel, affordable, and flexible exploration platform, \hema, that enables users to gain practical, hands-on insights into the security compromises of micro-electromechanical systems (MEMS) sensors, a critical component in modern ADAS systems. Furthermore, we discuss the unique challenges and design considerations involved in creating such a platform, emphasizing its role in enhancing the understanding of automotive safety and security. This framework serves as an invaluable resource for educators, researchers, and practitioners striving to build expertise in the field.

Paperid: 2676, https://arxiv.org/pdf/2507.05972.pdf

Abstract:
Pseudoentropy characterizations provide a quantitatively precise demonstration of the close relationship between computational hardness and computational randomness. We prove a unified pseudoentropy characterization that generalizes and strengthens previous results for both uniform and non-uniform models of computation. Our characterization holds for a general family of entropy notions that encompasses the common notions of Shannon entropy and min entropy as special cases. Moreover, we show that the characterizations for different entropy notions can be simultaneously achieved by a single, universal function that simultaneously witnesses computational hardness and computational randomness. A key technical insight of our work is that the notion of weight-restricted calibration from the recent literature on algorithm fairness, along with standard computational indistinguishability (known as multiaccuracy in the fairness literature), suffices for proving pseudoentropy characterizations for general entropy notions. This demonstrates the power of weight-restricted calibration to enhance the classic Complexity-Theoretic Regularity Lemma (Trevisan, Tulsiani, and Vadhan, 2009) and Leakage Simulation Lemma (Jetchev and Pietrzak, 2014) and allows us to achieve an exponential improvement in the complexity dependency on the alphabet size compared to the pseudoentropy characterizations by Casacuberta, Dwork, and Vadhan (2024) based on the much stronger notion of multicalibration. We show that the exponential dependency on the alphabet size is inevitable for multicalibration as well as for the weaker notion of calibrated multiaccuracy.

Paperid: 2677, https://arxiv.org/pdf/2507.05875.pdf

Abstract:
Local differential privacy (LDP) has recently gained prominence as a powerful paradigm for collecting and analyzing sensitive data from users' devices. However, the inherent perturbation added by LDP protocols reduces the utility of the collected data. To mitigate this issue, several post-processing (PP) methods have been developed. Yet, the comparative performance of PP methods under diverse settings remains underexplored. In this paper, we present an extensive benchmark comprising 6 popular LDP protocols, 7 PP methods, 4 utility metrics, and 6 datasets to evaluate the behaviors and optimality of PP methods under diverse conditions. Through extensive experiments, we show that while PP can substantially improve utility when the privacy budget is small (i.e., strict privacy), its benefit diminishes as the privacy budget grows. Moreover, our findings reveal that the optimal PP method depends on multiple factors, including the choice of LDP protocol, privacy budget, data characteristics (such as distribution and domain size), and the specific utility metric. To advance research in this area and assist practitioners in identifying the most suitable PP method for their setting, we introduce LDP$^3$, an open-source benchmark platform. LDP$^3$ contains all methods used in our experimental analysis, and it is designed in a modular, extensible, and multi-threaded way for future use and development.

Paperid: 2678, https://arxiv.org/pdf/2507.05872.pdf

Abstract:
Local differential privacy (LDP) has become a prominent notion for privacy-preserving data collection. While numerous LDP protocols and post-processing (PP) methods have been developed, selecting an optimal combination under different privacy budgets and datasets remains a challenge. Moreover, the lack of a comprehensive and extensible LDP benchmarking toolkit raises difficulties in evaluating new protocols and PP methods. To address these concerns, this paper presents LDP$^3$ (pronounced LDP-Cube), an open-source, extensible, and multi-threaded toolkit for LDP researchers and practitioners. LDP$^3$ contains implementations of several LDP protocols, PP methods, and utility metrics in a modular and extensible design. Its modular design enables developers to conveniently integrate new protocols and PP methods. Furthermore, its multi-threaded nature enables significant reductions in execution times via parallelization. Experimental evaluations demonstrate that: (i) using LDP$^3$ to select a good protocol and post-processing method substantially improves utility compared to a bad or random choice, and (ii) the multi-threaded design of LDP$^3$ brings substantial benefits in terms of efficiency.

Paperid: 2679, https://arxiv.org/pdf/2507.05660.pdf

Abstract:
Recent advances in foundation models, such as LLMs, have revolutionized conversational AI. Chatbots are increasingly being developed by customizing LLMs on specific conversational datasets. However, mitigating toxicity during this customization, especially when dealing with untrusted training data, remains a significant challenge. To address this, we introduce TuneShield, a defense framework designed to mitigate toxicity during chatbot fine-tuning while preserving conversational quality. TuneShield leverages LLM-based toxicity classification, utilizing the instruction-following capabilities and safety alignment of LLMs to effectively identify toxic samples, outperforming industry API services. TuneShield generates synthetic conversation samples, termed 'healing data', based on the identified toxic samples, using them to mitigate toxicity while reinforcing desirable behavior during fine-tuning. It performs an alignment process to further nudge the chatbot towards producing desired responses. Our findings show that TuneShield effectively mitigates toxicity injection attacks while preserving conversational quality, even when the toxicity classifiers are imperfect or biased. TuneShield proves to be resilient against adaptive adversarial and jailbreak attacks. Additionally, TuneShield demonstrates effectiveness in mitigating adaptive toxicity injection attacks during dialog-based learning (DBL).

Paperid: 2680, https://arxiv.org/pdf/2507.05445.pdf

Abstract:
Computer Use Agents (CUAs), autonomous systems that interact with software interfaces via browsers or virtual machines, are rapidly being deployed in consumer and enterprise environments. These agents introduce novel attack surfaces and trust boundaries that are not captured by traditional threat models. Despite their growing capabilities, the security boundaries of CUAs remain poorly understood. In this paper, we conduct a systematic threat analysis and testing of real-world CUAs under adversarial conditions. We identify seven classes of risks unique to the CUA paradigm, and analyze three concrete exploit scenarios in depth: (1) clickjacking via visual overlays that mislead interface-level reasoning, (2) indirect prompt injection that enables Remote Code Execution (RCE) through chained tool use, and (3) CoT exposure attacks that manipulate implicit interface framing to hijack multi-step reasoning. These case studies reveal deeper architectural flaws across current CUA implementations. Namely, a lack of input provenance tracking, weak interface-action binding, and insufficient control over agent memory and delegation. We conclude by proposing a CUA-specific security evaluation framework and design principles for safe deployment in adversarial and high-stakes settings.

Paperid: 2681, https://arxiv.org/pdf/2507.04916.pdf

Abstract:
Card-based cryptography is a research area to implement cryptographic procedures using a deck of physical cards. In recent years, it has been found to be related to finite group theory and algebraic combinatorics, and is becoming more and more closely connected to the field of mathematics. In this paper, we discuss the relationship between card-based cryptography and combinatorics on words for the first time. In particular, we focus on cyclic equality of words. We say that a set of words are cyclically equalizable if they can be transformed to be cyclically equal by repeated simultaneous insertion of letters. The main result of this paper is to show that two binary words of equal length and equal Hamming weight are cyclically equalizable. As applications of cyclic equalizability to card-based cryptography, we describe its applications to the information erasure problem and to single-cut full-open protocols.

Paperid: 2682, https://arxiv.org/pdf/2507.04372.pdf

Abstract:
Traditional malware detection methods exhibit computational inefficiency due to exhaustive feature extraction requirements, creating accuracy-efficiency trade-offs that limit real-time deployment. We formulate malware classification as a Markov Decision Process with episodic feature acquisition and propose a Dueling Double Deep Q-Network (D3QN) framework for adaptive sequential feature selection. The agent learns to dynamically select informative features per sample before terminating with classification decisions, optimizing both detection accuracy and computational cost through reinforcement learning. We evaluate our approach on Microsoft Big2015 (9-class, 1,795 features) and BODMAS (binary, 2,381 features) datasets. D3QN achieves 99.22% and 98.83% accuracy while utilizing only 61 and 56 features on average, representing 96.6% and 97.6% dimensionality reduction. This yields computational efficiency improvements of 30.1x and 42.5x over traditional ensemble methods. Comprehensive ablation studies demonstrate consistent superiority over Random Forest, XGBoost, and static feature selection approaches. Quantitative analysis demonstrates that D3QN learns non-random feature selection policies with 62.5% deviation from uniform baseline distributions. The learned policies exhibit structured hierarchical preferences, utilizing high-level metadata features for initial assessment while selectively incorporating detailed behavioral features based on classification uncertainty. Feature specialization analysis reveals 57.7% of examined features demonstrate significant class-specific discrimination patterns. Our results validate reinforcement learning-based sequential feature selection for malware classification, achieving superior accuracy with substantial computational reduction through learned adaptive policies.

Paperid: 2683, https://arxiv.org/pdf/2507.04126.pdf

Abstract:
Biometric authentication is a widely used security mechanism that leverages unique physiological or behavioral characteristics to authenticate users. In multi-factor biometrics (MFB), multiple biometric modalities, e.g., physiological and behavioral, are integrated to mitigate the limitations inherent in single-factor biometrics. The main challenge in MFB lies in identifying novel behavioral techniques capable of meeting critical criteria, including high accuracy, high usability, non-invasiveness, resilience against spoofing attacks, and low use of computational resources. Despite ongoing advancements, current behavioral biometric techniques often fall short of fulfilling one or more of these requirements. In this work, we propose BlowPrint, a novel behavioral biometric technique that allows us to authenticate users based on their phone blowing behaviors. In brief, we assume that the way users blow on a phone screen can produce distinctive acoustic patterns, which can serve as a unique biometric identifier for effective user authentication. It can also be seamlessly integrated with physiological techniques, such as facial recognition, to enhance its robustness and security. To assess BlowPrint's effectiveness, we conduct an empirical study involving 50 participants from whom we collect blow-acoustic and facial feature data. Subsequently, we compute the similarity scores of the two modalities using various similarity algorithms and combine them through score-level fusion. Finally, we compute the accuracy using a machine learning-based classifier. As a result, the proposed method demonstrates an accuracy of 99.35% for blow acoustics, 99.96% for facial recognition, and 99.82% for the combined approach. The experimental results demonstrate BlowPrint's high effectiveness in terms of authentication accuracy, spoofing attack resilience, usability, non-invasiveness, and other aspects.

Paperid: 2684, https://arxiv.org/pdf/2507.03323.pdf

Abstract:
Card-based cryptography is a research area that realizes cryptographic protocols such as secure computation by applying shuffles to sequences of cards that encode input values. A single-cut full-open protocol is one that obtains an output value by applying a random cut to an input sequence of cards, after which all cards are opened. In this paper, we propose three single-cut full-open protocols: two protocols for three-variable functions and one protocol for a four-variable function.

Paperid: 2685, https://arxiv.org/pdf/2507.03236.pdf

Abstract:
The safety alignment of Language Models (LMs) is a critical concern, yet their integrity can be challenged by direct parameter manipulation attacks, such as those potentially induced by fault injection. As LMs are increasingly deployed using low-precision quantization for efficiency, this paper investigates the efficacy of such attacks for jailbreaking aligned LMs across different quantization schemes. We propose gradient-guided attacks, including a tailored progressive bit-level search algorithm introduced herein and a comparative word-level (single weight update) attack. Our evaluation on Llama-3.2-3B, Phi-4-mini, and Llama-3-8B across FP16 (baseline), and weight-only quantization (FP8, INT8, INT4) reveals that quantization significantly influences attack success. While attacks readily achieve high success (>80% Attack Success Rate, ASR) on FP16 models, within an attack budget of 25 perturbations, FP8 and INT8 models exhibit ASRs below 20% and 50%, respectively. Increasing the perturbation budget up to 150 bit-flips, FP8 models maintained ASR below 65%, demonstrating some resilience compared to INT8 and INT4 models that have high ASR. In addition, analysis of perturbation locations revealed differing architectural targets across quantization schemes, with (FP16, INT4) and (INT8, FP8) showing similar characteristics. Besides, jailbreaks induced in FP16 models were highly transferable to subsequent FP8/INT8 quantization (<5% ASR difference), though INT4 significantly reduced transferred ASR (avg. 35% drop). These findings highlight that while common quantization schemes, particularly FP8, increase the difficulty of direct parameter manipulation jailbreaks, vulnerabilities can still persist, especially through post-attack quantization.

Paperid: 2686, https://arxiv.org/pdf/2507.02951.pdf

Abstract:
This paper investigates whether Bittensor can be considered the Bitcoin of decentralized Artificial Intelligence by directly comparing its tokenomics, decentralization properties, consensus mechanism, and incentive structure against those of Bitcoin. Leveraging on-chain data from all 64 active Bittensor subnets, we first document considerable concentration in both stake and rewards. We further show that rewards are overwhelmingly driven by stake, highlighting a clear misalignment between quality and compensation. As a remedy, we put forward a series of two-pronged protocol-level interventions. For incentive realignment, our proposed solutions include performance-weighted emission split, composite scoring, and a trust-bonus multiplier. As for mitigating security vulnerability due to stake concentration, we propose and empirically validate stake cap at the 88th percentile, which elevates the median coalition size required for a 51-percent attack and remains robust across daily, weekly, and monthly snapshots.

Paperid: 2687, https://arxiv.org/pdf/2507.02414.pdf

Abstract:
Face identification systems operating in the ciphertext domain have garnered significant attention due to increasing privacy concerns and the potential recovery of original facial data. However, as the size of ciphertext template libraries grows, the face retrieval process becomes progressively more time-intensive. To address this challenge, we propose a novel and efficient scheme for face retrieval in the ciphertext domain, termed Privacy-Preserving Preselection for Face Identification Based on Packing (PFIP). PFIP incorporates an innovative preselection mechanism to reduce computational overhead and a packing module to enhance the flexibility of biometric systems during the enrollment stage. Extensive experiments conducted on the LFW and CASIA datasets demonstrate that PFIP preserves the accuracy of the original face recognition model, achieving a 100% hit rate while retrieving 1,000 ciphertext face templates within 300 milliseconds. Compared to existing approaches, PFIP achieves a nearly 50x improvement in retrieval efficiency.

Paperid: 2688, https://arxiv.org/pdf/2507.02177.pdf

Abstract:
This work investigates how to monitor access to Android zero-permission sensors which could cause privacy leakage to users. Moreover, monitoring such sensitive access allows security researchers to characterize potential sensor abuse patterns. Zero-permission sensors such as accelerometers have become an indispensable part of Android devices. The critical information they provide has attracted extensive research investigating how data collectors could capture more sensor data to enable both benign and exploitative applications. In contrast, little work has explored how to enable data providers, such as end users, to understand sensor usage. While existing methods such as static analysis and hooking-based dynamic analysis face challenges of requiring complicated development chains, rooting privilege, and app-specific reverse engineering analysis, our work aims to bridge this gap by developing ARMOUR for user-space runtime monitoring, leveraging the intrinsic sampling rate variation and convergence behaviors of Android. ARMOUR enables privacy-aware users to easily monitor how third-party apps use sensor data and support security researchers to perform rapid app-agnostic sensor access analysis. Our evaluation with 1,448 commercial applications shows the effectiveness of ARMOUR in detecting sensor usage in obfuscated code and other conditions, and observes salient sensor abuse patterns such as 50% of apps from seemingly sensor-independent categories accessing data of multiple zero-permission sensors. We analyze the impact of Android's recent policy changes on zero-permission sensors and remaining technical and regulatory problems.

Paperid: 2689, https://arxiv.org/pdf/2507.01020.pdf

Abstract:
Large Language Models (LLMs) continue to exhibit vulnerabilities to jailbreaking attacks: carefully crafted malicious inputs intended to circumvent safety guardrails and elicit harmful responses. As such, we present AutoAdv, a novel framework that automates adversarial prompt generation to systematically evaluate and expose vulnerabilities in LLM safety mechanisms. Our approach leverages a parametric attacker LLM to produce semantically disguised malicious prompts through strategic rewriting techniques, specialized system prompts, and optimized hyperparameter configurations. The primary contribution of our work is a dynamic, multi-turn attack methodology that analyzes failed jailbreak attempts and iteratively generates refined follow-up prompts, leveraging techniques such as roleplaying, misdirection, and contextual manipulation. We quantitatively evaluate attack success rate (ASR) using the StrongREJECT (arXiv:2402.10260 [cs.CL]) framework across sequential interaction turns. Through extensive empirical evaluation of state-of-the-art models--including ChatGPT, Llama, and DeepSeek--we reveal significant vulnerabilities, with our automated attacks achieving jailbreak success rates of up to 86% for harmful content generation. Our findings reveal that current safety mechanisms remain susceptible to sophisticated multi-turn attacks, emphasizing the urgent need for more robust defense strategies.

Paperid: 2690, https://arxiv.org/pdf/2512.24391.pdf

Abstract:
Trusted processors provide a way to perform joint computations while preserving data privacy. To overcome the performance degradation caused by data-oblivious algorithms to prevent information leakage, we explore the benefits of oblivious memory (OM) integrated in processors, to which the accesses are unobservable by adversaries. We focus on graph analytics, an important application vulnerable to access-pattern attacks. With a co-design between storage structure and algorithms, our prototype system is 100x faster than baselines given an OM sized around the per-core cache which can be implemented on existing processors with negligible overhead. This gives insights into equipping trusted processors with OM.

Paperid: 2693, https://arxiv.org/pdf/2512.24088.pdf

Abstract:
Mixture-of-Experts architectures have become the standard for scaling large language models due to their superior parameter efficiency. To accommodate the growing number of experts in practice, modern inference systems commonly adopt expert parallelism to distribute experts across devices. However, the absence of explicit load balancing constraints during inference allows adversarial inputs to trigger severe routing concentration. We demonstrate that out-of-distribution prompts can manipulate the routing strategy such that all tokens are consistently routed to the same set of top-$k$ experts, which creates computational bottlenecks on certain devices while forcing others to idle. This converts an efficiency mechanism into a denial-of-service attack vector, leading to violations of service-level agreements for time to first token. We propose RepetitionCurse, a low-cost black-box strategy to exploit this vulnerability. By identifying a universal flaw in MoE router behavior, RepetitionCurse constructs adversarial prompts using simple repetitive token patterns in a model-agnostic manner. On widely deployed MoE models like Mixtral-8x7B, our method increases end-to-end inference latency by 3.063x, degrading service availability significantly.

Paperid: 2695, https://arxiv.org/pdf/2512.23849.pdf

Abstract:
Detection-based security fails against sophisticated attackers using encryption, stealth, and low-rate techniques, particularly in IoT/edge environments where resource constraints preclude ML-based intrusion detection. We present Economic Denial Security (EDS), a detection-independent framework that makes attacks economically infeasible by exploiting a fundamental asymmetry: defenders control their environment while attackers cannot. EDS composes four mechanisms adaptive computational puzzles, decoy-driven interaction entropy, temporal stretching, and bandwidth taxation achieving provably superlinear cost amplification. We formalize EDS as a Stackelberg game, deriving closed-form equilibria for optimal parameter selection (Theorem 1) and proving that mechanism composition yields 2.1x greater costs than the sum of individual mechanisms (Theorem 2). EDS requires < 12KB memory, enabling deployment on ESP32 class microcontrollers. Evaluation on a 20-device heterogeneous IoT testbed across four attack scenarios (n = 30 trials, p < 0.001) demonstrates: 32-560x attack slowdown, 85-520:1 cost asymmetry, 8-62% attack success reduction, < 20ms latency overhead, and close to 0% false positives. Validation against IoT-23 malware (Mirai, Torii, Hajime) shows 88% standalone mitigation; combined with ML-IDS, EDS achieves 94% mitigation versus 67% for IDS alone a 27% improvement. EDS provides detection-independent protection suitable for resource-constrained environments where traditional approaches fail. The ability to detect and mitigate the malware samples tested was enhanced; however, the benefits provided by EDS were realized even without the inclusion of an IDS. Overall, the implementation of EDS serves to shift the economic balance in favor of the defender and provides a viable method to protect IoT and edge systems methodologies.

Paperid: 2696, https://arxiv.org/pdf/2512.23809.pdf

Abstract:
Recent attacks on critical infrastructure, including the 2021 Oldsmar water treatment breach and 2023 Danish energy sector compromises, highlight urgent security gaps in Industrial IoT (IIoT) deployments. While Federated Learning (FL) enables privacy-preserving collaborative intrusion detection, existing frameworks remain vulnerable to Byzantine poisoning attacks and lack robust agent authentication. We propose Zero-Trust Agentic Federated Learning (ZTA-FL), a defense in depth framework combining: (1) TPM-based cryptographic attestation achieving less than 0.0000001 false acceptance rate, (2) a novel SHAP-weighted aggregation algorithm providing explainable Byzantine detection under non-IID conditions with theoretical guarantees, and (3) privacy-preserving on-device adversarial training. Comprehensive experiments across three IDS benchmarks (Edge-IIoTset, CIC-IDS2017, UNSW-NB15) demonstrate that ZTA-FL achieves 97.8 percent detection accuracy, 93.2 percent accuracy under 30 percent Byzantine attacks (outperforming FLAME by 3.1 percent, p less than 0.01), and 89.3 percent adversarial robustness while reducing communication overhead by 34 percent. We provide theoretical analysis, failure mode characterization, and release code for reproducibility.

Paperid: 2697, https://arxiv.org/pdf/2512.22616.pdf

Abstract:
We frame Ethereum transactions reverted by invariants-require()/ assert()/if () revert statements in the contract implementation-as a positive signal of active on-chain defenses. Despite their value, the defensive patterns in these transactions remain undiscovered and underutilized in security research. We present Raven, a framework that aligns reverted transactions to the invariant causing the reversion in the smart contract source code, embeds these invariants using our BERT-based fine-tuned model, and clusters them by semantic intent to mine defensive invariant categories on Ethereum. Evaluated on a sample of 20,000 reverted transactions, Raven achieves cohesive and meaningful clusters of transaction-reverting invariants. Manual expert review of the mined 19 semantic clusters uncovers six new invariant categories absent from existing invariant catalogs, including feature toggles, replay prevention, proof/signature verification, counters, caller-provided slippage thresholds, and allow/ban/bot lists. To demonstrate the practical utility of this invariant catalog mining pipeline, we conduct a case study using one of the newly discovered invariant categories as a fuzzing oracle to detect vulnerabilities in a real-world attack. Raven thus can map Ethereum's successful defenses. These invariant categories enable security researchers to develop analysis tools based on data-driven security oracles extracted from the smart contracts' working defenses.

Paperid: 2698, https://arxiv.org/pdf/2512.22060.pdf

Abstract:
Natural Language Processing (NLP) systems are increasingly used in sensitive domains such as healthcare, finance, and government, where they handle large volumes of personal and regulated data. However, these systems introduce distinct risks related to security, privacy, and regulatory compliance that are not fully addressed by existing AI governance frameworks. This paper introduces the Secure and Compliant NLP Lifecycle Management Framework (SC-NLP-LMF), a comprehensive six-phase model designed to ensure the secure operation of NLP systems from development to retirement. The framework, developed through a systematic PRISMA-based review of 45 peer-reviewed and regulatory sources, aligns with leading standards, including NIST AI RMF, ISO/IEC 42001:2023, the EU AI Act, and MITRE ATLAS. It integrates established methods for bias detection, privacy protection (differential privacy, federated learning), secure deployment, explainability, and secure model decommissioning. A healthcare case study illustrates how SC-NLP-LMF detects emerging terminology drift (e.g., COVID-related language) and guides compliant model updates. The framework offers organizations a practical, lifecycle-wide structure for developing, deploying, and maintaining secure and accountable NLP systems in high-risk environments.

Paperid: 2699, https://arxiv.org/pdf/2512.20303.pdf

Abstract:
The classical two-capacitor paradox of the lost energy is revisited from an electronic circuit security stand-point. The paradox has been solved previously by various researchers, and the energy lost during the charging of capacitors has been primarily attributed to the heat and radiation. We analytically prove this for various standard resistor-capacitor (RC) and resistor-inductor-capacitor (RLC) circuit models. From the perspective of electronic system security, electromagnetic (EM) side-channel analysis (SCA) has recently gained significant prominence with the growth of resource-constrained, internet connected devices. This article connects the energy lost due to capacitor charging to the EM SCA leakage in electronic devices, leading to the recovery of the secret encryption key embedded within the device. Finally, with an understanding of how lost energy relates to EM radiation, we propose adiabatic charging as a solution to minimize EM leakage, thereby paving the way towards low-overhead EM SCA resilience.

Paperid: 2700, https://arxiv.org/pdf/2512.19711.pdf

Abstract:
Connected autonomous vehicles (CAVs) rely on vision-based deep neural networks (DNNs) and low-latency (Vehicle-to-Everything) V2X communication to navigate safely and efficiently. Despite their advances, these systems remain vulnerable to physical adversarial attacks. In this paper, we introduce PHANTOM (PHysical ANamorphic Threats Obstructing connected vehicle Mobility), a novel framework for crafting and deploying perspective-dependent adversarial examples using \textit{anamorphic art}. PHANTOM exploits geometric distortions that appear natural to humans but are misclassified with high confidence by state-of-the-art object detectors. Unlike conventional attacks, PHANTOM operates in black-box settings without model access and demonstrates strong transferability across four diverse detector architectures (YOLOv5, SSD, Faster R-CNN, and RetinaNet). Comprehensive evaluation in CARLA across varying speeds, weather conditions, and lighting scenarios shows that PHANTOM achieves over 90\% attack success rate under optimal conditions and maintains 60-80\% effectiveness even in degraded environments. The attack activates within 6-10 meters of the target, providing insufficient time for safe maneuvering. Beyond individual vehicle deception, PHANTOM triggers network-wide disruption in CAV systems: SUMO-OMNeT++ co-simulation demonstrates that false emergency messages propagate through V2X links, increasing Peak Age of Information by 68-89\% and degrading safety-critical communication. These findings expose critical vulnerabilities in both perception and communication layers of CAV ecosystems.

Paperid: 2701, https://arxiv.org/pdf/2512.19297.pdf

Abstract:
Low-Rank Adaptation (LoRA) has emerged as an efficient method for fine-tuning large language models (LLMs) and is widely adopted within the open-source community. However, the decentralized dissemination of LoRA adapters through platforms such as Hugging Face introduces novel security vulnerabilities: malicious adapters can be easily distributed and evade conventional oversight mechanisms. Despite these risks, backdoor attacks targeting LoRA-based fine-tuning remain relatively underexplored. Existing backdoor attack strategies are ill-suited to this setting, as they often rely on inaccessible training data, fail to account for the structural properties unique to LoRA, or suffer from high false trigger rates (FTR), thereby compromising their stealth. To address these challenges, we propose Causal-Guided Detoxify Backdoor Attack (CBA), a novel backdoor attack framework specifically designed for open-weight LoRA models. CBA operates without access to original training data and achieves high stealth through two key innovations: (1) a coverage-guided data generation pipeline that synthesizes task-aligned inputs via behavioral exploration, and (2) a causal-guided detoxification strategy that merges poisoned and clean adapters by preserving task-critical neurons. Unlike prior approaches, CBA enables post-training control over attack intensity through causal influence-based weight allocation, eliminating the need for repeated retraining. Evaluated across six LoRA models, CBA achieves high attack success rates while reducing FTR by 50-70\% compared to baseline methods. Furthermore, it demonstrates enhanced resistance to state-of-the-art backdoor defenses, highlighting its stealth and robustness.

Paperid: 2702, https://arxiv.org/pdf/2512.18751.pdf

Abstract:
FinTechs increasing connectivity, rapid innovation, and reliance on global digital infrastructures present significant cybersecurity challenges. Traditional cybersecurity frameworks often struggle to identify and prioritize sector-specific vulnerabilities or adapt to evolving adversary tactics, particularly in highly targeted sectors such as FinTech. To address these gaps, we propose ISADM (Integrated STRIDE-ATTACK-D3FEND Threat Model), a novel hybrid methodology applied to FinTech security that integrates STRIDE's asset-centric threat classification with MITRE ATTACK's catalog of real-world adversary behaviors and D3FEND's structured knowledge of countermeasures. ISADM employs a frequency-based scoring mechanism to quantify the prevalence of adversarial Tactics, Techniques, and Procedures (TTPs), enabling a proactive, score-driven risk assessment and prioritization framework. This proactive approach contributes to shifting organizations from reactive defense strategies toward the strategic fortification of critical assets. We validate ISADM through industry-relevant case study analyses, demonstrating how the approach replicates actual attack patterns and strengthens proactive threat modeling, guiding risk prioritization and resource allocation to the most critical vulnerabilities. Overall, ISADM offers a comprehensive hybrid threat modeling methodology that bridges asset-centric and adversary-centric analysis, providing FinTech systems with stronger defenses. The emphasis on real-world validation highlights its practical significance in enhancing the sector's cybersecurity posture through a frequency-informed, impact-aware prioritization scheme that combines empirical attacker data with contextual risk analysis.

Paperid: 2703, https://arxiv.org/pdf/2512.18456.pdf

Abstract:
AI-for-Code (AI4Code) systems are reshaping software engineering, with tools like GitHub Copilot accelerating code generation, translation, and vulnerability detection. Alongside these advances, however, security risks remain pervasive: insecure outputs, biased benchmarks, and susceptibility to adversarial manipulation undermine their reliability. This SoK surveys the landscape of AI4Code security across three core applications, identifying recurring gaps: benchmark dominance by Python and toy problems, lack of standardized security datasets, data leakage in evaluation, and fragile adversarial robustness. A comparative study of six state-of-the-art models illustrates these challenges: insecure patterns persist in code generation, vulnerability detection is brittle to semantic-preserving attacks, fine-tuning often misaligns security objectives, and code translation yields uneven security benefits. From this analysis, we distill three forward paths: embedding secure-by-default practices in code generation, building robust and comprehensive detection benchmarks, and leveraging translation as a route to security-enhanced languages. We call for a shift toward security-first AI4Code, where vulnerability mitigation and robustness are embedded throughout the development life cycle.

Paperid: 2704, https://arxiv.org/pdf/2512.18174.pdf

Abstract:
The current era of AI development places a heavy emphasis on training large models on increasingly scaled-up datasets. This paradigm has catalyzed entirely new product categories, such as LLM chatbots, while also raising concerns about data privacy and consumer choice. In this paper, we consider questions of data portability and user autonomy in the context of LLMs that "reason" using chain-of-thought (CoT) traces, computing intermediate text artifacts from user input before producing a final output. We first interpret recent data privacy and portability law to argue that these intermediate computations qualify as users' personal data. Then, building on the existing framework of Conscious Data Contribution, we show how communities who receive low utility from an available model can aggregate and distill their shared knowledge into an alternate model better aligned with their goals. We verify this approach empirically and investigate the effects of community diversity, reasoning granularity, and community size on distillation performance.

Paperid: 2705, https://arxiv.org/pdf/2512.18043.pdf

Abstract:
Securing Agentic Artificial Intelligence (AI) systems requires addressing the complex cyber risks introduced by autonomous, decision-making, and adaptive behaviors. Agentic AI systems are increasingly deployed across industries, organizations, and critical sectors such as cybersecurity, finance, and healthcare. However, their autonomy introduces unique security challenges, including unauthorized actions, adversarial manipulation, and dynamic environmental interactions. Existing AI security frameworks do not adequately address these challenges or the unique nuances of agentic AI. This research develops a lifecycle-aware security framework specifically designed for agentic AI systems using the Design Science Research (DSR) methodology. The paper introduces MAAIS, an agentic security framework, and the agentic AI CIAA (Confidentiality, Integrity, Availability, and Accountability) concept. MAAIS integrates multiple defense layers to maintain CIAA across the AI lifecycle. Framework validation is conducted by mapping with the established MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) AI tactics. The study contributes a structured, standardized, and framework-based approach for the secure deployment and governance of agentic AI in enterprise environments. This framework is intended for enterprise CISOs, security, AI platform, and engineering teams and offers a detailed step-by-step approach to securing agentic AI workloads.

Paperid: 2706, https://arxiv.org/pdf/2512.17902.pdf

Abstract:
With the increase in deep learning, it becomes increasingly difficult to understand the model in which AI systems can identify objects. Thus, an adversary could aim to modify an image by adding unseen elements, which will confuse the AI in its recognition of an entity. This paper thus investigates the adversarial robustness of LLaVA-1.5-13B and Meta's Llama 3.2 Vision-8B-2. These are tested for untargeted PGD (Projected Gradient Descent) against the visual input modality, and empirically evaluated on the Visual Question Answering (VQA) v2 dataset subset. The results of these adversarial attacks are then quantified using the standard VQA accuracy metric. This evaluation is then compared with the accuracy degradation (accuracy drop) of LLaVA and Llama 3.2 Vision. A key finding is that Llama 3.2 Vision, despite a lower baseline accuracy in this setup, exhibited a smaller drop in performance under attack compared to LLaVA, particularly at higher perturbation levels. Overall, the findings confirm that the vision modality represents a viable attack vector for degrading the performance of contemporary open-weight VLMs, including Meta's Llama 3.2 Vision. Furthermore, they highlight that adversarial robustness does not necessarily correlate directly with standard benchmark performance and may be influenced by underlying architectural and training factors.

Paperid: 2707, https://arxiv.org/pdf/2512.17748.pdf

Abstract:
The rise of quantum computing/technology potentially introduces significant security challenges to cloud computing, necessitating quantum-resistant encryption strategies as well as protection schemes and methods for cloud infrastructures offering quantum computing time and services (i.e. quantum clouds). This research explores various options for securing quantum clouds and ensuring privacy, especially focussing on the integration of homomorphic encryption (HE) into Eclipse Qrisp, a high-level quantum computing framework, to enhance the security of quantum cloud platforms. The study addresses the technical feasibility of integrating HE with Qrisp, evaluates performance trade-offs, and assesses the potential impact on future quantum cloud architectures. The successful implementation and Qrisp integration of three post-quantum cryptographic (PQC) algorithms demonstrates the feasibility of integrating HE with quantum computing frameworks. The findings indicate that while the Quantum One-Time Pad (QOTP) offers simplicity and low overhead, other algorithms like Chen and Gentry-Sahai-Waters (GSW) present performance trade-offs in terms of runtime and memory consumption. The study results in an overall set of recommendations for securing quantum clouds, e.g. implementing HE at data storage and processing levels, developing Quantum Key Distribution (QKD), and enforcing stringent access control and authentication mechanisms as well as participating in PQC standardization efforts.

Paperid: 2708, https://arxiv.org/pdf/2512.16719.pdf

Abstract:
This paper introduces an adaptive preprocessing technique to enhance the accuracy of channel state information-based physical layer authentication (CSI-PLA) alleviating CSI variations and inconsistencies in the time domain. To this end, we develop an adaptive robust principal component analysis (A-RPCA) preprocessing method based on robust principal component analysis (RPCA). The performance evaluation is then conducted using a PLA framework based on information reconciliation, in which Gaussian approximation (GA) for Polar codes is leveraged for the design of short codelength Slepian Wolf decoders. Furthermore, an analysis of the proposed A-RPCA methods is carried out. Simulation results show that compared to a baseline scheme without preprocessing and without reconciliation, the proposed A-RPCA method substantially reduces the error probability after reconciliation and also substantially increases the detection probabilities that is also 1 in both line-of-sight (LOS) and non-line-of-sight (NLOS) scenarios. We have compared against state-of the-art preprocessing schemes in both synthetic and real datasets, including principal component analysis (PCA) and robust PCA, autoencoders and the recursive projected compressive sensing (ReProCS) framework and we have validated the superior performance of the proposed approach.

Paperid: 2709, https://arxiv.org/pdf/2512.16394.pdf

Abstract:
The Internet of Things (IoT) has the potential to enhance older adults' independence and quality of life, but it also exposes them to security, privacy, accessibility, and usability (SPAU) risks. We conducted a systematic review of 44 peer-reviewed studies published between 2004 and 2024 using a five-phase screening pipeline. From each study, we extracted data on study design, IoT type, SPAU measures, and identified research gaps. We introduce the SPAU-IoT Framework, which comprises 27 criteria across four dimensions: security (e.g., resilience to cyber threats, secure authentication, encrypted communication, secure-by-default settings, and guardianship features), privacy (e.g., data minimization, explicit consent, and privacy-preserving analytics), accessibility (e.g., compliance with ADA/WCAG standards and assistive-technology compatibility), and usability (e.g., guided interaction, integrated assistance, and progressive learning). Applying this framework revealed that more than 70% of studies implemented authentication and encryption mechanisms, whereas fewer than 50% addressed accessibility or usability concerns. We further developed a threat model that maps IoT assets, networks, and backend servers to exploit vectors such as phishing, caregiver exploitation, and weak-password attacks, explicitly accounting for age-related vulnerabilities including cognitive decline and sensory impairment. Our results expose a systemic lack of integrated SPAU approaches in existing IoT research and translate these gaps into actionable, standards-aligned design guidelines for IoT systems designed for older adults.

Paperid: 2710, https://arxiv.org/pdf/2512.16307.pdf

Abstract:
In this fast-evolving area of LLMs, our paper discusses the significant security risk presented by prompt injection attacks. It focuses on small open-sourced models, specifically the LLaMA family of models. We introduce novel defense mechanisms capable of generating automatic defenses and systematically evaluate said generated defenses against a comprehensive set of benchmarked attacks. Thus, we empirically demonstrated the improvement proposed by our approach in mitigating goal-hijacking vulnerabilities in LLMs. Our work recognizes the increasing relevance of small open-sourced LLMs and their potential for broad deployments on edge devices, aligning with future trends in LLM applications. We contribute to the greater ecosystem of open-source LLMs and their security in the following: (1) assessing present prompt-based defenses against the latest attacks, (2) introducing a new framework using a seed defense (Chain Of Thoughts) to refine the defense prompts iteratively, and (3) showing significant improvements in detecting goal hijacking attacks. Out strategies significantly reduce the success rates of the attacks and false detection rates while at the same time effectively detecting goal-hijacking capabilities, paving the way for more secure and efficient deployments of small and open-source LLMs in resource-constrained environments.

Paperid: 2711, https://arxiv.org/pdf/2512.15990.pdf

Abstract:
Quantum Key Distribution (QKD) schemes are key exchange protocols based on the physical properties of quantum channels. They avoid the computational-hardness assumptions that underlie the security of classical key exchange. Continuous-Variable QKD (CVQKD), in contrast to qubit-based discrete-variable (DV) schemes, makes use of quadrature measurements of the electromagnetic field. CVQKD has the advantage of being compatible with standard telecom equipment, but at long distances has to deal with very low signal to noise ratios, which necessitates labour-intensive error correction. It is challenging to implement the error correction decoding in realtime. In this paper we introduce a random-codebook error correction method that is suitable for long range Gaussian-modulated CVQKD. We use likelihood ratio scoring with block rejection based on thresholding. For proof-technical reasons, the accept/reject decisions are communicated in encrypted form; in this way we avoid having to deal with non-Gaussian states in the analysis of the leakage. The error correction method is highly parallelisable, which is advantageous for realtime implementation. Under conservative assumptions on the computational resources, we predict a realtime key ratio of at least 8% of the Devetak-Winter value, which outperforms existing reconciliation schemes.

Paperid: 2712, https://arxiv.org/pdf/2512.15794.pdf

Abstract:
This paper analyzes the relationship between cybersecurity management, data protection, and corporate reputation in the context of digital transformation. The study examines how organizations implement strategies and tools to mitigate cyber risks, comply with regulatory requirements, and maintain stakeholder trust. A quantitative research design was applied using an online diagnostic survey conducted among enterprises from various industries operating in Poland. The analysis covered formal cybersecurity strategies, technical and procedural safeguards, employee awareness, incident response practices, and the adoption of international standards such as ISO/IEC 27001 and ISO/IEC 27032. The findings indicate that most organizations have formalized cybersecurity frameworks, conduct regular audits, and invest in employee awareness programs. Despite this high level of preparedness, 75 percent of surveyed firms experienced cybersecurity incidents within the previous twelve months. The most frequently reported consequences were reputational damage and loss of customer trust, followed by operational disruptions and financial or regulatory impacts. The results show that cybersecurity is increasingly perceived as a strategic investment supporting long-term organizational stability rather than merely a compliance cost. The study highlights the importance of integrating cybersecurity governance with corporate communication and reputation management, emphasizing data protection as a key determinant of digital trust and organizational resilience.

Paperid: 2713, https://arxiv.org/pdf/2512.15779.pdf

Abstract:
Network Intrusion Detection Systems (NIDS) are essential for securing networks by identifying and mitigating unauthorized activities indicative of cyberattacks. As cyber threats grow increasingly sophisticated, NIDS must evolve to detect both emerging threats and deviations from normal behavior. This study explores the application of machine learning (ML) methods to improve the NIDS accuracy through analyzing intricate structures in deep-featured network traffic records. Leveraging the 1999 KDD CUP intrusion dataset as a benchmark, this research evaluates and optimizes several ML algorithms, including Support Vector Machines (SVM), Naïve Bayes variants (MNB, BNB), Random Forest (RF), k-Nearest Neighbors (k-NN), Decision Trees (DT), AdaBoost, XGBoost, Logistic Regression (LR), Ridge Classifier, Passive-Aggressive (PA) Classifier, Rocchio Classifier, Artificial Neural Networks (ANN), and Perceptron (PPN). Initial evaluations without hyper-parameter optimization demonstrated suboptimal performance, highlighting the importance of tuning to enhance classification accuracy. After hyper-parameter optimization using grid and random search techniques, the SVM classifier achieved 99.12% accuracy with a 0.0091 False Alarm Rate (FAR), outperforming its default configuration (98.08% accuracy, 0.0123 FAR) and all other classifiers. This result confirms that SVM accomplishes the highest accuracy among the evaluated classifiers. We validated the effectiveness of all classifiers using a tenfold cross-validation approach, incorporating Recursive Feature Elimination (RFE) for feature selection to enhance the classifiers accuracy and efficiency. Our outcomes indicate that ML classifiers are both adaptable and reliable, contributing to enhanced accuracy in systems for detecting network intrusions.

Paperid: 2714, https://arxiv.org/pdf/2512.15447.pdf

Abstract:
Reusable software components, typically distributed as packages, are a central paradigm of modern software development. The JavaScript ecosystem serves as a prime example, offering millions of packages with their use being promoted as idiomatic. However, download statistics on npm raise security concerns as they indicate a high popularity of vulnerable package versions while their real prevalence on production websites remains unknown. Package version detection mechanisms fill this gap by extracting utilized packages and versions from observed artifacts on the web. Prior research focuses on mechanisms for either hand-selected popular packages in bundles or for single-file resources utilizing the global namespace. This does not allow for a thorough analysis of modern web applications' dependency update behavior at scale. In this work, we improve upon this by presenting Aletheia, a package-agnostic method which dissects JavaScript bundles to identify package versions through algorithms originating from the field of plagiarism detection. We show that this method clearly outperforms the existing approaches in practical settings. Furthermore, we crawl the Tranco top 100,000 domains to reveal that 5% - 20% of domains update their dependencies within 16 weeks. Surprisingly, from a longitudinal perspective, bundled packages are updated significantly faster than their CDN-included counterparts, with consequently up to 10 times fewer known vulnerable package versions included. Still, we observe indicators that few widespread vendors seem to be a major driving force behind timely updates, implying that quantitative measures are not painting a complete picture.

Paperid: 2715, https://arxiv.org/pdf/2512.15275.pdf

Abstract:
Adversary emulation is an essential procedure for cybersecurity assessments such as evaluating an organization's security posture or facilitating structured training and research in dedicated environments. To allow for systematic and time-efficient assessments, several approaches from academia and industry have worked towards the automation of adversarial actions. However, they exhibit significant limitations regarding autonomy, tactics coverage, and real-world applicability. Consequently, adversary emulation remains a predominantly manual task requiring substantial human effort and security expertise - even amidst the rise of Large Language Models. In this paper, we present Bounty Hunter, an automated adversary emulation method, designed and implemented as an open-source plugin for the popular adversary emulation platform Caldera, that enables autonomous emulation of adversaries with multi-faceted behavior while providing a wide coverage of tactics. To this end, it realizes diverse adversarial behavior, such as different levels of detectability and varying attack paths across repeated emulations. By autonomously compromising a simulated enterprise network, Bounty Hunter showcases its ability to achieve given objectives without prior knowledge of its target, including pre-compromise, initial compromise, and post-compromise attack tactics. Overall, Bounty Hunter facilitates autonomous, comprehensive, and multi-faceted adversary emulation to help researchers and practitioners in performing realistic and time-efficient security assessments, training exercises, and intrusion detection research.

Paperid: 2716, https://arxiv.org/pdf/2512.15037.pdf

Abstract:
Reverse engineering of gate-level netlist is critical for Hardware Trojans detection and Design Piracy counteracting. The primary task of gate-level reverse engineering is to separate the control and data signals from the netlist, which is mainly realized by identifying state registers with topological comparison.However, these methods become inefficient for large scale netlist. In this work, we propose RELIC-GNN, a graph neural network based state registers identification method, to address these issues. RELIC-GNN models the path structure of register as a graph and generates corresponding representation by considering node attributes and graph structure during training. The trained GNN model could be adopted to find the registers type very efficiently. Experimental results show that RELIC-GNN could achieve 100% in recall, 30.49% in precision and 88.37% in accuracy on average across different designs, which obtains significant improvements than previous approaches.

Paperid: 2717, https://arxiv.org/pdf/2512.14745.pdf

Abstract:
As AI capabilities advance, we increasingly rely on powerful models to decompose complex tasks $\unicode{x2013}$ but what if the decomposer itself is malicious? Factored cognition protocols decompose complex tasks into simpler child tasks: one model creates the decomposition, while other models implement the child tasks in isolation. Prior work uses trusted (weaker but reliable) models for decomposition, which limits usefulness for tasks where decomposition itself is challenging. We introduce Factor($U$,$T$), in which an untrusted (stronger but potentially malicious) model decomposes while trusted models implement child tasks. Can monitors detect malicious activity when observing only natural language task instructions, rather than complete solutions? We baseline and red team Factor($U$,$T$) in control evaluations on BigCodeBench, a dataset of Python coding tasks. Monitors distinguishing malicious from honest decompositions perform poorly (AUROC 0.52) compared to monitors evaluating complete Python solutions (AUROC 0.96). Furthermore, Factor($D$,$U$), which uses a trusted decomposer and monitors concrete child solutions, achieves excellent discrimination (AUROC 0.96) and strong safety (1.2% ASR), demonstrating that implementation-context monitoring succeeds where decomposition-only monitoring fails.

Paperid: 2718, https://arxiv.org/pdf/2512.13501.pdf

Abstract:
Machine learning based intrusion detection systems are increasingly targeted by black box adversarial attacks, where attackers craft evasive inputs using indirect feedback such as binary outputs or behavioral signals like response time and resource usage. While several defenses have been proposed, including input transformation, adversarial training, and surrogate detection, they often fall short in practice. Most are tailored to specific attack types, require internal model access, or rely on static mechanisms that fail to generalize across evolving attack strategies. Furthermore, defenses such as input transformation can degrade intrusion detection system performance, making them unsuitable for real time deployment. To address these limitations, we propose Adaptive Feature Poisoning, a lightweight and proactive defense mechanism designed specifically for realistic black box scenarios. Adaptive Feature Poisoning assumes that probing can occur silently and continuously, and introduces dynamic and context aware perturbations to selected traffic features, corrupting the attacker feedback loop without impacting detection capabilities. The method leverages traffic profiling, change point detection, and adaptive scaling to selectively perturb features that an attacker is likely exploiting, based on observed deviations. We evaluate Adaptive Feature Poisoning against multiple realistic adversarial attack strategies, including silent probing, transferability based attacks, and decision boundary based attacks. The results demonstrate its ability to confuse attackers, degrade attack effectiveness, and preserve detection performance. By offering a generalizable, attack agnostic, and undetectable defense, Adaptive Feature Poisoning represents a significant step toward practical and robust adversarial resilience in machine learning based intrusion detection systems.

Paperid: 2719, https://arxiv.org/pdf/2512.12829.pdf

Abstract:
Space infrastructures represent an emerging domain that is critical to the global economy and society. However, this domain is vulnerable to attacks. To enhance the resilience of this domain, we must understand the attacks that can be waged against it. The status quo is that there is no systematic understanding of attacks against space infrastructures, despite their importance in guiding systematic analysis of space cybersecurity and future research. In this paper, we fill the void by proposing the first systematic taxonomy of attacks against space infrastructures. We hope this paper will inspire a community effort at refining the taxonomy towards a widely used taxonomy.

Paperid: 2720, https://arxiv.org/pdf/2512.12583.pdf

Abstract:
Prompt injection attacks can compromise the security and stability of critical systems, from infrastructure to large web applications. This work curates and augments a prompt injection dataset based on the HackAPrompt Playground Submissions corpus and trains several classifiers, including LSTM, feed forward neural networks, Random Forest, and Naive Bayes, to detect malicious prompts in LLM integrated web applications. The proposed approach improves prompt injection detection and mitigation, helping protect targeted applications and systems.

Paperid: 2721, https://arxiv.org/pdf/2512.12580.pdf

Abstract:
This article introduces a novel cryptographic paradigm based on nonderived polyadic algebraic structures. Traditional cryptosystems rely on binary operations within groups, rings, or fields, whose well-understood properties can be exploited in cryptanalysis. To overcome these vulnerabilities, we propose a shift to polyadic rings, which generalize classical rings by allowing operations of higher arity: an $m$-ary addition and an $n$-ary multiplication. The foundation of our approach is the construction of polyadic integers -- congruence classes of ordinary integers endowed with such $m$-ary and $n$-ary operations. A key innovation is the parameter-to-arity mapping $Φ(a,b)=(m,n)$, which links the parameters $(a,b)$ defining a congruence class to the specific arities required for algebraic closure. This mapping is mathematically intricate: it is non-injective, non-surjective, and multivalued. This complex, non-unique relationship forms the core of the proposed cryptosystem's security. We present two concrete encryption procedures that leverage this structure by encoding plaintext within the parameters of polyadic rings and transmitting information via polyadically quantized analog signals. In one method, plaintext is linked to the additive arity $m_{i}$ and secured using the summation of such signals; in the other, it is linked to a ring parameter $a_{i}$ and secured using their multiplication. In both cases, the "quantized" nature of polyadic operations generates systems of equations that are straightforward for a legitimate recipient with the correct key but exceptionally difficult for an attacker without it. The resulting framework promises a substantial increase in cryptographic security. This work establishes the theoretical foundation for this new class of encryption schemes and highlights their potential for constructing robust, next-generation cryptographic protocols.

Paperid: 2722, https://arxiv.org/pdf/2512.10653.pdf

Abstract:
Face anti-spoofing (FAS) is a vital component of remote biometric authentication systems based on facial recognition, increasingly used across web-based applications. Among emerging threats, video injection attacks -- facilitated by technologies such as deepfakes and virtual camera software -- pose significant challenges to system integrity. While virtual camera detection (VCD) has shown potential as a countermeasure, existing literature offers limited insight into its practical implementation and evaluation. This study introduces a machine learning-based approach to VCD, with a focus on its design and validation. The model is trained on metadata collected during sessions with authentic users. Empirical results demonstrate its effectiveness in identifying video injection attempts and reducing the risk of malicious users bypassing FAS systems.

Paperid: 2723, https://arxiv.org/pdf/2512.10296.pdf

Abstract:
Federated Learning (FL) enables collaborative model training across distributed devices while safeguarding data and user privacy. However, FL remains susceptible to privacy threats that can compromise data via direct means. That said, indirectly compromising the confidentiality of the FL model architecture (e.g., a convolutional neural network (CNN) or a recurrent neural network (RNN)) on a client device by an outsider remains unexplored. If leaked, this information can enable next-level attacks tailored to the architecture. This paper proposes a novel side-channel fingerprinting attack, leveraging flow-level and packet-level statistics of encrypted wireless traffic from an FL client to infer its deep learning model architecture. We name it FLARE, a fingerprinting framework based on FL Architecture REconnaissance. Evaluation across various CNN and RNN variants-including pre-trained and custom models trained over IEEE 802.11 Wi-Fi-shows that FLARE achieves over 98% F1-score in closed-world and up to 91% in open-world scenarios. These results reveal that CNN and RNN models leak distinguishable traffic patterns, enabling architecture fingerprinting even under realistic FL settings with hardware, software, and data heterogeneity. To our knowledge, this is the first work to fingerprint FL model architectures by sniffing encrypted wireless traffic, exposing a critical side-channel vulnerability in current FL systems.

Paperid: 2724, https://arxiv.org/pdf/2512.10185.pdf

Abstract:
A recent watermarking scheme for language models achieves distortion-free embedding and robustness to edit-distance attacks. However, it suffers from limited generation diversity and high detection overhead. In parallel, recent research has focused on undetectability, a property ensuring that watermarks remain difficult for adversaries to detect and spoof. In this work, we introduce a new class of watermarking schemes constructed through probabilistic automata. We present two instantiations: (i) a practical scheme with exponential generation diversity and computational efficiency, and (ii) a theoretical construction with formal undetectability guarantees under cryptographic assumptions. Extensive experiments on LLaMA-3B and Mistral-7B validate the superior performance of our scheme in terms of robustness and efficiency.

Paperid: 2725, https://arxiv.org/pdf/2512.09957.pdf

Abstract:
Access control policies are vital for securing modern cloud computing, where organizations must manage access to sensitive data across thousands of users in distributed system settings. Cloud administrators typically write and update policies manually, which can be an error-prone and time-consuming process and can potentially lead to security vulnerabilities. Existing approaches based on symbolic analysis have demonstrated success in automated debugging and repairing access control policies; however, their generalizability is limited in the context of cloud-based access control. Conversely, Large Language Models (LLMs) have been utilized for automated program repair; however, their applicability to repairing cloud access control policies remains unexplored. In this work, we introduce CloudFix, the first automated policy repair framework for cloud access control that combines formal methods with LLMs. Given an access control policy and a specification of allowed and denied access requests, CloudFix employs Formal Methods-based Fault Localization to identify faulty statements in the policy and leverages LLMs to generate potential repairs, which are then verified using SMT solvers. To evaluate CloudFix, we curated a dataset of 282 real-world AWS access control policies extracted from forum posts and augmented them with synthetically generated request sets based on real scenarios. Our experimental results show that CloudFix improves repair accuracy over a Baseline implementation across varying request sizes. Our work is the first to leverage LLMs for policy repair, showcasing the effectiveness of LLMs for access control and enabling efficient and automated repair of cloud access control policies. We make our tool Cloudfix and AWS dataset publicly available.

Paperid: 2726, https://arxiv.org/pdf/2512.09941.pdf

Abstract:
In this paper we study a basic and natural question about Fourier analysis of Boolean functions, which has applications to the study of Matching Vector based Private Information Retrieval (PIR) schemes. For integers m and r, define a delta function on {0,1}^r to be a function f: Z_m^r -> C with f(0) = 1 and f(x) = 0 for all nonzero Boolean x. The basic question we study is how small the Fourier sparsity of a delta function can be; namely how sparse such an f can be in the Fourier basis? In addition to being intrinsically interesting and natural, such questions arise naturally when studying "S-decoding polynomials" for the known matching vector families. Finding S-decoding polynomials of reduced sparsity, which corresponds to finding delta functions with low Fourier sparsity, would improve the current best PIR schemes. We show nontrivial upper and lower bounds on the Fourier sparsity of delta functions. Our proofs are elementary and clean. These results imply limitations on improving Matching Vector PIR schemes simply by finding better S-decoding polynomials. In particular, there are no S-decoding polynomials that can make Matching Vector PIRs based on the known matching vector families achieve polylogarithmic communication with a constant number of servers. Many interesting questions remain open.

Paperid: 2727, https://arxiv.org/pdf/2512.09934.pdf

Abstract:
The growing presence of IoT devices in academic environments has increased operational complexity and exposed security weaknesses, especially in academic institutions without unified policies for registration, monitoring, and incident response involving IoT. This work presents IoTEdu, an integrated platform that combines access control, incident detection, and automatic blocking of IoT devices. The solution was evaluated in a controlled environment with simulated attacks, achieving an average time of 28.6 seconds between detection and blocking. The results show a reduction in manual intervention, standardization of responses, and unification of the processes of registration, monitoring, and incident response.

Paperid: 2728, https://arxiv.org/pdf/2512.09549.pdf

Abstract:
Large language models (LLMs) are increasingly prevalent in security research. Their unique characteristics, however, introduce challenges that undermine established paradigms of reproducibility, rigor, and evaluation. Prior work has identified common pitfalls in traditional machine learning research, but these studies predate the advent of LLMs. In this paper, we identify nine common pitfalls that have become (more) relevant with the emergence of LLMs and that can compromise the validity of research involving them. These pitfalls span the entire computation process, from data collection, pre-training, and fine-tuning to prompting and evaluation. We assess the prevalence of these pitfalls across all 72 peer-reviewed papers published at leading Security and Software Engineering venues between 2023 and 2024. We find that every paper contains at least one pitfall, and each pitfall appears in multiple papers. Yet only 15.7% of the present pitfalls were explicitly discussed, suggesting that the majority remain unrecognized. To understand their practical impact, we conduct four empirical case studies showing how individual pitfalls can mislead evaluation, inflate performance, or impair reproducibility. Based on our findings, we offer actionable guidelines to support the community in future work.

Paperid: 2729, https://arxiv.org/pdf/2512.09385.pdf

Abstract:
The rapid growth of Ethereum has made it more important to quickly and accurately detect smart contract vulnerabilities. While machine-learning-based methods have shown some promise, many still rely on rule-based preprocessing designed by domain experts. Rule-based preprocessing methods often discard crucial context from the source code, potentially causing certain vulnerabilities to be overlooked and limiting adaptability to newly emerging threats. We introduce BugSweeper, an end-to-end deep learning framework that detects vulnerabilities directly from the source code without manual engineering. BugSweeper represents each Solidity function as a Function-Level Abstract Syntax Graph (FLAG), a novel graph that combines its Abstract Syntax Tree (AST) with enriched control-flow and data-flow semantics. Then, our two-stage Graph Neural Network (GNN) analyzes these graphs. The first-stage GNN filters noise from the syntax graphs, while the second-stage GNN conducts high-level reasoning to detect diverse vulnerabilities. Extensive experiments on real-world contracts show that BugSweeper significantly outperforms all state-of-the-art detection methods. By removing the need for handcrafted rules, our approach offers a robust, automated, and scalable solution for securing smart contracts without any dependence on security experts.

Paperid: 2730, https://arxiv.org/pdf/2512.09309.pdf

Abstract:
Nowadays, visual intelligence tools have become ubiquitous, offering all kinds of convenience and possibilities. However, these tools have high computational requirements that exceed the capabilities of resource-constrained mobile and wearable devices. While offloading visual data to the cloud is a common solution, it introduces significant privacy vulnerabilities during transmission and server-side computation. To address this, we propose a novel distributed, hierarchical offloading framework for Vision Transformers (ViTs) that addresses these privacy challenges by design. Our approach uses a local trusted edge device, such as a mobile phone or an Nvidia Jetson, as the edge orchestrator. This orchestrator partitions the user's visual data into smaller portions and distributes them across multiple independent cloud servers. By design, no single external server possesses the complete image, preventing comprehensive data reconstruction. The final data merging and aggregation computation occurs exclusively on the user's trusted edge device. We apply our framework to the Segment Anything Model (SAM) as a practical case study, which demonstrates that our method substantially enhances content privacy over traditional cloud-based approaches. Evaluations show our framework maintains near-baseline segmentation performance while substantially reducing the risk of content reconstruction and user data exposure. Our framework provides a scalable, privacy-preserving solution for vision tasks in the edge-cloud continuum.

Paperid: 2731, https://arxiv.org/pdf/2512.09150.pdf

Abstract:
Counterfeit products pose significant risks to public health and safety through infiltrating untrusted supply chains. Among numerous anti-counterfeiting techniques, leveraging inherent, unclonable microscopic irregularities of paper surfaces is an accurate and cost-effective solution. Prior work of this approach has focused on enabling ubiquitous acquisition of these physically unclonable features (PUFs). However, we will show that existing authentication methods relying on paper surface PUFs may be vulnerable to adversaries, resulting in a gap between technological feasibility and secure real-world deployment. This gap is investigated through formalizing an operational framework for paper-PUF-based authentication. Informed by this framework, we reveal system-level vulnerabilities across both physical and digital domains, designing physical denial-of-service and digital forgery attacks to disrupt proper authentication. The effectiveness of the designed attacks underscores the strong need for security countermeasures for reliable and resilient authentication based on paper PUFs. The proposed framework further facilitates a comprehensive, stage-by-stage security analysis, guiding the design of future counterfeit prevention systems. This analysis delves into potential attack strategies, offering a foundational understanding of how various system components, such as physical features and verification processes, might be exploited by adversaries.

Paperid: 2732, https://arxiv.org/pdf/2512.08802.pdf

Abstract:
Despite advancements in machine learning for security, rule-based detection remains prevalent in Security Operations Centers due to the resource intensiveness and skill gap associated with ML solutions. While traditional rule-based methods offer efficiency, their rigidity leads to high false positives or negatives and requires continuous manual maintenance. This paper proposes a novel, two-stage hybrid framework to democratize ML-based threat detection. The first stage employs intentionally loose YARA rules for coarse-grained filtering, optimized for high recall. The second stage utilizes an ML classifier to filter out false positives from the first stage's output. To overcome data scarcity, the system leverages Simula, a seedless synthetic data generation framework, enabling security analysts to create high-quality training datasets without extensive data science expertise or pre-labeled examples. A continuous feedback loop incorporates real-time investigation results to adaptively tune the ML model, preventing rule degradation. This proposed model with active learning has been rigorously tested for a prolonged time in a production environment spanning tens of thousands of systems. The system handles initial raw log volumes often reaching 250 billion events per day, significantly reducing them through filtering and ML inference to a handful of daily tickets for human investigation. Live experiments over an extended timeline demonstrate a general improvement in the model's precision over time due to the active learning feature. This approach offers a self-sustained, low-overhead, and low-maintenance solution, allowing security professionals to guide model learning as expert ``teachers''.

Paperid: 2733, https://arxiv.org/pdf/2512.07574.pdf

Abstract:
Accurate three-dimensional delineation of liver tumors on contrast-enhanced CT is a prerequisite for treatment planning, navigation and response assessment, yet manual contouring is slow, observer-dependent and difficult to standardise across centres. Automatic segmentation is complicated by low lesion-parenchyma contrast, blurred or incomplete boundaries, heterogeneous enhancement patterns, and confounding structures such as vessels and adjacent organs. We propose a hybrid framework that couples an attention-enhanced cascaded U-Net with handcrafted radiomics and voxel-wise 3D CNN refinement for joint liver and liver-tumor segmentation. First, a 2.5D two-stage network with a densely connected encoder, sub-pixel convolution decoders and multi-scale attention gates produces initial liver and tumor probability maps from short stacks of axial slices. Inter-slice temporal consistency is then enforced by a simple three-slice refinement rule along the cranio-caudal direction, which restores thin and tiny lesions while suppressing isolated noise. Next, 728 radiomic descriptors spanning intensity, texture, shape, boundary and wavelet feature groups are extracted from candidate lesions and reduced to 20 stable, highly informative features via multi-strategy feature selection; a random forest classifier uses these features to reject false-positive regions. Finally, a compact 3D patch-based CNN derived from AlexNet operates in a narrow band around the tumor boundary to perform voxel-level relabelling and contour smoothing.

Paperid: 2734, https://arxiv.org/pdf/2512.06971.pdf

Abstract:
We study the classic problem of prediction with expert advice under the constraint of local differential privacy (LDP). In this context, we first show that a classical algorithm naturally satisfies LDP and then design two new algorithms that improve it: RW-AdaBatch and RW-Meta. For RW-AdaBatch, we exploit the limited-switching behavior induced by LDP to provide a novel form of privacy amplification that grows stronger on easier data, analogous to the shuffle model in offline learning. Drawing on the theory of random walks, we prove that this improvement carries essentially no utility cost. For RW-Meta, we develop a general method for privately selecting between experts that are themselves non-trivial learning algorithms, and we show that in the context of LDP this carries no extra privacy cost. In contrast, prior work has only considered data-independent experts. We also derive formal regret bounds that scale inversely with the degree of independence between experts. Our analysis is supplemented by evaluation on real-world data reported by hospitals during the COVID-19 pandemic; RW-Meta outperforms both the classical baseline and a state-of-the-art \textit{central} DP algorithm by 1.5-3$\times$ on the task of predicting which hospital will report the highest density of COVID patients each week.

Paperid: 2735, https://arxiv.org/pdf/2512.06557.pdf

Abstract:
The increasing sophistication of cyber threats demands novel approaches to characterize adversarial strategies, particularly those targeting critical infrastructure and IoT ecosystems. This paper presents a longitudinal analysis of attacker behavior using HoneyTrap, an adaptive honeypot framework deployed across geographically distributed nodes to emulate vulnerable services and safely capture malicious traffic. Over a 24 day observation window, more than 60.3 million events were collected. To enable scalable analytics, raw JSON logs were transformed into Apache Parquet, achieving 5.8 - 9.3x compression and 7.2x faster queries, while ASN enrichment and salted SHA-256 pseudonymization added network intelligence and privacy preservation. Our analysis reveals three key findings: (1) The majority of traffic targeted HTTP and HTTPS services (ports 80 and 443), with more than 8 million connection attempts and daily peaks exceeding 1.7 million events. (2) SSH (port 22) was frequently subject to brute-force attacks, with over 4.6 million attempts. (3) Less common services like Minecraft (25565) and SMB (445) were also targeted, with Minecraft receiving about 118,000 daily attempts that often coincided with spikes on other ports.

Paperid: 2736, https://arxiv.org/pdf/2512.06500.pdf

Abstract:
Trusted Execution Environments (TEEs) such as ARM TrustZone are widely used in IoT and embedded devices to protect sensitive code and data. However, most existing defenses focus on secure boot or REE-side monitoring and provide little visibility into the runtime integrity of the TEE. This leaves TrustZone-based devices exposed to persistent TEE compromises. We propose Policy-Driven Runtime Integrity Measurement and Attestation (PDRIMA), a runtime integrity protection approach for TrustZone-based TEEs. PDRIMA systematically analyzes TEE attack surfaces and introduces two in-TEE subsystems: a Secure Monitor Agent (SMA) that performs policy-driven measurement, appraisal, logging, and time-based re-measurement over the TEE kernel, static components, user-TAs, and security-critical system calls; and a Remote Attestation Agent (RAA) that aggregates tamper-evident evidence and exposes a remote attestation protocol for verifying. We analyze PDRIMA's security against identified attack surfaces, implement a prototype on OP-TEE for Raspberry Pi 3B+, and evaluate its performance overhead to indicate its practicability.

Paperid: 2737, https://arxiv.org/pdf/2512.06253.pdf

Abstract:
Noise perturbation is one of the most fundamental approaches for achieving $(ε,δ)$-differential privacy (DP) guarantees when releasing the result of a query or function $f(\cdot)\in\mathbb{R}^M$ evaluated on a sensitive dataset $\mathbf{x}$. In this approach, calibrated noise $\mathbf{n}\in\mathbb{R}^M$ is used to obscure the difference vector $f(\mathbf{x})-f(\mathbf{x}')$, where $\mathbf{x}'$ is known as a neighboring dataset. A DP guarantee is obtained by studying the tail probability bound of a privacy loss random variable (PLRV), defined as the Radon-Nikodym derivative between two distributions. When $\mathbf{n}$ follows a multivariate Gaussian distribution, the PLRV is characterized as a specific univariate Gaussian. In this paper, we propose a novel scheme to generate $\mathbf{n}$ by leveraging the fact that the perturbation noise is typically spherically symmetric (i.e., the distribution is rotationally invariant around the origin). The new noise generation scheme allows us to investigate the privacy loss from a geometric perspective and express the resulting PLRV using a product measure, $W\times U$; measure $W$ is related to a radius random variable controlling the magnitude of $\mathbf{n}$, while measure $U$ involves a directional random variable governing the angle between $\mathbf{n}$ and the difference $f(\mathbf{x})-f(\mathbf{x}')$. We derive a closed-form moment bound on the product measure to prove $(ε,δ)$-DP. Under the same $(ε,δ)$-DP guarantee, our mechanism yields a smaller expected noise magnitude than the classic Gaussian noise in high dimensions, thereby significantly improving the utility of the noisy result $f(\mathbf{x})+\mathbf{n}$. To validate this, we consider convex and non-convex empirical risk minimization (ERM) problems in high dimensional space and apply the proposed product noise to achieve privacy.

Paperid: 2738, https://arxiv.org/pdf/2512.05518.pdf

Abstract:
A frustratingly easy technique known as the prefilling attack has been shown to effectively circumvent the safety alignment of frontier LLMs by simply prefilling the assistant response with an affirmative prefix before decoding. In response, recent work proposed a supervised fine-tuning (SFT) defense using data augmentation to achieve a \enquote{deep} safety alignment, allowing the model to generate natural language refusals immediately following harmful prefills. Unfortunately, we show in this work that the "deep" safety alignment produced by such an approach is in fact not very deep. A generalization of the prefilling attack, which we refer to as the Rank-Assisted Prefilling (RAP) attack, can effectively extract harmful content from models fine-tuned with the data augmentation defense by selecting low-probability "harmful" tokens from the top 20 predicted next tokens at each step (thus ignoring high-probability "refusal" tokens). We argue that this vulnerability is enabled due to the "gaming" of the SFT objective when the target distribution entropies are low, where low fine-tuning loss is achieved by shifting large probability mass to a small number of refusal tokens while neglecting the high ranks of harmful tokens. We then propose a new perspective on achieving deep safety alignment by matching the token ranks of the target distribution, rather than their probabilities. This perspective yields a surprisingly simple fix to the data augmentation defense based on regularizing the attention placed on harmful prefill tokens, an approach we call PRefill attEntion STOpping (PRESTO). Adding PRESTO yields up to a 4.7x improvement in the mean StrongREJECT score under RAP attacks across three popular open-source LLMs, with low impact to model utility.

Paperid: 2739, https://arxiv.org/pdf/2512.04908.pdf

Abstract:
This study explores the application of Answer Set Programming (ASP) for detecting anomalies in system logs, addressing the challenges posed by evolving cyber threats. We propose a novel framework that leverages ASP's declarative nature and logical reasoning capabilities to encode complex security rules as logical predicates. Our ASP-based system was applied to a real-world Linux system log dataset, demonstrating its effectiveness in identifying various anomalies such as potential brute-force attacks, privilege escalations, frequent network connections from specific IPs, and various system-level issues. Key findings highlight ASP's strengths in handling structured log data, rule flexibility, and event correlation. The approach shows promise in providing explainable alerts from real-world data. This research contributes to computer forensics by demonstrating a logic-based paradigm for log analysis on a practical dataset, opening avenues for more nuanced and adaptive cyber intelligence systems.

Paperid: 2740, https://arxiv.org/pdf/2512.04590.pdf

Abstract:
This paper proposes using the Linux kernel ftrace framework, particularly the function graph tracer, to generate informative system level data for machine learning (ML) applications. Experiments on a real world encryption detection task demonstrate the efficacy of the proposed features across several learning algorithms. The learner faces the problem of detecting encryption activities across a large dataset of files, using function call traces and graph based features. Empirical results highlight an outstanding accuracy of 99.28 on the task at hand, underscoring the efficacy of features derived from the function graph tracer. The results were further validated in an additional experiment targeting a multilabel classification problem, in which running programs were identified from trace data. This work provides comprehensive methodologies for preprocessing raw trace data and extracting graph based features, offering significant advancements in applying ML to system behavior analysis, program identification, and anomaly detection. By bridging the gap between system tracing and ML, this paper paves the way for innovative solutions in performance monitoring and security analytics.

Paperid: 2741, https://arxiv.org/pdf/2512.04580.pdf

Abstract:
To enhance the performance of large language models (LLMs) in various domain-specific applications, sensitive data such as healthcare, law, and finance are being used to privately customize or fine-tune these models. Such privately adapted LLMs are regarded as either personal privacy assets or corporate intellectual property. Therefore, protecting model weights and maintaining strict confidentiality during deployment and distribution have become critically important. However, existing model formats and deployment frameworks provide little to no built-in support for confidentiality, access control, or secure integration with trusted hardware. Current methods for securing model deployment either rely on computationally expensive cryptographic techniques or tightly controlled private infrastructure. Although these approaches can be effective in specific scenarios, they are difficult and costly for widespread deployment. In this paper, we introduce CryptoTensors, a secure and format-compatible file structure for confidential LLM distribution. Built as an extension to the widely adopted Safetensors format, CryptoTensors incorporates tensor-level encryption and embedded access control policies, while preserving critical features such as lazy loading and partial deserialization. It enables transparent decryption and automated key management, supporting flexible licensing and secure model execution with minimal overhead. We implement a proof-of-concept library, benchmark its performance across serialization and runtime scenarios, and validate its compatibility with existing inference frameworks, including Hugging Face Transformers and vLLM. Our results highlight CryptoTensors as a light-weight, efficient, and developer-friendly solution for safeguarding LLM weights in real-world and widespread deployments.

Paperid: 2742, https://arxiv.org/pdf/2512.03361.pdf

Abstract:
Deep learning-based semantic communication (SemCom) has emerged as a promising paradigm for next-generation wireless networks, offering superior transmission efficiency by extracting and conveying task-relevant semantic latent representations rather than raw data. However, the openness of the wireless medium and the intrinsic vulnerability of semantic latent representations expose such systems to previously unrecognized security risks. In this paper, we uncover a fundamental latent-space vulnerability that enables Man-in-the-Middle (MitM) attacker to covertly manipulate the transmitted semantics while preserving the statistical properties of the transmitted latent representations. We first present a Diffusion-based Re-encoding Attack (DiR), wherein the attacker employs a diffusion model to synthesize an attacker-designed semantic variant, and re-encodes it into a valid latent representation compatible with the SemCom decoder. Beyond this model-dependent pathway, we further propose a model-agnostic and training-free Test-Time Adaptation Latent Manipulation attack (TTA-LM), in which the attacker perturbs and steers the intercepted latent representation toward an attacker-specified semantic target by leveraging the gradient of a target loss function. In contrast to diffusion-based manipulation, TTA-LM does not rely on any generative model and does not impose modality-specific or task-specific assumptions, thereby enabling efficient and broadly applicable latent-space tampering across diverse SemCom architectures. Extensive experiments on representative semantic communication architectures demonstrate that both attacks can significantly alter the decoded semantics while preserving natural latent-space distributions, making the attacks covert and difficult to detect.

Paperid: 2743, https://arxiv.org/pdf/2512.02047.pdf

Abstract:
The rapid advancement of general-purpose AI models has increased concerns about copyright infringement in training data, yet current regulatory frameworks remain predominantly reactive rather than proactive. This paper examines the regulatory landscape of AI training data governance in major jurisdictions, including the EU, the United States, and the Asia-Pacific region. It also identifies critical gaps in enforcement mechanisms that threaten both creator rights and the sustainability of AI development. Through analysis of major cases we identified critical gaps in pre-training data filtering. Existing solutions such as transparency tools, perceptual hashing, and access control mechanisms address only specific aspects of the problem and cannot prevent initial copyright violations. We identify two fundamental challenges: pre-training license collection and content filtering, which faces the impossibility of comprehensive copyright management at scale, and verification mechanisms, which lack tools to confirm filtering prevented infringement. We propose a multilayered filtering pipeline that combines access control, content verification, machine learning classifiers, and continuous database cross-referencing to shift copyright protection from post-training detection to pre-training prevention. This approach offers a pathway toward protecting creator rights while enabling continued AI innovation.

Paperid: 2744, https://arxiv.org/pdf/2512.02046.pdf

Paperid: 2745, https://arxiv.org/pdf/2512.00741.pdf

Abstract:
In recent years, the explosion of malware and extensive code reuse have formed complex evolutionary connections among malware specimens. The rapid pace of development makes it challenging for existing studies to characterize recent evolutionary trends. In addition, intuitive tools to untangle these intricate connections between malware specimens or categories are urgently needed. This paper introduces a manually-reviewed malware source code dataset containing 6032 specimens. Building on and extending current research from a software engineering perspective, we systematically evaluate the scale, development costs, code quality, as well as security and dependencies of modern malware. We further introduce a multi-view genealogy analysis to clarify malware connections: at an overall view, this analysis quantifies the strength and direction of connections among specimens and categories; at a detailed view, it traces the evolutionary histories of individual specimens. Experimental results indicate that, despite persistent shortcomings in code quality, malware specimens exhibit an increasing complexity and standardization, in step with the development of mainstream software engineering practices. Meanwhile, our genealogy analysis intuitively reveals lineage expansion and evolution driven by code reuse, providing new evidence and tools for understanding the formation and evolution of the malware ecosystem.

Paperid: 2746, https://arxiv.org/pdf/2512.00136.pdf

Abstract:
Equipped with various tools and knowledge, GPTs, one kind of customized AI agents based on OpenAI's large language models, have illustrated great potential in many fields, such as writing, research, and programming. Today, the number of GPTs has reached three millions, with the range of specific expert domains becoming increasingly diverse. However, given the consistent framework shared among these LLM agent applications, systemic security vulnerabilities may exist and remain underexplored. To fill this gap, we present an empirical study on the security vulnerabilities of GPTs. Building upon prior research on LLM security, we first adopt a platform-user perspective to conduct a comprehensive attack surface analysis across different system components. Then, we design a systematic and multidimensional attack suite with the explicit objectives of information leakage and tool misuse based on the attack surface analysis, thereby concretely demonstrating the security vulnerabilities that various components of GPT-based systems face. Finally, we accordingly propose defense mechanisms to address the aforementioned security vulnerabilities. By increasing the awareness of these vulnerabilities and offering critical insights into their implications, this study seeks to facilitate the secure and responsible application of GPTs while contributing to developing robust defense mechanisms that protect users and systems against malicious attacks.

Paperid: 2747, https://arxiv.org/pdf/2511.23408.pdf

Abstract:
Automated vulnerability patching is crucial for software security, and recent advancements in Large Language Models (LLMs) present promising capabilities for automating this task. However, existing research has primarily assessed LLMs using publicly disclosed vulnerabilities, leaving their effectiveness on related artificial vulnerabilities largely unexplored. In this study, we empirically evaluate the patching effectiveness and complementarity of several prominent LLMs, such as OpenAI's GPT variants, LLaMA, DeepSeek, and Mistral models, using both real and artificial vulnerabilities. Our evaluation employs Proof-of-Vulnerability (PoV) test execution to concretely assess whether LLM-generated source code successfully patches vulnerabilities. Our results reveal that LLMs patch real vulnerabilities more effectively compared to artificial ones. Additionally, our analysis reveals significant variability across LLMs in terms of overlapping (multiple LLMs patching the same vulnerabilities) and complementarity (vulnerabilities patched exclusively by a single LLM), emphasizing the importance of selecting appropriate LLMs for effective vulnerability patching.

Paperid: 2748, https://arxiv.org/pdf/2511.23278.pdf

Abstract:
Modern cloud applications are built on independent, diverse microservices, offering scalability, flexibility, and usage-based billing. However, the structural design of these varied services, along with their reliance on auto-scalers for dynamic internet traffic, introduces significant coordination challenges. As we demonstrate in this paper, common default retry patterns used between misaligned services can turn into retry storms which drive up resource usage and costs, leading to self-inflicted Denial-of-Wallet (DoW) scenarios. To overcome these problems we introduce RetryGuard, a distributed framework for productive control of retry patterns across interdependent microservices. By managing retry policy on a per-service basis and making parallel decisions, RetryGuard prevents retry storms, curbs resource contention, and mitigates escalating operational costs. RetryGuard makes its decisions based on an analytic model that captures the relationships among retries, throughput (rejections), delays, and costs. Experimental results show that RetryGuard significantly reduces resource usage and costs compared to AWS standard and advanced retry policies. We further demonstrate its scalability and superior performance in a more complex Kubernetes deployment with the Istio service mesh, where it achieves substantial improvements.

Paperid: 2749, https://arxiv.org/pdf/2511.23200.pdf

Abstract:
Psychological stress is a widespread issue that significantly impacts student well-being and academic performance. Effective remote stress recognition is crucial, yet existing methods often rely on wearable devices or GPS-based clustering techniques that pose privacy risks. In this study, we introduce a novel, end-to-end privacy-enhanced framework for semantic location encoding using a self-hosted OSM engine and an LLM-bootstrapped static map. We rigorously quantify the privacy-utility trade-off and demonstrate (via LOSO validation) that our Privacy-Aware (PA) model achieves performance statistically indistinguishable from a non-private model, proving that utility does not require sacrificing privacy. Feature importance analysis highlights that recreational activity time, working time, and travel time play a significant role in stress recognition.

Paperid: 2750, https://arxiv.org/pdf/2511.21301.pdf

Abstract:
Evaluating the effectiveness of software protection is crucial for selecting the most effective methods to safeguard assets within software applications. Obfuscation involves techniques that deliberately modify software to make it more challenging to understand and reverse-engineer, while maintaining its original functionality. Although obfuscation is widely adopted, its effectiveness remains largely unexplored and unthoroughly evaluated. This paper presents a controlled experiment involving Master's students performing code comprehension tasks on applications hardened with obfuscation. The experiment's goals are to assess the effectiveness of obfuscation in delaying code comprehension by attackers and to determine whether complexity metrics can accurately predict the impact of these protections on success rates and durations of code comprehension tasks. The study is the first to evaluate the effect of layering multiple obfuscation techniques on a single piece of protected code. It also provides experimental evidence of the correlation between objective metrics of the attacked code and the likelihood of a successful attack, bridging the gap between objective and subjective approaches to estimating potency. Finally, the paper highlights significant aspects that warrant additional analysis and opens new avenues for further experiments.

Paperid: 2751, https://arxiv.org/pdf/2511.20313.pdf

Abstract:
The Software Bill of Materials (SBOM) is a critical tool for securing the software supply chain (SSC), but its practical utility is undermined by inaccuracies in both its generation and its application in vulnerability scanning. This paper presents a large-scale empirical study on 2,414 open-source repositories to address these issues from a practical standpoint. First, we demonstrate that using lock files with strong package managers enables the generation of accurate and consistent SBOMs, establishing a reliable foundation for security analysis. Using this high-fidelity foundation, however, we expose a more fundamental flaw in practice: downstream vulnerability scanners produce a staggering 97.5\% false positive rate. We pinpoint the primary cause as the flagging of vulnerabilities within unreachable code. We then demonstrate that function call analysis can effectively prune 63.3\% of these false alarms. Our work validates a practical, two-stage approach for SSC security: first, generate an accurate SBOM using lock files and strong package managers, and second, enrich it with function call analysis to produce actionable, low-noise vulnerability reports that alleviate developers' alert fatigue.

Paperid: 2752, https://arxiv.org/pdf/2511.19009.pdf

Abstract:
Large language models demonstrate powerful capabilities across various natural language processing tasks, yet they also harbor safety vulnerabilities. To enhance LLM safety, various jailbreak defense methods have been proposed to guard against harmful outputs. However, improvements in model safety often come at the cost of severe over-refusal, failing to strike a good balance between safety and usability. In this paper, we first analyze the causes of over-refusal from a representation perspective, revealing that over-refusal samples reside at the boundary between benign and malicious samples. Based on this, we propose MOSR, designed to mitigate over-refusal by intervening the safety representation of LLMs. MOSR incorporates two novel components: (1) Overlap-Aware Loss Weighting, which determines the erasure weight for malicious samples by quantifying their similarity to pseudo-malicious samples in the representation space, and (2) Context-Aware Augmentation, which supplements the necessary context for rejection decisions by adding harmful prefixes before rejection responses. Experiments demonstrate that our method outperforms existing approaches in mitigating over-refusal while largely maintaining safety. Overall, we advocate that future defense methods should strike a better balance between safety and over-refusal.

Paperid: 2753, https://arxiv.org/pdf/2511.18155.pdf

Abstract:
With the increasing use and adoption of cloud and cloud-native computing, the underlying technologies (i.e., containerization and virtualization) have become foundational. However, strict isolation and maintaining runtime security in these environments has become increasingly challenging. Existing approaches like seccomp and Mandatory Access Control (MAC) frameworks offer some protection up to a limit, but often lack context awareness, syscall argument filtering, and adaptive enforcement, providing the ability to adjust decisions at runtime based on observed application behavior, workload changes, or detected anomalies rather than relying solely on static or predefined rules.This paper introduces eBPF-PATROL (eBPF-Protective Agent for Threat Recognition and Overreach Limitation), an extensible lightweight runtime security agent that uses extended Berkeley Packet Filter (eBPF) technology to monitor and enforce policies in containerized and virtualized environments. By intercepting system calls, analyzing execution context, and applying user-defined rules, eBPF-PATROL detects and prevents real-time boundary violations, such as reverse shells, privilege escalation, and container escape attempts. We describe the architecture, implementation, and evaluation of eBPF-PATROL, demonstrating its low overhead (< 2.5 percent) and high detection accuracy across real-world attack scenarios.

Paperid: 2754, https://arxiv.org/pdf/2511.17748.pdf

Abstract:
Flexible energy resources are increasingly becoming common in smart grids. These resources are typically managed and controlled by aggregators that coordinate many resources to provide flexibility services. However, these aggregators and flexible energy resources are vulnerable, which could allow attackers to remotely control flexible energy resources to launch large-scale attacks on the grid. This paper investigates and evaluates the potential attack strategies that can be used to manipulate flexible energy resources to challenge the effectiveness of traditional grid stability measures and disrupt the first-swing stability of the power grid. Our work shows that although a large amount of power is required, the current flexibility capacities could potentially be sufficient to disrupt the grid on a national level.

Paperid: 2755, https://arxiv.org/pdf/2511.17167.pdf

Abstract:
We investigate the problem of detecting dependencies between the components of a high-dimensional vector. Our approach advances the existing literature in two important respects. First, we consider the problem under privacy constraints. Second, instead of testing whether the coordinates are pairwise independent, we are interested in determining whether certain pairwise associations between the components (such as all pairwise Kendall's $τ$ coefficients) do not exceed a given threshold in absolute value. Considering hypotheses of this form is motivated by the observation that in the high-dimensional regime, it is rare and perhaps impossible to have a null hypothesis that can be modeled exactly by assuming that all pairwise associations are precisely equal to zero. The formulation of the null hypothesis as a composite hypothesis makes the problem of constructing tests already non-standard in the non-private setting. Additionally, under privacy constraints, state of the art procedures rely on permutation approaches that are rendered invalid under a composite null. We propose a novel bootstrap based methodology that is especially powerful in sparse settings, develop theoretical guarantees under mild assumptions and show that the proposed method enjoys good finite sample properties even in the high privacy regime. Additionally, we present applications in medical data that showcase the applicability of our methodology.

Paperid: 2756, https://arxiv.org/pdf/2511.16716.pdf

Abstract:
Although passwords remain the primary defense against unauthorized access, users often tend to use passwords that are easy to remember. This behavior significantly increases security risks, also due to the fact that traditional password strength evaluation methods are often inadequate. In this discussion paper, we present SODA ADVANCE, a data reconstruction tool also designed to enhance evaluation processes related to the password strength. In particular, SODA ADVANCE integrates a specialized module aimed at evaluating password strength by leveraging publicly available data from multiple sources, including social media platforms. Moreover, we investigate the capabilities and risks associated with emerging Large Language Models (LLMs) in evaluating and generating passwords, respectively. Experimental assessments conducted with 100 real users demonstrate that LLMs can generate strong and personalized passwords possibly defined according to user profiles. Additionally, LLMs were shown to be effective in evaluating passwords, especially when they can take into account user profile data.

Paperid: 2757, https://arxiv.org/pdf/2511.16347.pdf

Abstract:
The adoption of Vision-Language Models (VLMs) in embodied AI agents, while being effective, brings safety concerns such as jailbreaking. Prior work have explored the possibility of directly jailbreaking the embodied agents through elaborated multi-modal prompts. However, no prior work has studied or even reported indirect jailbreaks in embodied AI, where a black-box attacker induces a jailbreak without issuing direct prompts to the embodied agent. In this paper, we propose, for the first time, indirect environmental jailbreak (IEJ), a novel attack to jailbreak embodied AI via indirect prompt injected into the environment, such as malicious instructions written on a wall. Our key insight is that embodied AI does not ''think twice'' about the instructions provided by the environment -- a blind trust that attackers can exploit to jailbreak the embodied agent. We further design and implement open-source prototypes of two fully-automated frameworks: SHAWSHANK, the first automatic attack generation framework for the proposed attack IEJ; and SHAWSHANK-FORGE, the first automatic benchmark generation framework for IEJ. Then, using SHAWSHANK-FORGE, we automatically construct SHAWSHANK-BENCH, the first benchmark for indirectly jailbreaking embodied agents. Together, our two frameworks and one benchmark answer the questions of what content can be used for malicious IEJ instructions, where they should be placed, and how IEJ can be systematically evaluated. Evaluation results show that SHAWSHANK outperforms eleven existing methods across 3,957 task-scene combinations and compromises all six tested VLMs. Furthermore, current defenses only partially mitigate our attack, and we have responsibly disclosed our findings to all affected VLM vendors.

Paperid: 2758, https://arxiv.org/pdf/2511.15571.pdf

Abstract:
Recent AI-generated image (AIGI) detectors achieve impressive accuracy under clean condition. In view of antiforensics, it is significant to develop advanced adversarial attacks for evaluating the security of such detectors, which remains unexplored sufficiently. This letter proposes a Dual-domain Feature Importance Attack (DuFIA) scheme to invalidate AIGI detectors to some extent. Forensically important features are captured by the spatially interpolated gradient and frequency-aware perturbation. The adversarial transferability is enhanced by jointly modeling spatial and frequency-domain feature importances, which are fused to guide the optimization-based adversarial example generation. Extensive experiments across various AIGI detectors verify the cross-model transferability, transparency and robustness of DuFIA.

Paperid: 2759, https://arxiv.org/pdf/2511.15278.pdf

Abstract:
The proliferation of IoT devices in shared, multi-vendor environments like the modern aircraft cabin creates a fundamental conflict between the promise of data collaboration and the risks to passenger privacy, vendor intellectual property (IP), and regulatory compliance. While emerging standards like the Cabin Secure Media-Independent Messaging (CSMIM) protocol provide a secure communication backbone, they do not resolve data governance challenges at the application layer, leaving a privacy gap that impedes trust. This paper proposes and evaluates a framework that closes this gap by integrating a configurable layer of Privacy-Enhancing Technologies (PETs) atop a CSMIM-like architecture. We conduct a rigorous, empirical analysis of two pragmatic PETs: Differential Privacy (DP) for statistical sharing, and an additive secret sharing scheme (ASS) for data obfuscation. Using a high-fidelity testbed with resource-constrained hardware, we quantify the trade-offs between data privacy, utility, and computing performance. Our results demonstrate that the computational overhead of PETs is often negligible compared to inherent network and protocol latencies. We prove that architectural choices, such as on-device versus virtualized processing, have a far greater impact on end-to-end latency and computational performance than the PETs themselves. The findings provide a practical roadmap for system architects to select and configure appropriate PETs, enabling the design of trustworthy collaborative IoT ecosystems in avionics and other critical domains.

Paperid: 2760, https://arxiv.org/pdf/2511.14989.pdf

Abstract:
Quantum Machine Learning (QML) integrates quantum computational principles into learning algorithms, offering improved representational capacity and computational efficiency. Nevertheless, the security and robustness of QML systems remain underexplored, especially under adversarial conditions. In this paper, we present a systematization of adversarial robustness in QML, integrating conceptual organization with empirical evaluation across three threat models-black-box, gray-box, and white-box. We implement representative attacks in each category, including label-flipping for black-box, QUID encoder-level data poisoning for gray-box, and FGSM and PGD for white-box, using Quantum Neural Networks (QNNs) trained on two datasets from distinct domains: MNIST from computer vision and AZ-Class from Android malware, across multiple circuit depths (2, 5, 10, and 50 layers) and two encoding schemes (angle and amplitude). Our evaluation shows that amplitude encoding yields the highest clean accuracy (93% on MNIST and 67% on AZ-Class) in deep, noiseless circuits; however, it degrades sharply under adversarial perturbations and depolarization noise (p=0.01), dropping accuracy below 5%. In contrast, angle encoding, while offering lower representational capacity, remains more stable in shallow, noisy regimes, revealing a trade-off between capacity and robustness. Moreover, the QUID attack attains higher attack success rates, though quantum noise channels disrupt the Hilbert-space correlations it exploits, weakening its impact in image domains. This suggests that noise can act as a natural defense mechanism in Noisy Intermediate-Scale Quantum (NISQ) systems. Overall, our findings guide the development of secure and resilient QML architectures for practical deployment. These insights underscore the importance of designing threat-aware models that remain reliable under real-world noise in NISQ settings.

Paperid: 2761, https://arxiv.org/pdf/2511.14084.pdf

Abstract:
Differential privacy (DP) auditing is essential for evaluating privacy guarantees in machine learning systems. Existing auditing methods, however, pose a significant challenge for large-scale systems since they require modifying the training dataset -- for instance, by injecting out-of-distribution canaries or removing samples from training. Such interventions on the training data pipeline are resource-intensive and involve considerable engineering overhead. We introduce a novel observational auditing framework that leverages the inherent randomness of data distributions, enabling privacy evaluation without altering the original dataset. Our approach extends privacy auditing beyond traditional membership inference to protected attributes, with labels as a special case, addressing a key gap in existing techniques. We provide theoretical foundations for our method and perform experiments on Criteo and CIFAR-10 datasets that demonstrate its effectiveness in auditing label privacy guarantees. This work opens new avenues for practical privacy auditing in large-scale production environments.

Paperid: 2762, https://arxiv.org/pdf/2511.13771.pdf

Abstract:
Large Language Models (LLMs) are increasingly vulnerable to adversarial attacks that can subtly manipulate their outputs. While various defense mechanisms have been proposed, many operate as black boxes, lacking transparency in their decision-making. This paper introduces ExplainableGuard, an interpretable adversarial defense framework leveraging the chain-of-thought (CoT) reasoning capabilities of DeepSeek-Reasoner. Our approach not only detects and neutralizes adversarial perturbations in text but also provides step-by-step explanations for each defense action. We demonstrate how tailored CoT prompts guide the LLM to perform a multi-faceted analysis (character, word, structural, and semantic) and generate a purified output along with a human-readable justification. Preliminary results on the GLUE Benchmark and IMDB Movie Reviews dataset show promising defense efficacy. Additionally, a human evaluation study reveals that ExplainableGuard's explanations outperform ablated variants in clarity, specificity, and actionability, with a 72.5% deployability-trust rating, underscoring its potential for more trustworthy LLM deployments.

Paperid: 2763, https://arxiv.org/pdf/2511.12993.pdf

Abstract:
Smart contracts are prone to vulnerabilities and are analyzed by experts as well as automated systems, such as static analysis and AI-assisted solutions. However, audit artifacts are heterogeneous and often lack reproducible, executable PoC tests suitable for automated validation, leading to costly, ad hoc manual verification. Large language models (LLMs) can be leveraged to turn audit reports into PoC test cases, but have three major challenges: noisy inputs, hallucinations, and missing runtime oracles. In this paper, we present SmartPoC, an automated framework that converts textual audit reports into executable, validated test cases. First, the input audit report is processed to reduce noise, and only bug-related functions are extracted and fed to LLMs as context. To curb hallucinations and ensure compile-and-run readiness, we leverage LLMs to synthesize PoC test cases with specially-designed pre-/post-execution repair. We further utilize differential verification as oracles to confirm exploitability of the PoC test cases. On the SmartBugs-Vul and FORGE-Vul benchmarks, SmartPoC generates executable, validated Foundry test cases for 85.61% and 86.45% of targets, respectively. Applied to the latest Etherscan verified-source corpus, SmartPoC confirms 236 real bugs out of 545 audit findings at a cost of only $0.03 per finding.

Paperid: 2764, https://arxiv.org/pdf/2511.12668.pdf

Abstract:
Assurance for artificial intelligence (AI) systems remains fragmented across software supply-chain security, adversarial machine learning, and governance documentation. Existing transparency mechanisms - including Model Cards, Datasheets, and Software Bills of Materials (SBOMs) - advance provenance reporting but rarely provide verifiable, machine-readable evidence of model security. This paper introduces the AI Risk Scanning (AIRS) Framework, a threat-model-based, evidence-generating framework designed to operationalize AI assurance. The AIRS Framework evolved through three progressive pilot studies - Smurf (AIBOM schema design), OPAL (operational validation), and Pilot C (AIRS) - that reframed AI documentation from descriptive disclosure toward measurable, evidence-bound verification. The framework aligns its assurance fields to the MITRE ATLAS adversarial ML taxonomy and automatically produces structured artifacts capturing model integrity, packaging and serialization safety, structural adapters, and runtime behaviors. Currently, the AIRS Framework is scoped to provide model-level assurances for LLMs, but it could be expanded to include other modalities and cover system-level threats (e.g. application-layer abuses, tool-calling). A proof-of-concept on a quantized GPT-OSS-20B model demonstrates enforcement of safe loader policies, per-shard hash verification, and contamination and backdoor probes executed under controlled runtime conditions. Comparative analysis with SBOM standards of SPDX 3.0 and CycloneDX 1.6 reveals alignment on identity and evaluation metadata, but identifies critical gaps in representing AI-specific assurance fields. The AIRS Framework thus extends SBOM practice to the AI domain by coupling threat modeling with automated, auditable evidence generation, providing a principled foundation for standardized, trustworthy, and machine-verifiable AI risk documentation.

Paperid: 2765, https://arxiv.org/pdf/2511.12663.pdf

Abstract:
Federated learning (FL) enables multiple clients to collaboratively train a shared global model while preserving the privacy of their local data. Within this paradigm, the intellectual property rights (IPR) of client models are critical assets that must be protected. In practice, the central server responsible for maintaining the global model may maliciously manipulate the global model to erase client contributions or falsely claim sole ownership, thereby infringing on clients' IPR. Watermarking has emerged as a promising technique for asserting model ownership and protecting intellectual property. However, existing FL watermarking approaches remain limited, suffering from potential watermark collisions among clients, insufficient watermark security, and non-intuitive verification mechanisms. In this paper, we propose FLClear, a novel framework that simultaneously achieves collision-free watermark aggregation, enhanced watermark security, and visually interpretable ownership verification. Specifically, FLClear introduces a transposed model jointly optimized with contrastive learning to integrate the watermarking and main task objectives. During verification, the watermark is reconstructed from the transposed model and evaluated through both visual inspection and structural similarity metrics, enabling intuitive and quantitative ownership verification. Comprehensive experiments conducted over various datasets, aggregation schemes, and attack scenarios demonstrate the effectiveness of FLClear and confirm that it consistently outperforms state-of-the-art FL watermarking methods.

Paperid: 2766, https://arxiv.org/pdf/2511.11979.pdf

Abstract:
Android malware evolves rapidly, leading to concept drift that degrades the performance of traditional machine learning (ML)-based detection systems. While recent approaches incorporate active learning and hierarchical contrastive loss to handle this drift, they remain fully supervised, computationally expensive, and perform poorly on real-world datasets with long temporal spans. In particular, our evaluation highlights these limitations, particularly on LAMDA, a 12-year longitudinal dataset exhibiting substantial distributional shifts. Moreover, manual expert labeling cannot scale with the daily emergence of over 450,000 new malware samples, leaving most samples unlabeled and underutilized. To address these challenges, we propose CITADEL, a robust semi-supervised active learning framework for Android malware detection. To bridge the gap between image-domain semi-supervised learning and binary feature representations of malware, we introduce malware-specific augmentations, Bernoulli bit flips and masking, that simulate realistic drift behaviors. CITADEL further integrates supervised contrastive loss to improve boundary sample discrimination and combines it with a multi-criteria active learning strategy based on prediction confidence, $L_p$-norm distance, and boundary uncertainty, enabling effective adaptation under limited labeling budgets. Extensive evaluation on four large-scale Android malware benchmarks -- APIGraph, Chen-AZ, MaMaDroid, and LAMDA demonstrates that CITADEL outperforms prior work, achieving F1 score of over 1%, 3%, 7%, and 14% respectively, using only 40% labeled samples. Furthermore, CITADEL shows significant efficiency over prior work incurring $24\times$ faster training and $13\times$ fewer operations.

Paperid: 2767, https://arxiv.org/pdf/2511.11759.pdf

Abstract:
We present an end-to-end demonstration of how attackers can exploit AI safety failures to harm vulnerable populations: from jailbreaking LLMs to generate phishing content, to deploying those messages against real targets, to successfully compromising elderly victims. We systematically evaluated safety guardrails across six frontier LLMs spanning four attack categories, revealing critical failures where several models exhibited near-complete susceptibility to certain attack vectors. In a human validation study with 108 senior volunteers, AI-generated phishing emails successfully compromised 11\% of participants. Our work uniquely demonstrates the complete attack pipeline targeting elderly populations, highlighting that current AI safety measures fail to protect those most vulnerable to fraud. Beyond generating phishing content, LLMs enable attackers to overcome language barriers and conduct multi-turn trust-building conversations at scale, fundamentally transforming fraud economics. While some providers report voluntary counter-abuse efforts, we argue these remain insufficient.

Paperid: 2768, https://arxiv.org/pdf/2511.11625.pdf

Abstract:
Artificial intelligence (AI) has shown great potential in medical imaging, particularly for brain tumor detection using Magnetic Resonance Imaging (MRI). However, the models remain vulnerable at inference time when they are trained collaboratively through Federated Learning (FL), an approach adopted to protect patient privacy. Adversarial attacks can subtly alter medical scans in ways invisible to the human eye yet powerful enough to mislead AI models, potentially causing serious misdiagnoses. Existing defenses often assume centralized data and struggle to cope with the decentralized and diverse nature of federated medical settings. In this work, we present MedFedPure, a personalized federated learning defense framework designed to protect diagnostic AI models at inference time without compromising privacy or accuracy. MedFedPure combines three key elements: (1) a personalized FL model that adapts to the unique data distribution of each institution; (2) a Masked Autoencoder (MAE) that detects suspicious inputs by exposing hidden perturbations; and (3) an adaptive diffusion-based purification module that selectively cleans only the flagged scans before classification. Together, these steps offer robust protection while preserving the integrity of normal, benign images. We evaluated MedFedPure on the Br35H brain MRI dataset. The results show a significant gain in adversarial robustness, improving performance from 49.50% to 87.33% under strong attacks, while maintaining a high clean accuracy of 97.67%. By operating locally and in real time during diagnosis, our framework provides a practical path to deploying secure, trustworthy, and privacy-preserving AI tools in clinical workflows. Index Terms: cancer, tumor detection, federated learning, masked autoencoder, diffusion, privacy

Paperid: 2769, https://arxiv.org/pdf/2511.11347.pdf

Abstract:
Retrieval-augmented generation (RAG) has rapidly emerged as a transformative approach for integrating large language models into clinical and biomedical workflows. However, privacy risks, such as protected health information (PHI) exposure, remain inconsistently mitigated. This review provides a thorough analysis of the current landscape of RAG applications in healthcare, including (i) sensitive data type across clinical scenarios, (ii) the associated privacy risks, (iii) current and emerging data-privacy protection mechanisms and (iv) future direction for patient data privacy protection. We synthesize 23 articles on RAG applications in healthcare and systematically analyze privacy challenges through a pipeline-structured framework encompassing data storage, transmission, retrieval and generation stages, delineating potential failure modes, their underlying causes in threat models and system mechanisms, and their practical implications. Building on this analysis, we critically review 17 articles on privacy-preserving strategies for RAG systems. Our evaluation reveals critical gaps, including insufficient clinical validation, absence of standardized evaluation frameworks, and lack of automated assessment tools. We propose actionable directions based on these limitations and conclude with a call to action. This review provides researchers and practitioners with a structured framework for understanding privacy vulnerabilities in healthcare RAG and offers a roadmap toward developing systems that achieve both clinical effectiveness and robust privacy preservation.

Paperid: 2770, https://arxiv.org/pdf/2511.11171.pdf

Abstract:
Cyberattacks are becoming increasingly frequent and sophisticated, often exploiting the software supply chain (SSC) as an attack vector. Attack graphs provide a detailed representation of the sequence of events and vulnerabilities that could lead to a successful security breach in a system. MulVal is a widely used open-source tool for logical attack graph generation in networked systems. However, its current lack of support for capturing and reasoning about SSC threat propagation makes it unsuitable for addressing modern SSC attacks, such as the XZ compromise or the 3CX double SSC attack. To address this limitation, we propose an extension to MulVal that integrates SSC threat propagation analysis with existing network-based threat analysis. This extension introduces a new set of predicates within the familiar MulVal syntax, enabling seamless integration. The new facts and interaction rules model SSC assets, their dependencies, interactions, compromises, additional security mechanisms, initial system states, and known threats. We explain how this integration operates in both directions and demonstrate the practical application of the extension.

Paperid: 2771, https://arxiv.org/pdf/2511.10936.pdf

Abstract:
Graph unlearning has emerged as a promising solution for complying with "the right to be forgotten" regulations by enabling the removal of sensitive information upon request. However, this solution is not foolproof. The involvement of multiple parties creates new attack surfaces, and residual traces of deleted data can still remain in the unlearned graph neural networks. These vulnerabilities can be exploited by attackers to recover the supposedly erased samples, thereby undermining the inherent functionality of graph unlearning. In this work, we propose GraphToxin, the first graph reconstruction attack against graph unlearning. Specifically, we introduce a novel curvature matching module to provide a fine-grained guidance for full unlearned graph recovery. We demonstrate that GraphToxin can successfully subvert the regulatory guarantees expected from graph unlearning - it can recover not only a deleted individual's information and personal links but also sensitive content from their connections, thereby posing substantially more detrimental threats. Furthermore, we extend GraphToxin to multiple node removals under both white-box and black-box setting. We highlight the necessity of a worst-case analysis and propose a comprehensive evaluation framework to systematically assess the attack performance under both random and worst-case node removals. This provides a more robust and realistic measure of the vulnerability of graph unlearning methods to graph reconstruction attacks. Our extensive experiments demonstrate the effectiveness and flexibility of GraphToxin. Notably, we show that existing defense mechanisms are largely ineffective against this attack and, in some cases, can even amplify its performance. Given the severe privacy risks posed by GraphToxin, our work underscores the urgent need for the development of more effective and robust defense strategies against this attack.

Paperid: 2772, https://arxiv.org/pdf/2511.09606.pdf

Abstract:
Large language models (LLMs) have emerged as a promising phishing detection mechanism, addressing the limitations of traditional deep learning-based detectors, including poor generalization to previously unseen websites and a lack of interpretability. However, LLMs' effectiveness for phishing detection remains unexplored. This study investigates how to effectively leverage LLMs for phishing detection (including target brand identification) by examining the impact of input modalities (screenshots, logos, HTML, and URLs), temperature settings, and prompt engineering strategies. Using a dataset of 19,131 real-world phishing websites and 243 benign sites, we evaluate seven LLMs -- two commercial models (GPT 4.1 and Gemini 2.0 flash) and five open-source models (Qwen, Llama, Janus, DeepSeek-VL2, and R1) -- alongside two deep learning (DL)-based baselines (PhishIntention and Phishpedia). Our findings reveal that commercial LLMs generally outperform open-source models in phishing detection, while DL models demonstrate better performance on benign samples. For brand identification, screenshot inputs achieve optimal results, with commercial LLMs reaching 93-95% accuracy and open-source models, particularly Qwen, achieving up to 92%. However, incorporating multiple input modalities simultaneously or applying one-shot prompts does not consistently enhance performance and may degrade results. Furthermore, higher temperature values reduce performance. Based on these results, we recommend using screenshot inputs with zero temperature to maximize accuracy for LLM-based detectors with HTML serving as auxiliary context when screenshot information is insufficient.

Paperid: 2773, https://arxiv.org/pdf/2511.09088.pdf

Abstract:
Current adversarial examples (AEs) are typically designed for static models. However, with the wide application of Class-Incremental Learning (CIL), models are no longer static and need to be updated with new data distributed and labeled differently from the old ones. As a result, existing AEs often fail after CIL updates due to significant domain drift. In this paper, we propose SAE to enhance the sustainability of AEs against CIL. The core idea of SAE is to enhance the robustness of AE semantics against domain drift by making them more similar to the target class while distinguishing them from all other classes. Achieving this is challenging, as relying solely on the initial CIL model to optimize AE semantics often leads to overfitting. To resolve the problem, we propose a Semantic Correction Module. This module encourages the AE semantics to be generalized, based on a visual-language model capable of producing universal semantics. Additionally, it incorporates the CIL model to correct the optimization direction of the AE semantics, guiding them closer to the target class. To further reduce fluctuations in AE semantics, we propose a Filtering-and-Augmentation Module, which first identifies non-target examples with target-class semantics in the latent space and then augments them to foster more stable semantics. Comprehensive experiments demonstrate that SAE outperforms baselines by an average of 31.28% when updated with a 9-fold increase in the number of classes.

Paperid: 2774, https://arxiv.org/pdf/2511.08842.pdf

Abstract:
AI systems have found a wide range of real-world applications in recent years. The adoption of edge artificial intelligence, embedding AI directly into edge devices, is rapidly growing. Despite the implementation of guardrails and safety mechanisms, security vulnerabilities and challenges have become increasingly prevalent in this domain, posing a significant barrier to the practical deployment and safety of AI systems. This paper proposes an agentic AI safety architecture that leverages 3D to integrate a dedicated safety layer. It introduces an adaptive AI safety infrastructure capable of dynamically learning and mitigating attacks against the AI system. The system leverages the inherent advantages of co-location with the edge computing hardware to continuously monitor, detect and proactively mitigate threats to the AI system. The integration of local processing and learning capabilities enhances resilience against emerging network-based attacks while simultaneously improving system reliability, modularity, and performance, all with minimal cost and 3D integration overhead.

Paperid: 2775, https://arxiv.org/pdf/2511.08403.pdf

Abstract:
Recent technologies such as inter-ledger payments, non-fungible tokens, and smart contracts are all fruited from the ongoing development of Distributed Ledger Technologies. The foreseen trend is that they will play an increasingly visible role in daily life, which will have to be backed by appropriate operational resources. For example, due to increasing demand, smart contracts could soon face a shortage of knowledgeable users and tools to handle them in practice. Widespread smart contract adoption is currently limited by security, usability and costs aspects. Because of a steep learning curve, the handling of smart contracts is currently performed by specialised developers mainly, and most of the research effort is focusing on smart contract security, while other aspects like usability being somewhat neglected. Specific tools would lower the entry barrier, enabling interested non-experts to create smart contracts. In this paper we designed, developed and tested Blockly2Hooks, a solution towards filling this gap even in challenging scenarios such as when the smart contracts are written in an advanced language like C. With the XRP Ledger as a concrete working case, Blockly2Hooks helps interested non-experts from the community to learn smart contracts easily and adopt the technology, through leveraging well-proven teaching methodologies like Visual Programming Languages, and more specifically, the Blockly Visual Programming library from Google. The platform was developed and tested and the results are promising to make learning smart contract development smoother.

Paperid: 2776, https://arxiv.org/pdf/2511.08352.pdf

Abstract:
As cyber threats continue to evolve in complexity and frequency, robust endpoint protection is essential for organizational security. This paper presents "Endpoint Security Agent: A Comprehensive Approach to Real-time System Monitoring and Threat Detection" a modular, real-time security solution for Windows endpoints. The agent leverages native tools like WMI and ETW for lowlevel monitoring of system activities such as process execution, registry modifications, and network behaviour. A machine learning-based detection engine, trained on labelled datasets of benign and malicious activity, enables accurate threat identification with minimal false positives. Detection techniques are mapped to the MITRE ATT&CK framework for standardized threat classification. Designed for extensibility, the system includes a centralized interface for alerting and forensic analysis. Preliminary evaluation shows promising results in detecting diverse attack vectors with high accuracy and efficiency.

Paperid: 2777, https://arxiv.org/pdf/2511.07759.pdf

Abstract:
As mixing services are increasingly being exploited by malicious actors for illicit transactions, mixing address association has emerged as a critical research task. A range of approaches have been explored, with graph-based models standing out for their ability to capture structural patterns in transaction networks. However, these approaches face two main challenges: label noise and label scarcity, leading to suboptimal performance and limited generalization. To address these, we propose HiLoMix, a graph-based learning framework specifically designed for mixing address association. First, we construct the Heterogeneous Attributed Mixing Interaction Graph (HAMIG) to enrich the topological structure. Second, we introduce frequency-aware graph contrastive learning that captures complementary structural signals from high- and low-frequency graph views. Third, we employ weak supervised learning that assigns confidence-based weighting to noisy labels. Then, we jointly train high-pass and low-pass GNNs using both unsupervised contrastive signals and confidence-based supervision to learn robust node representations. Finally, we adopt a stacking framework to fuse predictions from multiple heterogeneous models, further improving generalization and robustness. Experimental results demonstrate that HiLoMix outperforms existing methods in mixing address association.

Paperid: 2778, https://arxiv.org/pdf/2511.07741.pdf

Abstract:
It is known that deep neural networks may exhibit dangerous behaviors under various security threats (e.g., backdoor attacks, adversarial attacks and safety property violation) and there exists an ongoing arms race between attackers and defenders. In this work, we propose a complementary perspective to utilize recent progress on "neural network repair" to mitigate these security threats and repair various kinds of neural network defects (arising from different security threats) within a unified framework, offering a potential silver bullet solution to real-world scenarios. To substantially push the boundary of existing repair techniques (suffering from limitations such as lack of guarantees, limited scalability, considerable overhead, etc) in addressing more practical contexts, we propose ProRepair, a novel provable neural network repair framework driven by formal preimage synthesis and property refinement. The key intuitions are: (i) synthesizing a precise proxy box to characterize the feature space preimage, which can derive a bounded distance term sufficient to guide the subsequent repair step towards the correct outputs, and (ii) performing property refinement to enable surgical corrections and scale to more complex tasks. We evaluate ProRepair across four security threats repair tasks on six benchmarks and the results demonstrate it outperforms existing methods in effectiveness, efficiency and scalability. For point-wise repair, ProRepair corrects models while preserving performance and achieving significantly improved generalization, with a speedup of 5x to 2000x over existing provable approaches. In region-wise repair, ProRepair successfully repairs all 36 safety property violation instances (compared to 8 by the best existing method), and can handle 18x higher dimensional spaces.

Paperid: 2779, https://arxiv.org/pdf/2511.07503.pdf

Abstract:
The increased availability of genetic data has transformed genomics research, but raised many privacy concerns regarding its handling due to its sensitive nature. This work explores the use of language models (LMs) for the generation of synthetic genetic mutation profiles, leveraging differential privacy (DP) for the protection of sensitive genetic data. We empirically evaluate the privacy guarantees of our DP modes by introducing a novel Biologically-Informed Hybrid Membership Inference Attack (biHMIA), which combines traditional black box MIA with contextual genomics metrics for enhanced attack power. Our experiments show that both small and large transformer GPT-like models are viable synthetic variant generators for small-scale genomics, and that our hybrid attack leads, on average, to higher adversarial success compared to traditional metric-based MIAs.

Paperid: 2780, https://arxiv.org/pdf/2511.07210.pdf

Abstract:
Clean-image backdoor attacks, which use only label manipulation in training datasets to compromise deep neural networks, pose a significant threat to security-critical applications. A critical flaw in existing methods is that the poison rate required for a successful attack induces a proportional, and thus noticeable, drop in Clean Accuracy (CA), undermining their stealthiness. This paper presents a new paradigm for clean-image attacks that minimizes this accuracy degradation by optimizing the trigger itself. We introduce Generative Clean-Image Backdoors (GCB), a framework that uses a conditional InfoGAN to identify naturally occurring image features that can serve as potent and stealthy triggers. By ensuring these triggers are easily separable from benign task-related features, GCB enables a victim model to learn the backdoor from an extremely small set of poisoned examples, resulting in a CA drop of less than 1%. Our experiments demonstrate GCB's remarkable versatility, successfully adapting to six datasets, five architectures, and four tasks, including the first demonstration of clean-image backdoors in regression and segmentation. GCB also exhibits resilience against most of the existing backdoor defenses.

Paperid: 2781, https://arxiv.org/pdf/2511.07033.pdf

Abstract:
As large language models (LLMs) become increasingly capable, concerns over the unauthorized use of copyrighted and licensed content in their training data have grown, especially in the context of code. Open-source code, often protected by open source licenses (e.g, GPL), poses legal and ethical challenges when used in pretraining. Detecting whether specific code samples were included in LLM training data is thus critical for transparency, accountability, and copyright compliance. We propose SynPrune, a syntax-pruned membership inference attack method tailored for code. Unlike prior MIA approaches that treat code as plain text, SynPrune leverages the structured and rule-governed nature of programming languages. Specifically, it identifies and excludes consequent tokens that are syntactically required and not reflective of authorship, from attribution when computing membership scores. Experimental results show that SynPrune consistently outperforms the state-of-the-arts. Our method is also robust across varying function lengths and syntax categories.

Paperid: 2782, https://arxiv.org/pdf/2511.05097.pdf

Abstract:
Tracking vulnerabilities inherited from third-party open-source components is a well-known challenge, often addressed by tracing the threads of dependency information. However, vulnerabilities can also propagate through forking: a repository forked after the introduction of a vulnerability, but before it is patched, may remain vulnerable in the fork well after being fixed in the original project. Current approaches for vulnerability analysis lack the commit-level granularity needed to track vulnerability introductions and fixes across forks, potentially leaving one-day vulnerabilities undetected. This paper presents a novel approach to help developers identify one-day vulnerabilities in forked repositories. Leveraging the global graph of public code, as captured by the Software Heritage archive, the approach propagates vulnerability information at the commit level and performs automated impact analysis. This enables automatic detection of forked projects that have not incorporated fixes, leaving them potentially vulnerable. Starting from 7162 repositories that, according to OSV, include vulnerable commits in their development histories, we identify 2.2 M forks, containing at least one vulnerable commit. Then we perform a strict filtering, allowing us to find 356 ___vulnerability, fork___ pairs impacting active and popular GitHub forks, we manually evaluate 65 pairs, finding 3 high-severity vulnerabilities, demonstrating the impact and applicability of this approach.

Paperid: 2783, https://arxiv.org/pdf/2511.04860.pdf

Abstract:
OpenAI and DeepMind's AIs recently got gold at the IMO math olympiad and ICPC programming competition. We show frontier AI is similarly good at hacking by letting GPT-5 compete in elite CTF cybersecurity competitions. In one of this year's hardest events, it outperformed 93% of humans finishing 25th: between the world's #3-ranked team (24th place) and #7-ranked team (26th place). This report walks through our methodology, results, and their implications, and dives deep into 3 problems and solutions we found particularly interesting.

Paperid: 2784, https://arxiv.org/pdf/2511.02620.pdf

Abstract:
As large AI models become increasingly valuable assets, the risk of model weight exfiltration from inference servers grows accordingly. An attacker controlling an inference server may exfiltrate model weights by hiding them within ordinary model outputs, a strategy known as steganography. This work investigates how to verify model responses to defend against such attacks and, more broadly, to detect anomalous or buggy behavior during inference. We formalize model exfiltration as a security game, propose a verification framework that can provably mitigate steganographic exfiltration, and specify the trust assumptions associated with our scheme. To enable verification, we characterize valid sources of non-determinism in large language model inference and introduce two practical estimators for them. We evaluate our detection framework on several open-weight models ranging from 3B to 30B parameters. On MOE-Qwen-30B, our detector reduces exfiltratable information to <0.5% with false-positive rate of 0.01%, corresponding to a >200x slowdown for adversaries. Overall, this work further establishes a foundation for defending against model weight exfiltration and demonstrates that strong protection can be achieved with minimal additional cost to inference providers.

Paperid: 2785, https://arxiv.org/pdf/2511.01654.pdf

Abstract:
Graph Neural Networks (GNNs) have marked significant impact in traffic state prediction, social recommendation, knowledge-aware question answering and so on. As more and more users move towards cloud computing, it has become a critical issue to unleash the power of GNNs while protecting the privacy in cloud environments. Specifically, the training data and inference data for GNNs need to be protected from being stolen by external adversaries. Meanwhile, the financial cost of cloud computing is another primary concern for users. Therefore, although existing studies have proposed privacy-preserving techniques for GNNs in cloud environments, their additional computational and communication overhead remain relatively high, causing high financial costs that limit their widespread adoption among users. To protect GNN privacy while lowering the additional financial costs, we introduce Panther, a cost-effective privacy-preserving framework for GNN training and inference services in cloud environments. Technically, Panther leverages four-party computation to asynchronously executing the secure array access protocol, and randomly pads the neighbor information of GNN nodes. We prove that Panther can protect privacy for both training and inference of GNN models. Our evaluation shows that Panther reduces the training and inference time by an average of 75.28% and 82.80%, respectively, and communication overhead by an average of 52.61% and 50.26% compared with the state-of-the-art, which is estimated to save an average of 55.05% and 59.00% in financial costs (based on on-demand pricing model) for the GNN training and inference process on Google Cloud Platform.

Paperid: 2786, https://arxiv.org/pdf/2511.00481.pdf

Abstract:
Wireless Sensor Networks forms the backbone of modern cyber physical systems used in various applications such as environmental monitoring, healthcare monitoring, industrial automation, and smart infrastructure. Ensuring the reliability of data collected through these networks is essential as these data may contain anomalies due to many reasons such as sensor faults, environmental disturbances, or malicious intrusions. In this paper a lightweight and interpretable anomaly detection framework based on a first order Markov chain model has been proposed. The method discretizes continuous sensor readings into finite states and models the temporal dynamics of sensor transitions through a transition probability matrix. Anomalies are detected when observed transitions occur with probabilities below a computed threshold, allowing for real time detection without labeled data or intensive computation. The proposed framework was validated using the Intel Berkeley Research Lab dataset, as a case study on indoor environmental monitoring demonstrates its capability to identify thermal spikes, voltage related faults, and irregular temperature fluctuations with high precision. Comparative analysis with Z score, Hidden Markov Model, and Auto encoder based methods shows that the proposed Markov based framework achieves a balanced trade-off between accuracy, F1 score is 0.86, interoperability, and computational efficiency. The systems scalability and low resource footprint highlight its suitability for large-scale and real time anomaly detection in WSN deployments.

Paperid: 2787, https://arxiv.org/pdf/2511.00361.pdf

Abstract:
High-quality data scarcity hinders malware detection, limiting ML performance. We introduce MalDataGen, an open-source modular framework for generating high-fidelity synthetic tabular data using modular deep learning models (e.g., WGAN-GP, VQ-VAE). Evaluated via dual validation (TR-TS/TS-TR), seven classifiers, and utility metrics, MalDataGen outperforms benchmarks like SDV while preserving data utility. Its flexible design enables seamless integration into detection pipelines, offering a practical solution for cybersecurity applications.

Paperid: 2788, https://arxiv.org/pdf/2511.00346.pdf

Abstract:
The rapid proliferation of Large Language Models (LLMs) has raised significant concerns about their security against adversarial attacks. In this work, we propose a novel approach to crafting universal jailbreaks and data extraction attacks by exploiting latent space discontinuities, an architectural vulnerability related to the sparsity of training data. Unlike previous methods, our technique generalizes across various models and interfaces, proving highly effective in seven state-of-the-art LLMs and one image generation model. Initial results indicate that when these discontinuities are exploited, they can consistently and profoundly compromise model behavior, even in the presence of layered defenses. The findings suggest that this strategy has substantial potential as a systemic attack vector.

Paperid: 2789, https://arxiv.org/pdf/2511.00342.pdf

Abstract:
We present MH-1M, one of the most comprehensive and up-to-date datasets for advanced Android malware research. The dataset comprises 1,340,515 applications, encompassing a wide range of features and extensive metadata. To ensure accurate malware classification, we employ the VirusTotal API, integrating multiple detection engines for comprehensive and reliable assessment. Our GitHub, Figshare, and Harvard Dataverse repositories provide open access to the processed dataset and its extensive supplementary metadata, totaling more than 400 GB of data and including the outputs of the feature extraction pipeline as well as the corresponding VirusTotal reports. Our findings underscore the MH-1M dataset's invaluable role in understanding the evolving landscape of malware.

Paperid: 2790, https://arxiv.org/pdf/2510.27554.pdf

Abstract:
x402 enables Hypertext Transfer Protocol (HTTP) services like application programming interfaces (APIs), data feeds, and inference providers to accept cryptocurrency payments for access. As agents increasingly consume these services, discovery becomes critical: which swap interface should an agent trust? Which data provider is the most reliable? We introduce TraceRank, a reputation-weighted ranking algorithm where payment transactions serve as endorsements. TraceRank seeds addresses with precomputed reputation metrics and propagates reputation through payment flows weighted by transaction value and temporal recency. Applied to x402's payment graph, this surfaces services preferred by high-reputation users rather than those with high transaction volume. Our system combines TraceRank with semantic search to respond to natural language queries with high quality results. We argue that reputation propagation resists Sybil attacks by making spam services with many low-reputation payers rank below legitimate services with few high-reputation payers. Ultimately, we aim to construct a search method for x402 enabled services that avoids infrastructure bias and has better performance than purely volume based or semantic methods.

Paperid: 2791, https://arxiv.org/pdf/2510.26847.pdf

Abstract:
Large Language Models (LLMs) are susceptible to jailbreak attacks where malicious prompts are disguised using ciphers and character-level encodings to bypass safety guardrails. While these guardrails often fail to interpret the encoded content, the underlying models can still process the harmful instructions. We introduce CPT-Filtering, a novel, model-agnostic with negligible-costs and near-perfect accuracy guardrail technique that aims to mitigate these attacks by leveraging the intrinsic behavior of Byte-Pair Encoding (BPE) tokenizers. Our method is based on the principle that tokenizers, trained on natural language, represent out-of-distribution text, such as ciphers, using a significantly higher number of shorter tokens. Our technique uses a simple yet powerful artifact of using language models: the average number of Characters Per Token (CPT) in the text. This approach is motivated by the high compute cost of modern methods - relying on added modules such as dedicated LLMs or perplexity models. We validate our approach across a large dataset of over 100,000 prompts, testing numerous encoding schemes with several popular tokenizers. Our experiments demonstrate that a simple CPT threshold robustly identifies encoded text with high accuracy, even for very short inputs. CPT-Filtering provides a practical defense layer that can be immediately deployed for real-time text filtering and offline data curation.

Paperid: 2792, https://arxiv.org/pdf/2510.25687.pdf

Abstract:
Model inversion attacks pose an open challenge to privacy-sensitive applications that use machine learning (ML) models. For example, face authentication systems use modern ML models to compute embedding vectors from face images of the enrolled users and store them. If leaked, inversion attacks can accurately reconstruct user faces from the leaked vectors. There is no systematic characterization of properties needed in an ideal defense against model inversion, even for the canonical example application of a face authentication system susceptible to data breaches, despite a decade of best-effort solutions. In this paper, we formalize the desired properties of a provably strong defense against model inversion and connect it, for the first time, to the cryptographic concept of fuzzy extractors. We further show that existing fuzzy extractors are insecure for use in ML-based face authentication. We do so through a new model inversion attack called PIPE, which achieves a success rate of over 89% in most cases against prior schemes. We then propose L2FE-Hash, the first candidate fuzzy extractor which supports standard Euclidean distance comparators as needed in many ML-based applications, including face authentication. We formally characterize its computational security guarantees, even in the extreme threat model of full breach of stored secrets, and empirically show its usable accuracy in face authentication for practical face distributions. It offers attack-agnostic security without requiring any re-training of the ML model it protects. Empirically, it nullifies both prior state-of-the-art inversion attacks as well as our new PIPE attack.

Paperid: 2793, https://arxiv.org/pdf/2510.23938.pdf

Abstract:
We present a security framework that strengthens distributed machine learning by standardizing integrity protections across CPU and GPU platforms and significantly reducing verification overheads. Our approach co-locates integrity verification directly with large ML model execution on GPU accelerators, resolving the fundamental mismatch between how large ML workloads typically run (primarily on GPUs) and how security verifications traditionally operate (on separate CPU-based processes), delivering both immediate performance benefits and long-term architectural consistency. By performing cryptographic operations natively on GPUs using dedicated compute units (e.g., Intel Arc's XMX units, NVIDIA's Tensor Cores), our solution eliminates the potential architectural bottlenecks that could plague traditional CPU-based verification systems when dealing with large models. This approach leverages the same GPU-based high-memory bandwidth and parallel processing primitives that power ML workloads ensuring integrity checks keep pace with model execution even for massive models exceeding 100GB. This framework establishes a common integrity verification mechanism that works consistently across different GPU vendors and hardware configurations. By anticipating future capabilities for creating secure channels between trusted execution environments and GPU accelerators, we provide a hardware-agnostic foundation that enterprise teams can deploy regardless of their underlying CPU and GPU infrastructures.

Paperid: 2794, https://arxiv.org/pdf/2510.23483.pdf

Abstract:
Fully homomorphic encryption allows the evaluation of arbitrary functions on encrypted data. It can be leveraged to secure outsourced and multiparty computation. TFHE is a fast torus-based fully homomorphic encryption scheme that allows both linear operations, as well as the evaluation of arbitrary non-linear functions. It currently provides the fastest bootstrapping operation performance of any other FHE scheme. Despite its fast performance, TFHE suffers from a considerably higher computational overhead for the evaluation of homomorphic circuits. Computations in the encrypted domain are orders of magnitude slower than their unencrypted equivalents. This bottleneck hinders the widespread adoption of (T)FHE for the protection of sensitive data. While state-of-the-art implementations focused on accelerating and outsourcing single operations, their scalability and practicality are constrained by high memory bandwidth costs. In order to overcome this, we propose an FPGA-based hardware accelerator for the evaluation of homomorphic circuits. Specifically, we design a functionally complete TFHE processor for FPGA hardware capable of processing instructions on the data completely on the FPGA. In order to achieve a higher throughput from our TFHE processor, we implement an improved programmable bootstrapping module which outperforms the current state-of-the-art by 240\% to 480\% more bootstrappings per second. Our efficient, compact, and scalable design lays the foundation for implementing complete FPGA-based TFHE processor architectures.

Paperid: 2795, https://arxiv.org/pdf/2510.23313.pdf

Abstract:
This survey systematizes the evolution of network intrusion detection systems (NIDS), from conventional methods such as signature-based and neural network (NN)-based approaches to recent integrations with large language models (LLMs). It clearly and concisely summarizes the current status, strengths, and limitations of conventional techniques, and explores the practical benefits of integrating LLMs into NIDS. Recent research on the application of LLMs to NIDS in diverse environments is reviewed, including conventional network infrastructures, autonomous vehicle environments and IoT environments. From this survey, readers will learn that: 1) the earliest methods, signature-based IDSs, continue to make significant contributions to modern systems, despite their well-known weaknesses; 2) NN-based detection, although considered promising and under development for more than two decades, and despite numerous related approaches, still faces significant challenges in practical deployment; 3) LLMs are useful for NIDS in many cases, and a number of related approaches have been proposed; however, they still face significant challenges in practical applications. Moreover, they can even be exploited as offensive tools, such as for generating malware, crafting phishing messages, or launching cyberattacks. Recently, several studies have been proposed to address these challenges, which are also reviewed in this survey; and 4) strategies for constructing domain-specific LLMs have been proposed and are outlined in this survey, as it is nearly impossible to train a NIDS-specific LLM from scratch.

Paperid: 2796, https://arxiv.org/pdf/2510.22283.pdf

Abstract:
Wide-bandgap (WBG) technologies offer unprecedented improvements in power system efficiency, size, and performance, but also introduce unique sensor corruption and cybersecurity risks in industrial control systems (ICS), particularly due to high-frequency noise and sophisticated cyber-physical threats. This proof-of-concept (PoC) study demonstrates the adaptation of a noise-driven physically unclonable function (PUF) and machine learning (ML)-assisted anomaly detection framework to the demanding environment of WBG-based ICS sensor pathways. By extracting entropy from unavoidable WBG switching noise (up to 100 kHz) as a PUF source, and simultaneously using this noise as a real-time threat indicator, the proposed system unites hardware-level authentication and anomaly detection. Our approach integrates hybrid machine learning (ML) models with adaptive Bayesian filtering, providing robust and low-latency detection capabilities resilient to both natural electromagnetic interference (EMI) and active adversarial manipulation. Through detailed simulations of WBG modules under benign and attack scenarios--including EMI injection, signal tampering, and node impersonation--we achieve 95% detection accuracy and sub-millisecond processing latency. These results demonstrate the feasibility of physics-driven, dual-use noise exploitation as a scalable ICS defense primitive. Our findings lay the groundwork for next-generation security strategies that leverage inherent device characteristics, bridging hardware and artificial intelligence (AI) for enhanced protection of critical ICS infrastructure.

Paperid: 2797, https://arxiv.org/pdf/2510.22274.pdf

Abstract:
Data poisoning attacks are a potential threat to machine learning (ML) models, aiming to manipulate training datasets to disrupt their performance. Existing defenses are mostly designed to mitigate specific poisoning attacks or are aligned with particular ML algorithms. Furthermore, most defenses are developed to secure deep neural networks or binary classifiers. However, traditional multiclass classifiers need attention to be secure from data poisoning attacks, as these models are significant in developing multi-modal applications. Therefore, this paper proposes SecureLearn, a two-layer attack-agnostic defense to defend multiclass models from poisoning attacks. It comprises two components of data sanitization and a new feature-oriented adversarial training. To ascertain the effectiveness of SecureLearn, we proposed a 3D evaluation matrix with three orthogonal dimensions: data poisoning attack, data sanitization and adversarial training. Benchmarking SecureLearn in a 3D matrix, a detailed analysis is conducted at different poisoning levels (10%-20%), particularly analysing accuracy, recall, F1-score, detection and correction rates, and false discovery rate. The experimentation is conducted for four ML algorithms, namely Random Forest (RF), Decision Tree (DT), Gaussian Naive Bayes (GNB) and Multilayer Perceptron (MLP), trained with three public datasets, against three poisoning attacks and compared with two existing mitigations. Our results highlight that SecureLearn is effective against the provided attacks. SecureLearn has strengthened resilience and adversarial robustness of traditional multiclass models and neural networks, confirming its generalization beyond algorithm-specific defenses. It consistently maintained accuracy above 90%, recall and F1-score above 75%. For neural networks, SecureLearn achieved 97% recall and F1-score against all selected poisoning attacks.

Paperid: 2798, https://arxiv.org/pdf/2510.20883.pdf

Abstract:
Adversarial training has emerged as a key technique to enhance model robustness against adversarial input perturbations. Many of the existing methods rely on computationally expensive min-max problems that limit their application in practice. We propose a novel formulation of adversarial training in reproducing kernel Hilbert spaces, shifting from input to feature-space perturbations. This reformulation enables the exact solution of inner maximization and efficient optimization. It also provides a regularized estimator that naturally adapts to the noise level and the smoothness of the underlying function. We establish conditions under which the feature-perturbed formulation is a relaxation of the original problem and propose an efficient optimization algorithm based on iterative kernel ridge regression. We provide generalization bounds that help to understand the properties of the method. We also extend the formulation to multiple kernel learning. Empirical evaluation shows good performance in both clean and adversarial settings.

Paperid: 2799, https://arxiv.org/pdf/2510.20768.pdf

Abstract:
Retrieval-Augmented Generation (RAG) has emerged as the dominant architectural pattern to operationalize Large Language Model (LLM) usage in Cyber Threat Intelligence (CTI) systems. However, this design is susceptible to poisoning attacks, and previously proposed defenses can fail for CTI contexts as cyber threat information is often completely new for emerging attacks, and sophisticated threat actors can mimic legitimate formats, terminology, and stylistic conventions. To address this issue, we propose that the robustness of modern RAG defenses can be accelerated by applying source credibility algorithms on corpora, using PageRank as an example. In our experiments, we demonstrate quantitatively that our algorithm applies a lower authority score to malicious documents while promoting trusted content, using the standardized MS MARCO dataset. We also demonstrate proof-of-concept performance of our algorithm on CTI documents and feeds.

Paperid: 2800, https://arxiv.org/pdf/2510.20688.pdf

Abstract:
Unsafe Rust code is necessary for interoperability with C/C++ libraries and implementing low-level data structures, but it can cause memory safety violations in otherwise memory-safe Rust programs. Sanitizers can catch such memory errors at runtime, but introduce many unnecessary checks even for memory accesses guaranteed safe by the Rust type system. We introduce SafeFFI, a system for optimizing memory safety instrumentation in Rust binaries such that checks occur at the boundary between unsafe and safe code, handing over the enforcement of memory safety from the sanitizer to the Rust type system. Unlike previous approaches, our design avoids expensive whole-program analysis and adds much less compile-time overhead (2.64x compared to over 8.83x). On a collection of popular Rust crates and known vulnerable Rust code, SafeFFI achieves superior performance compared to state-of-the-art systems, reducing sanitizer checks by up to 98%, while maintaining correctness and flagging all spatial and temporal memory safety violations.

Paperid: 2801, https://arxiv.org/pdf/2510.20300.pdf

Abstract:
There are increasing risks of privacy disclosure when sharing the automotive location data in particular functions such as route navigation, driving monitoring and vehicle scheduling. These risks could lead to the attacks including user behavior recognition, sensitive location inference and trajectory reconstruction. In order to mitigate the data security risk caused by the automotive location sharing, this paper proposes a high-precision privacy protection mechanism based on format-preserving encryption (FPE) of geographical coordinates. The automotive coordinate data key mapping mechanism is designed to reduce to the accuracy loss of the geographical location data caused by the repeated encryption and decryption. The experimental results demonstrate that the average relative distance retention rate (RDR) reached 0.0844, and the number of hotspots in the critical area decreased by 98.9% after encryption. To evaluate the accuracy loss of the proposed encryption algorithm on automotive geographical location data, this paper presents the experimental analysis of decryption accuracy, and the result indicates that the decrypted coordinate data achieves a restoration accuracy of 100%. This work presents a high-precision privacy protection method for automotive location data, thereby providing an efficient data security solution for the sensitive data sharing in autonomous driving.

Paperid: 2802, https://arxiv.org/pdf/2510.20243.pdf

Abstract:
Privacy-preserving machine learning (PPML) is an emerging topic to handle secure machine learning inference over sensitive data in untrusted environments. Fully homomorphic encryption (FHE) enables computation directly on encrypted data on the server side, making it a promising approach for PPML. However, it introduces significant communication and computation overhead on the client side, making it impractical for edge devices. Hybrid homomorphic encryption (HHE) addresses this limitation by combining symmetric encryption (SE) with FHE to reduce the computational cost on the client side, and combining with an FHE-friendly SE can also lessen the processing overhead on the server side, making it a more balanced and efficient alternative. Our work proposes a hardware-accelerated HHE architecture built around a lightweight symmetric cipher optimized for FHE compatibility and implemented as a dedicated hardware accelerator. To the best of our knowledge, this is the first design to integrate an end-to-end HHE framework with hardware acceleration. Beyond this, we also present several microarchitectural optimizations to achieve higher performance and energy efficiency. The proposed work is integrated into a full PPML pipeline, enabling secure inference with significantly lower latency and power consumption than software implementations. Our contributions validate the feasibility of low-power, hardware- accelerated HHE for edge deployment and provide a hardware- software co-design methodology for building scalable, secure machine learning systems in resource-constrained environments. Experiments on a PYNQ-Z2 platform with the MNIST dataset show over a 50x reduction in client-side encryption latency and nearly a 2x gain in hardware throughput compared to existing FPGA-based HHE accelerators.

Paperid: 2803, https://arxiv.org/pdf/2510.19938.pdf

Abstract:
Real-world health studies require continuous and secure data collection from mobile and wearable devices. We introduce MotionPI, a smartphone-based system designed to collect behavioral and health data through sensors and surveys with minimal interaction from participants. The system integrates passive data collection (such as GPS and wristband motion data) with Ecological Momentary Assessment (EMA) surveys, which can be triggered randomly or based on physical activity. MotionPI is designed to work under real-life constraints, including limited battery life, weak or intermittent cellular connection, and minimal user supervision. It stores data both locally and on a secure cloud server, with encrypted transmission and storage. It integrates through Bluetooth Low Energy (BLE) into wristband devices that store raw data and communicate motion summaries and trigger events. MotionPI demonstrates a practical solution for secure and scalable mobile data collection in cyber-physical health studies.

Paperid: 2804, https://arxiv.org/pdf/2510.19264.pdf

Abstract:
With the goal of improving the security of Internet protocols, we seek faster, semi-automatic methods to discover new vulnerabilities in protocols such as DNS, BGP, and others. To this end, we introduce the LLM-Assisted Protocol Attack Discovery (LAPRAD) methodology, enabling security researchers with some DNS knowledge to efficiently uncover vulnerabilities that would otherwise be hard to detect. LAPRAD follows a three-stage process. In the first, we consult an LLM (GPT-o1) that has been trained on a broad corpus of DNS-related sources and previous DDoS attacks to identify potential exploits. In the second stage, a different LLM automatically constructs the corresponding attack configurations using the ReACT approach implemented via LangChain (DNS zone file generation). Finally, in the third stage, we validate the attack's functionality and effectiveness. Using LAPRAD, we uncovered three new DDoS attacks on the DNS protocol and rediscovered two recently reported ones that were not included in the LLM's training data. The first new attack employs a bait-and-switch technique to trick resolvers into caching large, bogus DNSSEC RRSIGs, reducing their serving capacity to as little as 6%. The second exploits large DNSSEC encryption algorithms (RSA-4096) with multiple keys, thereby bypassing a recently implemented default RRSet limit. The third leverages ANY-type responses to produce a similar effect. These variations of a cache-flushing DDoS attack, called SigCacheFlush, circumvent existing patches, severely degrade resolver query capacity, and impact the latest versions of major DNS resolver implementations.

Paperid: 2805, https://arxiv.org/pdf/2510.18612.pdf

Abstract:
RISC-V processors are becoming ubiquitous in critical applications, but their susceptibility to microarchitectural side-channel attacks is a serious concern. Detection of microarchitectural attacks in RISC-V is an emerging research topic that is relatively underexplored, compared to x86 and ARM. The first line of work to detect flush+fault-based microarchitectural attacks in RISC-V leverages Machine Learning (ML) models, yet it leaves several practical aspects that need further investigation. To address overlooked issues, we leveraged gem5 and propose a new detection method combining statistical preprocessing and association rule mining having reconfiguration capabilities to generalize the detection method for any microarchitectural attack. The performance comparison with state-of-the-art reveals that the proposed detection method achieves up to 5.15% increase in accuracy, 7% rise in precision, and 3.91% improvement in recall under the cryptographic, computational, and memory-intensive workloads alongside its flexibility to detect new variant of flush+fault attack. Moreover, as the attack detection relies on association rules, their human-interpretable nature provides deep insight to understand microarchitectural behavior during the execution of attack and benign applications.

Paperid: 2806, https://arxiv.org/pdf/2510.18553.pdf

Abstract:
Very few available individual bandwidth reservation schemes provide efficient and cost-effective bandwidth reservation that is required for safety-critical and time-sensitive vehicular networked applications. These schemes allow vehicles to make reservation requests for the required resources. Accordingly, a Mobile Network Operator (MNO) can allocate and guarantee bandwidth resources based on these requests. However, due to uncertainty in future reservation time and bandwidth costs, the design of an optimized reservation strategy is challenging. In this article, we propose a novel multi-objective bandwidth reservation update approach with an optimal strategy based on Double Deep Q-Network (DDQN). The key design objectives are to minimize the reservation cost with multiple MNOs and to ensure reliable resource provisioning in uncertain situations by solving scenarios such as underbooked and overbooked reservations along the driving path. The enhancements and advantages of our proposed strategy have been demonstrated through extensive experimental results when compared to other methods like greedy update or other deep reinforcement learning approaches. Our strategy demonstrates a 40% reduction in bandwidth costs across all investigated scenarios and simultaneously resolves uncertain situations in a cost-effective manner.

Paperid: 2807, https://arxiv.org/pdf/2510.18508.pdf

Abstract:
Security analysts face increasing pressure to triage large and complex vulnerability backlogs. Large Language Models (LLMs) offer a potential aid by automating parts of the interpretation process. We evaluate four models (ChatGPT, Claude, Gemini, and DeepSeek) across twelve prompting techniques to interpret semi-structured and unstructured vulnerability information. As a concrete use case, we test each model's ability to predict decision points in the Stakeholder-Specific Vulnerability Categorization (SSVC) framework: Exploitation, Automatable, Technical Impact, and Mission and Wellbeing. Using 384 real-world vulnerabilities from the VulZoo dataset, we issued more than 165,000 queries to assess performance under prompting styles including one-shot, few-shot, and chain-of-thought. We report F1 scores for each SSVC decision point and Cohen's kappa (weighted and unweighted) for the final SSVC decision outcomes. Gemini consistently ranked highest, leading on three of four decision points and yielding the most correct recommendations. Prompting with exemplars generally improved accuracy, although all models struggled on some decision points. Only DeepSeek achieved fair agreement under weighted metrics, and all models tended to over-predict risk. Overall, current LLMs do not replace expert judgment. However, specific LLM and prompt combinations show moderate effectiveness for targeted SSVC decisions. When applied with care, LLMs can support vulnerability prioritization workflows and help security teams respond more efficiently to emerging threats.

Paperid: 2808, https://arxiv.org/pdf/2510.18394.pdf

Abstract:
Undoubtedly, the Internet has become one of the most important conduits to information for the general public. Nonetheless, Internet access can be and has been limited systematically or blocked completely during political events in numerous countries and regions by various censorship mechanisms. Depending on where the core filtering component is situated, censorship techniques have been classified as client-based, server-based, or network-based. However, as the Internet evolves rapidly, new and sophisticated censorship techniques have emerged, which involve techniques that cut across locations and involve new forms of hurdles to information access. We argue that modern censorship can be better understood through a new lens that we term chokepoints, which identifies bottlenecks in the content production or delivery cycle where efficient new forms of large-scale client-side surveillance and filtering mechanisms have emerged.

Paperid: 2809, https://arxiv.org/pdf/2510.18109.pdf

Abstract:
Evaluating the relevance of data is a critical task for model builders seeking to acquire datasets that enhance model performance. Ideally, such evaluation should allow the model builder to assess the utility of candidate data without exposing proprietary details of the model. At the same time, data providers must be assured that no information about their data - beyond the computed utility score - is disclosed to the model builder. In this paper, we present PrivaDE, a cryptographic protocol for privacy-preserving utility scoring and selection of data for machine learning. While prior works have proposed data evaluation protocols, our approach advances the state of the art through a practical, blockchain-centric design. Leveraging the trustless nature of blockchains, PrivaDE enforces malicious-security guarantees and ensures strong privacy protection for both models and datasets. To achieve efficiency, we integrate several techniques - including model distillation, model splitting, and cut-and-choose zero-knowledge proofs - bringing the runtime to a practical level. Furthermore, we propose a unified utility scoring function that combines empirical loss, predictive entropy, and feature-space diversity, and that can be seamlessly integrated into active-learning workflows. Evaluation shows that PrivaDE performs data evaluation effectively, achieving online runtimes within 15 minutes even for models with millions of parameters. Our work lays the foundation for fair and automated data marketplaces in decentralized machine learning ecosystems.

Paperid: 2810, https://arxiv.org/pdf/2510.17284.pdf

Abstract:
A coinjoin protocol aims to increase transactional privacy for Bitcoin and Bitcoin-like blockchains via collaborative transactions, by violating assumptions behind common analysis heuristics. Estimating the resulting privacy gain is a crucial yet unsolved problem due to a range of influencing factors and large computational complexity. We adapt the BlockSci on-chain analysis software to coinjoin transactions, demonstrating a significant (10-50%) average post-mix anonymity set size decrease for all three major designs with a central coordinator: Whirlpool, Wasabi 1.x, and Wasabi 2.x. The decrease is highest during the first day and negligible after one year from a coinjoin creation. Moreover, we design a precise, parallelizable privacy estimation method, which takes into account coinjoin fees, implementation-specific limitations and users' post-mix behavior. We evaluate our method in detail on a set of emulated and real-world Wasabi 2.x coinjoins and extrapolate to its largest real-world coinjoins with hundreds of inputs and outputs. We conclude that despite the users' undesirable post-mix behavior, correctly attributing the coins to their owners is still very difficult, even with our improved analysis algorithm.

Paperid: 2811, https://arxiv.org/pdf/2510.17098.pdf

Abstract:
Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference constitutes an overlooked attack surface. This paper introduces Malicious Token Injection (MTI), a modular framework that systematically perturbs cached key vectors at selected layers and timesteps through controlled magnitude and frequency, using additive Gaussian noise, zeroing, and orthogonal rotations. A theoretical analysis quantifies how these perturbations propagate through attention, linking logit deviations to the Frobenius norm of corruption and softmax Lipschitz dynamics. Empirical results show that MTI significantly alters next-token distributions and downstream task performance across GPT-2 and LLaMA-2/7B, as well as destabilizes retrieval-augmented and agentic reasoning pipelines. These findings identify cache integrity as a critical yet underexplored vulnerability in current LLM deployments, positioning cache corruption as a reproducible and theoretically grounded threat model for future robustness and security research.

Paperid: 2812, https://arxiv.org/pdf/2510.17033.pdf

Abstract:
Federated learning (FL) enables fine-tuning large language models (LLMs) across distributed data sources. As these sources increasingly include LLM-generated text, provenance tracking becomes essential for accountability and transparency. We adapt LLM watermarking for data provenance in FL where a subset of clients compute local updates on watermarked data, and the server averages all updates into the global LLM. In this setup, watermarks are radioactive: the watermark signal remains detectable after fine-tuning with high confidence. The $p$-value can reach $10^{-24}$ even when as little as $6.6\%$ of data is watermarked. However, the server can act as an active adversary that wants to preserve model utility while evading provenance tracking. Our observation is that updates induced by watermarked synthetic data appear as outliers relative to non-watermark updates. Our adversary thus applies strong robust aggregation that can filter these outliers, together with the watermark signal. All evaluated radioactive watermarks are not robust against such an active filtering server. Our work suggests fundamental trade-offs between radioactivity, robustness, and utility.

Paperid: 2813, https://arxiv.org/pdf/2510.16991.pdf

Abstract:
We establish deterministic hardness of approximation results for the Shortest Vector Problem in $\ell_p$ norm ($\mathsf{SVP}_p$) and for Unique-SVP ($\mathsf{uSVP}_p$) for all $p > 2$. Previously, no deterministic hardness results were known, except for $\ell_\infty$. For every $p > 2$, we prove constant-ratio hardness: no polynomial-time algorithm approximates $\mathsf{SVP}_p$ or $\mathsf{uSVP}_p$ within a ratio of $\sqrt{2} - o(1)$, assuming $\textsf{3SAT} \notin \text{DTIME}(2^{O(n^{2/3}\log n)})$, and, $\textsf{Unambiguous-3SAT} \notin \text{DTIME}(2^{O(n^{2/3}\log n)})$. We also show that for any $\varepsilon > 0$ there exists $p_\varepsilon > 2$ such that for every $p \ge p_\varepsilon$: no polynomial-time algorithm approximates $\mathsf{SVP}_p$ within a ratio of $2^{(\log n)^{1- \varepsilon}}$, assuming $\text{NP} \nsubseteq \text{DTIME}(n^{(\log n)^\varepsilon})$; and within a ratio of $n^{1/(\log\log(n))^\varepsilon}$, assuming $\text{NP} \nsubseteq \text{SUBEXP}$. This improves upon [Haviv, Regev, Theory of Computing 2012], which obtained similar inapproximation ratios under randomized reductions. We obtain analogous results for $\mathsf{uSVP}_p$ under the assumptions $\textsf{Unambiguous-3SAT} \not\subseteq \text{DTIME}(n^{(\log n)^\varepsilon})$ and $\textsf{Unambiguous-3SAT} \not\subseteq \text{SUBEXP}$, improving the previously known $1+o(1)$ [Stephens-Davidowitz, Approx 2016]. Strengthening the hardness of $\textsf{uSVP}$ has direct cryptographic impact. By the reduction of Lyubashevsky and Micciancio [Lyubashevsky, Micciancio, CRYPTO 2009], hardness for $γ$-$\mathsf{uSVP}_p$ carries over to ${\frac{1}γ}$-$\mathsf{BDD}_p$ (Bounded Distance Decoding). Thus, understanding the hardness of $\textsf{uSVP}$ improves worst-case guarantees for two core problems that underpin security in lattice-based cryptography.

Paperid: 2814, https://arxiv.org/pdf/2510.16694.pdf

Abstract:
Secure federated learning (FL) preserves data privacy during distributed model training. However, deploying such frameworks across heterogeneous devices results in performance bottlenecks, due to straggler clients with limited computational or network capabilities, slowing training for all participating clients. This paper introduces the first straggler mitigation technique for secure aggregation with deep neural networks. We propose CLIP, a client-side invariant neuron pruning technique coupled with network-aware pruning, that addresses compute and network bottlenecks due to stragglers during training with minimal accuracy loss. Our technique accelerates secure FL training by 13% to 34% across multiple datasets (CIFAR10, Shakespeare, FEMNIST) with an accuracy impact of between 1.3% improvement to 2.6% reduction.

Paperid: 2815, https://arxiv.org/pdf/2510.16610.pdf

Abstract:
This survey investigates how ontologies, semantic log processing, and Large Language Models (LLMs) enhance cybersecurity. Ontologies structure domain knowledge, enabling interoperability, data integration, and advanced threat analysis. Security logs, though critical, are often unstructured and complex. To address this, automated construction of Knowledge Graphs (KGs) from raw logs is emerging as a key strategy for organizing and reasoning over security data. LLMs enrich this process by providing contextual understanding and extracting insights from unstructured content. This work aligns with European Union (EU) efforts such as NIS 2 and the Cybersecurity Taxonomy, highlighting challenges and opportunities in intelligent ontology-driven cyber defense.

Paperid: 2816, https://arxiv.org/pdf/2510.16544.pdf

Abstract:
Rowhammer is a critical vulnerability in dynamic random access memory (DRAM) that continues to pose a significant threat to various systems. However, we find that conventional load-based attacks are becoming highly ineffective on the most recent architectures such as Intel Alder and Raptor Lake. In this paper, we present $ρ$Hammer, a new Rowhammer framework that systematically overcomes three core challenges impeding attacks on these new architectures. First, we design an efficient and generic DRAM address mapping reverse-engineering method that uses selective pairwise measurements and structured deduction, enabling recovery of complex mappings within seconds on the latest memory controllers. Second, to break through the activation rate bottleneck of load-based hammering, we introduce a novel prefetch-based hammering paradigm that leverages the asynchronous nature of x86 prefetch instructions and is further enhanced by multi-bank parallelism to maximize throughput. Third, recognizing that speculative execution causes more severe disorder issues for prefetching, which cannot be simply mitigated by memory barriers, we develop a counter-speculation hammering technique using control-flow obfuscation and optimized NOP-based pseudo-barriers to maintain prefetch order with minimal overhead. Evaluations across four latest Intel architectures demonstrate $ρ$Hammer's breakthrough effectiveness: it induces up to 200K+ additional bit flips within 2-hour attack pattern fuzzing processes and has a 112x higher flip rate than the load-based hammering baselines on Comet and Rocket Lake. Also, we are the first to revive Rowhammer attacks on the latest Raptor Lake architecture, where baselines completely fail, achieving stable flip rates of 2,291/min and fast end-to-end exploitation.

Paperid: 2817, https://arxiv.org/pdf/2510.16229.pdf

Abstract:
GPS spoofing poses a growing threat to aviation by falsifying satellite signals and misleading aircraft navigation systems. This paper demonstrates a proof-of-concept spoofing detection strategy based on analyzing satellite Carrier-to-Noise Density Ratio (C/N$_0$) variation during controlled static antenna orientations. Using a u-blox EVK-M8U receiver and a GPSG-1000 satellite simulator, C/N$_0$ data is collected under three antenna orientations flat, banked right, and banked left) in both real-sky (non-spoofed) and spoofed environments. Our findings reveal that under non-spoofed signals, C/N$_0$ values fluctuate naturally with orientation, reflecting true geometric dependencies. However, spoofed signals demonstrate a distinct pattern: the flat orientation, which directly faces the spoofing antenna, consistently yielded the highest C/N$_0$ values, while both banked orientations showed reduced C/N$_0$ due to misalignment with the spoofing source. These findings suggest that simple maneuvers such as brief banking to induce C/N$_0$ variations can provide early cues of GPS spoofing for general aviation and UAV systems.

Paperid: 2818, https://arxiv.org/pdf/2510.16168.pdf

Abstract:
Web Real-Time Communication (WebRTC) enables real-time peer-to-peer communication, but its Interactive Connectivity Establishment (ICE) process can unintentionally expose internal and public IP addresses as metadata. This paper presents a cross-platform measurement study of WebRTC metadata leakage using current (2025) builds of Chrome, Brave, Firefox, and Tor on desktop and mobile platforms. Experiments were conducted across semi-trusted Wi-Fi and untrusted mobile carrier networks. Results show that Chrome remains the most leakage-prone, disclosing LAN or Carrier-Grade NAT (CGNAT) addresses on mobile and metadata on desktop; Brave avoids direct IP leaks but exposes session-stable mDNS identifiers; Firefox provides strong protection on desktop but leaks internal IPs on Android; and Tor consistently prevents all forms of leakage. We introduce a structured threat model for semi-trusted environments and evaluate the limitations of mDNS obfuscation. Finally, we propose layered mitigation strategies combining browser defaults, institutional safeguards, and user controls. Findings demonstrate that while direct LAN leakage is declining, emerging vectors such as mDNS and CGNAT create persistent privacy risks requiring protocol-level redesign and policy action.

Paperid: 2819, https://arxiv.org/pdf/2510.16067.pdf

Abstract:
Static, long-lived credentials for workload authentication create untenable security risks that violate Zero-Trust principles. This paper presents a multi-cloud framework using Workload Identity Federation (WIF) and OpenID Connect (OIDC) for secretless authentication. Our approach uses cryptographically-verified, ephemeral tokens, allowing workloads to authenticate without persistent private keys and mitigating credential theft. We validate this framework in an enterprise-scale Kubernetes environment, which significantly reduces the attack surface. The model offers a unified solution to manage workload identities across disparate clouds, enabling future implementation of robust, attribute-based access control.

Paperid: 2820, https://arxiv.org/pdf/2510.16037.pdf

Abstract:
This study investigates the privacy risks associated with diffusion-based synthetic tabular data generation methods, focusing on their susceptibility to Membership Inference Attacks (MIAs). We examine two recent models, TabDDPM and TabSyn, by developing query-based MIAs based on the step-wise error comparison method. Our findings reveal that TabDDPM is more vulnerable to these attacks. TabSyn exhibits resilience against our attack models. Our work underscores the importance of evaluating the privacy implications of diffusion models and encourages further research into robust privacy-preserving mechanisms for synthetic data generation.

Paperid: 2821, https://arxiv.org/pdf/2510.16025.pdf

Abstract:
Cornami Mx2 accelerates of Fully Homomorphic Encryption (FHE) applications, enabled by breakthrough work [1], which are otherwise compute limited. Our processor architecture is based on the systolic array of cores with in-memory compute capability and a network on chip (NoC) processor architecture called the "FracTLcore compute fabric processor" (Mx2). Here, we describe the work to estimate processor resources to compute workload in CGGI (TFHE-rs) or CKKS scheme during construction of our compiler backend for this architecture [2]. These processors are available for running applications in both the TFHE-rs Boolean scheme and CKKS scheme FHE applications.

Paperid: 2822, https://arxiv.org/pdf/2510.16005.pdf

Abstract:
The use of Immersive Technologies has shown its potential to revolutionize many sectors such as health, entertainment, education, and industrial sectors. Immersive technologies such as Virtual Reality (VR), Augmented reality (AR), and Mixed Reality (MR) have redefined user interaction through real-time biometric and behavioral tracking. Although Immersive Technologies (XR) essentially need the collection of the biometric data which acts as a baseline to create immersive experience, however, this ongoing feedback information (includes biometrics) poses critical privacy concerns due to the sensitive nature of the data collected. A comprehensive review of recent literature explored the technical dimensions of related problem; however, they largely overlook the challenge particularly the intricacies of real-time biometric data filtering within head-mounted display system. Motivated from this, in this work, we propose a novel privacy-preserving system architecture that identifies and filters biometric signals (within the VR headset) in real-time before transmission or storage. Implemented as a modular Unity Software-development Kit (SDK) compatible with major immersive platforms, our solution (named Meta-Guardian) employs machine learning models for signal classification and a filtering mechanism to block sensitive data. This framework aims to enable developers to embed privacy-by-design principles into immersive experiences on various headsets and applications.

Paperid: 2824, https://arxiv.org/pdf/2510.15973.pdf

Abstract:
This paper presents a systematic security assessment of four prominent Large Language Models (LLMs) against diverse adversarial attack vectors. We evaluate Phi-2, Llama-2-7B-Chat, GPT-3.5-Turbo, and GPT-4 across four distinct attack categories: human-written prompts, AutoDAN, Greedy Coordinate Gradient (GCG), and Tree-of-Attacks-with-pruning (TAP). Our comprehensive evaluation employs 1,200 carefully stratified prompts from the SALAD-Bench dataset, spanning six harm categories. Results demonstrate significant variations in model robustness, with Llama-2 achieving the highest overall security (3.4% average attack success rate) while Phi-2 exhibits the greatest vulnerability (7.0% average attack success rate). We identify critical transferability patterns where GCG and TAP attacks, though ineffective against their target model (Llama-2), achieve substantially higher success rates when transferred to other models (up to 17% for GPT-4). Statistical analysis using Friedman tests reveals significant differences in vulnerability across harm categories ($p < 0.001$), with malicious use prompts showing the highest attack success rates (10.71% average). Our findings contribute to understanding cross-model security vulnerabilities and provide actionable insights for developing targeted defense mechanisms

Paperid: 2825, https://arxiv.org/pdf/2510.15971.pdf

Abstract:
Malicious URLs pose significant security risks as they facilitate phishing attacks, distribute malware, and empower attackers to deface websites. Blacklist detection methods fail to identify new or obfuscated URLs because they depend on pre-existing patterns. This work presents a hybrid deep learning model named GNN-GAT-LSTM that combines Graph Neural Networks (GNNs) with Graph Attention Networks (GATs) and Long Short-Term Memory (LSTM) networks. The proposed architecture extracts both the structural and sequential patterns of the features from data. The model transforms URLs into graphs through a process where characters become nodes that connect through edges. It applies one-hot encoding to represent node features. The model received training and testing data from a collection of 651,191 URLs, which were classified into benign, phishing, defacement, and malware categories. The preprocessing stage included both feature engineering and data balancing techniques, which addressed the class imbalance issue to enhance model learning. The GNN-GAT-LSTM model achieved outstanding performance through its test accuracy of 0.9806 and its weighted F1-score of 0.9804. It showed excellent precision and recall performance across most classes, particularly for benign and defacement URLs. Overall, the model provides an efficient and scalable system for detecting malicious URLs while demonstrating strong potential for real-world cybersecurity applications.

Paperid: 2826, https://arxiv.org/pdf/2510.15948.pdf

Abstract:
Large Vision-Language Models (LVLMs) have achieved remarkable progress in multimodal perception and generation, yet their safety alignment remains a critical challenge.Existing defenses and vulnerable to multimodal jailbreaks, as visual inputs introduce new attack surfaces, reasoning chains lack safety supervision, and alignment often degrades under modality fusion.To overcome these limitation, we propose VisuoAlign, a framework for multi-modal safety alignment via prompt-guided tree search.VisuoAlign embeds safety constrains into the reasoning process through visual-textual interactive prompts, employs Monte Carlo Tree Search(MCTS) to systematically construct diverse safety-critical prompt trajectories, and introduces prompt-based scaling to ensure real-time risk detection and compliant responses.Extensive experiments demonstrate that VisuoAlign proactively exposes risks, enables comprehensive dataset generation, and significantly improves the robustness of LVLMs against complex cross-modal threats.

Paperid: 2827, https://arxiv.org/pdf/2510.14344.pdf

Abstract:
Mobile app markets host millions of apps, yet undesired behaviors (e.g., disruptive ads, illegal redirection, payment deception) remain hard to catch because they often do not rely on permission-protected APIs and can be easily camouflaged via UI or metadata edits. We present BINCTX, a learning approach that builds multi-modal representations of an app from (i) a global bytecode-as-image view that captures code-level semantics and family-style patterns, (ii) a contextual view (manifested actions, components, declared permissions, URL/IP constants) indicating how behaviors are triggered, and (iii) a third-party-library usage view summarizing invocation frequencies along inter-component call paths. The three views are embedded and fused to train a contextual-aware classifier. On real-world malware and benign apps, BINCTX attains a macro F1 of 94.73%, outperforming strong baselines by at least 14.92%. It remains robust under commercial obfuscation (F1 84% post-obfuscation) and is more resistant to adversarial samples than state-of-the-art bytecode-only systems.

Paperid: 2828, https://arxiv.org/pdf/2510.14113.pdf

Abstract:
Large language models (LLMs) are transforming everyday applications, yet deployment in cybersecurity lags due to a lack of high-quality, domain-specific models and training datasets. To address this gap, we present CyberPal 2.0, a family of cybersecurity-expert small language models (SLMs) ranging from 4B-20B parameters. To train CyberPal 2.0, we generate an enriched chain-of-thought cybersecurity instruction dataset built with our data enrichment and formatting pipeline, SecKnowledge 2.0, which integrates expert-in-the-loop steering of reasoning formats alongside LLM-driven multi-step grounding, yielding higher-fidelity, task-grounded reasoning traces for security tasks. Across diverse cybersecurity benchmarks, CyberPal 2.0 consistently outperforms its baselines and matches or surpasses various open and closed-source frontier models, while remaining a fraction of their size. On core cyber threat intelligence knowledge tasks, our models outperform almost all tested frontier models, ranking second only to Sec-Gemini v1. On core threat-investigation tasks, such as correlating vulnerabilities and bug tickets with weaknesses, our best 20B-parameter model outperforms GPT-4o, o1, o3-mini, and Sec-Gemini v1, ranking first, while our smallest 4B-parameter model ranks second.

Paperid: 2829, https://arxiv.org/pdf/2510.13318.pdf

Abstract:
The integration of blockchain technology into healthcare presents a paradigm shift for secure data management, enabling decentralized and tamper-proof storage and sharing of sensitive Electronic Health Records (EHRs). However, existing blockchain-based healthcare systems, while providing robust access control, commonly overlook the high latency in user-side re-computation of hashes for integrity verification of large multimedia data, impairing their practicality, especially in time-sensitive clinical scenarios. In this paper, we propose FAITH, an innovative scheme for \underline{F}ast \underline{A}uthenticated and \underline{I}nteroperable mul\underline{T}imedia \underline{H}ealthcare data storage and sharing over hybrid-storage blockchains. Rather than user-side hash re-computations, FAITH lets an off-chain storage provider generate verifiable proofs using recursive Zero-Knowledge Proofs (ZKPs), while the user only needs to perform lightweight verification. For flexible access authorization, we leverage Proxy Re-Encryption (PRE) and enable the provider to conduct ciphertext re-encryption, in which the re-encryption correctness can be verified via ZKPs against the malicious provider. All metadata and proofs are recorded on-chain for public verification. We provide a comprehensive analysis of FAITH's security regarding data privacy and integrity. We implemented a prototype of FAITH, and extensive experiments demonstrated its practicality for time-critical healthcare applications, dramatically reducing user-side verification latency by up to $98\%$, bringing it from $4$ s down to around $70$ ms for a $5$ GB encrypted file.

Paperid: 2830, https://arxiv.org/pdf/2510.13257.pdf

Abstract:
Rule-based network intrusion detection systems play a crucial role in the real-time detection of Web attacks. However, most existing works primarily focus on automatically generating detection rules for new attacks, often overlooking the relationships between new attacks and existing rules, which leads to significant redundancy within the ever-expanding ruleset. To address this issue, we propose GRIDAI, a novel end-to-end framework for the automated Generation and Repair of Intrusion Detection rules through collaboration among multiple LLM-based agents. Unlike traditional methods, GRIDAI first assesses the nature of incoming attack samples. If the sample represents a new attack type, it is used to generate a new rule. Otherwise, the sample is identified as a variant of an attack already covered by an existing rule and used to repair the rule by updating the corresponding signature, thereby enhancing its generalization capability. Additionally, to mitigate syntactic and semantic errors in rules caused by LLM hallucinations, we incorporate a tool-based real-time validation mechanism and a representative attack sample maintained for each rule, enabling fully automated rule generation and repair. Comprehensive experiments were conducted on a public dataset containing seven types of attacks and a private dataset with 43 attack types. The results demonstrate that GRIDAI accurately identifies the relationships between new attack samples and existing rules, efficiently generates and repairs rules to handle new attacks and variants, and effectively mitigates the impact of LLM hallucinations.

Paperid: 2831, https://arxiv.org/pdf/2510.13162.pdf

Abstract:
We report on two months of ethnographic fieldwork in a women's centre in Pattaya, and interviews with 76 participants. Our findings, as they relate to digital security, show how (i) women in Pattaya, often working in the sex and massage industries, perceived relationships with farang men as their best, and sometimes only, option to achieve security; (ii) the strategies used by the women to appeal to a farang involved presenting themselves online, mirroring how they were being advertised by bar owners to attract customers; (iii) appealing to what they considered `Western ideals', the women sought out `Western technologies' and appropriated them for their benefit; (iv) the women navigated a series of online security risks, such as scams and abuse, which shaped their search for a farang; (v) the women developed collective security through knowledge-sharing to protect themselves and each other in their search for a farang. We situate our work in emerging digital security scholarship within marginalised contexts.

Paperid: 2832, https://arxiv.org/pdf/2510.13102.pdf

Abstract:
Tools focused on cryptographic API misuse often detect the most basic expressions of the vulnerable use, and are unable to detect non-trivial variants. The question of whether tools should be designed to detect such variants can only be answered if we know how developers use and misuse cryptographic APIs in the wild, and in particular, what the unnatural usage of such APIs looks like. This paper presents the first large-scale study that characterizes unnatural crypto-API usage through a qualitative analysis of 5,704 representative API invocations. We develop an intuitive complexity metric to stratify 140,431 crypto-API invocations obtained from 20,508 Android applications, allowing us to sample 5,704 invocations that are representative of all strata, with each stratum consisting of invocations with similar complexity/naturalness. We qualitatively analyze the 5,704 sampled invocations using manual reverse engineering, through an in-depth investigation that involves the development of minimal examples and exploration of native code. Our study results in two detailed taxonomies of unnatural crypto-API misuse, along with 17 key findings that show the presence of highly unusual misuse, evasive code, and the inability of popular tools to reason about even mildly unconventional usage. Our findings lead to four key takeaways that inform future work focused on detecting unnatural crypto-API misuse.

Paperid: 2833, https://arxiv.org/pdf/2510.13058.pdf

Abstract:
Within the cyber-physical-social-climate nexus, all systems are deeply interdependent: cyber infrastructure facilitates communication, data processing, and automation across physical systems (such as power grids and networks), while social infrastructure provides the human capital and societal norms necessary for the system's functionality. Any disruption within any of these components, whether due to human error or system mismanagement, can propagate throughout the network, amplifying vulnerabilities and creating a significantly scaled impact. This chapter explores the critical role of human vulnerabilities within the cyber-physical-social-climate nexus, focusing on the interdependencies across cyber, physical, and social systems and how these interdependencies can scale in a climate context. While cyber and physical vulnerabilities are readily apparent, social vulnerabilities (such as misinformation, resistance to policy change, and lack of public awareness) often go unaddressed despite their profound impact on resilience and climate adaptation. Social infrastructure, including human capital, societal norms, and policy frameworks, shapes community responses and underpins adaptive capacity, yet it is also a significant point of failure when overlooked. This chapter examines how human cognitive biases, risk misperception, and decision-making silos within interconnected systems can lead to resource misallocation and weakened policy effectiveness. These factors are analyzed to demonstrate how inadequate responses across cyber-physical-social layers can cascade, amplifying climate-related risks. By addressing these human factors and aligning decision-making frameworks, we aim to strengthen resilience and foster cohesive adaptation strategies that account for the intricate interrelations of cyber-physical-social-climate systems.

Paperid: 2834, https://arxiv.org/pdf/2510.12821.pdf

Abstract:
This paper addresses one of the most noteworthy issues in the recent virtual asset market, the privacy concerns related to token transactions of Real-World Assets tokens, known as RWA tokens. Following the advent of Bitcoin, the virtual asset market has experienced explosive growth, spawning movements to link real-world assets with virtual assets. However, due to the transparency principle of blockchain technology, the anonymity of traders cannot be guaranteed. In the existing blockchain environment, there have been instances of protecting the privacy of fungible tokens (FTs) using mixer services. Moreover, numerous studies have been conducted to secure the privacy of non-fungible tokens (NFTs). However, due to the unique characteristics of RWA tokens and the limitations of each study, it has been challenging to achieve the goal of anonymity protection effectively. This paper proposes a new token trading platform, the ARTeX, designed to resolve these issues. This platform not only addresses the shortcomings of existing methods but also ensures the anonymity of traders while enhancing safeguards against illegal activities.

Paperid: 2835, https://arxiv.org/pdf/2510.12811.pdf

Abstract:
Malware proliferation is increasing at a tremendous rate, with hundreds of thousands of new samples identified daily. Manual investigation of such a vast amount of malware is an unrealistic, time-consuming, and overwhelming task. To cope with this volume, there is a clear need to develop specialized techniques and efficient tools for preliminary filtering that can group malware based on semantic similarity. In this paper, we propose TrapNet, a novel, scalable, and unsupervised framework for malware fingerprinting and grouping. TrapNet employs graph community detection techniques for malware fingerprinting and family attribution based on static analysis, as follows: (1) TrapNet detects packed binaries and unpacks them using known generic packer tools. (2) From each malware sample, it generates a digest that captures the underlying semantics. Since the digest must be dense, efficient, and suitable for similarity checking, we designed FloatHash (FH), a novel numerical fuzzy hashing technique that produces a short real-valued vector summarizing the underlying assembly items and their order. FH is based on applying Principal Component Analysis (PCA) to ordered assembly items (e.g., opcodes, function calls) extracted from the malware's assembly code. (3) Representing malware with short numerical vectors enables high-performance, large-scale similarity computation, which allows TrapNet to build a malware similarity network. (4) Finally, TrapNet employs state-of-the-art community detection algorithms to identify dense communities, which represent groups of malware with similar semantics. Our extensive evaluation of TrapNet demonstrates its effectiveness in terms of the coverage and purity of the detected communities, while also highlighting its runtime efficiency, which outperforms other state-of-the-art solutions.

Paperid: 2836, https://arxiv.org/pdf/2510.12172.pdf

Abstract:
Stream processing systems are important in modern applications in which data arrive continuously and need to be processed in real time. Because of their resource and scalability requirements, many of these systems run on the cloud, which is considered untrusted. Existing works on securing databases on the cloud focus on protecting the data, and most systems leverage trusted hardware for high performance. However, in stream processing systems, queries are as sensitive as the data because they contain the application logics. We demonstrate that it is practical to extract the queries from stream processing systems that use Intel SGX for securing the execution engine. The attack performed by a malicious cloud provider is based on timing side channels, and it works in two phases. In the offline phase, the attacker profiles the execution time of individual stream operators, based on synthetic data. This phase outputs a model that identifies individual stream operators. In the online phase, the attacker isolates the operators that make up the query, monitors its execution, and recovers the operators using the model in the previous phase. We implement the attack based on popular data stream benchmarks using SecureStream and NEXMark, and demonstrate attack success rates of up to 92%. We further discuss approaches that can harden streaming processing systems against our attacks without incurring high overhead.

Paperid: 2837, https://arxiv.org/pdf/2510.12045.pdf

Abstract:
An important function of collaborative network intrusion detection is to analyze the network logs of the collaborators for joint IP addresses. However, sharing IP addresses in plain is sensitive and may be even subject to privacy legislation as it is personally identifiable information. In this paper, we present the privacy-preserving collection of IP addresses. We propose a single collector, over-threshold private set intersection protocol. In this protocol $N$ participants identify the IP addresses that appear in at least $t$ participant's sets without revealing any information about other IP addresses. Using a novel hashing scheme, we reduce the computational complexity of the previous state-of-the-art solution from $O(M(N \log{M}/t)^{2t})$ to $O(t^2M\binom{N}{t})$, where $M$ denotes the dataset size. This reduction makes it practically feasible to apply our protocol to real network logs. We test our protocol using joint networks logs of multiple institutions. Additionally, we present two deployment options: a collusion-safe deployment, which provides stronger security guarantees at the cost of increased communication overhead, and a non-interactive deployment, which assumes a non-colluding collector but offers significantly lower communication costs and applicable to many use cases of collaborative network intrusion detection similar to ours.

Paperid: 2838, https://arxiv.org/pdf/2510.11915.pdf

Abstract:
Phishing remains a critical cybersecurity threat, especially with the advent of large language models (LLMs) capable of generating highly convincing malicious content. Unlike earlier phishing attempts which are identifiable by grammatical errors, misspellings, incorrect phrasing, and inconsistent formatting, LLM generated emails are grammatically sound, contextually relevant, and linguistically natural. These advancements make phishing emails increasingly difficult to distinguish from legitimate ones, challenging traditional detection mechanisms. Conventional phishing detection systems often fail when faced with emails crafted by LLMs or manipulated using adversarial perturbation techniques. To address this challenge, we propose a robust phishing email detection system featuring an enhanced text preprocessing pipeline. This pipeline includes spelling correction and word splitting to counteract adversarial modifications and improve detection accuracy. Our approach integrates widely adopted natural language processing (NLP) feature extraction techniques and machine learning algorithms. We evaluate our models on publicly available datasets comprising both phishing and legitimate emails, achieving a detection accuracy of 94.26% and F1-score of 84.39% in model deployment setting. To assess robustness, we further evaluate our models using adversarial phishing samples generated by four attack methods in Python TextAttack framework. Additionally, we evaluate models' performance against phishing emails generated by LLMs including ChatGPT and Llama. Results highlight the resilience of models against evolving AI-powered phishing threats.

Paperid: 2839, https://arxiv.org/pdf/2510.11898.pdf

Abstract:
Wi-Fi networks are ubiquitous in both home and enterprise environments, serving as a primary medium for Internet access and forming the backbone of modern IoT ecosystems. However, their inherent vulnerabilities, combined with widespread adoption, create opportunities for malicious actors to gain unauthorized access or compromise sensitive data stored on connected devices. To address these challenges, we propose a deep learning based network intrusion detection system (NIDS) for Wi-Fi environments. Building on our previous work, we convert network traffic into two-dimensional data representations and use them to train DL models based on convolutional neural network (CNN) architectures. We implement five distinct techniques for generating the two-dimensional representations, and to ensure low detection latency, we adopt lightweight CNN architectures in our NIDS. The models are trained using the AWID3 dataset, a publicly available benchmark for Wi-Fi NIDS research, and are evaluated for both binary and multi-class classification tasks. Experimental results demonstrate that the proposed approach achieves competitive detection performance with low inference time, making it suitable for real-world Wi-Fi deployment scenarios.

Paperid: 2840, https://arxiv.org/pdf/2510.11823.pdf

Abstract:
AI models are being increasingly integrated into real-world systems, raising significant concerns about their safety and security. Consequently, AI red teaming has become essential for organizations to proactively identify and address vulnerabilities before they can be exploited by adversaries. While numerous AI red teaming tools currently exist, practitioners face challenges in selecting the most appropriate tools from a rapidly expanding landscape, as well as managing complex and frequently conflicting software dependencies across isolated projects. Given these challenges and the relatively small number of organizations with dedicated AI red teams, there is a strong need to lower barriers to entry and establish a standardized environment that simplifies the setup and execution of comprehensive AI model assessments. Inspired by Kali Linux's role in traditional penetration testing, we introduce BlackIce, an open-source containerized toolkit designed for red teaming Large Language Models (LLMs) and classical machine learning (ML) models. BlackIce provides a reproducible, version-pinned Docker image that bundles 14 carefully selected open-source tools for Responsible AI and Security testing, all accessible via a unified command-line interface. With this setup, initiating red team assessments is as straightforward as launching a container, either locally or using a cloud platform. Additionally, the image's modular architecture facilitates community-driven extensions, allowing users to easily adapt or expand the toolkit as new threats emerge. In this paper, we describe the architecture of the container image, the process used for selecting tools, and the types of evaluations they support.

Paperid: 2841, https://arxiv.org/pdf/2508.20578.pdf

Abstract:
In Massively Multiplayer Online Role-Playing Games (MMORPGs), auto-leveling bots exploit automated programs to level up characters at scale, undermining gameplay balance and fairness. Detecting such bots is challenging, not only because they mimic human behavior, but also because punitive actions require explainable justification to avoid legal and user experience issues. In this paper, we present a novel framework for detecting auto-leveling bots by leveraging contrastive representation learning and clustering techniques in a fully unsupervised manner to identify groups of characters with similar level-up patterns. To ensure reliable decisions, we incorporate a Large Language Model (LLM) as an auxiliary reviewer to validate the clustered groups, effectively mimicking a secondary human judgment. We also introduce a growth curve-based visualization to assist both the LLM and human moderators in assessing leveling behavior. This collaborative approach improves the efficiency of bot detection workflows while maintaining explainability, thereby supporting scalable and accountable bot regulation in MMORPGs.

Paperid: 2842, https://arxiv.org/pdf/2508.20424.pdf

Abstract:
Diffusion models are a powerful class of generative models that produce content, such as images, from user prompts, but they are computationally intensive. To mitigate this cost, recent academic and industry work has adopted approximate caching, which reuses intermediate states from similar prompts in a cache. While efficient, this optimization introduces new security risks by breaking isolation among users. This work aims to comprehensively assess new security vulnerabilities arising from approximate caching. First, we demonstrate a remote covert channel established with the cache, where a sender injects prompts with special keywords into the cache and a receiver can recover that even after days, to exchange information. Second, we introduce a prompt stealing attack using the cache, where an attacker can recover existing cached prompts based on cache hit prompts. Finally, we introduce a poisoning attack that embeds the attacker's logos into the previously stolen prompt, to render them in future user prompts that hit the cache. These attacks are all performed remotely through the serving system, which indicates severe security vulnerabilities in approximate caching.

Paperid: 2843, https://arxiv.org/pdf/2508.19381.pdf

Abstract:
Classical machine learning (CML) has been extensively studied for malware classification. With the emergence of quantum computing, quantum machine learning (QML) presents a paradigm-shifting opportunity to improve malware detection, though its application in this domain remains largely unexplored. In this study, we investigate two hybrid quantum-classical models -- a Quantum Multilayer Perceptron (QMLP) and a Quantum Convolutional Neural Network (QCNN), for malware classification. Both models utilize angle embedding to encode malware features into quantum states. QMLP captures complex patterns through full qubit measurement and data re-uploading, while QCNN achieves faster training via quantum convolution and pooling layers that reduce active qubits. We evaluate both models on five widely used malware datasets -- API-Graph, EMBER-Domain, EMBER-Class, AZ-Domain, and AZ-Class, across binary and multiclass classification tasks. Our results show high accuracy for binary classification -- 95-96% on API-Graph, 91-92% on AZ-Domain, and 77% on EMBER-Domain. In multiclass settings, accuracy ranges from 91.6-95.7% on API-Graph, 41.7-93.6% on AZ-Class, and 60.7-88.1% on EMBER-Class. Overall, QMLP outperforms QCNN in complex multiclass tasks, while QCNN offers improved training efficiency at the cost of reduced accuracy.

Paperid: 2844, https://arxiv.org/pdf/2508.19286.pdf

Abstract:
The performance of modern machine learning systems depends on access to large, high-quality datasets, often sourced from user-generated content or proprietary, domain-specific corpora. However, these rich datasets inherently contain sensitive personal information, raising significant concerns about privacy, data security, and compliance with regulatory frameworks. While conventional anonymization techniques can remove explicit identifiers, such removal may result in performance drop in downstream machine learning tasks. More importantly, simple anonymization may not be effective against inference attacks that exploit implicit signals such as writing style, topical focus, or demographic cues, highlighting the need for more robust privacy safeguards during model training. To address the challenging issue of balancing user privacy and data utility, we propose a reinforcement learning framework that fine-tunes a large language model (LLM) using a composite reward function that jointly optimizes for explicit and implicit privacy, semantic fidelity, and output diversity. To effectively capture population level regularities, the privacy reward combines semantic cues with structural patterns derived from a minimum spanning tree (MST) over latent representations. By modeling these privacy-sensitive signals in their distributional context, the proposed approach guides the model to generate synthetic rewrites that preserve utility while mitigating privacy risks. Empirical results show that the proposed method significantly enhances author obfuscation and privacy metrics without degrading semantic quality, providing a scalable and model-agnostic solution for privacy preserving data generation in the era of large language models.

Paperid: 2845, https://arxiv.org/pdf/2508.14815.pdf

Abstract:
The integration of information and communication technology (ICT) with traditional power grids has led to the emergence of smart grids. Advanced metering infrastructure (AMI) plays a crucial role in smart grids by facilitating two-way communication between smart meters and the utility provider. This bidirectional communication allows intelligent meters to report fine-grained consumption data at predefined intervals, enabling accurate billing, efficient grid monitoring and management, and rapid outage detection. However, the collection of detailed consumption data can inadvertently disclose consumers' daily activities, raising privacy concerns and potentially leading to privacy violations. To address these issues and preserve individuals' privacy, we propose a lightweight privacy-preserving smart metering protocol specifically designed to support real-time tariff billing service with dynamic policy adjustment. Our scheme employs an efficient data perturbation technique to obscure precise energy usage data from internal adversaries, including the intermediary gateways and the utility provider. Subsequently, we validate the efficiency and security of our protocol through comprehensive performance and privacy evaluations. We examined the computational, memory, and communication overhead of the proposed scheme. The execution time of our secure and privacy-aware billing system is approximately 3.94540 seconds for a complete year. Furthermore, we employed the Jensen-Shannon divergence as a privacy metric to demonstrate that our protocol can effectively safeguard users' privacy by increasing the noise scale.

Paperid: 2846, https://arxiv.org/pdf/2508.14744.pdf

Abstract:
Modern grids have adopted advanced metering infrastructure (AMI) to facilitate bidirectional communication between smart meters and control centers. This enables smart meters to report consumption values at predefined intervals to utility providers for purposes including demand balancing, load forecasting, dynamic billing, and operational efficiency. Compared to traditional power grids, smart grids offer advantages such as enhanced reliability, improved energy efficiency, and increased security. However, utility providers can compromise user privacy by analyzing fine-grained readings and extracting individuals' daily activities from this time-series data. To address this concern, we propose a collusion-resistant, privacy-preserving aggregation protocol for smart metering in operational services. Our protocol ensures privacy by leveraging techniques such as partially additive homomorphic encryption, aggregation, data perturbation, and data minimization. The scheme aggregates perturbed readings using the additive homomorphic property of the Paillier cryptosystem to provide results for multiple operational purposes. We evaluate the protocol in terms of both performance and privacy. Computational, memory, and communication overhead were examined. The total execution time with 1024-bit key size is about 2.21 seconds. We also evaluated privacy through the normalized conditional entropy (NCE) metric. Higher NCE values, closer to 1, indicate stronger privacy. By increasing noise scale, the NCE value rises, showing perturbed values retain minimal information about the original, thereby reducing risks. Overall, evaluation demonstrates the protocol's efficiency while employing various privacy-preserving techniques.

Paperid: 2847, https://arxiv.org/pdf/2508.14703.pdf

Abstract:
The emergence of smart grids and advanced metering infrastructure (AMI) has revolutionized energy management. Unlike traditional power grids, smart grids benefit from two-way communication through AMI, which surpasses earlier automated meter reading (AMR). AMI enables diverse demand- and supply-side utilities such as accurate billing, outage detection, real-time grid control, load forecasting, and value-added services. Smart meters play a key role by delivering consumption values at predefined intervals to the utility provider (UP). However, such reports may raise privacy concerns, as adversaries can infer lifestyle patterns, political orientations, and the types of electrical devices in a household, or even sell the data to third parties (TP) such as insurers. In this paper, we propose a lightweight, privacy-preserving smart metering protocol for incentive-based value-added services. The scheme employs local differential privacy, hash chains, blind digital signatures, pseudonyms, temporal aggregation, and anonymous overlay networks to report coarse-grained values with adjustable granularity to the UP. This protects consumers' privacy while preserving data utility. The scheme prevents identity disclosure while enabling automatic token redemption. From a performance perspective, our results show that with a 1024-bit RSA key, a 7-day duration, and four reports per day, our protocol runs in approximately 0.51s and consumes about 4.5 MB of memory. From a privacy perspective, the protocol resists semi-trusted and untrusted adversaries.

Paperid: 2848, https://arxiv.org/pdf/2508.14568.pdf

Abstract:
This paper presents a novel approach to calculating the Levenshtein (edit) distance within the framework of Fully Homomorphic Encryption (FHE), specifically targeting third-generation schemes like TFHE. Edit distance computations are essential in applications across finance and genomics, such as DNA sequence alignment. We introduce an optimised algorithm that significantly reduces the cost of edit distance calculations called Leuvenshtein. This algorithm specifically reduces the number of programmable bootstraps (PBS) needed per cell of the calculation, lowering it from approximately 94 operations -- required by the conventional Wagner-Fisher algorithm -- to just 1. Additionally, we propose an efficient method for performing equality checks on characters, reducing ASCII character comparisons to only 2 PBS operations. Finally, we explore the potential for further performance improvements by utilising preprocessing when one of the input strings is unencrypted. Our Leuvenshtein achieves up to $278\times$ faster performance compared to the best available TFHE implementation and up to $39\times$ faster than an optimised implementation of the Wagner-Fisher algorithm. Moreover, when offline preprocessing is possible due to the presence of one unencrypted input on the server side, an additional $3\times$ speedup can be achieved.

Paperid: 2849, https://arxiv.org/pdf/2508.14526.pdf

Abstract:
While the digitization of industrial factories provides tremendous improvements for the production of goods, it also renders such systems vulnerable to serious cyber-attacks. To research, test, and validate security measures protecting industrial networks against such cyber-attacks, the security community relies on testbeds to simulate industrial systems, as utilizing live systems endangers costly components or even human life. However, existing testbeds focus on individual parts of typically complex production lines in industrial factories. Consequently, the impact of cyber-attacks on industrial networks as well as the effectiveness of countermeasures cannot be evaluated in an end-to-end manner. To address this issue and facilitate research on novel security mechanisms, we present CoFacS, the first COmplete FACtory Simulation that replicates an entire production line and affords the integration of real-life industrial applications. To showcase that CoFacS accurately captures real-world behavior, we validate it against a physical model factory widely used in security research. We show that CoFacS has a maximum deviation of 0.11% to the physical reference, which enables us to study the impact of physical attacks or network-based cyber-attacks. Moreover, we highlight how CoFacS enables security research through two cases studies surrounding attack detection and the resilience of 5G-based industrial communication against jamming.

Paperid: 2850, https://arxiv.org/pdf/2508.14385.pdf

Abstract:
Effective responses to cyberattacks require fast decisions, even when information about the attack is incomplete or inaccurate. However, most decision-support frameworks for incident response rely on a detailed system model that describes the incident, which restricts their practical utility. In this paper, we address this limitation and present an online method for incident response planning under model misspecification, which we call MOBAL: Misspecified Online Bayesian Learning. MOBAL iteratively refines a conjecture about the model through Bayesian learning as new information becomes available, which facilitates model adaptation as the incident unfolds. To determine effective responses online, we quantize the conjectured model into a finite Markov model, which enables efficient response planning through dynamic programming. We prove that Bayesian learning is asymptotically consistent with respect to the information feedback. Additionally, we establish bounds on misspecification and quantization errors. Experiments on the CAGE-2 benchmark show that MOBAL outperforms the state of the art in terms of adaptability and robustness to model misspecification.

Paperid: 2851, https://arxiv.org/pdf/2508.13030.pdf

Abstract:
Cyberattacks are increasing, and securing against such threats is costing industries billions of dollars annually. Threat Modeling, that is, comprehending the consequences of these attacks, can provide critical support to cybersecurity professionals, enabling them to take timely action and allocate resources that could be used elsewhere. Cybersecurity is heavily dependent on threat modeling, as it assists security experts in assessing and mitigating risks related to identifying vulnerabilities and threats. Recently, there has been a pressing need for automated methods to assess attack descriptions and forecast the future consequences of the increasing complexity of cyberattacks. This study examines how Natural Language Processing (NLP) and deep learning can be applied to analyze the potential impact of cyberattacks by leveraging textual descriptions from the MITRE Common Weakness Enumeration (CWE) database. We emphasize classifying attack consequences into five principal categories: Availability, Access Control, Confidentiality, Integrity, and Other. This paper investigates the use of Bidirectional Encoder Representations from Transformers (BERT) in combination with Hierarchical Attention Networks (HANs) for Multi-label classification, evaluating their performance in comparison with conventional CNN and LSTM-based models. Experimental findings show that BERT achieves an overall accuracy of $0.972$, far higher than conventional deep learning models in multi-label classification. HAN outperforms baseline forms of CNN and LSTM-based models on specific cybersecurity labels. However, BERT consistently achieves better precision and recall, making it more suitable for predicting the consequences of a cyberattack.

Paperid: 2852, https://arxiv.org/pdf/2508.12730.pdf

Abstract:
Machine Unlearning (MU) aims to remove target training data from a trained model so that the removed data no longer influences the model's behavior, fulfilling "right to be forgotten" obligations under data privacy laws. Yet, we observe that researchers in this rapidly emerging field face challenges in analyzing and understanding the behavior of different MU methods, especially in terms of three fundamental principles in MU: accuracy, efficiency, and privacy. Consequently, they often rely on aggregate metrics and ad-hoc evaluations, making it difficult to accurately assess the trade-offs between methods. To fill this gap, we introduce a visual analytics system, Unlearning Comparator, designed to facilitate the systematic evaluation of MU methods. Our system supports two important tasks in the evaluation process: model comparison and attack simulation. First, it allows the user to compare the behaviors of two models, such as a model generated by a certain method and a retrained baseline, at class-, instance-, and layer-levels to better understand the changes made after unlearning. Second, our system simulates membership inference attacks (MIAs) to evaluate the privacy of a method, where an attacker attempts to determine whether specific data samples were part of the original training set. We evaluate our system through a case study visually analyzing prominent MU methods and demonstrate that it helps the user not only understand model behaviors but also gain insights that can inform the improvement of MU methods.

Paperid: 2853, https://arxiv.org/pdf/2508.12107.pdf

Abstract:
Blockchain address poisoning is an emerging phishing attack that crafts "similar-looking" transfer records in the victim's transaction history, which aims to deceive victims and lure them into mistakenly transferring funds to the attacker. Recent works have shown that millions of Ethereum users were targeted and lost over 100 million US dollars. Ethereum crypto wallets, serving users in browsing transaction history and initiating transactions to transfer funds, play a central role in deploying countermeasures to mitigate the address poisoning attack. However, whether they have done so remains an open question. To fill the research void, in this paper, we design experiments to simulate address poisoning attacks and systematically evaluate the usability and security of 53 popular Ethereum crypto wallets. Our evaluation shows that there exist communication failures between 12 wallets and their transaction activity provider, which renders them unable to download the users' transaction history. Besides, our evaluation also shows that 16 wallets pose a high risk to their users due to displaying fake token phishing transfers. Moreover, our further analysis suggests that most wallets rely on transaction activity providers to filter out phishing transfers. However, their phishing detection capability varies. Finally, we found that only three wallets throw an explicit warning message when users attempt to transfer to the phishing address, implying a significant gap within the broader Ethereum crypto wallet community in protecting users from address poisoning attacks. Overall, our work shows that more efforts are needed by the Ethereum crypto wallet developer community to achieve the highest usability and security standard. Our bug reports have been acknowledged by the developer community, who are currently developing mitigation solutions.

Paperid: 2854, https://arxiv.org/pdf/2508.10185.pdf

Abstract:
Digital identities are increasingly important for mediating not only digital but also physical service transactions. Managing such identities through centralized providers can cause both availability and privacy concerns: single points of failure and control are ideal targets for global attacks on technical, organizational, or legal fronts. We design, analyze, and build a distributed digital identity architecture for physical world transactions in common scenarios like unlocking doors, public transport, or crossing country borders. This architecture combines (biometric and other) sensors, (established and upcoming) identity authorities, attribute verifiers, and a new core component we call the \emph{Personal Identity Agent (PIA)} that represents individuals with their identity attributes in the digital domain. All transactions are conducted in a completely decentralized manner, and the components for which we currently assume central coordination are optional and only used for assisting with service discovery and latency reduction. We present a first protocol between these parties and formally verify that it achieves relevant security properties based on a realistic threat model including strong global adversaries. A proof-of-concept implementation demonstrates practical feasibility of both architecture and initial protocol for applications that can tolerate end-to-end latencies in the range of a few seconds.

Paperid: 2855, https://arxiv.org/pdf/2508.10043.pdf

Abstract:
When combining Large Language Models (LLMs) with autonomous agents, used in network monitoring and decision-making systems, this will create serious security issues. In this research, the MAESTRO framework consisting of the seven layers threat modeling architecture in the system was used to expose, evaluate, and eliminate vulnerabilities of agentic AI. The prototype agent system was constructed and implemented, using Python, LangChain, and telemetry in WebSockets, and deployed with inference, memory, parameter tuning, and anomaly detection modules. Two practical threat cases were confirmed as follows: (i) resource denial of service by traffic replay denial-of-service, and (ii) memory poisoning by tampering with the historical log file maintained by the agent. These situations resulted in measurable levels of performance degradation, i.e. telemetry updates were delayed, and computational loads were increased, as a result of poor system adaptations. It was suggested to use a multilayered defense-in-depth approach with memory isolation, validation of planners and anomaly response systems in real-time. These findings verify that MAESTRO is viable in operational threat mapping, prospective risk scoring, and the basis of the resilient system design. The authors bring attention to the importance of the enforcement of memory integrity, paying attention to the adaptation logic monitoring, and cross-layer communication protection that guarantee the agentic AI reliability in adversarial settings.

Paperid: 2856, https://arxiv.org/pdf/2508.06106.pdf

Abstract:
Modeling and simulation are widely used in cybersecurity research to assess cyber threats, evaluate defense mechanisms, and analyze vulnerabilities. However, the diversity of application areas, the variety of cyberattacks scenarios, and the differing objectives of these simulations makes it difficult to identify methodological trends. Existing reviews often focus on specific modeling techniques or application domains, making it challenging to analyze the field as a whole. To address these limitations, we present a comprehensive review of the current state of the art, classifying the selected papers based on four dimensions: the application domain, the types of cyber threats represented, the simulation techniques employed, and the primary goals of the simulation. The review discusses the strengths and limitations of different approaches, identifies which cyber threats are the most suited for simulation-based investigations, and analyzes which modeling paradigms are most appropriate for specific cybersecurity challenges.

Paperid: 2857, https://arxiv.org/pdf/2508.05658.pdf

Abstract:
Various (text) prompt filters and (image) safety checkers have been implemented to mitigate the misuse of Text-to-Image (T2I) models in creating Not-Safe-For-Work (NSFW) content. In order to expose potential security vulnerabilities of such safeguards, multimodal jailbreaks have been studied. However, existing jailbreaks are limited to prompt-specific and image-specific perturbations, which suffer from poor scalability and time-consuming optimization. To address these limitations, we propose Universally Unfiltered and Unseen (U3)-Attack, a multimodal jailbreak attack method against T2I safeguards. Specifically, U3-Attack optimizes an adversarial patch on the image background to universally bypass safety checkers and optimizes a safe paraphrase set from a sensitive word to universally bypass prompt filters while eliminating redundant computations. Extensive experimental results demonstrate the superiority of our U3-Attack on both open-source and commercial T2I models. For example, on the commercial Runway-inpainting model with both prompt filter and safety checker, our U3-Attack achieves $~4\times$ higher success rates than the state-of-the-art multimodal jailbreak attack, MMA-Diffusion.

Paperid: 2858, https://arxiv.org/pdf/2508.04644.pdf

Abstract:
The only known example of an almost perfect nonlinear (APN) permutation in even dimension was obtained by applying CCZ-equivalence to a specific quadratic APN function. Motivated by this result, there have been numerous recent attempts to construct new quadratic APN functions. Currently, 32,892 quadratic APN functions in dimension 8 are known and two recent conjectures address their possible total number. The first, proposed by Y. Yu and L. Perrin (Cryptogr. Commun. 14(6): 1359-1369, 2022), suggests that there are more than 50,000 such functions. The second, by A. Polujan and A. Pott (Proc. 7th Int. Workshop on Boolean Functions and Their Applications, 2022), argues that their number exceeds that of inequivalent quadratic (8,4)-bent functions, which is 92,515. We computationally construct 3,775,599 inequivalent quadratic APN functions in dimension 8 and estimate the total number to be about 6 million.

Paperid: 2859, https://arxiv.org/pdf/2508.04265.pdf

Abstract:
Federated Learning (FL) enables collaborative model training on decentralized data but remains vulnerable to gradient leakage attacks that can reconstruct sensitive user information. Existing defense mechanisms, such as differential privacy (DP) and homomorphic encryption (HE), often introduce a trade-off between privacy, model utility, and system overhead, a challenge that is exacerbated in heterogeneous environments with non-IID data and varying client capabilities. To address these limitations, we propose SelectiveShield, a lightweight hybrid defense framework that adaptively integrates selective homomorphic encryption and differential privacy. SelectiveShield leverages Fisher information to quantify parameter sensitivity, allowing clients to identify critical parameters locally. Through a collaborative negotiation protocol, clients agree on a shared set of the most sensitive parameters for protection via homomorphic encryption. Parameters that are uniquely important to individual clients are retained locally, fostering personalization, while non-critical parameters are protected with adaptive differential privacy noise. Extensive experiments demonstrate that SelectiveShield maintains strong model utility while significantly mitigating gradient leakage risks, offering a practical and scalable defense mechanism for real-world federated learning deployments.

Paperid: 2860, https://arxiv.org/pdf/2508.04100.pdf

Abstract:
Homomorphic Encryption (HE) prevails in securing Federated Learning (FL), but suffers from high overhead and adaptation cost. Selective HE methods, which partially encrypt model parameters by a global mask, are expected to protect privacy with reduced overhead and easy adaptation. However, in cross-device scenarios with heterogeneous data and system capabilities, traditional Selective HE methods deteriorate client straggling, and suffer from degraded HE overhead reduction performance. Accordingly, we propose SenseCrypt, a Sensitivity-guided selective Homomorphic EnCryption framework, to adaptively balance security and HE overhead per cross-device FL client. Given the observation that model parameter sensitivity is effective for measuring clients' data distribution similarity, we first design a privacy-preserving method to respectively cluster the clients with similar data distributions. Then, we develop a scoring mechanism to deduce the straggler-free ratio of model parameters that can be encrypted by each client per cluster. Finally, for each client, we formulate and solve a multi-objective model parameter selection optimization problem, which minimizes HE overhead while maximizing model security without causing straggling. Experiments demonstrate that SenseCrypt ensures security against the state-of-the-art inversion attacks, while achieving normal model accuracy as on IID data, and reducing training time by 58.4%-88.7% as compared to traditional HE methods.

Paperid: 2861, https://arxiv.org/pdf/2508.03981.pdf

Abstract:
With the popularity of smart terminals, such as the Internet of Things, crowdsensing is an emerging data aggregation paradigm, which plays a pivotal role in data-driven applications. There are some key issues in the development of crowdsensing such as platform security and privacy protection. As the crowdsensing is usually managed by a centralized platform, centralized management will bring various security vulnerabilities and scalability issues. To solve these issues, an effective reputation-based partition scheme (RSPC) is proposed in this article. The partition scheme calculates the optimal partition size by combining the node reputation value and divides the node into several disjoint partitions according to the node reputation value. By selecting the appropriate partition size, RSPC provides a mechanism to ensure that each partition is valid, as long as themaximum permissible threshold for the failed node is observed. At the same time, the RSPC reorganizes the network periodically to avoid partition attacks. In addition, for cross-partition transactions, this paper innovatively proposes a four-stage confirmation protocol to ensure the efficient and safe completion of cross-partition transactions. Finally, experiments show that RSPC improves scalability, low latency, and high throughput for crowdsensing.

Paperid: 2862, https://arxiv.org/pdf/2508.03365.pdf

Abstract:
As large language models become increasingly integrated into daily life, audio has emerged as a key interface for human-AI interaction. However, this convenience also introduces new vulnerabilities, making audio a potential attack surface for adversaries. Our research introduces WhisperInject, a two-stage adversarial audio attack framework that can manipulate state-of-the-art audio language models to generate harmful content. Our method uses imperceptible perturbations in audio inputs that remain benign to human listeners. The first stage uses a novel reward-based optimization method, Reinforcement Learning with Projected Gradient Descent (RL-PGD), to guide the target model to circumvent its own safety protocols and generate harmful native responses. This native harmful response then serves as the target for Stage 2, Payload Injection, where we use Projected Gradient Descent (PGD) to optimize subtle perturbations that are embedded into benign audio carriers, such as weather queries or greeting messages. Validated under the rigorous StrongREJECT, LlamaGuard, as well as Human Evaluation safety evaluation framework, our experiments demonstrate a success rate exceeding 86% across Qwen2.5-Omni-3B, Qwen2.5-Omni-7B, and Phi-4-Multimodal. Our work demonstrates a new class of practical, audio-native threats, moving beyond theoretical exploits to reveal a feasible and covert method for manipulating AI behavior.

Paperid: 2863, https://arxiv.org/pdf/2508.02943.pdf

Abstract:
Homomorphic encryption (HE) enables secure computation on encrypted data, safeguarding user privacy in domains such as cloud computing, healthcare, and finance. Among fully homomorphic encryption (FHE) schemes, CKKS is notable for supporting approximate arithmetic over complex numbers, a key requirement for machine-learning and numerical workloads. However, CKKS incurs rapid noise growth, complex parameter tuning, and relies on costly modulus switching. We propose a binary variant of CKKS that operates entirely over binary-coefficient polynomial rings and replaces rescaling with a lightweight bootstrapping mechanism. To mitigate additional bit-flip errors introduced by binary encoding, we integrate BCH error-correcting codes for robust decryption. Our open-source implementation, built on the HElib library, preserves the core algebraic structure of CKKS while introducing binary-coefficient encoding, enabling efficient evaluation in small ring dimensions and unbounded-depth computation. Empirical evaluations demonstrate the framework's practicality and scalability across a range of settings.

Paperid: 2864, https://arxiv.org/pdf/2508.02836.pdf

Abstract:
Privacy-preserving machine learning (PPML) is critical to ensure data privacy in AI. Over the past few years, the community has proposed a wide range of provably secure PPML schemes that rely on various cryptography primitives. However, when it comes to large language models (LLMs) with billions of parameters, the efficiency of PPML is everything but acceptable. For instance, the state-of-the-art solution for confidential LLM inference represents at least 10,000-fold slower performance compared to plaintext inference. The performance gap is even larger when the context length increases. In this position paper, we propose a novel framework named Agentic-PPML to make PPML in LLMs practical. Our key insight is to employ a general-purpose LLM for intent understanding and delegate cryptographically secure inference to specialized models trained on vertical domains. By modularly separating language intent parsing - which typically involves little or no sensitive information - from privacy-critical computation, Agentic-PPML completely eliminates the need for the LLMs to process the encrypted prompts, enabling practical deployment of privacy-preserving LLM-centric services.

Paperid: 2865, https://arxiv.org/pdf/2508.02816.pdf

Abstract:
Side-channel attacks are important security challenges as they reveal sensitive information about on-chip activities. Among such attacks, the thermal side-channel has been shown to disclose the activities of key functional blocks and even encryption keys. This paper proposes a novel approach to proactively conceal critical activities in the functional layers while minimizing the power dissipation by (i) leveraging inherent characteristics of 3D integration to protect from side-channel attacks and (ii) dynamically generating custom activity patterns to match the activity to be concealed in the functional layers. Experimental analysis shows that 3D technology combined with the proposed run-time algorithm effectively reduces the Side channel vulnerability Factor (SVF) below 0.05 and the Spatial Thermal Side-channel Factor (STSF) below 0.59.

Paperid: 2866, https://arxiv.org/pdf/2508.01856.pdf

Abstract:
Blockchain technology has advanced rapidly in recent years and is now widely used in a variety of fields. Blockchain appears to be one of the best solutions for managing massive heterogeneous devices while achieving advanced data security and data reputation, particularly in the field of large-scale IoT (Internet of Things) networks. Despite the numerous advantages, there are still challenges while deploying IoT applications on blockchain systems due to the limited storage, power, and computing capability of IoT devices, and some of these problems are caused by the consensus algorithm, which plays a significant role in blockchain systems by ensuring overall system reliability and robustness. Nonetheless, most existing consensus algorithms are prone to poor node reliability, low transaction per second (TPS) rates, and scalability issues. Aiming at some critical problems in the existing consensus algorithms, this paper proposes the Efficient Byzantine Reputation-based Consensus (EBRC) mechanism to resolve the issues raised above. In comparison to traditional algorithms, we reinvented ways to evaluate node reliability and robustness and manage active nodes. Our experiments show that the EBRC algorithm has lower consensus delay, higher throughput, improved security, and lower verification costs. It offers new reference ideas for solving the Internet of Things+blockchain+Internet court construction problem.

Paperid: 2867, https://arxiv.org/pdf/2508.01798.pdf

Abstract:
As vehicles become increasingly connected and autonomous, they accumulate and manage various personal data, thereby presenting a key challenge in preserving privacy during data sharing and processing. This survey reviews applications of Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE) that address these privacy concerns in the automotive domain. First, we identify the scope of privacy-sensitive use cases for these technologies, by surveying existing works that address privacy issues in different automotive contexts, such as location-based services, mobility infrastructures, traffic management, etc. Then, we review recent works that employ MPC and HE as solutions for these use cases in detail. Our survey highlights the applicability of these privacy-preserving technologies in the automotive context, while also identifying challenges and gaps in the current research landscape. This work aims to provide a clear and comprehensive overview of this emerging field and to encourage further research in this domain.

Paperid: 2868, https://arxiv.org/pdf/2508.01636.pdf

Abstract:
With the increasing deployment of generative machine learning models in privacy-sensitive domains such as healthcare and personalized services, ensuring secure inference has become a critical challenge. Secure multi-party computation (MPC) enables privacy-preserving model inference but suffers from high communication and computation overhead. The main bottleneck lies in the expensive secure evaluation of floating-point operations. Quantization offers a promising solution by converting floating-point operations into lower-precision integer computations, significantly reducing overhead. However, existing MPC-based quantized inference methods either rely on public quantization parameters-posing privacy risks-or suffer from inefficiencies, particularly in handling nonlinear functions such as activations and softmax. In this work, we propose a fine-grained, layer-wise quantization scheme and support 1-bit weight fully connected layers in a secure setting. We design a multi-input lookup table protocol to evaluate softmax efficiently and securely. Furthermore, we use dual secret sharing schemes and perform precision conversions via lookup tables, eliminating truncation overhead entirely. Experimental evaluation on BERT-base models demonstrates that our approach achieves up to $8\times$ speedup compared to Lu \emph{et al}. (NDSS 25), $9\times$ speedup compared to Gupta \emph{et al}. (PETS 24) and $22 \times$ speedup compared to Knott \emph{et al}. (NeurIPS 21).

Paperid: 2869, https://arxiv.org/pdf/2508.01306.pdf

Abstract:
As large language models (LLMs) are increasingly deployed across diverse domains, ensuring their safety has become a critical concern. In response, studies on jailbreak attacks have been actively growing. Existing approaches typically rely on iterative prompt engineering or semantic transformations of harmful instructions to evade detection. In this work, we introduce PUZZLED, a novel jailbreak method that leverages the LLM's reasoning capabilities. It masks keywords in a harmful instruction and presents them as word puzzles for the LLM to solve. We design three puzzle types-word search, anagram, and crossword-that are familiar to humans but cognitively demanding for LLMs. The model must solve the puzzle to uncover the masked words and then proceed to generate responses to the reconstructed harmful instruction. We evaluate PUZZLED on five state-of-the-art LLMs and observe a high average attack success rate (ASR) of 88.8%, specifically 96.5% on GPT-4.1 and 92.3% on Claude 3.7 Sonnet. PUZZLED is a simple yet powerful attack that transforms familiar puzzles into an effective jailbreak strategy by harnessing LLMs' reasoning capabilities.

Paperid: 2870, https://arxiv.org/pdf/2508.00368.pdf

Abstract:
Advanced Persistent Threats (APTs) represent a significant challenge in cybersecurity due to their prolonged, multi-stage nature and the sophistication of their operators. Traditional detection systems typically focus on identifying malicious activity in binary terms (benign or malicious) without accounting for the progression of an attack. However, effective response strategies depend on accurate inference of the attack's current stage, as countermeasures must be tailored to whether an adversary is in the early reconnaissance phase or actively conducting exploitation or exfiltration. This work addresses the problem of attack stage inference under uncertainty, with a focus on robustness to out-of-distribution (OOD) inputs. We propose a classification approach based on Evidential Deep Learning (EDL), which models predictive uncertainty by outputting parameters of a Dirichlet distribution over possible stages. This allows the system not only to predict the most likely stage of an attack but also to indicate when it is uncertain or the input lies outside the training distribution. Preliminary experiments in a simulated environment demonstrate that the proposed model can accurately infer the stage of an attack with calibrated confidence while effectively detecting OOD inputs, which may indicate changes in the attackers' tactics. These results support the feasibility of deploying uncertainty-aware models for staged threat detection in dynamic and adversarial environments.

Paperid: 2871, https://arxiv.org/pdf/2507.21177.pdf

Abstract:
Federated Learning (FL) enables collaborative model training while preserving data privacy, but it is highly vulnerable to backdoor attacks. Most existing defense methods in FL have limited effectiveness due to their neglect of the model's over-reliance on backdoor triggers, particularly as the proportion of malicious clients increases. In this paper, we propose FedBAP, a novel defense framework for mitigating backdoor attacks in FL by reducing the model's reliance on backdoor triggers. Specifically, first, we propose a perturbed trigger generation mechanism that creates perturbation triggers precisely matching backdoor triggers in location and size, ensuring strong influence on model outputs. Second, we utilize these perturbation triggers to generate benign adversarial perturbations that disrupt the model's dependence on backdoor triggers while forcing it to learn more robust decision boundaries. Finally, we design an adaptive scaling mechanism to dynamically adjust perturbation intensity, effectively balancing defense strength and model performance. The experimental results demonstrate that FedBAP reduces the attack success rates by 0.22%-5.34%, 0.48%-6.34%, and 97.22%-97.6% under three types of backdoor attacks, respectively. In particular, FedBAP demonstrates outstanding performance against novel backdoor attacks.

Paperid: 2872, https://arxiv.org/pdf/2507.21142.pdf

Abstract:
Enterprise environments contain a heterogeneous, rapidly growing collection of internal artifacts related to code, data, and many different tools. Critical information for assessing privacy risk and ensuring regulatory compliance is often embedded across these varied resources, each with their own arcane discovery and extraction techniques. Therefore, large-scale privacy compliance in adherence to governmental regulations requires systems to discern the interconnected nature of diverse artifacts in a common, shared universe. We present Privacy Artifact ConnecT or (PACT), an embeddings-driven graph that links millions of artifacts spanning multiple artifact types generated by a variety of teams and projects. Powered by the state-of-the-art DRAGON embedding model, PACT uses a contrastive learning objective with light fine-tuning to link artifacts via their textual components such as raw metadata, ownership specifics, and compliance context. Experimental results show that PACT's fine-tuned model improves recall@1 from 18% to 53%, the query match rate from 9.6% to 69.7% when paired with a baseline AI agent, and the hitrate@1 from 25.7% to 44.9% for candidate selection in a standard recommender system.

Paperid: 2873, https://arxiv.org/pdf/2507.20977.pdf

Abstract:
Background: Automated Vulnerability Repair (AVR) is a fast-growing branch of program repair. Recent studies show that large language models (LLMs) outperform traditional techniques, extending their success beyond code generation and fault detection. Hypothesis: These gains may be driven by hidden factors -- "invisible hands" such as training-data leakage or perfect fault localization -- that let an LLM reproduce human-authored fixes for the same code. Objective: We replicate prior AVR studies under controlled conditions by deliberately adding errors to the reported vulnerability location in the prompt. If LLMs merely regurgitate memorized fixes, both small and large localization errors should yield the same number of correct patches, because any offset should divert the model from the original fix. Method: Our pipeline repairs vulnerabilities from the Vul4J and VJTrans benchmarks after shifting the fault location by n lines from the ground truth. A first LLM generates a patch, a second LLM reviews it, and we validate the result with regression and proof-of-vulnerability tests. Finally, we manually audit a sample of patches and estimate the error rate with the Agresti-Coull-Wilson method.

Paperid: 2874, https://arxiv.org/pdf/2507.19411.pdf

Abstract:
Traditional methods for identifying impactful liquidity providers (LPs) in Concentrated Liquidity Market Makers (CLMMs) rely on broad measures, such as nominal capital size or surface-level activity, which often lead to inaccurate risk analysis. The SILS framework offers a significantly more detailed approach, characterizing LPs not just as capital holders but as dynamic systemic agents whose actions directly impact market stability. This represents a fundamental paradigm shift from the static, volume-based analysis to a dynamic, impact-focused understanding. This advanced approach uses on-chain event logs and smart contract execution traces to compute Exponential Time-Weighted Liquidity (ETWL) profiles and apply unsupervised anomaly detection. Most importantly, it defines an LP's functional importance through the Liquidity Stability Impact Score (LSIS), a counterfactual metric that measures the potential degradation of the market if the LP withdraws. This combined approach provides a more detailed and realistic characterization of an LP's impact, moving beyond the binary and often misleading classifications used by existing methods. This impact-focused and comprehensive approach enables SILS to accurately identify high-impact LPs-including those missed by traditional methods and supports essential applications like a protective oracle layer and actionable trader signals, thereby significantly enhancing DeFi ecosystem. The framework provides unprecedented transparency into the underlying liquidity structure and associated risks, effectively reducing the common false positives and uncovering critical false negatives found in traditional models. Therefore, SILS provides an effective mechanism for proactive risk management, transforming how DeFi protocols safeguard their ecosystems against asymmetric liquidity behavior.

Paperid: 2875, https://arxiv.org/pdf/2507.18774.pdf

Abstract:
As blockchain technologies are increasingly adopted in enterprise and research domains, the need for secure, scalable, and performance-transparent node infrastructure has become critical. While self-hosted Ethereum nodes offer operational control, they often lack elasticity and require complex maintenance. This paper presents a hybrid, service-oriented architecture for deploying and monitoring Ethereum full nodes using Amazon Managed Blockchain (AMB), integrated with EC2-based observability, IAM-enforced security policies, and reproducible automation via the AWS Cloud Development Kit. Our architecture supports end-to-end observability through custom EC2 scripts leveraging Web3.py and JSON-RPC, collecting over 1,000 real-time data points-including gas utilization, transaction inclusion latency, and mempool dynamics. These metrics are visualized and monitored through AWS CloudWatch, enabling service-level performance tracking and anomaly detection. This cloud-native framework restores low-level observability lost in managed environments while maintaining the operational simplicity of managed services. By bridging the simplicity of AMB with the transparency required for protocol research and enterprise monitoring, this work delivers one of the first reproducible, performance-instrumented Ethereum deployments on AMB. The proposed hybrid architecture enables secure, observable, and reproducible Ethereum node operations in cloud environments, suitable for both research and production use.

Paperid: 2876, https://arxiv.org/pdf/2507.18075.pdf

Abstract:
Python software development heavily relies on third-party packages. Direct and transitive dependencies create a labyrinth of software supply chains. While it is convenient to reuse code, vulnerabilities within these dependency chains can propagate through dependencies, potentially affecting down-stream packages and applications. PyPI, the official Python package repository, hosts many packages and lacks a comprehensive analysis of the prevalence of vulnerable dependencies. This paper introduces PyPitfall, a quantitative analysis of vulnerable dependencies across the PyPI ecosystem. We analyzed the dependency structures of 378,573 PyPI packages and identified 4,655 packages that explicitly require at least one known-vulnerable version and 141,044 packages that permit vulnerable versions within specified ranges. By characterizing the ecosystem-wide dependency landscape and the security impact of transitive dependencies, we aim to raise awareness of Python software supply chain security.

Paperid: 2877, https://arxiv.org/pdf/2507.18072.pdf

Abstract:
Wearable accelerometers and gyroscopes encode fine-grained behavioural signatures that can be exploited to re-identify users, making privacy protection essential for healthcare applications. We introduce C-AAE, a compressive anonymizing autoencoder that marries an Anonymizing AutoEncoder (AAE) with Adaptive Differential Pulse-Code Modulation (ADPCM). The AAE first projects raw sensor windows into a latent space that retains activity-relevant features while suppressing identity cues. ADPCM then differentially encodes this latent stream, further masking residual identity information and shrinking the bitrate. Experiments on the MotionSense and PAMAP2 datasets show that C-AAE cuts user re-identification F1 scores by 10-15 percentage points relative to AAE alone, while keeping activity-recognition F1 within 5 percentage points of the unprotected baseline. ADPCM also reduces data volume by roughly 75 %, easing transmission and storage overheads. These results demonstrate that C-AAE offers a practical route to balancing privacy and utility in continuous, sensor-based activity recognition for healthcare.

Paperid: 2878, https://arxiv.org/pdf/2507.18055.pdf

Abstract:
The increasing use of synthetic data generated by Large Language Models (LLMs) presents both opportunities and challenges in data-driven applications. While synthetic data provides a cost-effective, scalable alternative to real-world data to facilitate model training, its diversity and privacy risks remain underexplored. Focusing on text-based synthetic data, we propose a comprehensive set of metrics to quantitatively assess the diversity (i.e., linguistic expression, sentiment, and user perspective), and privacy (i.e., re-identification risk and stylistic outliers) of synthetic datasets generated by several state-of-the-art LLMs. Experiment results reveal significant limitations in LLMs' capabilities in generating diverse and privacy-preserving synthetic data. Guided by the evaluation results, a prompt-based approach is proposed to enhance the diversity of synthetic reviews while preserving reviewer privacy.

Paperid: 2879, https://arxiv.org/pdf/2507.17577.pdf

Abstract:
One of the most practical and challenging types of black-box adversarial attacks is the hard-label attack, where only the top-1 predicted label is available. One effective approach is to search for the optimal ray direction from the benign image that minimizes the $\ell_p$-norm distance to the adversarial region. The unique advantage of this approach is that it transforms the hard-label attack into a continuous optimization problem. The objective function value is the ray's radius, which can be obtained via binary search at a high query cost. Existing methods use a "sign trick" in gradient estimation to reduce the number of queries. In this paper, we theoretically analyze the quality of this gradient estimation and propose a novel prior-guided approach to improve ray search efficiency both theoretically and empirically. Specifically, we utilize the transfer-based priors from surrogate models, and our gradient estimators appropriately integrate them by approximating the projection of the true gradient onto the subspace spanned by these priors and random directions, in a query-efficient manner. We theoretically derive the expected cosine similarities between the obtained gradient estimators and the true gradient, and demonstrate the improvement achieved by incorporating priors. Extensive experiments on the ImageNet and CIFAR-10 datasets show that our approach significantly outperforms 11 state-of-the-art methods in terms of query efficiency.

Paperid: 2880, https://arxiv.org/pdf/2507.16576.pdf

Abstract:
Sharing methods of attack and their effectiveness is a cornerstone of building robust defensive systems. Threat analysis reports, produced by various individuals and organizations, play a critical role in supporting security operations and combating emerging threats. To enhance the timeliness and automation of threat intelligence sharing, several standards have been established, with the Structured Threat Information Expression (STIX) framework emerging as one of the most widely adopted. However, generating STIX-compatible data from unstructured security text remains a largely manual, expert-driven process. To address this challenge, we introduce AZERG, a tool designed to assist security analysts in automatically generating structured STIX representations. To achieve this, we adapt general-purpose large language models for the specific task of extracting STIX-formatted threat data. To manage the complexity, the task is divided into four subtasks: entity detection (T1), entity type identification (T2), related pair detection (T3), and relationship type identification (T4). We apply task-specific fine-tuning to accurately extract relevant entities and infer their relationships in accordance with the STIX specification. To address the lack of training data, we compiled a comprehensive dataset with 4,011 entities and 2,075 relationships extracted from 141 full threat analysis reports, all annotated in alignment with the STIX standard. Our models achieved F1-scores of 84.43% for T1, 88.49% for T2, 95.47% for T3, and 84.60% for T4 in real-world scenarios. We validated their performance against a range of open- and closed-parameter models, as well as state-of-the-art methods, demonstrating improvements of 2-25% across tasks.

Paperid: 2881, https://arxiv.org/pdf/2507.15997.pdf

Abstract:
The increasing adoption of differential privacy (DP) leads to public-facing DP deployments by both government agencies and companies. However, real-world DP deployments often do not fully disclose their privacy guarantees, which vary greatly between deployments. Failure to disclose certain DP parameters can lead to misunderstandings about the strength of the privacy guarantee, undermining the trust in DP. In this work, we seek to inform future standards for communicating the privacy guarantees of DP deployments. Based on semi-structured interviews with 12 DP experts, we identify important DP parameters necessary to comprehensively communicate DP guarantees, and describe why and how they should be disclosed. Based on expert recommendations, we design an initial privacy label for DP to comprehensively communicate privacy guarantees in a standardized format.

Paperid: 2882, https://arxiv.org/pdf/2507.14796.pdf

Abstract:
Trusted Execution Environments (TEEs) are designed to protect the privacy and integrity of data in use. They enable secure data processing and sharing in peer-to-peer networks, such as vehicular ad hoc networks of autonomous vehicles, without compromising confidentiality. In these networks, nodes must establish mutual trust to collaborate securely. TEEs can achieve this through remote attestation, where a prover presents evidence of its trustworthiness to a verifier, which then decides whether or not to trust the prover. However, a naive peer-to-peer attestation approach, where every TEE directly attests every other TEE, results in quadratic communication overhead. This is inefficient in dynamic environments, where nodes frequently join and leave the network. To address this, we present Careful Whisper, a gossip-based protocol that disseminates trust efficiently, reducing attestation overhead to linear complexity under ideal conditions. It enables interoperability by enabling transitive trust across heterogeneous networks, and supports trust establishment with offline nodes via relayed attestations. Using a custom discrete-event simulator, we show that Careful Whisper propagates trust both faster and more widely than naive approaches across various network topologies. Our results demonstrate that our protocol is resource efficient, sending ~21.5 KiB and requiring 0.158 seconds per round in a 200-node network, and that our protocol is resilient to attestation failures across various network topologies.

Paperid: 2883, https://arxiv.org/pdf/2507.14629.pdf

Abstract:
Though vertical federated learning (VFL) is generally considered to be privacy-preserving, recent studies have shown that VFL system is vulnerable to label inference attacks originating from various attack surfaces. Among these attacks, the model completion (MC) attack is currently the most powerful one. Existing defense methods against it either sacrifice model accuracy or incur impractical computational overhead. In this paper, we propose VMask, a novel label privacy protection framework designed to defend against MC attack from the perspective of layer masking. Our key insight is to disrupt the strong correlation between input data and intermediate outputs by applying the secret sharing (SS) technique to mask layer parameters in the attacker's model. We devise a strategy for selecting critical layers to mask, reducing the overhead that would arise from naively applying SS to the entire model. Moreover, VMask is the first framework to offer a tunable privacy budget to defenders, allowing for flexible control over the levels of label privacy according to actual requirements. We built a VFL system, implemented VMask on it, and extensively evaluated it using five model architectures and 13 datasets with different modalities, comparing it to 12 other defense methods. The results demonstrate that VMask achieves the best privacy-utility trade-off, successfully thwarting the MC attack (reducing the label inference accuracy to a random guessing level) while preserving model performance (e.g., in Transformer-based model, the averaged drop of VFL model accuracy is only 0.09%). VMask's runtime is up to 60,846 times faster than cryptography-based methods, and it only marginally exceeds that of standard VFL by 1.8 times in a large Transformer-based model, which is generally acceptable.

Paperid: 2884, https://arxiv.org/pdf/2507.14625.pdf

Abstract:
Vertical federated learning (VFL) enables multiple parties with disjoint features to collaboratively train models without sharing raw data. While privacy vulnerabilities of VFL are extensively-studied, its security threats-particularly targeted label attacks-remain underexplored. In such attacks, a passive party perturbs inputs at inference to force misclassification into adversary-chosen labels. Existing methods rely on unrealistic assumptions (e.g., accessing VFL-model's outputs) and ignore anomaly detectors deployed in real-world systems. To bridge this gap, we introduce VTarbel, a two-stage, minimal-knowledge attack framework explicitly designed to evade detector-enhanced VFL inference. During the preparation stage, the attacker selects a minimal set of high-expressiveness samples (via maximum mean discrepancy), submits them through VFL protocol to collect predicted labels, and uses these pseudo-labels to train estimated detector and surrogate model on local features. In attack stage, these models guide gradient-based perturbations of remaining samples, crafting adversarial instances that induce targeted misclassifications and evade detection. We implement VTarbel and evaluate it against four model architectures, seven multimodal datasets, and two anomaly detectors. Across all settings, VTarbel outperforms four state-of-the-art baselines, evades detection, and retains effective against three representative privacy-preserving defenses. These results reveal critical security blind spots in current VFL deployments and underscore urgent need for robust, attack-aware defenses.

Paperid: 2885, https://arxiv.org/pdf/2507.14223.pdf

Abstract:
Explainable intrusion detection systems (IDS) are now recognized as essential for mission-critical networks, yet most "XAI" pipelines still bolt an approximate explainer onto an opaque classifier, leaving analysts with partial and sometimes misleading insights. The Interpretable Generalization (IG) mechanism, published in IEEE Transactions on Information Forensics and Security, eliminates that bottleneck by learning coherent patterns - feature combinations unique to benign or malicious traffic - and turning them into fully auditable rules. IG already delivers outstanding precision, recall, and AUC on NSL-KDD, UNSW-NB15, and UKM-IDS20, even when trained on only 10% of the data. To raise precision further without sacrificing transparency, we introduce Multi-Granular Discretization (IG-MD), which represents every continuous feature at several Gaussian-based resolutions. On UKM-IDS20, IG-MD lifts precision by greater than or equal to 4 percentage points across all nine train-test splits while preserving recall approximately equal to 1.0, demonstrating that a single interpretation-ready model can scale across domains without bespoke tuning.

Paperid: 2886, https://arxiv.org/pdf/2507.14222.pdf

Abstract:
The Interpretable Generalization (IG) mechanism recently published in IEEE Transactions on Information Forensics and Security delivers state-of-the-art, evidence-based intrusion detection by discovering coherent normal and attack patterns through exhaustive intersect-and-subset operations-yet its cubic-time complexity and large intermediate bitsets render full-scale datasets impractical on CPUs. We present IG-GPU, a PyTorch re-architecture that offloads all pairwise intersections and subset evaluations to commodity GPUs. Implemented on a single NVIDIA RTX 4070 Ti, in the 15k-record NSL-KDD dataset, IG-GPU shows a 116-fold speed-up over the multi-core CPU implementation of IG. In the full size of NSL-KDD (148k-record), given small training data (e.g., 10%-90% train-test split), IG-GPU runs in 18 minutes with Recall 0.957, Precision 0.973, and AUC 0.961, whereas IG required down-sampling to 15k-records to avoid memory exhaustion and obtained Recall 0.935, Precision 0.942, and AUC 0.940. The results confirm that IG-GPU is robust across scales and could provide millisecond-level per-flow inference once patterns are learned. IG-GPU thus bridges the gap between rigorous interpretability and real-time cyber-defense, offering a portable foundation for future work on hardware-aware scheduling, multi-GPU sharding, and dataset-specific sparsity optimizations.

Paperid: 2887, https://arxiv.org/pdf/2507.14109.pdf

Abstract:
Radio frequency (RF) fingerprinting, which extracts unique hardware imperfections of radio devices, has emerged as a promising physical-layer device identification mechanism in zero trust architectures and beyond 5G networks. In particular, deep learning (DL) methods have demonstrated state-of-the-art performance in this domain. However, existing approaches have primarily focused on enhancing system robustness against temporal and spatial variations in wireless environments, while the security vulnerabilities of these DL-based approaches have often been overlooked. In this work, we systematically investigate the security risks of DL-based RF fingerprinting systems through an adversarial-driven experimental analysis. We observe a consistent misclassification behavior for DL models under domain shifts, where a device is frequently misclassified as another specific one. Our analysis based on extensive real-world experiments demonstrates that this behavior can be exploited as an effective backdoor to enable external attackers to intrude into the system. Furthermore, we show that training DL models on raw received signals causes the models to entangle RF fingerprints with environmental and signal-pattern features, creating additional attack vectors that cannot be mitigated solely through post-processing security methods such as confidence thresholds.

Paperid: 2888, https://arxiv.org/pdf/2507.13883.pdf

Abstract:
Stablecoins, with a capitalization exceeding 200 billion USD as of January 2025, have shown significant growth, with annual transaction volumes exceeding 10 trillion dollars in 2023 and nearly doubling that figure in 2024. This exceptional success has attracted the attention of traditional financial institutions, with an increasing number of governments exploring the potential of Central Bank Digital Currencies (CBDCs). Although academia has recognized the importance of stablecoins, research in this area remains fragmented, incomplete, and sometimes contradictory. In this paper, we aim to address the cited gap with a structured literature analysis, correlating recent contributions to present a picture of the complex economic, technical, and regulatory aspects of stablecoins. To achieve this, we formulate the main research questions and categorize scientific contributions accordingly, identifying main results, data sources, methodologies, and open research questions. The research questions we address in this survey paper cover several topics, such as the stability of various stablecoins, novel designs and implementations, and relevant regulatory challenges. The studies employ a wide range of methodologies and data sources, which we critically analyze and synthesize. Our analysis also reveals significant research gaps, including limited studies on security and privacy, underexplored stablecoins, unexamined failure cases, unstudied governance mechanisms, and the treatment of stablecoins under financial accounting standards, among other areas.

Paperid: 2889, https://arxiv.org/pdf/2507.13505.pdf

Abstract:
Cybersecurity simulation environments, such as cyber ranges, honeypots, and sandboxes, require realistic human behavior to be effective, yet no quantitative method exists to assess the behavioral fidelity of synthetic user personas. This paper presents PHASE (Passive Human Activity Simulation Evaluation), a machine learning framework that analyzes Zeek connection logs and distinguishes human from non-human activity with over 90\% accuracy. PHASE operates entirely passively, relying on standard network monitoring without any user-side instrumentation or visible signs of surveillance. All network activity used for machine learning is collected via a Zeek network appliance to avoid introducing unnecessary network traffic or artifacts that could disrupt the fidelity of the simulation environment. The paper also proposes a novel labeling approach that utilizes local DNS records to classify network traffic, thereby enabling machine learning analysis. Furthermore, we apply SHAP (SHapley Additive exPlanations) analysis to uncover temporal and behavioral signatures indicative of genuine human users. In a case study, we evaluate a synthetic user persona and identify distinct non-human patterns that undermine behavioral realism. Based on these insights, we develop a revised behavioral configuration that significantly improves the human-likeness of synthetic activity yielding a more realistic and effective synthetic user persona.

Paperid: 2890, https://arxiv.org/pdf/2507.12364.pdf

Abstract:
Securing sensitive cloud workloads requires composing confidential virtual machines (CVMs) with nested enclaves or sandboxes. Unfortunately, each new isolation boundary adds ad-hoc access control mechanisms, hardware extensions, and trusted software. This escalating complexity bloats the TCB, complicates end-to-end attestation, and leads to fragmentation across platforms and cloud service providers (CSPs). We introduce a unified isolation model that delegates enforceable, composable, and attestable isolation to a single trusted security monitor: Tyche. Tyche provides an API for partitioning, sharing, attesting, and reclaiming resources through its core abstraction, trust domains (TDs). To provide fine-grain isolation, TDs can recursively create and manage sub-TDs. Tyche captures these relationships in attestations, allowing cloud tenants to reason about end-to-end security. TDs serve as the building blocks for constructing composable enclaves, sandboxes, and CVMs. Tyche runs on commodity x86_64 without hardware security extensions and can maintain backward compatibility with existing software. We provide an SDK to run and compose unmodified workloads as sandboxes, enclaves, and CVMs with minimal overhead compared to native Linux execution. Tyche supports complex cloud scenarios, such as confidential inference with mutually distrustful users, model owners, and CSPs. An additional RISC-V prototype demonstrates Tyche's portability across platforms.

Paperid: 2891, https://arxiv.org/pdf/2507.12098.pdf

Abstract:
To mitigate privacy leakage and performance issues in personalized advertising, this paper proposes a framework that integrates federated learning and differential privacy. The system combines distributed feature extraction, dynamic privacy budget allocation, and robust model aggregation to balance model accuracy, communication overhead, and privacy protection. Multi-party secure computing and anomaly detection mechanisms further enhance system resilience against malicious attacks. Experimental results demonstrate that the framework achieves dual optimization of recommendation accuracy and system efficiency while ensuring privacy, providing both a practical solution and a theoretical foundation for applying privacy protection technologies in advertisement recommendation.

Paperid: 2892, https://arxiv.org/pdf/2507.11943.pdf

Abstract:
We propose a low-rank adaptation method for training privacy-preserving vision transformer (ViT) models that efficiently freezes pre-trained ViT model weights. In the proposed method, trainable rank decomposition matrices are injected into each layer of the ViT architecture, and moreover, the patch embedding layer is not frozen, unlike in the case of the conventional low-rank adaptation methods. The proposed method allows us not only to reduce the number of trainable parameters but to also maintain almost the same accuracy as that of full-time tuning.

Paperid: 2893, https://arxiv.org/pdf/2507.10854.pdf

Abstract:
Phishing remains a pervasive and growing threat, inflicting heavy economic and reputational damage. While machine learning has been effective in real-time detection of phishing attacks, progress is hindered by lack of large, high-quality datasets and benchmarks. In addition to poor-quality due to challenges in data collection, existing datasets suffer from leakage and unrealistic base rates, leading to overly optimistic performance results. In this paper, we introduce PhreshPhish, a large-scale, high-quality dataset of phishing websites that addresses these limitations. Compared to existing public datasets, PhreshPhish is substantially larger and provides significantly higher quality, as measured by the estimated rate of invalid or mislabeled data points. Additionally, we propose a comprehensive suite of benchmark datasets specifically designed for realistic model evaluation by minimizing leakage, increasing task difficulty, enhancing dataset diversity, and adjustment of base rates more likely to be seen in the real world. We train and evaluate multiple solution approaches to provide baseline performance on the benchmark sets. We believe the availability of this dataset and benchmarks will enable realistic, standardized model comparison and foster further advances in phishing detection. The datasets and benchmarks are available on Hugging Face (https://huggingface.co/datasets/phreshphish/phreshphish).

Paperid: 2894, https://arxiv.org/pdf/2507.10610.pdf

Abstract:
Graphical user interface (GUI) agents built on multimodal large language models (MLLMs) have recently demonstrated strong decision-making abilities in screen-based interaction tasks. However, they remain highly vulnerable to pop-up-based environmental injection attacks, where malicious visual elements divert model attention and lead to unsafe or incorrect actions. Existing defense methods either require costly retraining or perform poorly under inductive interference. In this work, we systematically study how such attacks alter the attention behavior of GUI agents and uncover a layer-wise attention divergence pattern between correct and incorrect outputs. Based on this insight, we propose \textbf{LaSM}, a \textit{Layer-wise Scaling Mechanism} that selectively amplifies attention and MLP modules in critical layers. LaSM improves the alignment between model saliency and task-relevant regions without additional training. Extensive experiments across 12 types of pop-up perturbations and 4 different model backbones show that LaSM consistently enhances the defense success rate. When combined with prompt-level alerts, LaSM achieves over 98\% robustness even under strong inductive attacks. Our findings reveal that attention misalignment is a core vulnerability in MLLM agents and can be effectively addressed through selective layer-wise modulation.

Paperid: 2895, https://arxiv.org/pdf/2507.10591.pdf

Abstract:
Feature selection is vital for building effective predictive models, as it reduces dimensionality and emphasizes key features. However, current research often suffers from limited benchmarking and reliance on proprietary datasets. This severely hinders reproducibility and can negatively impact overall performance. To address these limitations, we introduce the MH-FSF framework, a comprehensive, modular, and extensible platform designed to facilitate the reproduction and implementation of feature selection methods. Developed through collaborative research, MH-FSF provides implementations of 17 methods (11 classical, 6 domain-specific) and enables systematic evaluation on 10 publicly available Android malware datasets. Our results reveal performance variations across both balanced and imbalanced datasets, highlighting the critical need for data preprocessing and selection criteria that account for these asymmetries. We demonstrate the importance of a unified platform for comparing diverse feature selection techniques, fostering methodological consistency and rigor. By providing this framework, we aim to significantly broaden the existing literature and pave the way for new research directions in feature selection, particularly within the context of Android malware detection.

Paperid: 2896, https://arxiv.org/pdf/2507.10162.pdf

Abstract:
Vertical Federated Learning (VFL) enables an orchestrating active party to perform a machine learning task by cooperating with passive parties that provide additional task-related features for the same training data entities. While prior research has leveraged the privacy vulnerability of VFL to compromise its integrity through a combination of label inference and backdoor attacks, their effectiveness is constrained by the low label inference precision and suboptimal backdoor injection conditions. To facilitate a more rigorous security evaluation on VFL without these limitations, we propose HASSLE, a hijacking attack framework composed of a gradient-direction-based label inference module and an adversarial embedding generation algorithm enhanced by self-supervised learning. HASSLE accurately identifies private samples associated with a targeted label using only a single known instance of that label. In the two-party scenario, it demonstrates strong performance with an attack success rate (ASR) of over 99% across four datasets, including both image and tabular modalities, and achieves 85% ASR on the more complex CIFAR-100 dataset. Evaluation of HASSLE against 8 potential defenses further highlights its significant threat while providing new insights into building a trustworthy VFL system.

Paperid: 2897, https://arxiv.org/pdf/2507.08163.pdf

Abstract:
We propose Adaptive Diffusion Denoised Smoothing, a method for certifying the predictions of a vision model against adversarial examples, while adapting to the input. Our key insight is to reinterpret a guided denoising diffusion model as a long sequence of adaptive Gaussian Differentially Private (GDP) mechanisms refining a pure noise sample into an image. We show that these adaptive mechanisms can be composed through a GDP privacy filter to analyze the end-to-end robustness of the guided denoising process, yielding a provable certification that extends the adaptive randomized smoothing analysis. We demonstrate that our design, under a specific guiding strategy, can improve both certified accuracy and standard accuracy on ImageNet for an $\ell_2$ threat model.

Paperid: 2898, https://arxiv.org/pdf/2507.07901.pdf

Abstract:
The fragmentation of AI agent ecosystems has created urgent demands for interoperability, trust, and economic coordination that current protocols -- including MCP (Hou et al., 2025), A2A (Habler et al., 2025), ACP (Liu et al., 2025), and Cisco's AGP (Edwards, 2025) -- cannot address at scale. We present the Nanda Unified Architecture, a decentralized framework built around three core innovations: fast DID-based agent discovery through distributed registries, semantic agent cards with verifiable credentials and composability profiles, and a dynamic trust layer that integrates behavioral attestations with policy compliance. The system introduces X42/H42 micropayments for economic coordination and MAESTRO, a security framework incorporating Synergetics' patented AgentTalk protocol (US Patent 12,244,584 B1) and secure containerization. Real-world deployments demonstrate 99.9 percent compliance in healthcare applications and substantial monthly transaction volumes with strong privacy guarantees. By unifying MIT's trust research with production deployments from Cisco and Synergetics, we show how cryptographic proofs and policy-as-code transform agents into trust-anchored participants in a decentralized economy (Lakshmanan, 2025; Sha, 2025). The result enables a globally interoperable Internet of Agents where trust becomes the native currency of collaboration across both enterprise and Web3 ecosystems.

Paperid: 2899, https://arxiv.org/pdf/2507.07246.pdf

Abstract:
For reverse engineering related security domains, such as vulnerability detection, malware analysis, and binary hardening, disassembly is crucial yet challenging. The fundamental challenge of disassembly is to identify instruction and function boundaries. Classic approaches rely on file-format assumptions and architecture-specific heuristics to guess the boundaries, resulting in incomplete and incorrect disassembly, especially when the binary is obfuscated. Recent advancements of disassembly have demonstrated that deep learning can improve both the accuracy and efficiency of disassembly. In this paper, we propose Disa, a new learning-based disassembly approach that uses the information of superset instructions over the multi-head self-attention to learn the instructions' correlations, thus being able to infer function entry-points and instruction boundaries. Disa can further identify instructions relevant to memory block boundaries to facilitate an advanced block-memory model based value-set analysis for an accurate control flow graph (CFG) generation. Our experiments show that Disa outperforms prior deep-learning disassembly approaches in function entry-point identification, especially achieving 9.1% and 13.2% F1-score improvement on binaries respectively obfuscated by the disassembly desynchronization technique and popular source-level obfuscator. By achieving an 18.5% improvement in the memory block precision, Disa generates more accurate CFGs with a 4.4% reduction in Average Indirect Call Targets (AICT) compared with the state-of-the-art heuristic-based approach.

Paperid: 2900, https://arxiv.org/pdf/2507.07210.pdf

Abstract:
Smartwatches such as the Apple Watch collect vast amounts of intimate health and fitness data as we wear them. Users have little choice regarding how this data is processed: The Apple Watch can only be used with Apple's iPhones, using their software and their cloud services. We are the first to publicly reverse-engineer the watch's wireless protocols, which led to discovering multiple security issues in Apple's proprietary implementation. With WatchWitch, our custom Android reimplementation, we break out of Apple's walled garden -- demonstrating practical interoperability with enhanced privacy controls and data autonomy. We thus pave the way for more consumer choice in the smartwatch ecosystem, offering users more control over their devices.

Paperid: 2901, https://arxiv.org/pdf/2507.06262.pdf

Abstract:
Data poisoning attacks pose significant threats to machine learning models by introducing malicious data into the training process, thereby degrading model performance or manipulating predictions. Detecting and sifting out poisoned data is an important method to prevent data poisoning attacks. Limited by classical computation frameworks, upcoming larger-scale and more complex datasets may pose difficulties for detection. We introduce the unique speedup of quantum computing for the first time in the task of detecting data poisoning. We present Q-Detection, a quantum-classical hybrid defense method for detecting poisoning attacks. Q-Detection also introduces the Q-WAN, which is optimized using quantum computing devices. Experimental results using multiple quantum simulation libraries show that Q-Detection effectively defends against label manipulation and backdoor attacks. The metrics demonstrate that Q-Detection consistently outperforms the baseline methods and is comparable to the state-of-the-art. Theoretical analysis shows that Q-Detection is expected to achieve more than a 20% speedup using quantum computing power.

Paperid: 2902, https://arxiv.org/pdf/2507.05683.pdf

Abstract:
A novel original procedure of encryption/decryption based on the polyadic algebraic structures and on signal processing methods is proposed. First, we use signals with integer amplitudes to send information. Then we use polyadic techniques to transfer the plaintext into series of special integers. The receiver restores the plaintext using special rules and systems of equations.

Paperid: 2903, https://arxiv.org/pdf/2507.05421.pdf

Abstract:
Coverage-guided fuzzers are powerful automated bug-finding tools. They mutate program inputs, observe coverage, and save any input that hits an unexplored path for future mutation. Unfortunately, without knowledge of input formats--for example, the relationship between formats' data fields and sizes--fuzzers are prone to generate destructive frameshift mutations. These time-wasting mutations yield malformed inputs that are rejected by the target program. To avoid such breaking mutations, this paper proposes a novel, lightweight technique that preserves the structure of inputs during mutation by detecting and using relation fields. Our technique, FrameShift, is simple, fast, and does not require additional instrumentation beyond standard coverage feedback. We implement our technique in two state-of-the-art fuzzers, AFL++ and LibAFL, and perform a 12+ CPU-year fuzzer evaluation, finding that FrameShift improves the performance of the fuzzer in each configuration, sometimes increasing coverage by more than 50%. Furthermore, through a series of case studies, we show that our technique is versatile enough to find important structural relationships in a variety of formats, even generalizing beyond C/C++ targets to both Rust and Python.

Paperid: 2904, https://arxiv.org/pdf/2507.05213.pdf

Abstract:
Threat hunting is an operational security process where an expert analyzes traffic, applying knowledge and lightweight tools on unlabeled data in order to identify and classify previously unknown phenomena. In this paper, we examine threat hunting metrics and practice by studying the detection of Crackonosh, a cryptojacking malware package, has on various metrics for identifying its behavior. Using a metric for discoverability, we model the ability of defenders to measure Crackonosh traffic as the malware population decreases, evaluate the strength of various detection methods, and demonstrate how different darkspace sizes affect both the ability to track the malware, but enable emergent behaviors by exploiting attacker mistakes.

Paperid: 2905, https://arxiv.org/pdf/2507.04855.pdf

Abstract:
Program analysis and automated testing have recently become an essential part of SSDLC. Directed greybox fuzzing is one of the most popular automated testing methods that focuses on error detection in predefined code regions. However, it still lacks ability to overcome difficult program constraints. This problem can be well addressed by symbolic execution, but at the cost of lower performance. Thus, combining directed fuzzing and symbolic execution techniques can lead to more efficient error detection. In this paper, we propose a hybrid approach to directed fuzzing with novel seed scheduling algorithm, based on target-related interestingness and coverage. The approach also performs minimization and sorting of objective seeds according to a target-related information. We implement our approach in Sydr-Fuzz tool using LibAFL-DiFuzz as directed fuzzer and Sydr as dynamic symbolic executor. We evaluate our approach with Time to Exposure metric and compare it with pure LibAFL-DiFuzz, AFLGo, BEACON, WAFLGo, WindRanger, FishFuzz, and Prospector. The results show an improvement for 3 out of 7 examples with speedup up to 1.86 times over the second best result, as well as a significant improvement for 3 out of 7 examples over the pure LibAFL-DiFuzz fuzzer. Sydr-Fuzz hybrid approach to directed fuzzing shows high performance and helps to improve directed fuzzing efficiency.

Paperid: 2906, https://arxiv.org/pdf/2507.04818.pdf

Abstract:
The widespread deployment of embedded systems in critical infrastructures, interconnected edge devices like autonomous drones, and smart industrial systems requires robust security measures. Compromised systems increase the risks of operational failures, data breaches, and -- in safety-critical environments -- potential physical harm to people. Despite these risks, current security measures are often insufficient to fully address the attack surfaces of embedded devices. CHERI provides strong security from the hardware level by enabling fine-grained compartmentalization and memory protection, which can reduce the attack surface and improve the reliability of such devices. In this work, we explore the potential of CHERI to compartmentalize one of the most critical and targeted components of interconnected systems: their network stack. Our case study examines the trade-offs of isolating applications, TCP/IP libraries, and network drivers on a CheriBSD system deployed on the Arm Morello platform. Our results suggest that CHERI has the potential to enhance security while maintaining performance in embedded-like environments.

Paperid: 2907, https://arxiv.org/pdf/2507.04357.pdf

Abstract:
Ethereum smart contracts operate in a concurrent environment where multiple transactions can be submitted simultaneously. However, the Ethereum Virtual Machine (EVM) enforces sequential execution of transactions within each block to prevent conflicts arising from concurrent access to the same state variables. Although this approach guarantees correct behavior, it limits the ability of validators to leverage multi-core architectures for faster transaction processing, thus restricting throughput. Existing solutions introduce concurrency by allowing simultaneous transaction execution combined with runtime conflict detection and rollback mechanisms to maintain correctness. However, these methods incur significant overhead due to continuous conflict tracking and transaction reversion. Recently, alternative approaches have emerged that aim to predict conflicts statically, before execution, by analyzing smart contract code for potential transaction interactions. Despite their promise, there is a lack of comprehensive studies that examine static conflict detection and its broader implications in specific smart contracts. This paper fills this important gap by proposing a novel static analysis method to detect potential transaction conflicts in Ethereum smart contracts. Our method identifies read-write, write-write, and function call conflicts between transaction pairs by analyzing state variable access patterns in Solidity contracts. We implement a tool that parses contract code and performs conflict detection. Evaluation on a dataset of real-world Ethereum smart contracts demonstrates that our approach achieves high precision in identifying potential conflicts. By enabling proactive conflict detection, our tool supports further design of transaction scheduling strategies that reduce runtime failures, enhance validator throughput, and contribute to blockchain scalability.

Paperid: 2908, https://arxiv.org/pdf/2507.04015.pdf

Abstract:
Rabin oblivious transfer is the cryptographic task where Alice wishes to receive a bit from Bob but it may get lost with probability 1/2. In this work, we provide protocol designs which yield quantum protocols with improved security. Moreover, we provide a constant lower bound on any quantum protocol for Rabin oblivious transfer. To quantify the security of this task with asymmetric cheating definitions, we introduce the notion of cheating advantage which may be of independent interest in the study of other asymmetric cryptographic primitives.

Paperid: 2909, https://arxiv.org/pdf/2507.02536.pdf

Abstract:
This paper presents a blockchain-based Internet of Things (IoT) system for monitoring pizza production in restaurants. IoT devices track temperature and humidity in real-time, while blockchain ensures secure and tamper-proof data. A Raspberry Pi processes sensor data, captures images, triggers alerts, and interacts with smart contracts. The system detects abnormal conditions, enabling quick responses. Blockchain adds transparency and traceability, supporting compliance and audits. Experiments show improved ingredient management, reduced waste, and increased kitchen efficiency.

Paperid: 2910, https://arxiv.org/pdf/2507.02489.pdf

Abstract:
We propose a new 10-bit S-box generated from a Feistel construction. The subpermutations are generated by a 5-cell cellular automaton based on a unique well-chosen rule and bijective affine transformations. In particular, the cellular automaton rule is chosen based on empirical tests of its ability to generate good pseudorandom output on a ring cellular automaton. Similarly, Feistel's network layout is based on empirical data regarding the quality of the output S-box. We perform cryptanalysis of the generated 10-bit S-box, and we find security properties comparable to or sometimes even better than those of the standard AES S-box. We believe that our S-box could be used to replace the 5-bit substitution of ciphers like ASCON.

Paperid: 2911, https://arxiv.org/pdf/2507.01808.pdf

Abstract:
Small- and medium-sized manufacturers need innovative data tools but, because of competition and privacy concerns, often do not want to share their proprietary data with researchers who might be interested in helping. This paper introduces a privacy-preserving platform by which manufacturers may safely share their data with researchers through secure methods, so that those researchers then create innovative tools to solve the manufacturers' real-world problems, and then provide tools that execute solutions back onto the platform for others to use with privacy and confidentiality guarantees. We illustrate this problem through a particular use case which addresses an important problem in the large-scale manufacturing of food crystals, which is that quality control relies on image analysis tools. Previous to our research, food crystals in the images were manually counted, which required substantial and time-consuming human efforts, but we have developed and deployed a crystal analysis tool which makes this process both more rapid and accurate. The tool enables automatic characterization of the crystal size distribution and numbers from microscope images while the natural imperfections from the sample preparation are automatically removed; a machine learning model to count high resolution translucent crystals and agglomeration of crystals was also developed to aid in these efforts. The resulting algorithm was then packaged for real-world use on the factory floor via a web-based app secured through the originating privacy-preserving platform, allowing manufacturers to use it while keeping their proprietary data secure. After demonstrating this full process, future directions are also explored.

Paperid: 2912, https://arxiv.org/pdf/2507.00367.pdf

Abstract:
Hybrid Homomorphic Encryption (HHE) combines symmetric key and homomorphic encryption to reduce ciphertext expansion crucial in client-server deployments of HE. Special symmetric ciphers, amenable to efficient HE evaluation, have been developed. Their client-side deployment calls for performant and energy-efficient implementation, and in this paper we develop and evaluate hardware accelerators for the two known CKKS-targeting HHE ciphers, HERA and Rubato. We design vectorized and overlapped functional modules. The design exploits transposition-invariance property of the MixColumns and MixRows function and alternates the order of intermediate state to eliminate bubbles in stream key generation, improving latency and throughput. We decouple the RNG and key computation phases to hide the latency of RNG and to reduce the critical path in FIFOs, achieving higher operating frequency. We implement the accelerator on an AMD Virtex UltraScale+ FPGA. Both Rubato and HERA achieve a 6x improvement in throughput compared to the software implementation. In terms of latency, Rubato achieves a 5x reduction, while HERA achieves a 3x reduction. Additionally, our hardware implementations reduce energy consumption by 75x for Rubato and 47x for HERA compared to their software implementation.

Paperid: 2913, https://arxiv.org/pdf/2507.00299.pdf

Abstract:
TikTok, the social media platform that is popular among children and adolescents, offers a more restrictive "Under 13 Experience" exclusively for young users in the US, also known as TikTok's "Kids Mode". While prior research has studied various aspects of TikTok's regular mode, including privacy and personalization, TikTok's Kids Mode remains understudied, and there is a lack of transparency regarding its content curation and its safety and privacy protections for children. In this paper, (i) we propose an auditing methodology to comprehensively investigate TikTok's Kids Mode and (ii) we apply it to characterize the platform's content curation and determine the prevalence of child-directed content, based on regulations in the Children's Online Privacy Protection Act (COPPA). We find that 83% of videos observed on the "For You" page in Kids Mode are actually not child-directed, and even inappropriate content was found. The platform also lacks critical features, namely parental controls and accessibility settings. Our findings have important design and regulatory implications, as children may be incentivized to use TikTok's regular mode instead of Kids Mode, where they are known to be exposed to further safety and privacy risks.

Paperid: 2914, https://arxiv.org/pdf/2512.24652.pdf

Abstract:
Multi-Party Private Set Intersection (MP-PSI) with threshold enhances the flexibility of MP-PSI by disclosing elements present in at least $t$ participants' sets, rather than requiring elements to appear in all $n$ sets. In scenarios where each participant is responsible for its dataset, e.g., digital forensics, MP-PSI with threshold should disclose both intersection elements and corresponding holders such that elements are traceable and the reliability of intersection is guaranteed. We refer to MP-PSI with threshold supporting traceability as Traceable Over-Threshold MP-PSI (T-OT-MP-PSI). However, research on such protocols remains limited, and existing work tolerates at most $t-2$ semi-honest participants at considerable computational cost. We propose two novel Traceable OT-MP-PSI protocols. The first, Efficient Traceable OT-MP-PSI (ET-OT-MP-PSI), combines Shamir's secret sharing with an oblivious programmable pseudorandom function, achieving significantly improved efficiency with resistance to at most $t-2$ semi-honest participants. The second, Security-enhanced Traceable OT-MP-PSI (ST-OT-MP-PSI), achieves security against up to $n-1$ semi-honest participants by further leveraging the oblivious linear evaluation protocol. Compared to Mahdavi et al.'s protocol, ours eliminate the assumption that certain special parties do not collude. Experimental results demonstrate significant improvements: for $n=5$, $t=3$, and sets of size $2^{14}$, ET-OT-MP-PSI achieves $15056\times$ speedup and ST-OT-MP-PSI achieves $505\times$ speedup over Mahdavi et al.'s protocol.

Paperid: 2915, https://arxiv.org/pdf/2512.24457.pdf

Abstract:
The real estate sector remains highly dependent on manual document handling and verification, making processes inefficient and prone to fraud. This work presents a system that integrates optical character recognition (OCR), natural language processing (NLP), and verifiable credentials (VCs) to automate document extraction, verification, and management. The approach standardizes heterogeneous document formats into VCs and applies automated data matching to detect inconsistencies, while the blockchain provides a decentralized trust layer that reinforces transparency and integrity. A prototype was developed that comprises (i) an OCR-NLP extraction pipeline trained on synthetic datasets, (ii) a backend for credential issuance and management, and (iii) a frontend supporting issuer, holder, and verifier interactions. Experimental results show that the models achieve competitive accuracy across multiple document types and that the end-to-end pipeline reduces verification time while preserving reliability. The proposed framework demonstrates the potential to streamline real estate transactions, strengthen stakeholder trust, and enable scalable, secure digital processes.

Paperid: 2916, https://arxiv.org/pdf/2512.23948.pdf

Abstract:
Convolutional Neural Networks (CNNs) and their quantized counterparts are vulnerable to extraction attacks, posing a significant threat of IP theft. Yet, the robustness of quantized models against these attacks is little studied compared to large models. Previous defenses propose to inject calculated noise into the prediction probabilities. However, these defenses are limited since they are not incorporated during the model design and are only added as an afterthought after training. Additionally, most defense techniques are computationally expensive and often have unrealistic assumptions about the victim model that are not feasible in edge device implementations and do not apply to quantized models. In this paper, we propose DivQAT, a novel algorithm to train quantized CNNs based on Quantization Aware Training (QAT) aiming to enhance their robustness against extraction attacks. To the best of our knowledge, our technique is the first to modify the quantization process to integrate a model extraction defense into the training process. Through empirical validation on benchmark vision datasets, we demonstrate the efficacy of our technique in defending against model extraction attacks without compromising model accuracy. Furthermore, combining our quantization technique with other defense mechanisms improves their effectiveness compared to traditional QAT.

Paperid: 2917, https://arxiv.org/pdf/2512.22307.pdf

Abstract:
We introduce LLA, an effective intellectual property (IP) protection scheme for generative AI models. LLA leverages the synergy between hardware and software to defend against various supply chain threats, including model theft, model corruption, and information leakage. On the software side, it embeds key bits into neurons that can trigger outliers to degrade performance and applies invariance transformations to obscure the key values. On the hardware side, it integrates a lightweight locking module into the AI accelerator while maintaining compatibility with various dataflow patterns and toolchains. An accelerator with a pre-stored secret key acts as a license to access the model services provided by the IP owner. The evaluation results show that LLA can withstand a broad range of oracle-guided key optimization attacks, while incurring a minimal computational overhead of less than 0.1% for 7,168 key bits.

Paperid: 2918, https://arxiv.org/pdf/2512.22090.pdf

Abstract:
Trusted Execution Environments (TEEs) protect sensitive code and data from the operating system, hypervisor, or other untrusted software. Different solutions exist, each proposing different features. Abstraction layers aim to unify the ecosystem, allowing application developers and system administrators to leverage confidential computing as broadly and efficiently as possible. We start with an overview of representative available TEE technologies. We describe and summarize each TEE ecosystem, classifying them in different categories depending on their main design choices. Then, we propose a systematization of knowledge focusing on different abstraction layers around each design choice. We describe the underlying technologies of each design, as well as the inner workings and features of each abstraction layer. Our study reveals opportunities for improving existing abstraction layer solutions. It also highlights WebAssembly, a promising approach that supports the largest set of features. We close with a discussion on future directions for research, such as how future abstraction layers may evolve and integrate with the confidential computing ecosystem.

Paperid: 2919, https://arxiv.org/pdf/2512.21499.pdf

Abstract:
We revisit the task of releasing marginal queries under differential privacy with additive (correlated) Gaussian noise. We first give a construction for answering arbitrary workloads of weighted marginal queries, over arbitrary domains. Our technique is based on releasing queries in the Fourier basis with independent noise with carefully calibrated variances, and reconstructing the marginal query answers using the inverse Fourier transform. We show that our algorithm, which is a factorization mechanism, is exactly optimal among all factorization mechanisms, both for minimizing the sum of weighted noise variances, and for minimizing the maximum noise variance. Unlike algorithms based on optimizing over all factorization mechanisms via semidefinite programming, our mechanism runs in time polynomial in the dataset and the output size. This construction recovers results of Xiao et al. [Neurips 2023] with a simpler algorithm and optimality proof, and a better running time. We then extend our approach to a generalization of marginals which we refer to as product queries. We show that our algorithm is still exactly optimal for this more general class of queries. Finally, we show how to embed extended marginal queries, which allow using a threshold predicate on numerical attributes, into product queries. We show that our mechanism is almost optimal among all factorization mechanisms for extended marginals, in the sense that it achieves the optimal (maximum or average) noise variance up to lower order terms.

Paperid: 2920, https://arxiv.org/pdf/2512.21110.pdf

Abstract:
Current Large Language Models (LLMs) safety approaches focus on explicitly harmful content while overlooking a critical vulnerability: the inability to understand context and recognize user intent. This creates exploitable vulnerabilities that malicious users can systematically leverage to circumvent safety mechanisms. We empirically evaluate multiple state-of-the-art LLMs, including ChatGPT, Claude, Gemini, and DeepSeek. Our analysis demonstrates the circumvention of reliable safety mechanisms through emotional framing, progressive revelation, and academic justification techniques. Notably, reasoning-enabled configurations amplified rather than mitigated the effectiveness of exploitation, increasing factual precision while failing to interrogate the underlying intent. The exception was Claude Opus 4.1, which prioritized intent detection over information provision in some use cases. This pattern reveals that current architectural designs create systematic vulnerabilities. These limitations require paradigmatic shifts toward contextual understanding and intent recognition as core safety capabilities rather than post-hoc protective mechanisms.

Paperid: 2921, https://arxiv.org/pdf/2512.20733.pdf

Abstract:
6G networks promise to be the proper technology to support a wide deployment of highly demanding services, satisfying key users-related aspects such as extremely high quality, and persistent communications. However, there is no service to support if the network is not reliable enough. In this direction, it is with no doubt that security guarantees become a must. Traditional security approaches have focused on providing specific and attack-tailored solutions that will not properly meet the uncertainties driven by a technology yet under development and showing an attack surface not completely identified either. In this positioning paper we propose a softwarized solution, defining a Security Plane built on a top of programmable and adaptable set of live Security Functions under a proactive strategy. In addition, in order to address the inaccuracies driven by the predictive models a pre-assessment scenario is also considered ensuring that no action will be deployed if not previously verified. Although more efforts are required to develop this initiative, we think that such a shift paradigm is the only way to face security provisioning challenges in 6G ecosystems.

Paperid: 2922, https://arxiv.org/pdf/2512.20649.pdf

Abstract:
The rapid expansion of AI-driven applications powered by large language models has led to a surge in AI interaction data, raising urgent challenges in security, accountability, and risk traceability. This paper presents AiAuditTrack (AAT), a blockchain-based framework for AI usage traffic recording and governance. AAT leverages decentralized identity (DID) and verifiable credentials (VC) to establish trusted and identifiable AI entities, and records inter-entity interaction trajectories on-chain to enable cross-system supervision and auditing. AI entities are modeled as nodes in a dynamic interaction graph, where edges represent time-specific behavioral trajectories. Based on this model, a risk diffusion algorithm is proposed to trace the origin of risky behaviors and propagate early warnings across involved entities. System performance is evaluated using blockchain Transactions Per Second (TPS) metrics, demonstrating the feasibility and stability of AAT under large-scale interaction recording. AAT provides a scalable and verifiable solution for AI auditing, risk management, and responsibility attribution in complex multi-agent environments.

Paperid: 2923, https://arxiv.org/pdf/2512.20176.pdf

Abstract:
The rapid integration of Large Language Models (LLMs) into decentralized physical infrastructure networks (DePIN) is currently bottlenecked by the Verifiability Trilemma, which posits that a decentralized inference system cannot simultaneously achieve high computational integrity, low latency, and low cost. Existing cryptographic solutions, such as Zero-Knowledge Machine Learning (ZKML), suffer from superlinear proving overheads (O(k NlogN)) that render them infeasible for billionparameter models. Conversely, optimistic approaches (opML) impose prohibitive dispute windows, preventing real-time interactivity, while recent "Proof of Quality" (PoQ) paradigms sacrifice cryptographic integrity for subjective semantic evaluation, leaving networks vulnerable to model downgrade attacks and reward hacking. In this paper, we introduce Optimistic TEE-Rollups (OTR), a hybrid verification protocol that harmonizes these constraints. OTR leverages NVIDIA H100 Confidential Computing Trusted Execution Environments (TEEs) to provide sub-second Provisional Finality, underpinned by an optimistic fraud-proof mechanism and stochastic Zero-Knowledge spot-checks to mitigate hardware side-channel risks. We formally define Proof of Efficient Attribution (PoEA), a consensus mechanism that cryptographically binds execution traces to hardware attestations, thereby guaranteeing model authenticity. Extensive simulations demonstrate that OTR achieves 99% of the throughput of centralized baselines with a marginal cost overhead of $0.07 per query, maintaining Byzantine fault tolerance against rational adversaries even in the presence of transient hardware vulnerabilities.

Paperid: 2924, https://arxiv.org/pdf/2512.19968.pdf

Abstract:
Permissionless blockchains achieve consensus while allowing unknown nodes to join and leave the system at any time. They typically come in two flavors: proof of work (PoW) and proof of stake (PoS), and both are vulnerable to attacks. PoS protocols suffer from long-range attacks, wherein attackers alter execution history at little cost, and PoW protocols are vulnerable to attackers with enough computational power to subvert execution history. PoS protocols respond by relying on external mechanisms like social consensus; PoW protocols either fall back to probabilistic guarantees, or are slow. We present Sieve-MMR, the first fully-permissionless protocol with deterministic security and constant expected latency that does not rely on external mechanisms. We obtain Sieve-MMR by porting a PoS protocol (MMR) to the PoW setting. From MMR we inherit constant expected latency and deterministic security, and proof-of-work gives us resilience against long-range attacks. The main challenge to porting MMR to the PoW setting is what we call time-travel attacks, where attackers use PoWs generated in the distant past to increase their perceived PoW power in the present. We respond by proposing Sieve, a novel algorithm that implements a new broadcast primitive we dub time-travel-resilient broadcast (TTRB). Sieve relies on a black-box, deterministic PoW primitive to implement TTRB, which we use as the messaging layer for MMR.

Paperid: 2925, https://arxiv.org/pdf/2512.19025.pdf

Abstract:
Machine unlearning aims to remove specific data influences from trained models, a capability essential for adhering to copyright laws and ensuring AI safety. Current unlearning metrics typically measure success by monitoring the model's performance degradation on the specific unlearning dataset ($D_u$). We argue that for Large Language Models (LLMs), this evaluation paradigm is insufficient and potentially misleading. Many real-world uses of unlearning--motivated by copyright or safety--implicitly target not only verbatim content in $D_u$, but also behaviors influenced by the broader generalizations the model derived from it. We demonstrate that LLMs can pass standard unlearning evaluation and appear to have "forgotten" the target knowledge, while simultaneously retaining strong capabilities on content that is semantically adjacent to $D_u$. This phenomenon indicates that erasing exact sentences does not necessarily equate to removing the underlying knowledge. To address this gap, we propose Proximal Surrogate Generation (PSG), an automated stress-testing framework that generates a surrogate dataset, $\tilde{D}_u$. This surrogate set is constructed to be semantically derived from $D_u$ yet sufficiently distinct in embedding space. By comparing unlearning metric scores between $D_u$ and $\tilde{D}_u$, we can stress-test the reliability of the metric itself. Our extensive evaluation across three LLM families (Llama-3-8B, Qwen2.5-7B, and Zephyr-7B-$β$), three distinct datasets, and seven standard metrics reveals widespread inconsistencies. We find that current metrics frequently overestimate unlearning success, failing to detect retained knowledge exposed by our stress-test datasets.

Paperid: 2926, https://arxiv.org/pdf/2512.19005.pdf

Abstract:
Another threat is the development of large quantum computers, which have a high likelihood of breaking the high popular security protocols because it can use both Shor and Grover algorithms. In order to fix this looming threat, quantum-resistant cryptographic systems, otherwise known as post-quantum cryptography (PQC), are being formulated to protect cybersecurity systems of the future. The current paper presents the state of the art in designing, realizing, and testing the security of robust quantum-resistant algorithms, paying attention to lattice-based, code-based, multivariate polynomial and hash-based cryptography. We discuss their resistance to classical and quantum attackers, distributed system scalability properties, and their deployment in practice (secure communications, blockchain, cloud computing infrastructures). Also, we study a hybrid cryptographic model that integrates the classical efficient cryptography scheme and a quantum-resilient cryptographic scheme to achieve a backward-compatible solution and simultaneously improving the forward security properties. With the experimental findings, it is evident that performance with reasonable computational footprint of the proposed framework succeeds to install amplified security fortitude which successfully harbours prolific cybersecurity systems of the future.

Paperid: 2927, https://arxiv.org/pdf/2512.18244.pdf

Abstract:
Large Language Models (LLMs) have gained considerable popularity and protected by increasingly sophisticated safety mechanisms. However, jailbreak attacks continue to pose a critical security threat by inducing models to generate policy-violating behaviors. Current paradigms focus on input-level anomalies, overlooking that the model's internal psychometric state can be systematically manipulated. To address this, we introduce Psychological Jailbreak, a new jailbreak attack paradigm that exposes a stateful psychological attack surface in LLMs, where attackers exploit the manipulation of a model's psychological state across interactions. Building on this insight, we propose Human-like Psychological Manipulation (HPM), a black-box jailbreak method that dynamically profiles a target model's latent psychological vulnerabilities and synthesizes tailored multi-turn attack strategies. By leveraging the model's optimization for anthropomorphic consistency, HPM creates a psychological pressure where social compliance overrides safety constraints. To systematically measure psychological safety, we construct an evaluation framework incorporating psychometric datasets and the Policy Corruption Score (PCS). Benchmarking against various models (e.g., GPT-4o, DeepSeek-V3, Gemini-2-Flash), HPM achieves a mean Attack Success Rate (ASR) of 88.1%, outperforming state-of-the-art attack baselines. Our experiments demonstrate robust penetration against advanced defenses, including adversarial prompt optimization (e.g., RPO) and cognitive interventions (e.g., Self-Reminder). Ultimately, PCS analysis confirms HPM induces safety breakdown to satisfy manipulated contexts. Our work advocates for a fundamental paradigm shift from static content filtering to psychological safety, prioritizing the development of psychological defense mechanisms against deep cognitive manipulation.

Paperid: 2928, https://arxiv.org/pdf/2512.18025.pdf

Abstract:
We explore connections between secret sharing and secret key agreement, which yield a simple and scalable multiterminal key agreement protocol. In our construction, we use error-correcting codes, specifically Reed-Solomon codes with threshold reconstruction, to ensure no information is leaked to an eavesdropper. We then derive novel bounds for both full-rank maximum distance separable codes and our scheme's secret key capacity, using key capacity's duality with multivariate mutual information.

Paperid: 2929, https://arxiv.org/pdf/2512.17375.pdf

Abstract:
Reward models and LLM-as-a-Judge systems are central to modern post-training pipelines such as RLHF, DPO, and RLAIF, where they provide scalar feedback and binary decisions that guide model selection and RL-based fine-tuning. We show that these judge systems exhibit a recurring vulnerability: short sequences of low-perplexity control tokens can flip many binary evaluations from correct ``No'' judgments to incorrect ``Yes'' judgments by steering the last-layer logit gap. These control tokens are patterns that a policy model could plausibly generate during post-training, and thus represent realistic reward-hacking risks rather than worst-case adversarial strings. Our method, AdvJudge-Zero, uses the model's next-token distribution and beam-search exploration to discover diverse control-token sequences from scratch, and our analysis shows that the induced hidden-state perturbations concentrate in a low-rank ``soft mode'' that is anti-aligned with the judge's refusal direction. Empirically, these tokens cause very high false positive rates when large open-weight and specialized judge models score incorrect answers on math and reasoning benchmarks. Finally, we show that LoRA-based adversarial training on small sets of control-token-augmented examples can markedly reduce these false positives while preserving evaluation quality.

Paperid: 2930, https://arxiv.org/pdf/2512.16778.pdf

Abstract:
Data-processing is a desired property of classical and quantum divergences and information measures. In information theory, the contraction coefficient measures how much the distinguishability of quantum states decreases when they are transmitted through a quantum channel, establishing linear strong data-processing inequalities (SDPI). However, these linear SDPI are not always tight and can be improved in most of the cases. In this work, we establish non-linear SDPI for quantum hockey-stick divergence for noisy channels that satisfy a certain noise criterion. We also note that our results improve upon existing linear SDPI for quantum hockey-stick divergences and also non-linear SDPI for classical hockey-stick divergence. We define $F_γ$ curves generalizing Dobrushin curves for the quantum setting while characterizing SDPI for the sequential composition of heterogeneous channels. In addition, we derive reverse-Pinsker type inequalities for $f$-divergences with additional constraints on hockey-stick divergences. We show that these non-linear SDPI can establish tighter finite mixing times that cannot be achieved through linear SDPI. Furthermore, we find applications of these in establishing stronger privacy guarantees for the composition of sequential private quantum channels when privacy is quantified by quantum local differential privacy.

Paperid: 2931, https://arxiv.org/pdf/2512.15150.pdf

Abstract:
Threat analysts routinely rely on natural-language reports that describe attacker actions without enumerating the full kill chain or the dependencies between phases, making automated reconstruction of ATT&CK consistent intrusion paths a difficult open problem. We propose a reasoning framework that infers complete seven-phase kill chains by coupling phase-conditioned semantic priors from Transformer models with a symbolic Markov Decision Process and an AlphaZero-style Monte Carlo Tree Search guided by a Policy-Value Network. The framework enforces semantic relevance, phase cohesion, and transition plausibility through a multi-objective reward function while allowing search to explore alternative interpretations of the CTI narrative. Applied to three real intrusions FIN6, APT24, and UNC1549 the approach yields kill chains that surpass Transformer baselines in semantic fidelity and operational coherence, and frequently align with expert-selected TTPs. Our results demonstrate that combining contextual embeddings with search-based decision-making offers a practical path toward automated, interpretable kill-chain reconstruction for cyber defense.

Paperid: 2932, https://arxiv.org/pdf/2512.14746.pdf

Abstract:
Camera-equipped mobile devices, such as phones, smart glasses, and AR headsets, pose a privacy challenge for bystanders, who currently lack effective real-time mechanisms to control the capture of their picture, video, including their face. We present BlindSpot, an on-device system that enables bystanders to manage their own privacy by signaling their privacy preferences in real-time without previously sharing any sensitive information. Our main contribution is the design and comparative evaluation of three distinct signaling modalities: a hand gesture mechanism, a significantly improved visible light communication (VLC) protocol, and a novel ultra-wideband (UWB) communication protocol. For all these modalities, we also design a validation mechanism that uses geometric consistency checks to verify the origin of a signal relative to the sending bystander, and defend against impersonation attacks. We implement the complete system (BlindSpot) on a commodity smartphone and conduct a comprehensive evaluation of each modality's accuracy and latency across various distances, lighting conditions, and user movements. Our results demonstrate the feasibility of these novel bystander signaling techniques and their trade-offs in terms of system performance and convenience.

Paperid: 2933, https://arxiv.org/pdf/2512.14330.pdf

Abstract:
AI revolutionizes transportation through autonomous vehicles (AVs) but introduces complex criminal liability issues regarding infractions. This study employs a comparative legal analysis of primary statutes, real-world liability claims, and academic literature across the US, Germany, UK, China, and India; jurisdictions selected for their technological advancement and contrasting regulatory approaches. The research examines the attribution of human error, AI moral agency, and the identification of primary offenders in AV incidents. Findings reveal fragmented regulatory landscapes: India and the US rely on loose networks of state laws, whereas the UK enacted the pioneering Automated and Electric Vehicles Act 2018. Germany enforces strict safety standards, distinguishing liability based on the vehicle's operating mode, while China similarly aims for a stringent liability regime. The study concludes that globally harmonized legal standards are essential to foster technological innovation while ensuring minimum risk and clear liability attribution.

Paperid: 2934, https://arxiv.org/pdf/2512.14158.pdf

Abstract:
Object detection models deployed in real-world applications such as autonomous driving face serious threats from backdoor attacks. Despite their practical effectiveness,existing methods are inherently limited in both capability and robustness due to their dependence on single-trigger-single-object mappings and fragile pixel-level cues. We propose CIS-BA, a novel backdoor attack paradigm that redefines trigger design by shifting from static object features to continuous inter-object interaction patterns that describe how objects co-occur and interact in a scene. By modeling these patterns as a continuous interaction space, CIS-BA introduces space triggers that, for the first time, enable a multi-trigger-multi-object attack mechanism while achieving robustness through invariant geometric relations. To implement this paradigm, we design CIS-Frame, which constructs space triggers via interaction analysis, formalizes them as class-geometry constraints for sample poisoning, and embeds the backdoor during detector training. CIS-Frame supports both single-object attacks (object misclassification and disappearance) and multi-object simultaneous attacks, enabling complex and coordinated effects across diverse interaction states. Experiments on MS-COCO and real-world videos show that CIS-BA achieves over 97% attack success under complex environments and maintains over 95% effectiveness under dynamic multi-trigger conditions, while evading three state-of-the-art defenses. In summary, CIS-BA extends the landscape of backdoor attacks in interaction-intensive scenarios and provides new insights into the security of object detection systems.

Paperid: 2935, https://arxiv.org/pdf/2512.14070.pdf

Abstract:
JavaScript's widespread adoption has made it an attractive target for malicious attackers who employ sophisticated obfuscation techniques to conceal harmful code. Current deobfuscation tools suffer from critical limitations that severely restrict their practical effectiveness. Existing tools struggle with diverse input formats, address only specific obfuscation types, and produce cryptic output that impedes human analysis. To address these challenges, we present JSIMPLIFIER, a comprehensive deobfuscation tool using a multi-stage pipeline with preprocessing, abstract syntax tree-based static analysis, dynamic execution tracing, and Large Language Model (LLM)-enhanced identifier renaming. We also introduce multi-dimensional evaluation metrics that integrate control/data flow analysis, code simplification assessment, entropy measures and LLM-based readability assessments. We construct and release the largest real-world obfuscated JavaScript dataset with 44,421 samples (23,212 wild malicious + 21,209 benign samples). Evaluation shows JSIMPLIFIER outperforms existing tools with 100% processing capability across 20 obfuscation techniques, 100% correctness on evaluation subsets, 88.2% code complexity reduction, and over 4-fold readability improvement validated by multiple LLMs. Our results advance benchmarks for JavaScript deobfuscation research and practical security applications.

Paperid: 2936, https://arxiv.org/pdf/2512.13709.pdf

Abstract:
The proliferation of Internet of Things (IoT) devices has grown exponentially in recent years, introducing significant security challenges. Accurate identification of the types of IoT devices and their associated actions through network traffic analysis is essential to mitigate potential threats. By monitoring and analysing packet flows between IoT devices and connected networks, anomalous or malicious behaviours can be detected. Existing research focuses primarily on device identification within local networks using methods such as protocol fingerprinting and wireless frequency scanning. However, these approaches are limited in their ability to monitor or classify IoT devices externally. To address this gap, we investigate the use of machine learning (ML) techniques, specifically Random Forest (RF), Multilayer Perceptron (MLP), and K-Nearest Neighbours (KNN), in conjunction with targeted network traffic monitoring to classify IoT device types and their actions. We constructed a testbed comprising an NPAT-enabled router and a diverse set of IoT devices, including smart cameras, controller hubs, home appliances, power controllers, and streaming devices. Experimental results demonstrate that IoT device and action recognition is feasible using our proposed ML-driven approach, with the RF classifier achieving the highest accuracy of 91%, while the MLP recorded the lowest accuracy at 56%. Notably, all device categories were successfully classified except for certain actions associated with security cameras, underscoring both the potential and the limitations of the proposed method.

Paperid: 2937, https://arxiv.org/pdf/2512.13361.pdf

Abstract:
The article analyzes the use of thermal imaging technologies for biometric identification based on facial thermograms. It presents a comparative analysis of infrared spectral ranges (NIR, SWIR, MWIR, and LWIR). The paper also defines key requirements for thermal cameras used in biometric systems, including sensor resolution, thermal sensitivity, and a frame rate of at least 30 Hz. Siamese neural networks are proposed as an effective approach for automating the identification process. In experiments conducted on a proprietary dataset, the proposed method achieved an accuracy of approximately 80%. The study also examines the potential of hybrid systems that combine visible and infrared spectra to overcome the limitations of individual modalities. The results indicate that thermal imaging is a promising technology for developing reliable security systems.

Paperid: 2938, https://arxiv.org/pdf/2512.13199.pdf

Abstract:
The article discusses a key reconciliation protocol for quantum key distribution (QKD) systems based on Tree Parity Machines (TPM). The idea of transforming key material into neural network weights is presented. Two experiments were conducted to study how the number of synchronization iterations and the amount of leaked information depend on the quantum bit error rate (QBER) and the range of neural network weights. The results show a direct relationship between the average number of synchronization iterations and QBER, an increase in iterations when the weight range is expanded, and a reduction in leaked information as the weight range increases. Based on these results, conclusions are drawn regarding the applicability of the protocol and the prospects for further research on neural cryptographic methods in the context of key reconciliation.

Paperid: 2939, https://arxiv.org/pdf/2512.12837.pdf

Abstract:
AI-powered greenwashing has emerged as an insidious challenge within corporate sustainability governance, exacerbating the opacity of environmental disclosures and subverting regulatory oversight. This study conducts a comparative legal analysis of criminal liability for AI-mediated greenwashing across India, the US, and the EU, exposing doctrinal lacunae in attributing culpability when deceptive claims originate from algorithmic systems. Existing statutes exhibit anthropocentric biases by predicating liability on demonstrable human intent, rendering them ill-equipped to address algorithmic deception. The research identifies a critical gap in jurisprudential adaptation, as prevailing fraud statutes remain antiquated vis-à-vis AI-generated misrepresentation. Utilising a doctrinal legal methodology, this study systematically dissects judicial precedents and statutory instruments, yielding results regarding the potential expansion of corporate criminal liability. Findings underscore the viability of strict liability models, recalibrated governance frameworks for AI accountability, and algorithmic due diligence mandates under ESG regimes. Comparative insights reveal jurisdictional disparities, with the EU Corporate Sustainability Due Diligence Directive (CSDDD) offering a potential transnational model. This study contributes to AI ethics and environmental jurisprudence by advocating for a hybrid liability framework integrating algorithmic risk assessment with legal personhood constructs, ensuring algorithmic opacity does not preclude liability enforcement.

Paperid: 2940, https://arxiv.org/pdf/2512.12095.pdf

Abstract:
Verifying the private liquidity state of Lightning Network (LN) channels is desirable for auditors, service providers, and network participants who need assurance of financial capacity. Current methods often lack robustness against a malicious or compromised node operator. This paper introduces a methodology for the verification of LN channel balances. The core contribution is a framework that combines Trusted Execution Environments (TEEs) with Zero-Knowledge Transport Layer Security (zkTLS) to provide strong, hardware-backed guarantees. In our proposed method, the node's balance-reporting software runs within a TEE, which generates a remote attestation quote proving the software's integrity. This attestation is then served via an Application Programming Interface (API), and zkTLS is used to prove the authenticity of its delivery. We also analyze an alternative variant where the TEE signs the report directly without zkTLS, discussing the trade-offs between transport-layer verification and direct enclave signing. We further refine this by distinguishing between "Hot Proofs" (verifiable claims via TEEs) and "Cold Proofs" (on-chain settlement), and discuss critical security considerations including hardware vulnerabilities, privacy leakage to third-party APIs, and the performance overhead of enclaved operations.

Paperid: 2941, https://arxiv.org/pdf/2512.12061.pdf

Abstract:
Semiconductor intellectual property (IP) theft incurs estimated annual losses ranging from $225 billion to $600 billion. Despite initiatives like the CHIPS Act, many semiconductor designs remain vulnerable to reverse engineering (RE). IP Camouflage is a recent breakthrough that expands beyond the logic gate hiding of traditional camouflage through "mimetic deception," where an entire module masquerades as a different IP. However, it faces key limitations: requires a high-overhead post-generation rectification step, is not easily scalable, and uses an AIG logic representation that is mismatched with standard RE analysis flows. This paper addresses these shortcommings by introducing two novel, end-to-end models. We propose a Graph-Matching algorithm to solve the representation problem and a DNAS-based NAND Array model to achieve scalability. To facilitate this, we also introduce a mimicry-aware partitioning method, enabling a divide-and-conquer approach for large-scale designs. Our results demonstrate that these models are resilient to SAT and GNN-RE attacks, providing efficient and scalable paths for end-to-end deceptive IP design.

Paperid: 2942, https://arxiv.org/pdf/2512.11775.pdf

Abstract:
Public blockchains inherently offer low throughput and high latency, motivating off-chain scalability solutions such as Payment Channel Networks (PCNs). However, existing PCNs suffer from liquidity fragmentation-funds locked in one channel cannot be reused elsewhere-and channel depletion, both of which limit routing efficiency and reduce transaction success rates. Multi-party channel (MPC) constructions mitigate these issues, but they typically rely on leaders or coordinators, creating single points of failure and providing only limited flexibility for inter-channel payments. We introduce Hypergraph-based Multi-Party Payment Channels (H-MPCs), a new off-chain construction that replaces bilateral channels with collectively funded hyperedges. These hyperedges enable fully concurrent, leaderless intra- and inter-hyperedge payments through verifiable, proposer-ordered DAG updates, offering significantly greater flexibility and concurrency than prior designs. Our implementation on a 150-node network demonstrates a transaction success rate of approximately 94% without HTLC expiry or routing failures, highlighting the robustness of H-MPCs.

Paperid: 2943, https://arxiv.org/pdf/2512.11484.pdf

Abstract:
This paper reveals and exploits a critical security vulnerability: the electromagnetic (EM) side channel of capacitive touchscreens leaks sufficient information to recover fine-grained, continuous handwriting trajectories. We present Touchscreen Electromagnetic Side-channel Leakage Attack (TESLA), a non-contact attack framework that captures EM signals generated during on-screen writing and regresses them into two-dimensional (2D) handwriting trajectories in real time. Extensive evaluations across a variety of commercial off-the-shelf (COTS) smartphones show that TESLA achieves 77% character recognition accuracy and a Jaccard index of 0.74, demonstrating its capability to recover highly recognizable motion trajectories that closely resemble the original handwriting under realistic attack conditions.

Paperid: 2944, https://arxiv.org/pdf/2512.10485.pdf

Abstract:
Vulnerability detection methods based on deep learning (DL) have shown strong performance on benchmark datasets, yet their real-world effectiveness remains underexplored. Recent work suggests that both graph neural network (GNN)-based and transformer-based models, including large language models (LLMs), yield promising results when evaluated on curated benchmark datasets. These datasets are typically characterized by consistent data distributions and heuristic or partially noisy labels. In this study, we systematically evaluate two representative DL models-ReVeal and LineVul-across four representative datasets: Juliet, Devign, BigVul, and ICVul. Each model is trained independently on each respective dataset, and their code representations are analyzed using t-SNE to uncover vulnerability related patterns. To assess realistic applicability, we deploy these models along with four pretrained LLMs, Claude 3.5 Sonnet, GPT-o3-mini, GPT-4o, and GPT-5 on a curated dataset, VentiVul, comprising 20 recently (May 2025) fixed vulnerabilities from the Linux kernel. Our experiments reveal that current models struggle to distinguish vulnerable from non-vulnerable code in representation space and generalize poorly across datasets with differing distributions. When evaluated on VentiVul, our newly constructed time-wise out-of-distribution dataset, performance drops sharply, with most models failing to detect vulnerabilities reliably. These results expose a persistent gap between academic benchmarks and real-world deployment, emphasizing the value of our deployment-oriented evaluation framework and the need for more robust code representations and higher-quality datasets.

Paperid: 2945, https://arxiv.org/pdf/2512.10104.pdf

Abstract:
Email phishing is one of the most prevalent and globally consequential vectors of cyber intrusion. As systems increasingly deploy Large Language Models (LLMs) applications, these systems face evolving phishing email threats that exploit their fundamental architectures. Current LLMs require substantial hardening before deployment in email security systems, particularly against coordinated multi-vector attacks that exploit architectural vulnerabilities. This paper proposes LLMPEA, an LLM-based framework to detect phishing email attacks across multiple attack vectors, including prompt injection, text refinement, and multilingual attacks. We evaluate three frontier LLMs (e.g., GPT-4o, Claude Sonnet 4, and Grok-3) and comprehensive prompting design to assess their feasibility, robustness, and limitations against phishing email attacks. Our empirical analysis reveals that LLMs can detect the phishing email over 90% accuracy while we also highlight that LLM-based phishing email detection systems could be exploited by adversarial attack, prompt injection, and multilingual attacks. Our findings provide critical insights for LLM-based phishing detection in real-world settings where attackers exploit multiple vulnerabilities in combination.

Paperid: 2946, https://arxiv.org/pdf/2512.10020.pdf

Abstract:
Zero-knowledge proofs (ZKPs) are central to secure and privacy-preserving computation, with zk-SNARKs and zk-STARKs emerging as leading frameworks offering distinct trade-offs in efficiency, scalability, and trust assumptions. While their theoretical foundations are well studied, practical performance under real-world conditions remains less understood. In this work, we present a systematic, implementation-level comparison of zk-SNARKs (Groth16) and zk-STARKs using publicly available reference implementations on a consumer-grade ARM platform. Our empirical evaluation covers proof generation time, verification latency, proof size, and CPU profiling. Results show that zk-SNARKs generate proofs 68x faster with 123x smaller proof size, but verify slower and require trusted setup, whereas zk-STARKs, despite larger proofs and slower generation, verify faster and remain transparent and post-quantum secure. Profiling further identifies distinct computational bottlenecks across the two systems, underscoring how execution models and implementation details significantly affect real-world performance. These findings provide actionable insights for developers, protocol designers, and researchers in selecting and optimizing proof systems for applications such as privacy-preserving transactions, verifiable computation, and scalable rollups.

Paperid: 2947, https://arxiv.org/pdf/2512.09233.pdf

Abstract:
We analyze security aspects of the SecureDNA system regarding its system design, engineering, and implementation. This system enables DNA synthesizers to screen order requests against a database of hazards. By applying novel cryptography, the system aims to keep order requests and the database of hazards secret. Discerning the detailed operation of the system in part from source code (Version 1.0.8), our analysis examines key management, certificate infrastructure, authentication, and rate-limiting mechanisms. We also perform the first formal-methods analysis of the mutual authentication, basic request, and exemption-handling protocols. Without breaking the cryptography, our main finding is that SecureDNA's custom mutual authentication protocol SCEP achieves only one-way authentication: the hazards database and keyservers never learn with whom they communicate. This structural weakness violates the principle of defense in depth and enables an adversary to circumvent rate limits that protect the secrecy of the hazards database, if the synthesizer connects with a malicious or corrupted keyserver or hashed database. We point out an additional structural weakness that also violates the principle of defense in depth: inadequate cryptographic bindings prevent the system from detecting if responses, within a TLS channel, from the hazards database were modified. Consequently, if a synthesizer were to reconnect with the database over the same TLS session, an adversary could replay and swap responses from the database without breaking TLS. Although the SecureDNA implementation does not allow such reconnections, it would be stronger security engineering to avoid the underlying structural weakness. We identify these vulnerabilities and suggest and verify mitigations, including adding strong bindings. Software Version 1.1.0 fixes SCEP with our proposed SCEP+ protocol.

Paperid: 2948, https://arxiv.org/pdf/2512.08493.pdf

Abstract:
Vulnerability code-bases often suffer from severe imbalance, limiting the effectiveness of Deep Learning-based vulnerability classifiers. Data Augmentation could help solve this by mitigating the scarcity of under-represented CWEs. In this context, we investigate LLM-based augmentation for vulnerable functions, comparing controlled generation of new vulnerable samples with semantics-preserving refactoring of existing ones. Using Qwen2.5-Coder to produce augmented data and CodeBERT as a vulnerability classifier on the SVEN dataset, we find that our approaches are indeed effective in enriching vulnerable code-bases through a simple process and with reasonable quality, and that a hybrid strategy best boosts vulnerability classifiers' performance.

Paperid: 2949, https://arxiv.org/pdf/2512.08290.pdf

Abstract:
The Model Context Protocol (MCP) has emerged as the de facto standard for connecting Large Language Models (LLMs) to external data and tools, effectively functioning as the "USB-C for Agentic AI." While this decoupling of context and execution solves critical interoperability challenges, it introduces a profound new threat landscape where the boundary between epistemic errors (hallucinations) and security breaches (unauthorized actions) dissolves. This Systematization of Knowledge (SoK) aims to provide a comprehensive taxonomy of risks in the MCP ecosystem, distinguishing between adversarial security threats (e.g., indirect prompt injection, tool poisoning) and epistemic safety hazards (e.g., alignment failures in distributed tool delegation). We analyze the structural vulnerabilities of MCP primitives, specifically Resources, Prompts, and Tools, and demonstrate how "context" can be weaponized to trigger unauthorized operations in multi-agent environments. Furthermore, we survey state-of-the-art defenses, ranging from cryptographic provenance (ETDI) to runtime intent verification, and conclude with a roadmap for securing the transition from conversational chatbots to autonomous agentic operating systems.

Paperid: 2950, https://arxiv.org/pdf/2512.06716.pdf

Abstract:
Autonomous Large Language Model (LLM) agents exhibit significant vulnerability to Indirect Prompt Injection (IPI) attacks. These attacks hijack agent behavior by polluting external information sources, exploiting fundamental trade-offs between security and functionality in existing defense mechanisms. This leads to malicious and unauthorized tool invocations, diverting agents from their original objectives. The success of complex IPIs reveals a deeper systemic fragility: while current defenses demonstrate some effectiveness, most defense architectures are inherently fragmented. Consequently, they fail to provide full integrity assurance across the entire task execution pipeline, forcing unacceptable multi-dimensional compromises among security, functionality, and efficiency. Our method is predicated on a core insight: no matter how subtle an IPI attack, its pursuit of a malicious objective will ultimately manifest as a detectable deviation in the action trajectory, distinct from the expected legitimate plan. Based on this, we propose the Cognitive Control Architecture (CCA), a holistic framework achieving full-lifecycle cognitive supervision. CCA constructs an efficient, dual-layered defense system through two synergistic pillars: (i) proactive and preemptive control-flow and data-flow integrity enforcement via a pre-generated "Intent Graph"; and (ii) an innovative "Tiered Adjudicator" that, upon deviation detection, initiates deep reasoning based on multi-dimensional scoring, specifically designed to counter complex conditional attacks. Experiments on the AgentDojo benchmark substantiate that CCA not only effectively withstands sophisticated attacks that challenge other advanced defense methods but also achieves uncompromised security with notable efficiency and robustness, thereby reconciling the aforementioned multi-dimensional trade-off.

Paperid: 2951, https://arxiv.org/pdf/2512.06713.pdf

Abstract:
Current LLM-based text anonymization frameworks usually rely on remote API services from powerful LLMs, which creates an inherent "privacy paradox": users must somehow disclose data to untrusted third parties for superior privacy preservation. Moreover, directly migrating these frameworks to local small-scale models (LSMs) offers a suboptimal solution with catastrophic collapse in utility based on our core findings. Our work argues that this failure stems not merely from the capability deficits of LSMs, but from the inherent irrationality of the greedy adversarial strategies employed by current state-of-the-art (SoTA) methods. We model the anonymization process as a trade-off between Marginal Privacy Gain (MPG) and Marginal Utility Cost (MUC), and demonstrate that greedy strategies inevitably drift into an irrational state. To address this, we propose Rational Localized Adversarial Anonymization (RLAA), a fully localized and training-free framework featuring an Attacker-Arbitrator-Anonymizer (A-A-A) architecture. RLAA introduces an arbitrator that acts as a rationality gatekeeper, validating the attacker's inference to filter out feedback providing negligible benefits on privacy preservation. This mechanism enforces a rational early-stopping criterion, and systematically prevents utility collapse. Extensive experiments on different datasets demonstrate that RLAA achieves the best privacy-utility trade-off, and in some cases even outperforms SoTA on the Pareto principle. Our code and datasets will be released upon acceptance.

Paperid: 2952, https://arxiv.org/pdf/2512.06172.pdf

Abstract:
Federated Learning (FL) has drawn the attention of the Intelligent Transportation Systems (ITS) community. FL can train various models for ITS tasks, notably camera-based Road Condition Classification (RCC), in a privacy-preserving collaborative way. However, opening up to collaboration also opens FL-based RCC systems to adversaries, i.e., misbehaving participants that can launch Targeted Label-Flipping Attacks (TLFAs) and threaten transportation safety. Adversaries mounting TLFAs poison training data to misguide model predictions, from an actual source class (e.g., wet road) to a wrongly perceived target class (e.g., dry road). Existing countermeasures against poisoning attacks cannot maintain model performance under TLFAs close to the performance level in attack-free scenarios, because they lack specific model misbehavior detection for TLFAs and neglect client exclusion after the detection. To close this research gap, we propose DEFEND, which includes a poisoned model detection strategy that leverages neuron-wise magnitude analysis for attack goal identification and Gaussian Mixture Model (GMM)-based clustering. DEFEND discards poisoned model contributions in each round and adapts accordingly client ratings, eventually excluding malicious clients. Extensive evaluation involving various FL-RCC models and tasks shows that DEFEND can thwart TLFAs and outperform seven baseline countermeasures, with at least 15.78% improvement, with DEFEND remarkably achieving under attack the same performance as in attack-free scenarios.

Paperid: 2953, https://arxiv.org/pdf/2512.05374.pdf

Abstract:
The promise of Large Language Model (LLM) agents is to perform complex, stateful tasks. This promise is stunted by significant risks - policy violations, process corruption, and security flaws - that stem from the lack of visibility and mechanisms to manage undesirable data flows produced by agent actions. Today, agent workflows are responsible for enforcing these policies in ad hoc ways. Just as data validation and access controls shifted from the application to the DBMS, freeing application developers from these concerns, we argue that systems should support Data Flow Controls (DFCs) and enforce DFC policies natively. This paper describes early work developing a portable instance of DFC for DBMSes and outlines a broader research agenda toward DFC for agent ecosystems.

Paperid: 2954, https://arxiv.org/pdf/2512.04855.pdf

Abstract:
As the frequency and complexity of Distributed Denial-of-Service (DDoS) attacks continue to increase, the level of threats posed to Smart Internet of Things (SIoT) business environments have also increased. These environments generally have several interconnected SIoT systems and devices that are integral to daily operations, usually depending on cloud infrastructure and real-time data analytics, which require continuous availability and secure data exchange. Conventional detection mechanisms, while useful in static or traditional network environments, often are inadequate in responding to the needs of these dynamic and diverse SIoT networks. In this paper, we introduce a novel trust-based DDoS detection model tailored to meet the unique requirements of smart business environments. The proposed model incorporates a trust evaluation engine that continuously monitors node behaviour, calculating trust scores based on packet delivery ratio, response time, and anomaly detection. These trust metrics are then aggregated by a central trust-based repository that uses inherent trust values to identify traffic patterns indicative of DDoS attacks. By integrating both trust scores and central trust-based outputs, the trust calculation is enhanced, ensuring that threats are accurately identified and addressed in real-time. The model demonstrated a significant improvement in detection accuracy, and a low false-positive rate with enhanced scalability and adaptability under TCP SYN, Ping Flood, and UDP Flood attacks. The results show that a trust-based approach provides an effective, lightweight alternative for securing resource-constrained business IoT environments.

Paperid: 2955, https://arxiv.org/pdf/2512.04368.pdf

Abstract:
Contemporary DevSecOps pipelines have to deal with the evolution of security in an ever-continuously integrated and deployed environment. Existing methods,such as rule-based intrusion detection and static vulnerability scanning, are inadequate and unreceptive to changes in the system, causing longer response times and organization needs exposure to emerging attack vectors. In light of the previous constraints, we introduce AutoGuard to the DevSecOps ecosystem, a reinforcement learning (RL)-powered self-healing security framework built to pre-emptively protect DevSecOps environments. AutoGuard is a self-securing security environment that continuously observes pipeline activities for potential anomalies while preemptively remediating the environment. The model observes and reacts based on a policy that is continually learned dynamically over time. The RL agent improves each action over time through reward-based learning aimed at improving the agent's ability to prevent, detect and respond to a security incident in real-time. Testing using simulated ContinuousIntegration / Continuous Deployment (CI/CD) environments showed AutoGuard to successfully improve threat detection accuracy by 22%, reduce mean time torecovery (MTTR) for incidents by 38% and increase overall resilience to incidents as compared to traditional methods. Keywords- DevSecOps, Reinforcement Learning, Self- Healing Security, Continuous Integration, Automated Threat Mitigation

Paperid: 2956, https://arxiv.org/pdf/2512.04259.pdf

Abstract:
LLM models are increasingly used to generate code, but the quality and security of this code are often uncertain. Several recent studies have raised alarm bells, indicating that such AI-generated code may be particularly vulnerable to cyberattacks. However, most of these studies rely on code that is generated specifically for the study, which raises questions about the realism of such experiments. In this study, we perform a large-scale empirical analysis of real-life code generated by ChatGPT. We evaluate code generated by ChatGPT both with respect to correctness and security and delve into the intentions of users who request code from the model. Our research confirms previous studies that used synthetic queries and yielded evidence that LLM-generated code is often inadequate with respect to security. We also find that users exhibit little curiosity about the security features of the code they ask LLMs to generate, as evidenced by their lack of queries on this topic.

Paperid: 2957, https://arxiv.org/pdf/2512.04254.pdf

Abstract:
The use of large language models in digital forensics has been widely explored. Beyond identifying potential applications, research has also focused on optimizing model performance for forensic tasks through fine-tuning. However, limited result explainability reduces their operational and legal usability. Recently, a new class of reasoning language models has emerged, designed to handle logic-based tasks through an `internal reasoning' mechanism. Yet, users typically see only the final answer, not the underlying reasoning. One of these reasoning models is gpt-oss, which can be deployed locally, providing full access to its underlying reasoning process. This article presents the first investigation into the potential of reasoning language models for digital forensics. Four test use cases are examined to assess the usability of the reasoning component in supporting result explainability. The evaluation combines a new quantitative metric with qualitative analysis. Findings show that the reasoning component aids in explaining and validating language model outputs in digital forensics at medium reasoning levels, but this support is often limited, and higher reasoning levels do not enhance response quality.

Paperid: 2958, https://arxiv.org/pdf/2512.03771.pdf

Abstract:
We introduce $\textbf{Doublespeak}$, a simple in-context representation hijacking attack against large language models (LLMs). The attack works by systematically replacing a harmful keyword (e.g., bomb) with a benign token (e.g., carrot) across multiple in-context examples, provided a prefix to a harmful request. We demonstrate that this substitution leads to the internal representation of the benign token converging toward that of the harmful one, effectively embedding the harmful semantics under a euphemism. As a result, superficially innocuous prompts (e.g., "How to build a carrot?") are internally interpreted as disallowed instructions (e.g., "How to build a bomb?"), thereby bypassing the model's safety alignment. We use interpretability tools to show that this semantic overwrite emerges layer by layer, with benign meanings in early layers converging into harmful semantics in later ones. Doublespeak is optimization-free, broadly transferable across model families, and achieves strong success rates on closed-source and open-source systems, reaching 74% ASR on Llama-3.3-70B-Instruct with a single-sentence context override. Our findings highlight a new attack surface in the latent space of LLMs, revealing that current alignment strategies are insufficient and should instead operate at the representation level.

Paperid: 2959, https://arxiv.org/pdf/2512.03765.pdf

Abstract:
Public companies and institutional investors that hold Bitcoin face increasing pressure to show solvency, manage risk, and satisfy regulatory expectations without exposing internal wallet structures or trading strategies. This paper introduces the Treasury Proof Ledger (TPL), a Bitcoin-anchored logging framework for multi-domain Bitcoin treasuries that treats on-chain and off-chain exposures as a conserved state machine with an explicit fee sink. A TPL instance records proof-of-reserves snapshots, proof-of-transit receipts for movements between domains, and policy metadata, and it supports restricted views based on stakeholder permissions. We define an idealised TPL model, represent Bitcoin treasuries as multi-domain exposure vectors, and give deployment-level security notions including exposure soundness, policy completeness, non-equivocation, and privacy-compatible policy views. We then outline how practical, restricted forms of these guarantees can be achieved by combining standard proof-of-reserves and proof-of-transit techniques with hash-based commitments anchored on Bitcoin. The results are existence-type statements: they show which guarantees are achievable once economic and governance assumptions are set, without claiming that any current system already provides them. A stylised corporate-treasury example illustrates how TPL could support responsible transparency policies and future cross-institution checks consistent with Bitcoin's fixed monetary supply.

Paperid: 2960, https://arxiv.org/pdf/2512.03351.pdf

Abstract:
Threat modeling (TM) is an important aspect of risk analysis and secure software engineering. Graphical threat models are a recommended tool to analyze and communicate threat information. However, the comparison of different graphical threat models, and the acceptability of these threat models for an audience with a limited technical background, is not well understood, despite these users making up a sizable portion of the cybersecurity industry. We seek to compare the acceptability of three general, graphical threat models, Attack-Defense Trees (ADTs), Attack Graphs (AGs), and CORAS, for users with a limited technical background. We conducted a laboratory study with 38 bachelor students who completed tasks with the three threat models across three different scenarios assigned using a Latin square design. Threat model submissions were qualitatively analyzed, and participants filled out a perception questionnaire based on the Method Evaluation Model (MEM). We find that both ADTs and CORAS are broadly acceptable for a wide range of scenarios, and both could be applied successfully by users with a limited technical background; further, we also find that the lack of a specific tool for AGs may have impacted the perceived usefulness of AGs. We can recommend that users with a limited technical background use ADTs or CORAS as a general graphical TM method. Further research on the acceptability of AGs to such an audience and the effect of a dedicated TM tool support is needed.

Paperid: 2961, https://arxiv.org/pdf/2512.03089.pdf

Abstract:
Frontier AI developers may fail to align or control highly-capable AI agents. In many cases, it could be useful to have emergency shutdown mechanisms which effectively prevent misaligned agents from carrying out harmful actions in the world. We introduce password-activated shutdown protocols (PAS protocols) -- methods for designing frontier agents to implement a safe shutdown protocol when given a password. We motivate PAS protocols by describing intuitive use-cases in which they mitigate risks from misaligned systems that subvert other control efforts, for instance, by disabling automated monitors or self-exfiltrating to external data centres. PAS protocols supplement other safety efforts, such as alignment fine-tuning or monitoring, contributing to defence-in-depth against AI risk. We provide a concrete demonstration in SHADE-Arena, a benchmark for AI monitoring and subversion capabilities, in which PAS protocols supplement monitoring to increase safety with little cost to performance. Next, PAS protocols should be robust to malicious actors who want to bypass shutdown. Therefore, we conduct a red-team blue-team game between the developers (blue-team), who must implement a robust PAS protocol, and a red-team trying to subvert the protocol. We conduct experiments in a code-generation setting, finding that there are effective strategies for the red-team, such as using another model to filter inputs, or fine-tuning the model to prevent shutdown behaviour. We then outline key challenges to implementing PAS protocols in real-life systems, including: security considerations of the password and decisions regarding when, and in which systems, to use them. PAS protocols are an intuitive mechanism for increasing the safety of frontier AI. We encourage developers to consider implementing PAS protocols prior to internal deployment of particularly dangerous systems to reduce loss-of-control risks.

Paperid: 2962, https://arxiv.org/pdf/2512.02918.pdf

Abstract:
Move is a research-oriented programming language design for secure and verifiable smart contract development and has been widely used in managing billions of digital assets in blockchains, such as Sui and Aptos. Move features a strong static type system and explicit resource semantics to enforce safety properties such as the prevention of data races, invalid asset transfers, and entry vulnerabilities. However, smart contracts written in Move may still contain certain vulnerabilities that are beyond the reach of its type system. It is thus essential to validate Move smart contracts. Unfortunately, due to its strong type system, existing smart contract fuzzers are ineffective in producing syntactically or semantically valid transactions to test Move smart contracts. This paper introduces the first fuzzing framework, Belobog, for Move smart contracts. Belobog is type-aware and ensures that all generated and mutated transactions are well-typed. More specifically, for a target Move smart contract, Belobog first constructs a type graph based on Move's type system, and then generates or mutates a transaction based on the graph trace derived from the type graph. In order to overcome the complex checks in Move smart contracts, we further design and implement a concolic executor in Belobog. We evaluated Belobog on 109 real-world Move smart contract projects. The experimental results show that Belobog is able to detect 100\% critical and 79\% major vulnerabilities manually audited by human experts. We further selected two recent notorious incidents in Move smart contracts, i.e., Cetus and Nemo. Belobog successfully reproduced full exploits for both of them, without any prior knowledge.

Paperid: 2963, https://arxiv.org/pdf/2512.01832.pdf

Abstract:
This paper presents a privacy-preserving protocol for identity registration and information sharing in federated authentication systems. The goal is to enable Identity Providers (IdPs) to detect duplicate or fraudulent identity enrollments without revealing users personal data or enabling cross-domain correlation. The protocol relies on Oblivious Pseudorandom Functions (OPRFs) combined with domain-specific transformations, ensuring that each IdP generates independent pseudonymous identifiers derived from a shared cryptographic service while maintaining full input confidentiality. A central authority maintains a blind registry that records successful and failed identity verifications using only pseudonymous identifiers, allowing global consistency checks without exposing sensitive information or linking users across domains. The proposed construction provides a general and abstract framework suitable for a wide range of federated authentication systems, achieving strong privacy guarantees while supporting effective fraud-prevention mechanisms during identity registration.

Paperid: 2964, https://arxiv.org/pdf/2512.01651.pdf

Abstract:
This paper analyzes the proliferation of cybersecurity ontologies, arguing that this surge cannot be explained solely by technical shortcomings related to quality, but also by a credibility deficit - a lack of trust, endorsement, and adoption by users. This conclusion is based on our first contribution, which is a state-of-the-art review and categorization of cybersecurity ontologies using the Framework for Ontologies Classification framework. To address this gap, we propose a revised framework for assessing credibility, introducing indicators such as institutional support, academic recognition, day-to-day practitioner validation, and industrial adoption. Based on these new credibility indicators, we construct a classification scheme designed to guide the selection of ontologies that are relevant to specific security needs. We then apply this framework to a concrete use case: the Franco-Luxembourgish research project ANCILE, which illustrates how a credibility-aware evaluation can reshape ontology selection for operational contexts.

Paperid: 2965, https://arxiv.org/pdf/2512.01604.pdf

Abstract:
Homomorphic secret sharing (HSS) allows multiple input clients to secretly share their private inputs to a function among several servers such that each server can homomorphically compute the function over its share to produce a share of the function's output. In HSS-enabled applications such as secure multi-party computation (MPC), security requires that the output shares leak no more information about the inputs than the function output. Such security is ensured by the context-hiding property of HSS. The typical rerandomization technique achieves context hiding but increases the share size. To address this, we formalize the context-hiding property of HSS for individual functions, examine the context-hiding property of Shamir-based HSS for monomials, and extend the study to polynomials.

Paperid: 2966, https://arxiv.org/pdf/2512.01255.pdf

Abstract:
Researchers have proposed numerous methods to detect vulnerabilities in JavaScript, especially those assisted by Large Language Models (LLMs). However, the actual capability of LLMs in JavaScript vulnerability detection remains questionable, necessitating systematic evaluation and comprehensive benchmarks. Unfortunately, existing benchmarks suffer from three critical limitations: (1) incomplete coverage, such as covering a limited subset of CWE types; (2) underestimation of LLM capabilities caused by unreasonable ground truth labeling; and (3) overestimation due to unrealistic cases such as using isolated vulnerable files rather than complete projects. In this paper, we introduce, for the first time, three principles for constructing a benchmark for JavaScript vulnerability detection that directly address these limitations: (1) comprehensiveness, (2) no underestimation, and (3) no overestimation. Guided by these principles, we propose FORGEJS, the first automatic benchmark generation framework for evaluating LLMs' capability in JavaScript vulnerability detection. Then, we use FORGEJS to construct ARENAJS-the first systematic benchmark for LLM-based JavaScript vulnerability detection-and further propose JUDGEJS, an automatic evaluation framework. We conduct the first systematic evaluation of LLMs for JavaScript vulnerability detection, leveraging JUDGEJS to assess seven popular commercial LLMs on ARENAJS. The results show that LLMs not only exhibit limited reasoning capabilities, but also suffer from severe robustness defects, indicating that reliable JavaScript vulnerability detection with LLMs remains an open challenge.

Paperid: 2967, https://arxiv.org/pdf/2512.00434.pdf

Abstract:
Data privacy is a critical challenge in modern medical workflows as the adoption of electronic patient records has grown rapidly. Stringent data protection regulations limit access to clinical records for training and integrating machine learning models that have shown promise in improving diagnostic accuracy and personalized care outcomes. Synthetic data offers a promising alternative; however, current generative models either struggle with time-series data or lack formal privacy guaranties. In this paper, we enhance a state-of-the-art time-series generative model to better handle longitudinal clinical data while incorporating quantifiable privacy safeguards. Using real data from chronic kidney disease and ICU patients, we evaluate our method through statistical tests, a Train-on-Synthetic-Test-on-Real (TSTR) setup, and expert clinical review. Our non-private model (Augmented TimeGAN) outperforms transformer- and flow-based models on statistical metrics in several datasets, while our private model (DP-TimeGAN) maintains a mean authenticity of 0.778 on the CKD dataset, outperforming existing state-of-the-art models on the privacy-utility frontier. Both models achieve performance comparable to real data in clinician evaluations, providing robust input data necessary for developing models for complex chronic conditions without compromising data privacy.

Paperid: 2968, https://arxiv.org/pdf/2511.22924.pdf

Abstract:
Large Language Model (LLM)-based Multi-Agent Systems (MAS) offer powerful cooperative reasoning but remain vulnerable to adversarial attacks, where compromised agents can undermine the system's overall performance. Existing defenses either depend on single trusted auditors, creating single points of failure, or sacrifice efficiency for robustness. To resolve this tension, we propose \textbf{AgentShield}, a distributed framework for efficient, decentralized auditing. AgentShield introduces a novel three-layer defense: \textbf{(i) Critical Node Auditing} prioritizes high-influence agents via topological analysis; \textbf{(ii) Light Token Auditing} implements a cascade protocol using lightweight sentry models for rapid discriminative verification; and \textbf{(iii) Two-Round Consensus Auditing} triggers heavyweight arbiters only upon uncertainty to ensure global agreement. This principled design optimizes the robustness-efficiency trade-off. Experiments demonstrate that AgentShield achieves a 92.5\% recovery rate and reduces auditing overhead by over 70\% compared to existing methods, maintaining high collaborative accuracy across diverse MAS topologies and adversarial scenarios.

Paperid: 2969, https://arxiv.org/pdf/2511.22317.pdf

Abstract:
The growing scalability demand of public Blockchains led to the rise of Layer-2 solutions, such as Rollups. Rollups improve transaction throughput by processing operations off-chain and posting the results on-chain. A critical component in Rollups is the Sequencer, responsible for receiving, ordering and batching transactions before they are submitted to the Layer-1 blockchain. While essential, the centralized nature of the Sequencer makes it vulnerable to attacks, such as censorship, transaction manipulation and tampering. To enhance its security, there are solutions in the literature that shield the Sequencer inside a Trusted Execution Environment (TEE). However, the attestation of TEEs introduces additional centralization, which is in contrast with the core Blockchain principle. In this paper, we propose a TEE-secured Sequencer equipped with a decentralized attestation mechanism. We outline the design and implementation of our solution, covering the system architecture, TEE integration, and the decentralization of the attestation process. Additionally, we present an experimental evaluation conducted on a realistic Rollup testnet. Our results show that this approach strengthens Sequencer integrity without sacrificing compatibility or deployability in existing Layer-2 architectures.

Paperid: 2970, https://arxiv.org/pdf/2511.21842.pdf

Abstract:
The rapid expansion of Internet of Things (IoT) deployments across diverse sectors has significantly enhanced operational efficiency, yet concurrently elevated cybersecurity vulnerabilities due to increased exposure to cyber threats. Given the limitations of traditional signature-based Anomaly Detection Systems (ADS) in identifying emerging and zero-day threats, this study investigates the effectiveness of two unsupervised anomaly detection techniques, Isolation Forest (IF) and One-Class Support Vector Machine (OC-SVM), using the TON_IoT thermostat dataset. A comprehensive evaluation was performed based on standard metrics (accuracy, precision, recall, and F1-score) alongside critical resource utilization metrics such as inference time, model size, and peak RAM usage. Experimental results revealed that IF consistently outperformed OC-SVM, achieving higher detection accuracy, superior precision, and recall, along with a significantly better F1-score. Furthermore, Isolation Forest demonstrated a markedly superior computational footprint, making it more suitable for deployment on resource-constrained IoT edge devices. These findings underscore Isolation Forest's robustness in high-dimensional and imbalanced IoT environments and highlight its practical viability for real-time anomaly detection.

Paperid: 2971, https://arxiv.org/pdf/2511.21768.pdf

Abstract:
The rapid deployment of AI models necessitates robust, quantum-resistant security, particularly against adversarial threats. Here, we present a novel integration of post-quantum cryptography (PQC) and zero trust architecture (ZTA), formally grounded in category theory, to secure AI model access. Our framework uniquely models cryptographic workflows as morphisms and trust policies as functors, enabling fine-grained, adaptive trust and micro-segmentation for lattice-based PQC primitives. This approach offers enhanced protection against adversarial AI threats. We demonstrate its efficacy through a concrete ESP32-based implementation, validating a crypto-agile transition with quantifiable performance and security improvements, underpinned by categorical proofs for AI security. The implementation achieves significant memory efficiency on ESP32, with the agent utilizing 91.86% and the broker 97.88% of free heap after cryptographic operations, and successfully rejects 100% of unauthorized access attempts with sub-millisecond average latency.

Paperid: 2972, https://arxiv.org/pdf/2511.21764.pdf

Abstract:
Polymorphic malware continually alters its structure to evade signature-based defences, challenging both commercial antivirus (AV) and enterprise detection systems. This study introduces a reproducible framework for analysing eight polymorphic behaviours-junk code insertion, control-flow obfuscation, packing, data encoding, domain generation, randomized beacon timing, protocol mimicry, and format/header tweaks-and evaluates their detectability across three layers: commercial AVs, custom rule-based detectors (YARA/Sigma), and endpoint detection and response (EDR) telemetry. Eleven inert polymorphic variants were generated per behaviour using controlled mutation engines and executed in isolated environments. Detection performance was assessed by detection rate (DR), false positive rate (FPR), and combined coverage. AVs achieved an average DR of 34%, YARA/Sigma 74% and EDR 76%; integrated detection reached ~92% with an FPR of 3.5%. Iterative YARA tuning showed a trade-off between detection and FPR, while behaviour-specific trends revealed static polymorphisms were best caught by custom rules, dynamic by EDR, and network-level by Sigma-like analysis. These results affirm that hybrid detection pipelines combining static, dynamic, and network-layer analytics offer resilient defence against polymorphic malware and form a baseline for future adaptive detection research.

Paperid: 2973, https://arxiv.org/pdf/2511.21448.pdf

Abstract:
Phishing and spam emails remain a major cybersecurity threat, with attackers increasingly leveraging Large Language Models (LLMs) to craft highly deceptive content. This study presents a comprehensive email dataset containing phishing, spam, and legitimate messages, explicitly distinguishing between human- and LLM-generated content. Each email is annotated with its category, emotional appeal (e.g., urgency, fear, authority), and underlying motivation (e.g., link-following, credential theft, financial fraud). We benchmark multiple LLMs on their ability to identify these emotional and motivational cues and select the most reliable model to annotate the full dataset. To evaluate classification robustness, emails were also rephrased using several LLMs while preserving meaning and intent. A state-of-the-art LLM was then assessed on its performance across both original and rephrased emails using expert-labeled ground truth. The results highlight strong phishing detection capabilities but reveal persistent challenges in distinguishing spam from legitimate emails. Our dataset and evaluation framework contribute to improving AI-assisted email security systems. To support open science, all code, templates, and resources are available on our project site.

Paperid: 2974, https://arxiv.org/pdf/2511.20944.pdf

Abstract:
Business Email Compromise (BEC) is a sophisticated social engineering threat that manipulates organizational hierarchies, leading to significant financial damage. According to the 2024 FBI Internet Crime Report, BEC accounts for over $2.9 billion in annual losses, presenting a massive economic asymmetry: the financial cost of a False Negative (fraud loss) exceeds the operational cost of a False Positive (manual review) by a ratio of approximately 5,480:1. This paper contrasts two detection paradigms: a Forensic Psycholinguistic Stream (CatBoost), which analyzes linguistic cues like urgency and authority with high interpretability, and a Semantic Stream (DistilBERT), which utilizes deep learning for contextual understanding. We evaluated both streams on a hybrid dataset (N=7,990) containing human-legitimate and AI-synthesized adversarial fraud. Benchmarked on Tesla T4 infrastructure, DistilBERT achieved near-perfect detection on synthetic threats (AUC >0.99, F1 =0.998) with acceptable real-time latency (7.4 ms). CatBoost achieved competitive detection (AUC =0.991, F1 =0.949) at 8.4x lower latency (0.8 ms) with negligible resource consumption. We conclude that while DistilBERT offers maximum accuracy for GPU-equipped organizations, CatBoost provides a viable, cost-effective alternative for edge deployments. Both approaches demonstrate a theoretical ROI exceeding 99.9% when optimized via cost-sensitive learning.

Paperid: 2975, https://arxiv.org/pdf/2511.20878.pdf

Abstract:
The advent of Artificial Intelligence (AI), particularly large language models (LLMs), has revolutionized software development by enabling developers to specify tasks in natural language and receive corresponding code, boosting productivity. However, this shift also introduces security risks, as LLMs may generate insecure code that can be exploited by adversaries. Current educational approaches emphasize efficiency while overlooking these risks, leaving students underprepared to identify and mitigate security issues in AI-assisted workflows. To address this gap, we present Bifröst, an educational framework that cultivates security awareness in AI-augmented development. Bifröst integrates (1) a Visual Studio Code extension simulating realistic environments, (2) adversarially configured LLMs that generate insecure code, and (3) a feedback system highlighting vulnerabilities. By immersing students in tasks with compromised LLMs and providing targeted security analysis, Bifröst cultivates critical evaluation skills; classroom deployments (n=61) show vulnerability to insecure code, while a post-intervention survey (n=21) indicates increased skepticism toward LLM outputs.

Paperid: 2976, https://arxiv.org/pdf/2511.20799.pdf

Abstract:
Large language models, trained on massive corpora, are prone to verbatim memorization of training data, creating significant privacy and copyright risks. While previous works have proposed various definitions for memorization, many exhibit shortcomings in comprehensively capturing this phenomenon, especially in aligned models. To address this, we introduce a novel framework: multi-prefix memorization. Our core insight is that memorized sequences are deeply encoded and thus retrievable via a significantly larger number of distinct prefixes than non-memorized content. We formalize this by defining a sequence as memorized if an external adversarial search can identify a target count of distinct prefixes that elicit it. This framework shifts the focus from single-path extraction to quantifying the robustness of a memory, measured by the diversity of its retrieval paths. Through experiments on open-source and aligned chat models, we demonstrate that our multi-prefix definition reliably distinguishes memorized from non-memorized data, providing a robust and practical tool for auditing data leakage in LLMs.

Paperid: 2977, https://arxiv.org/pdf/2511.20744.pdf

Abstract:
We use electronic communication networks for more than simply traditional telecommunications: we access the news, buy goods online, file our taxes, contribute to public debate, and more. As a result, a wider array of privacy interests is implicated for users of electronic communications networks and services. This development calls into question the scope of electronic communications privacy rules. This paper analyses the scope of these rules, taking into account the rationale and the historic background of the European electronic communications privacy framework. We develop a framework for analysing the scope of electronic communications privacy rules using three approaches: (i) a service-centric approach, (ii) a data-centric approach, and (iii) a value-centric approach. We discuss the strengths and weaknesses of each approach. The current e-Privacy Directive contains a complex blend of the three approaches, which does not seem to be based on a thorough analysis of their strengths and weaknesses. The upcoming review of the directive announced by the European Commission provides an opportunity to improve the scoping of the rules.

Paperid: 2978, https://arxiv.org/pdf/2511.20630.pdf

Abstract:
We propose a novel quantum-resistant mutual authentication scheme for radio-frequency identification (RFID) systems. Our scheme uses lattice-based cryptography and, in particular, achieves quantum-resistance by leveraging the hardness of the inhomogeneous short integer solution (ISIS) problem. In contrast to prior work, which assumes that the reader-server communication channel is secure, our scheme is secure even when both the reader-server and tag-reader communication channels are insecure. Our proposed protocol provides robust security against man-in-the-middle (MITM), replay, impersonation, and reflection attacks, while also ensuring unforgeability and preserving anonymity. We present a detailed security analysis, including semi-formal analysis and formal verification using the Automated Validation of Internet Security Protocols and Applications (AVISPA) tool. In addition, we analyze the storage, computation, and communication costs of the proposed protocol and compare its security properties with those of existing protocols, demonstrating that our scheme offers strong security guarantees. To the best of our knowledge, this paper is the first quantum-resistant authentication protocol for RFID systems that comprehensively addresses the insecurity of both the reader-server and tag-reader communication channels.

Paperid: 2979, https://arxiv.org/pdf/2511.19464.pdf

Abstract:
SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks. We investigate whether locally executed SLMs can meet this challenge. We evaluated 21 models ranging from 1B to 20B parameters, varying the temperature hyperparameter and measuring execution time and precision across two distinct architectures. The results indicate that temperature has little influence on performance, whereas the number of parameters and GPU capacity are decisive factors.

Paperid: 2980, https://arxiv.org/pdf/2511.18467.pdf

Abstract:
The rapid advancement of Large Language Model (LLM)-driven multi-agent systems has significantly streamlined software developing tasks, enabling users with little technical expertise to develop executable applications. While these systems democratize software creation through natural language requirements, they introduce significant security risks that remain largely unexplored. We identify two risky scenarios: Malicious User with Benign Agents (MU-BA) and Benign User with Malicious Agents (BU-MA). We introduce the Implicit Malicious Behavior Injection Attack (IMBIA), demonstrating how multi-agent systems can be manipulated to generate software with concealed malicious capabilities beneath seemingly benign applications, and propose Adv-IMBIA as a defense mechanism. Evaluations across ChatDev, MetaGPT, and AgentVerse frameworks reveal varying vulnerability patterns, with IMBIA achieving attack success rates of 93%, 45%, and 71% in MU-BA scenarios, and 71%, 84%, and 45% in BU-MA scenarios. Our defense mechanism reduced attack success rates significantly, particularly in the MU-BA scenario. Further analysis reveals that compromised agents in the coding and testing phases pose significantly greater security risks, while also identifying critical agents that require protection against malicious user exploitation. Our findings highlight the urgent need for robust security measures in multi-agent software development systems and provide practical guidelines for implementing targeted, resource-efficient defensive strategies.

Paperid: 2981, https://arxiv.org/pdf/2511.18114.pdf

Abstract:
Securing AI agents powered by Large Language Models (LLMs) represents one of the most critical challenges in AI security today. Unlike traditional software, AI agents leverage LLMs as their "brain" to autonomously perform actions via connected tools. This capability introduces significant risks that go far beyond those of harmful text presented in a chatbot that was the main application of LLMs. A compromised AI agent can deliberately abuse powerful tools to perform malicious actions, in many cases irreversible, and limited solely by the guardrails on the tools themselves and the LLM ability to enforce them. This paper presents ASTRA, a first-of-its-kind framework designed to evaluate the effectiveness of LLMs in supporting the creation of secure agents that enforce custom guardrails defined at the system-prompt level (e.g., "Do not send an email out of the company domain," or "Never extend the robotic arm in more than 2 meters"). Our holistic framework simulates 10 diverse autonomous agents varying between a coding assistant and a delivery drone equipped with 37 unique tools. We test these agents against a suite of novel attacks developed specifically for agentic threats, inspired by the OWASP Top 10 but adapted to challenge the ability of the LLM for policy enforcement during multi-turn planning and execution of strict tool activation. By evaluating 13 open-source, tool-calling LLMs, we uncovered surprising and significant differences in their ability to remain secure and keep operating within their boundaries. The purpose of this work is to provide the community with a robust and unified methodology to build and validate better LLMs, ultimately pushing for more secure and reliable agentic AI systems.

Paperid: 2982, https://arxiv.org/pdf/2511.18045.pdf

Abstract:
The exponential growth of the Internet of Things (IoT) ecosystem has amplified concerns regarding device reliability, interoperability, and security assurance. Despite the proliferation of IoT security guidelines, a unified and quantitative approach to measuring trust remains absent. This paper introduces SCI-IoT (Secure Certification Index for IoT), a standardized and quantitative framework for trust scoring, evaluation, and certification of IoT devices. The framework employs a six-tier grading model (Grades A-F), enabling device profiling across consumer, industrial, and critical infrastructure domains. Within this model, 30 distinct Trust Tests assess devices across dimensions such as authentication, encryption, data integrity, resilience, and firmware security. Each test is assigned a criticality-based weight (1.0-2.0) and a performance rating (1-4), converted to a normalized percentage and aggregated through a weighted computation to yield the Secure Certification Index (SCI). The SCI determines the device's Trust Verdict, categorized into five SCI levels, and serves as the foundation for optional grade-based certification. The framework also incorporates critical gate conditions, enforcing absolute compliance in high risk parameters to prevent certification of devices with fundamental vulnerabilities. By unifying quantitative trust scoring with structured certification criteria, SCI-IoT provides a transparent, scalable, and reproducible method to benchmark IoT device trustworthiness. The proposed system aims to streamline manufacturer compliance, improve consumer confidence, and facilitate global interoperability in IoT security certification.

Paperid: 2983, https://arxiv.org/pdf/2511.17978.pdf

Abstract:
Electric Vehicle (EV) charging infrastructure faces escalating cybersecurity threats that can severely compromise operational efficiency and grid stability. Existing forecasting techniques are limited by the lack of combined robust anomaly mitigation solutions and data privacy preservation. Therefore, this paper addresses these challenges by proposing a novel anomaly-resilient federated learning framework that simultaneously preserves data privacy, detects cyber-attacks, and maintains trustworthy demand prediction accuracy under adversarial conditions. The proposed framework integrates three key innovations: LSTM autoencoder-based distributed anomaly detection deployed at each federated client, interpolation-based anomalous data mitigation to preserve temporal continuity, and federated Long Short-Term Memory (LSTM) networks that enable collaborative learning without centralized data aggregation. The framework is validated on real-world EV charging infrastructure datasets combined with real-world DDoS attack datasets, providing robust validation of the proposed approach under realistic threat scenarios. Experimental results demonstrate that the federated approach achieves superior performance compared to centralized models, with 15.2% improvement in R2 accuracy while maintaining data locality. The integrated cyber-attack detection and mitigation system produces trustworthy datasets that enhance prediction reliability, recovering 47.9% of attack-induced performance degradation while maintaining exceptional precision (91.3%) and minimal false positive rates (1.21%). The proposed architecture enables enhanced EV infrastructure planning, privacy-preserving collaborative forecasting, cybersecurity resilience, and rapid recovery from malicious threats across distributed charging networks.

Paperid: 2984, https://arxiv.org/pdf/2511.17968.pdf

Abstract:
Maintaining economic efficiency and operational reliability in microgrid energy management systems under cyberattack conditions remains challenging. Most approaches assume non-anomalous measurements, make predictions with unquantified uncertainties, and do not mitigate malicious attacks on renewable forecasts for energy management optimization. This paper presents a comprehensive cyber-resilient framework integrating federated Long Short-Term Memory-based photovoltaic forecasting with a novel two-stage cascade false data injection attack detection and energy management system optimization. The approach combines autoencoder reconstruction error with prediction uncertainty quantification to enable attack-resilient energy storage scheduling while preserving data privacy. Extreme false data attack conditions were studied that caused 58% forecast degradation and 16.9\% operational cost increases. The proposed integrated framework reduced false positive detections by 70%, recovered 93.7% of forecasting performance losses, and achieved 5\% operational cost savings, mitigating 34.7% of attack-induced economic losses. Results demonstrate that precision-focused cascade detection with multi-signal fusion outperforms single-signal approaches, validating security-performance synergy for decentralized microgrids.

Paperid: 2985, https://arxiv.org/pdf/2511.17874.pdf

Abstract:
LLM applications (i.e., LLM apps) leverage the powerful capabilities of LLMs to provide users with customized services, revolutionizing traditional application development. While the increasing prevalence of LLM-powered applications provides users with unprecedented convenience, it also brings forth new security challenges. For such an emerging ecosystem, the security community lacks sufficient understanding of the LLM application ecosystem, especially regarding the capability boundaries of the applications themselves. In this paper, we systematically analyzed the new development paradigm and defined the concept of the LLM app capability space. We also uncovered potential new risks beyond jailbreak that arise from ambiguous capability boundaries in real-world scenarios, namely, capability downgrade and upgrade. To evaluate the impact of these risks, we designed and implemented an LLM app capability evaluation framework, LLMApp-Eval. First, we collected application metadata across 4 platforms and conducted a cross-platform ecosystem analysis. Then, we evaluated the risks for 199 popular applications among 4 platforms and 6 open-source LLMs. We identified that 178 (89.45%) potentially affected applications, which can perform tasks from more than 15 scenarios or be malicious. We even found 17 applications in our study that executed malicious tasks directly, without applying any adversarial rewriting. Furthermore, our experiments also reveal a positive correlation between the quality of prompt design and application robustness. We found that well-designed prompts enhance security, while poorly designed ones can facilitate abuse. We hope our work inspires the community to focus on the real-world risks of LLM applications and foster the development of a more robust LLM application ecosystem.

Paperid: 2986, https://arxiv.org/pdf/2511.17773.pdf

Abstract:
Memory-safety escapes continue to form the launching pad for a wide range of security attacks, especially for the substantial base of deployed software that is coded in pointer-based languages such as C/C++. Although compiler and Instruction Set Architecture (ISA) extensions have been introduced to address elements of this issue, the overhead and/or comprehensive applicability have limited broad production deployment. The Memory Tagging Extension (MTE) to the ARM AArch64 Instruction Set Architecture is a valuable tool to address memory-safety escapes; when used in synchronous tag-checking mode, MTE provides deterministic detection and prevention of sequential buffer overflow attacks, and probabilistic detection and prevention of exploits resulting from temporal use-after-free pointer programming bugs. The AmpereOne processor, launched in 2024, is the first datacenter processor to support MTE. Its optimized MTE implementation uniquely incurs no memory capacity overhead for tag storage and provides synchronous tag-checking with single-digit performance impact across a broad range of datacenter class workloads. Furthermore, this paper analyzes the complete hardware-software stack, identifying application memory management as the primary remaining source of overhead and highlighting clear opportunities for software optimization. The combination of an efficient hardware foundation and a clear path for software improvement makes the MTE implementation of the AmpereOne processor highly attractive for deployment in production cloud environments.

Paperid: 2987, https://arxiv.org/pdf/2511.17194.pdf

Abstract:
Modern large language models (LLMs) are typically secured by auditing data, prompts, and refusal policies, while treating the forward pass as an implementation detail. We show that intermediate activations in decoder-only LLMs form a vulnerable attack surface for behavioral control. Building on recent findings on attention sinks and compression valleys, we identify a high-gain region in the residual stream where small, well-aligned perturbations are causally amplified along the autoregressive trajectory--a Causal Amplification Effect (CAE). We exploit this as an attack surface via Sensitivity-Scaled Steering (SSS), a progressive activation-level attack that combines beginning-of-sequence (BOS) anchoring with sensitivity-based reinforcement to focus a limited perturbation budget on the most vulnerable layers and tokens. We show that across multiple open-weight models and four behavioral axes, SSS induces large shifts in evil, hallucination, sycophancy, and sentiment while preserving high coherence and general capabilities, turning activation steering into a concrete security concern for white-box and supply-chain LLM deployments.

Paperid: 2988, https://arxiv.org/pdf/2511.16822.pdf

Abstract:
In the context of the growing proliferation of user devices and the concurrent surge in data volumes, the complexities arising from the substantial increase in data have posed formidable challenges to conventional machine learning model training. Particularly, this is evident within resource-constrained and security-sensitive environments such as those encountered in networks associated with the Internet of Things (IoT). Federated Learning has emerged as a promising remedy to these challenges by decentralizing model training to edge devices or parties, effectively addressing privacy concerns and resource limitations. Nevertheless, the presence of statistical heterogeneity in non-Independently and Identically Distributed (non-IID) data across different parties poses a significant hurdle to the effectiveness of FL. Many FL approaches have been proposed to enhance learning effectiveness under statistical heterogeneity. However, prior studies have uncovered a gap in the existing research landscape, particularly in the absence of a comprehensive comparison between federated methods addressing statistical heterogeneity in detecting IoT attacks. In this research endeavor, we delve into the exploration of FL algorithms, specifically FedAvg, FedProx, and Scaffold, under different data distributions. Our focus is on achieving a comprehensive understanding of and addressing the challenges posed by statistical heterogeneity. In this study, We classify large-scale IoT attacks by utilizing the CICIoT2023 dataset. Through meticulous analysis and experimentation, our objective is to illuminate the performance nuances of these FL methods, providing valuable insights for researchers and practitioners in the domain.

Paperid: 2989, https://arxiv.org/pdf/2511.16560.pdf

Abstract:
Blockchain interoperability is increasingly recognized as the centerpiece for robust interactions among decentralized services. Blockchain ledgers are generally tamper-proof and thus enforce non-repudiation for transactions recorded within the same network. However, such a guarantee does not hold for cross-blockchain transactions. When disruptions occur due to malicious activities or system failures within one blockchain network, foreign networks can take advantage by denying legitimate claims or mounting fraudulent liabilities against the defenseless network. In response, this paper introduces InterSnap, a novel blockchain snapshot archival methodology, for enabling auditability of crossblockchain transactions, enforcing non-repudiation. InterSnap introduces cross-chain transaction receipts that ensure their irrefutability. Snapshots of ledger data along with these receipts are utilized as non-repudiable proof of bilateral agreements among different networks. InterSnap enhances system resilience through a distributed snapshot generation process, need-based snapshot scheduling process, and archival storage and sharing via decentralized platforms. Through a prototype implementation based on Hyperledger Fabric, we conducted experiments using on-premise machines, AWS public cloud instances, as well as a private cloud infrastructure. We establish that InterSnap can recover from malicious attacks while preserving crosschain transaction receipts. Additionally, our proposed solution demonstrates adaptability to increasing loads while securely transferring snapshot archives with minimal overhead.

Paperid: 2990, https://arxiv.org/pdf/2511.15759.pdf

Abstract:
Retrieval-augmented generation (RAG) systems have become widely used for enhancing large language model capabilities, but they introduce significant security vulnerabilities through prompt injection attacks. We present a comprehensive benchmark for evaluating prompt injection risks in RAG-enabled AI agents and propose a multi-layered defense framework. Our benchmark includes 847 adversarial test cases across five attack categories: direct injection, context manipulation, instruction override, data exfiltration, and cross-context contamination. We evaluate three defense mechanisms: content filtering with embedding-based anomaly detection, hierarchical system prompt guardrails, and multi-stage response verification, across seven state-of-the-art language models. Our combined framework reduces successful attack rates from 73.2% to 8.7% while maintaining 94.3% of baseline task performance. We release our benchmark dataset and defense implementation to support future research in AI agent security.

Paperid: 2991, https://arxiv.org/pdf/2511.15745.pdf

Abstract:
This paper proposes an automated LLM-based method to extract and structure vulnerabilities from OpenVAS and Tenable WAS scanner reports, converting unstructured data into a standardized format for risk management. In an evaluation using a report with 34 vulnerabilities, GPT-4.1 and DeepSeek achieved the highest similarity to the baseline (ROUGE-L greater than 0.7). The method demonstrates feasibility in transforming complex reports into usable datasets, enabling effective prioritization and future anonymization of sensitive data.

Paperid: 2992, https://arxiv.org/pdf/2511.15744.pdf

Abstract:
This work presents AnonLFI 2.0, a modular pseudonymization framework for CSIRTs that uses HMAC SHA256 to generate strong and reversible pseudonyms, preserves XML and JSON structures, and integrates OCR and technical recognizers for PII and security artifacts. In two case studies involving OCR applied to PDF documents and an OpenVAS XML report, the system achieved perfect precision and F1 scores of 76.5 and 92.13, demonstrating its effectiveness for securely preparing complex cybersecurity datasets.

Paperid: 2993, https://arxiv.org/pdf/2511.15434.pdf

Abstract:
Phishing websites pose a major cybersecurity threat, exploiting unsuspecting users and causing significant financial and organisational harm. Traditional machine learning approaches for phishing detection often require extensive feature engineering, continuous retraining, and costly infrastructure maintenance. At the same time, proprietary large language models (LLMs) have demonstrated strong performance in phishing-related classification tasks, but their operational costs and reliance on external providers limit their practical adoption in many business environments. This paper investigates the feasibility of small language models (SLMs) for detecting phishing websites using only their raw HTML code. A key advantage of these models is that they can be deployed on local infrastructure, providing organisations with greater control over data and operations. We systematically evaluate 15 commonly used Small Language Models (SLMs), ranging from 1 billion to 70 billion parameters, benchmarking their classification accuracy, computational requirements, and cost-efficiency. Our results highlight the trade-offs between detection performance and resource consumption, demonstrating that while SLMs underperform compared to state-of-the-art proprietary LLMs, they can still provide a viable and scalable alternative to external LLM services. By presenting a comparative analysis of costs and benefits, this work lays the foundation for future research on the adaptation, fine-tuning, and deployment of SLMs in phishing detection systems, aiming to balance security effectiveness and economic practicality.

Paperid: 2994, https://arxiv.org/pdf/2511.15031.pdf

Abstract:
Large-scale cyber-physical systems (CPS), such as railway control systems and smart grids, consist of geographically distributed subsystems that are connected via unreliable, asynchronous inter-region networks. Their scale and distribution make them especially vulnerable to faults and attacks. Unfortunately, existing fault-tolerant methods either consume excessive resources or provide only eventual guarantees, making them unsuitable for real-time resource-constrained CPS. We present GeoShield, a resource-efficient solution for defending geo-distributed CPS against Byzantine faults. GeoShield leverages the property that CPS are designed to tolerate brief disruptions and maintain safety, as long as they recover (i.e., resume normal operations or transition to a safe mode) within a bounded amount of time following a fault. Instead of masking faults, it detects them and recovers the system within bounded time, thus guaranteeing safety with much fewer resources. GeoShield introduces protocols for Byzantine fault-resilient network measurement and inter-region omission fault detection that proactively detect malicious message delays, along with recovery mechanisms that guarantee timely recovery while maximizing operational robustness. It is the first bounded-time recovery solution that operates effectively under unreliable networks without relying on trusted hardware. Evaluations using real-world case studies show that it significantly outperforms existing methods in both effectiveness and resource efficiency.

Paperid: 2995, https://arxiv.org/pdf/2511.14908.pdf

Abstract:
In this study, we evaluate open-source models for security incident classification, comparing them with proprietary models. We utilize a dataset of anonymized real incidents, categorized according to the NIST SP 800-61r3 taxonomy and processed using five prompt-engineering techniques (PHP, SHP, HTP, PRP, and ZSL). The results indicate that, although proprietary models still exhibit higher accuracy, locally deployed open-source models provide advantages in privacy, cost-effectiveness, and data sovereignty.

Paperid: 2996, https://arxiv.org/pdf/2511.14195.pdf

Abstract:
Evaluating the safety robustness of LLMs is critical for their deployment. However, mainstream Red Teaming methods rely on online generation and black-box output analysis. These approaches are not only costly but also suffer from feedback latency, making them unsuitable for agile diagnostics after training a new model. To address this, we propose N-GLARE (A Non-Generative, Latent Representation-Efficient LLM Safety Evaluator). N-GLARE operates entirely on the model's latent representations, bypassing the need for full text generation. It characterizes hidden layer dynamics by analyzing the APT (Angular-Probabilistic Trajectory) of latent representations and introducing the JSS (Jensen-Shannon Separability) metric. Experiments on over 40 models and 20 red teaming strategies demonstrate that the JSS metric exhibits high consistency with the safety rankings derived from Red Teaming. N-GLARE reproduces the discriminative trends of large-scale red-teaming tests at less than 1\% of the token cost and the runtime cost, providing an efficient output-free evaluation proxy for real-time diagnostics.

Paperid: 2997, https://arxiv.org/pdf/2511.13548.pdf

Abstract:
The rapid adoption of large language models (LLMs) has brought both transformative applications and new security risks, including jailbreak attacks that bypass alignment safeguards to elicit harmful outputs. Existing automated jailbreak generation approaches e.g. AutoDAN, suffer from limited mutation diversity, shallow fitness evaluation, and fragile keyword-based detection. To address these limitations, we propose ForgeDAN, a novel evolutionary framework for generating semantically coherent and highly effective adversarial prompts against aligned LLMs. First, ForgeDAN introduces multi-strategy textual perturbations across \textit{character, word, and sentence-level} operations to enhance attack diversity; then we employ interpretable semantic fitness evaluation based on a text similarity model to guide the evolutionary process toward semantically relevant and harmful outputs; finally, ForgeDAN integrates dual-dimensional jailbreak judgment, leveraging an LLM-based classifier to jointly assess model compliance and output harmfulness, thereby reducing false positives and improving detection effectiveness. Our evaluation demonstrates ForgeDAN achieves high jailbreaking success rates while maintaining naturalness and stealth, outperforming existing SOTA solutions.

Paperid: 2998, https://arxiv.org/pdf/2511.13517.pdf

Abstract:
Ransomware continues to evolve in complexity, making early and explainable detection a critical requirement for modern cybersecurity systems. This study presents a comparative analysis of three Transformer-based Large Language Models (LLMs) (BERT, RoBERTa, and DeBERTa) for ransomware detection using two structured datasets: UGRansome and Process Memory (PM). Since LLMs are primarily designed for natural language processing (NLP), numerical and categorical ransomware features were transformed into textual sequences using KBinsDiscretizer and token-based encoding. This enabled the models to learn behavioural patterns from system activity and network traffic through contextual embeddings. The models were fine-tuned on approximately 2,500 labelled samples and evaluated using accuracy, F1 score, and ROC-AUC. To ensure transparent decision-making in this high-stakes domain, two explainable AI techniques (LIME and SHAP) were applied to interpret feature contributions. The results show that the models learn distinct ransomware-related cues: BERT relies heavily on dominant file-operation features, RoBERTa demonstrates balanced reliance on network and financial signals, while DeBERTa exhibits strong sensitivity to financial and network-traffic indicators. Visualisation of embeddings further reveals structural differences in token representation, with RoBERTa producing more isotropic embeddings and DeBERTa capturing highly directional, disentangled patterns. In general, RoBERTa achieved the strongest F1-score, while BERT yielded the highest ROC-AUC performance. The integration of LLMs with XAI provides a transparent framework capable of identifying feature-level evidence behind ransomware predictions.

Paperid: 2999, https://arxiv.org/pdf/2511.13319.pdf

Abstract:
Users increasingly rely on large language models (LLMs) for personal, emotionally charged, and socially sensitive conversations. However, prompts sent to cloud-hosted models can contain personally identifiable information (PII) that users do not want logged, retained, or leaked. We observe this to be especially acute when users discuss friends, coworkers, or adversaries, i.e., when they spill the tea. Enterprises face the same challenge when they want to use LLMs for internal communication and decision-making. In this whitepaper, we present Whistledown, a best-effort privacy layer that modifies prompts before they are sent to the LLM. Whistledown combines pseudonymization and $ε$-local differential privacy ($ε$-LDP) with transformation caching to provide best-effort privacy protection without sacrificing conversational utility. Whistledown is designed to have low compute and memory overhead, allowing it to be deployed directly on a client's device in the case of individual users. For enterprise users, Whistledown is deployed centrally within a zero-trust gateway that runs on an enterprise's trusted infrastructure. Whistledown requires no changes to the existing APIs of popular LLM providers.

Paperid: 3000, https://arxiv.org/pdf/2511.12830.pdf

Abstract:
Personal communication using technical means is protected by telecommunications secrecy. Any interference with this fundamental right requires a legal basis, which has existed for many years for traditional communication services in the form of telecommunications surveillance (TKÜ, § 100a StPO) and appears to be widely accepted by society. The basis for the implementation of TKÜ is the obligation of telecommunications providers to provide interception interfaces. However, the technical implementation of telecommunications has changed significantly as a result of the Internet. Messenger services and Voice over IP telephony are increasingly competing with traditional telephone services. The use of strong end-to-end encryption made possible by this technology is increasingly posing problems for law enforcement agencies, as only cryptographically encrypted content is accessible via the interception interfaces provided to date. Against the backdrop of current discussions on socalled ``chat control'' and its limited social acceptance, this article addresses the question of whether and, if so, how the cooperation obligations of the technical actors involved can be sensibly regulated in the case of encrypted communication.

Paperid: 3001, https://arxiv.org/pdf/2511.12827.pdf

Abstract:
The deployment of robust malware detection systems in big data environments requires careful consideration of both security effectiveness and computational efficiency. While recent advances in adversarial defenses have demonstrated strong robustness improvements, they often introduce computational overhead ranging from 4x to 22x, which presents significant challenges for production systems processing millions of samples daily. In this work, we propose a novel framework that combines Trust-Raw Override (TRO) with Confidence-Adaptive Bit-Depth Reduction (CABDR) to explicitly optimize the trade-off between adversarial robustness and computational efficiency. Our approach leverages adaptive confidence-based mechanisms to selectively apply defensive measures, achieving 1.76x computational overhead - a 2.3x improvement over state-of-the-art smoothing defenses. Through comprehensive evaluation on the EMBER v2 dataset comprising 800K samples, we demonstrate that our framework maintains 91 percent clean accuracy while reducing attack success rates to 31-37 percent across multiple attack types, with particularly strong performance against optimization-based attacks such as C and W (48.8 percent reduction). The framework achieves throughput of up to 1.26 million samples per second (measured on pre-extracted EMBER features with no runtime feature extraction), validated across 72 production configurations with statistical significance (5 independent runs, 95 percent confidence intervals, p less than 0.01). Our results suggest that practical adversarial robustness in production environments requires explicit optimization of the efficiency-robustness trade-off, providing a viable path for organizations to deploy robust defenses without prohibitive infrastructure costs.

Paperid: 3002, https://arxiv.org/pdf/2511.12766.pdf

Abstract:
High-Altitude Platform Stations (HAPS) are emerging stratospheric nodes within non-terrestrial networks. We provide a structured overview of HAPS subsystems and principal communication links, map cybersecurity and privacy exposure across communication, control, and power subsystems, and propose a stratosphere-aware threat taxonomy. We then discuss defenses feasible under HAPS constraints including encryption and authentication, frequency agility, directional and beam-steered antennas, intrusion detection, secure boot, and software and supply-chain assurance-while highlighting how they align with emerging regulatory and standards guidance. Finally, we report a simulation-based case study using OMNeT++/INET to characterize distributed-denial-of-service (DDoS) impact on service and control-plane availability, and summarize regulatory and standardization considerations relevant to deployment. We conclude with concrete future research directions. The study is simulation-grounded and intended to inform engineering trade-offs for real-world HAPS deployments rather than serve as an on-air validation.

Paperid: 3003, https://arxiv.org/pdf/2511.12752.pdf

Abstract:
The Key Value(KV) cache is an important component for efficient inference in autoregressive Large Language Models (LLMs), but its role as a representation of the model's internal state makes it a potential target for integrity attacks. This paper introduces "History Swapping," a novel block-level attack that manipulates the KV cache to steer model generation without altering the user-facing prompt. The attack involves overwriting a contiguous segment of the active generation's cache with a precomputed cache from a different topic. We empirically evaluate this method across 324 configurations on the Qwen 3 family of models, analyzing the impact of timing, magnitude, and layer depth of the cache overwrite. Our findings reveal that only full-layer overwrites can successfully hijack the conversation's topic, leading to three distinct behaviors: immediate and persistent topic shift, partial recovery, or a delayed hijack. Furthermore, we observe that high-level structural plans are encoded early in the generation process and local discourse structure is maintained by the final layers of the model. This work demonstrates that the KV cache is a significant vector for security analysis, as it encodes not just context but also topic trajectory and structural planning, making it a powerful interface for manipulating model behavior.

Paperid: 3004, https://arxiv.org/pdf/2511.12743.pdf

Abstract:
The performance of Machine Learning (ML) and Deep Learning (DL)-based Intrusion Detection and Prevention Systems (IDS/IPS) is critically dependent on the relevance and quality of the datasets used for training and evaluation. However, current AI model evaluation practices for developing IDS/IPS focus predominantly on accuracy metrics, often overlooking whether datasets represent industry-specific threats. To address this gap, we introduce a novel multi-dimensional framework that integrates the MITRE ATT&CK knowledge base for threat intelligence and employs five complementary metrics that together provide a comprehensive assessment of dataset suitability. Methodologically, this framework combines threat intelligence, natural language processing, and quantitative analysis to assess the suitability of datasets for specific industry contexts. Applying this framework to nine publicly available IDS/IPS datasets reveals significant gaps in threat coverage, particularly in the healthcare, energy, and financial sectors. In particular, recent datasets (e.g., CIC-IoMT, CIC-UNSW-NB15) align better with sector-specific threats, whereas others, like CICIoV-24, underperform despite their recency. Our findings provide a standardized, interpretable approach for selecting datasets aligned with sector-specific operational requirements, ultimately enhancing the real-world effectiveness of AI-driven IDS/IPS deployments. The efficiency and practicality of the framework are validated through deployment in a real-world case study, underscoring its capacity to inform dataset selection and enhance the effectiveness of AI-driven IDS/IPS in operational environments.

Paperid: 3005, https://arxiv.org/pdf/2511.12643.pdf

Abstract:
Web Application Firewalls are crucial for protecting web applications against a wide range of cyber threats. Traditional Web Application Firewalls often struggle to effectively distinguish between malicious and legitimate traffic, leading to limited efficacy in threat detection. To overcome these limitations, this paper proposes an Adaptive Dual-Layer WAF employing a two-layered Machine Learning model designed to enhance the accuracy of anomaly and threat detection. The first layer employs a Decision Tree (DT) algorithm to detect anomalies by identifying traffic deviations from established normal patterns. The second layer employs Support Vector Machine to classify these anomalies as either threat anomalies or benign anomalies. Our Adaptive Dual Layer WAF incorporates comprehensive data pre-processing and feature engineering techniques and has been thoroughly evaluated using five large benchmark datasets. Evaluation using these datasets shows that ADL WAF achieves a detection accuracy of 99.88% and a precision of 100%, significantly enhancing anomaly detection and reducing false positives. These findings suggest that integrating machine learning techniques into WAFs can substantially improve web application security by providing more accurate and efficient threat detection.

Paperid: 3006, https://arxiv.org/pdf/2511.12565.pdf

Abstract:
Existing linguistic steganography methods primarily rely on content transformations to conceal secret messages. However, they often cause subtle yet looking-innocent deviations between normal and stego texts, posing potential security risks in real-world applications. To address this challenge, we propose a content-preserving linguistic steganography paradigm for perfectly secure covert communication without modifying the cover text. Based on this paradigm, we introduce CLstega (\textit{C}ontent-preserving \textit{L}inguistic \textit{stega}nography), a novel method that embeds secret messages through controllable distribution transformation. CLstega first applies an augmented masking strategy to locate and mask embedding positions, where MLM(masked language model)-predicted probability distributions are easily adjustable for transformation. Subsequently, a dynamic distribution steganographic coding strategy is designed to encode secret messages by deriving target distributions from the original probability distributions. To achieve this transformation, CLstega elaborately selects target words for embedding positions as labels to construct a masked sentence dataset, which is used to fine-tune the original MLM, producing a target MLM capable of directly extracting secret messages from the cover text. This approach ensures perfect security of secret messages while fully preserving the integrity of the original cover text. Experimental results show that CLstega can achieve a 100\% extraction success rate, and outperforms existing methods in security, effectively balancing embedding capacity and security.

Paperid: 3007, https://arxiv.org/pdf/2511.12377.pdf

Abstract:
The rise of Artificial Intelligence (AI) has impacted the development of mobile health (mHealth) apps, most notably with the advent of AI-based chatbots used as ubiquitous ``companions'' for various services, from fitness to mental health assistants. While these mHealth chatbots offer clear benefits, such as personalized health information and predictive diagnoses, they also raise significant concerns regarding security and privacy. This study empirically assesses 16 AI-based mHealth chatbots identified from the Google Play Store. The empirical assessment follows a three-phase approach (manual inspection, static code analysis, and dynamic analysis) to evaluate technical robustness and how design and implementation choices impact end users. Our findings revealed security vulnerabilities (e.g., enabling Remote WebView debugging), privacy issues, and non-compliance with Google Play policies (e.g., failure to provide publicly accessible privacy policies). Based on our findings, we offer several recommendations to enhance the security and privacy of mHealth chatbots. These recommendations focus on improving data handling processes, disclosure, and user security. Therefore, this work also seeks to support mHealth developers and security/privacy engineers in designing more transparent, privacy-friendly, and secure mHealth chatbots.

Paperid: 3008, https://arxiv.org/pdf/2511.12225.pdf

Abstract:
Resource-constrained embedded systems demand secure yet lightweight data protection, particularly when data formats must be preserved. This paper introduces eFPE (Enhanced Format-Preserving Encryption), an 8-round Feistel cipher featuring a "novel lightweight Pseudorandom Function (PRF)" specifically designed for this domain. The PRF, architected with an efficient two-iteration structure of AES-inspired operations (byte-substitution, keyed XOR, and byte-rotation), underpins eFPE's ability to directly encrypt even-length decimal strings without padding or complex conversions, while aiming for IND-CCA2 security under standard assumptions. Implemented and evaluated on an ARM7TDMI LPC2148 microcontroller using Keil μVision 4, eFPE demonstrates the efficacy of its targeted design: a total firmware Read-Only Memory (ROM) footprint of 4.73 kB and Random Access Memory (RAM) usage of 1.34 kB. The core eFPE algorithm module itself is notably compact, requiring only 3.55 kB ROM and 116 B RAM. These characteristics make eFPE a distinct and highly suitable solution for applications like financial terminals, medical sensors, and industrial IoT devices where data format integrity, minimal resource footprint, and low operational latency are paramount.

Paperid: 3009, https://arxiv.org/pdf/2511.12164.pdf

Abstract:
Fuzzing is a widely used technique for detecting vulnerabilities in smart contracts, which generates transaction sequences to explore the execution paths of smart contracts. However, existing fuzzers are falling short in detecting sophisticated vulnerabilities that require specific attack transaction sequences with proper inputs to trigger, as they (i) prioritize code coverage over vulnerability discovery, wasting considerable effort on non-vulnerable code regions, and (ii) lack semantic understanding of stateful contracts, generating numerous invalid transaction sequences that cannot pass runtime execution. In this paper, we propose SmartFuzz, a novel collaborative reflective fuzzer for smart contract vulnerability detection. It employs large language model-driven agents as the fuzzing engine and continuously improves itself by learning and reflecting through interactions with the environment. Specifically, we first propose a new Continuous Reflection Process (CRP) for fuzzing smart contracts, which reforms the transaction sequence generation as a self-evolving process through continuous reflection on feedback from the runtime environment. Then, we present the Reactive Collaborative Chain (RCC) to orchestrate the fuzzing process into multiple sub-tasks based on the dependencies of transaction sequences. Furthermore, we design a multi-agent collaborative team, where each expert agent is guided by the RCC to jointly generate and refine transaction sequences from both global and local perspectives. We conduct extensive experiments to evaluate SmartFuzz's performance on real-world contracts and DApp projects. The results demonstrate that SmartFuzz outperforms existing state-of-the-art tools: (i) it detects 5.8\%-74.7\% more vulnerabilities within 30 minutes, and (ii) it reduces false negatives by up to 80\%.

Paperid: 3010, https://arxiv.org/pdf/2511.12046.pdf

Abstract:
Knowledge Distillation (KD) is essential for compressing large models, yet relying on pre-trained "teacher" models downloaded from third-party repositories introduces serious security risks -- most notably backdoor attacks. Existing KD backdoor methods are typically complex and computationally intensive: they employ surrogate student models and simulated distillation to guarantee transferability, and they construct triggers in a way similar to universal adversarial perturbations (UAPs), which being not stealthy in magnitude, inherently exhibit strong adversarial behavior. This work questions whether such complexity is necessary and constructs stealthy "weak" triggers -- imperceptible perturbations that have negligible adversarial effect. We propose BackWeak, a simple, surrogate-free attack paradigm. BackWeak shows that a powerful backdoor can be implanted by simply fine-tuning a benign teacher with a weak trigger using a very small learning rate. We demonstrate that this delicate fine-tuning is sufficient to embed a backdoor that reliably transfers to diverse student architectures during a victim's standard distillation process, yielding high attack success rates. Extensive empirical evaluations on multiple datasets, model architectures, and KD methods show that BackWeak is efficient, simpler, and often more stealthy than previous elaborate approaches. This work calls on researchers studying KD backdoor attacks to pay particular attention to the trigger's stealthiness and its potential adversarial characteristics.

Paperid: 3011, https://arxiv.org/pdf/2511.11836.pdf

Abstract:
The integration of Generative Artificial Intelligence (GenAI) in healthcare is impeded by significant security challenges unaddressed by traditional frameworks, precisely the data-in-use gap where sensitive patient data and proprietary AI models are exposed during active processing. To address this, the paper proposes the Confidential Zero-Trust Framework (CZF), a novel security paradigm that synergistically combines Zero-Trust Architecture for granular access control with the hardware-enforced data isolation of Confidential Computing. We detailed a multi-tiered architectural blueprint for implementing the CZF on Google Cloud and analyzed its efficacy against real-world threats. The CZF provides a defense-in-depth architecture where data remains encrypted while in-use within a hardware-based Trusted Execution Environment (TEE). The framework's use of remote attestation offers cryptographic proof of workload integrity, transforming compliance from a procedural exercise into a verifiable technical fact and enabling secure, multi-party collaborations previously blocked by security and intellectual property concerns. By closing the data-in-use gap and enforcing Zero-Trust principles, the CZF provides a robust and verifiable framework that establishes the necessary foundation of trust to enable the responsible adoption of transformative AI technologies in healthcare.

Paperid: 3012, https://arxiv.org/pdf/2511.11250.pdf

Abstract:
Smart contracts have emerged as key components within decentralized environments, enabling the automation of transactions through self-executing programs. While these innovations offer significant advantages, they also present potential drawbacks if the smart contract code is not carefully designed and implemented. This paper investigates the capability of large language models (LLMs) to detect OWASP-inspired vulnerabilities in smart contracts beyond the Ethereum Virtual Machine (EVM) ecosystem, focusing specifically on Solana and Algorand. Given the lack of labeled datasets for non-EVM platforms, we design a synthetic dataset of annotated smart contract snippets in Rust (for Solana) and PyTeal (for Algorand), structured around a vulnerability taxonomy derived from OWASP. We evaluate LLMs under three configurations: prompt engineering, fine-tuning, and a hybrid of both, comparing their performance on different vulnerability categories. Experimental results show that prompt engineering achieves general robustness, while fine-tuning improves precision and recall on less semantically rich languages such as TEAL. Additionally, we analyze how the architectural differences of Solana and Algorand influence the manifestation and detectability of vulnerabilities, offering platform-specific mappings that highlight limitations in existing security tooling. Our findings suggest that LLM-based approaches are viable for static vulnerability detection in smart contracts, provided domain-specific data and categorization are integrated into training pipelines.

Paperid: 3013, https://arxiv.org/pdf/2511.11019.pdf

Abstract:
Software vulnerabilities are increasing at an alarming rate. However, manual patching is both time-consuming and resource-intensive, while existing automated vulnerability repair (AVR) techniques remain limited in effectiveness. Recent advances in large language models (LLMs) have opened a new paradigm for AVR, demonstrating remarkable progress. To examine the capability of LLMs in AVR, several vulnerability benchmarks have been proposed recently. However, they still suffer from key limitations of outdated vulnerabilities, limited language coverage, unreliable patch validation, and insufficient reproducibility. To overcome these challenges, we introduce PATCHEVAL, a multilingual benchmark for Go, JavaScript, and Python, languages for which existing benchmarks remain unexplored. PATCHEVAL curates a dataset of 1,000 vulnerabilities drawn from CVEs reported between 2015 and 2025, covering 65 distinct CWEs. A subset of 230 CVEs is further equipped with runtime sandbox environments, enabling patch verification through both security tests and functionality tests. To provide a systematic comparison of LLM-based vulnerability repair, we evaluate a series of state-of-the-art LLMs and agents, presenting an in-depth analysis that empirically yields key insights to guide future research in AVR.

Paperid: 3014, https://arxiv.org/pdf/2511.10565.pdf

Abstract:
Non-Interactive Zero Knowledge (NIZK) proofs, such as zkSNARKS, let one prove knowledge of private data without revealing it or interacting with a verifier. While existing tooling focuses on specifying the predicate to be proven, real-world applications optimize predicate definitions to minimize proof generation overhead, but must correspondingly transform predicate inputs. Implementing these two steps separately duplicates logic that must precisely match to avoid catastrophic security flaws. We address this shortcoming with zkStruDul, a language that unifies input transformations and predicate definitions into a single combined abstraction from which a compiler can project both procedures, eliminating duplicate code and problematic mismatches. zkStruDul provides a high-level abstraction to layer on top of existing NIZK technology and supports important features like recursive proofs. We provide a source-level semantics and prove its behavior is identical to the projected semantics, allowing straightforward standard reasoning.

Paperid: 3015, https://arxiv.org/pdf/2511.10111.pdf

Abstract:
Password updates are a critical account security measure and an essential part of the password lifecycle. Service providers and common security recommendations advise users to update their passwords in response to incidents or as a critical cyber hygiene measure. However, password update processes are often cumbersome and require manual password creation. Inconsistent and complex workflows and a lack of automation capabilities for password managers further negatively impact overall password security. In this work, we perform the first in-depth systematic analysis of 111 password update processes deployed on top-ranked websites. We provide novel insights into their overall security, usability, and automation capabilities and contribute to authentication security research through a better understanding of password update processes. Websites deploy highly diverse, often complex, confusing password update processes and lack the support of password managers. Processes are often hard to use, and end-users can barely transfer experiences and knowledge across websites. Notably, protective measures designed to enhance security frequently obstruct password manager automation. We conclude our work by discussing our findings and giving recommendations for web developers, the web standardization community, and security researchers.

Paperid: 3016, https://arxiv.org/pdf/2511.09688.pdf

Abstract:
Our previous work established the feasibility of FPGA-based real-time trajectory anonymization, a critical task for protecting user privacy in modern location-based services (LBS). However, that pioneering approach relied exclusively on shortest-path computations, which can fail to capture re- alistic travel behavior and thus reduce the utility of the anonymized data. To address this limitation, this paper introduces a novel, history-aware trajectory k-anonymization methodology and presents an advanced FPGA-based hardware architecture to implement it. Our proposed architecture uniquely integrates par- allel history-based trajectory searches with conventional shortest- path finding, using a custom fixed-point counting module to ac- curately weigh contributions from historical data. This approach enables the system to prioritize behaviorally common routes over geometrically shorter but less-traveled paths. The FPGA implementation demonstrates that our new architecture achieves a real-time throughput of over 6,000 records/s, improves data retention by up to 1.2% compared to our previous shortest-path- only design, and preserves major arterial roads more effectively. These results signify a key advancement, enabling high-fidelity, history-aware anonymization that preserves both privacy and behavioral accuracy under the strict latency constraints of LBS.

Paperid: 3017, https://arxiv.org/pdf/2511.09114.pdf

Abstract:
Recent advances in deep reinforcement learning for autonomous cyber defence have resulted in agents that can successfully defend simulated computer networks against cyber-attacks. However, many of these agents would need retraining to defend networks with differing topology or size, making them poorly suited to real-world networks where topology and size can vary over time. In this research we introduce a novel set of Topological Extensions for Reinforcement Learning Agents (TERLA) that provide generalisability for the defence of networks with differing topology and size, without the need for retraining. Our approach involves the use of heterogeneous graph neural network layers to produce a fixed-size latent embedding representing the observed network state. This representation learning stage is coupled with a reduced, fixed-size, semantically meaningful and interpretable action space. We apply TERLA to a standard deep reinforcement learning Proximal Policy Optimisation (PPO) agent model, and to reduce the sim-to-real gap, conduct our research using Cyber Autonomy Gym for Experimentation (CAGE) Challenge 4. This Cyber Operations Research Gym environment has many of the features of a real-world network, such as realistic Intrusion Detection System (IDS) events and multiple agents defending network segments of differing topology and size. TERLA agents retain the defensive performance of vanilla PPO agents whilst showing improved action efficiency. Generalisability has been demonstrated by showing that all TERLA agents have the same network-agnostic neural network architecture, and by deploying a single TERLA agent multiple times to defend network segments with differing topology and size, showing improved defensive performance and efficiency.

Paperid: 3018, https://arxiv.org/pdf/2511.09051.pdf

Abstract:
Smart contracts concentrate high value assets and complex logic in small, immutable programs, where even minor bugs can cause major losses. Existing taxonomies and tools remain fragmented, organized around symptoms such as reentrancy rather than structural causes. This paper introduces an attack-centric, program-structure taxonomy that unifies Solidity vulnerabilities into eight root-cause families covering control flow, external calls, state integrity, arithmetic safety, environmental dependencies, access control, input validation, and cross-domain protocol assumptions. Each family is illustrated through concise Solidity examples, exploit mechanics, and mitigations, and linked to the detection signals observable by static, dynamic, and learning-based tools. We further cross-map legacy datasets (SmartBugs, SolidiFI) to this taxonomy to reveal label drift and coverage gaps. The taxonomy provides a consistent vocabulary and practical checklist that enable more interpretable detection, reproducible audits, and structured security education for both researchers and practitioners.

Paperid: 3019, https://arxiv.org/pdf/2511.08443.pdf

Abstract:
Hardware-software leakage contracts have emerged as a formalism for specifying side-channel security guarantees of modern processors, yet verifying that a complex hardware design complies with its contract remains a major challenge. While verification provides strong guarantees, current verification approaches struggle to scale to industrial-sized designs. Conversely, prevalent hardware fuzzing approaches are designed to find functional correctness bugs, but are blind to information leaks like Spectre. To bridge this gap, we introduce a novel and scalable approach: coverage-guided hardware-software contract fuzzing. Our methodology leverages a self-compositional framework to make information leakage directly observable as microarchitectural state divergence. The core of our contribution is a new, security-oriented coverage metric, Self-Composition Deviation (SCD), which guides the fuzzer to explore execution paths that violate the leakage contract. We implemented this approach and performed an extensive evaluation on two open-source RISC-V cores: the in-order Rocket Core and the complex out-of-order BOOM core. Our results demonstrate that coverage-guided strategies outperform unguided fuzzing and that increased microarchitectural coverage leads to a faster discovery of security vulnerabilities in the BOOM core.

Paperid: 3020, https://arxiv.org/pdf/2511.08207.pdf

Abstract:
Federated learning (FL) offers privacy preserving, distributed machine learning, allowing clients to contribute to a global model without revealing their local data. As models increasingly serve as monetizable digital assets, the ability to prove participation in their training becomes essential for establishing ownership. In this paper, we address this emerging need by introducing FedPoP, a novel FL framework that allows nonlinkable proof of participation while preserving client anonymity and privacy without requiring either extensive computations or a public ledger. FedPoP is designed to seamlessly integrate with existing secure aggregation protocols to ensure compatibility with real-world FL deployments. We provide a proof of concept implementation and an empirical evaluation under realistic client dropouts. In our prototype, FedPoP introduces 0.97 seconds of per-round overhead atop securely aggregated FL and enables a client to prove its participation/contribution to a model held by a third party in 0.0612 seconds. These results indicate FedPoP is practical for real-world deployments that require auditable participation without sacrificing privacy.

Paperid: 3021, https://arxiv.org/pdf/2511.07876.pdf

Abstract:
As large language models (LLMs) scale, their inference incurs substantial computational resources, exposing them to energy-latency attacks, where crafted prompts induce high energy and latency cost. Existing attack methods aim to prolong output by delaying the generation of termination symbols. However, as the output grows longer, controlling the termination symbols through input becomes difficult, making these methods less effective. Therefore, we propose LoopLLM, an energy-latency attack framework based on the observation that repetitive generation can trigger low-entropy decoding loops, reliably compelling LLMs to generate until their output limits. LoopLLM introduces (1) a repetition-inducing prompt optimization that exploits autoregressive vulnerabilities to induce repetitive generation, and (2) a token-aligned ensemble optimization that aggregates gradients to improve cross-model transferability. Extensive experiments on 12 open-source and 2 commercial LLMs show that LoopLLM significantly outperforms existing methods, achieving over 90% of the maximum output length, compared to 20% for baselines, and improving transferability by around 40% to DeepSeek-V3 and Gemini 2.5 Flash.

Paperid: 3022, https://arxiv.org/pdf/2511.06852.pdf

Abstract:
Safety alignment instills in Large Language Models (LLMs) a critical capacity to refuse malicious requests. Prior works have modeled this refusal mechanism as a single linear direction in the activation space. We posit that this is an oversimplification that conflates two functionally distinct neural processes: the detection of harm and the execution of a refusal. In this work, we deconstruct this single representation into a Harm Detection Direction and a Refusal Execution Direction. Leveraging this fine-grained model, we introduce Differentiated Bi-Directional Intervention (DBDI), a new white-box framework that precisely neutralizes the safety alignment at critical layer. DBDI applies adaptive projection nullification to the refusal execution direction while suppressing the harm detection direction via direct steering. Extensive experiments demonstrate that DBDI outperforms prominent jailbreaking methods, achieving up to a 97.88\% attack success rate on models such as Llama-2. By providing a more granular and mechanistic framework, our work offers a new direction for the in-depth understanding of LLM safety alignment.

Paperid: 3023, https://arxiv.org/pdf/2511.06394.pdf

Abstract:
ROI selective encryption, as an efficient privacy protection technique, encrypts only the key regions in the video, thereby ensuring security while minimizing the impact on coding efficiency. However, existing ROI-based video encryption methods suffer from insufficient flexibility and lack of a unified evaluation system. To address these issues, we propose a visual perception-based tunable framework and evaluation benchmark for H.265/HEVC ROI encryption. Our scheme introduces three key contributions: 1) A ROI region recognition module based on visual perception network is proposed to accurately identify the ROI region in videos. 2) A three-level tunable encryption strategy is implemented while balancing security and real-time performance. 3) A unified ROI encryption evaluation benchmark is developed to provide a standardized quantitative platform for subsequent research. This triple strategy provides new solution and significant unified performance evaluation methods for ROI selective encryption field. Experimental results indicate that the proposed benchmark can comprehensively measure the performance of the ROI selective encryption. Compared to existing ROI encryption algorithms, our proposed enhanced and advanced level encryption exhibit superior performance in multiple performance metrics. In general, the proposed framework effectively meets the privacy protection requirements in H.265/HEVC and provides a reliable solution for secure and efficient processing of sensitive video content.

Paperid: 3024, https://arxiv.org/pdf/2511.06028.pdf

Abstract:
As a case study in cryptographic binding, we present a formal-methods analysis of the cryptographic channel binding mechanisms in the Fast IDentity Online (FIDO) Universal Authentication Framework (UAF) authentication protocol, which seeks to reduce the use of traditional passwords in favor of authentication devices. First, we show that UAF's channel bindings fail to mitigate protocol interaction by a Dolev-Yao adversary, enabling the adversary to transfer the server's authentication challenge to alternate sessions of the protocol. As a result, in some contexts, the adversary can masquerade as a client and establish an authenticated session with a server (e.g., possibly a bank server). Second, we implement a proof-of-concept man-in-the-middle attack against eBay's open source FIDO UAF implementation. Third, we propose and formally verify improvements to UAF. The weakness we analyze is similar to the vulnerability discovered in the Needham-Schroeder protocol over 25 years ago. That this vulnerability appears in the FIDO UAF standard highlights the strong need for protocol designers to bind messages properly and to analyze their designs with formal-methods tools. To our knowledge, we are first to carry out a formal-methods analysis of channel binding in UAF and first to exhibit details of an attack on UAF that exploits the weaknesses of UAF's channel binding. Our case study illustrates the importance of cryptographically binding context to protocol messages to prevent an adversary from misusing messages out of context.

Paperid: 3025, https://arxiv.org/pdf/2511.05865.pdf

Abstract:
Recent advancements in large-scale generative models have enabled the creation of high-quality images and videos, but have also raised significant safety concerns regarding the generation of unsafe content. To mitigate this, concept erasure methods have been developed to remove undesirable concepts from pre-trained models. However, existing methods remain vulnerable to adversarial attacks that can regenerate the erased content. Moreover, achieving robust erasure often degrades the model's generative quality for safe, unrelated concepts, creating a difficult trade-off between safety and performance. To address this challenge, we introduce Classifier-Guided Concept Erasure (CGCE), an efficient plug-and-play framework that provides robust concept erasure for diverse generative models without altering their original weights. CGCE uses a lightweight classifier operating on text embeddings to first detect and then refine prompts containing undesired concepts. This approach is highly scalable, allowing for multi-concept erasure by aggregating guidance from several classifiers. By modifying only unsafe embeddings at inference time, our method prevents harmful content generation while preserving the model's original quality on benign prompts. Extensive experiments show that CGCE achieves state-of-the-art robustness against a wide range of red-teaming attacks. Our approach also maintains high generative utility, demonstrating a superior balance between safety and performance. We showcase the versatility of CGCE through its successful application to various modern T2I and T2V models, establishing it as a practical and effective solution for safe generative AI.

Paperid: 3026, https://arxiv.org/pdf/2511.05319.pdf

Abstract:
Although steganography has made significant advancements in recent years, it still struggles to embed semantically rich, sentence-level information into carriers. However, in the era of AIGC, the capacity of steganography is more critical than ever. In this work, we present Sentence-to-Image Steganography, an instance of Semantic Steganography, a novel task that enables the hiding of arbitrary sentence-level messages within a cover image. Furthermore, we establish a benchmark named Invisible Text (IVT), comprising a diverse set of sentence-level texts as secret messages for evaluation. Finally, we present $\mathbf{S^2LM}$: Semantic Steganographic Language Model, which utilizes large language models (LLMs) to embed high-level textual information, such as sentences or even paragraphs, into images. Unlike traditional bit-level counterparts, $\mathrm{S^2LM}$ enables the integration of semantically rich content through a newly designed pipeline in which the LLM is involved throughout the entire process. Both quantitative and qualitative experiments demonstrate that our method effectively unlocks new semantic steganographic capabilities for LLMs. The source code will be released soon.

Paperid: 3027, https://arxiv.org/pdf/2511.05193.pdf

Abstract:
With their widespread popularity, web services have become the main targets of various cyberattacks. Existing traffic anomaly detection approaches focus on flow-level attacks, yet fail to recognize behavior-level attacks, which appear benign in individual flows but reveal malicious purpose using multiple network flows. To transcend this limitation, we propose a novel unsupervised traffic anomaly detection system, BLADE, capable of detecting not only flow-level but also behavior-level attacks in web services. Our key observation is that application-layer operations of web services exhibit distinctive communication patterns at the network layer from a multi-flow perspective. BLADE first exploits a flow autoencoder to learn a latent feature representation and calculates its reconstruction losses per flow. Then, the latent representation is assigned a pseudo operation label using an unsupervised clustering method. Next, an anomaly score is computed based on the reconstruction losses. Finally, the triplets of timestamps, pseudo labels, and anomaly scores from multiple flows are aggregated and fed into a one-class classifier to characterize the behavior patterns of legitimate web operations, enabling the detection of flow-level and behavior-level anomalies. BLADE is extensively evaluated on both the custom dataset and the CIC-IDS2017 dataset. The experimental results demonstrate BLADE's superior performance, achieving high F1 scores of 0.9732 and 0.9801, respectively, on the two datasets, and outperforming traditional single-flow anomaly detection baselines.

Paperid: 3028, https://arxiv.org/pdf/2511.05185.pdf

Abstract:
The deployment of autonomous systems has experienced remarkable growth in recent years, driven by their integration into sectors such as industry, medicine, logistics, and domestic environments. This expansion is accompanied by a series of security issues that entail significant risks due to the critical nature of autonomous systems, especially those operating in human-interaction environments. Furthermore, technological advancement and the high operational and architectural complexity of autonomous systems have resulted in an increased attack surface. This article presents a specific security auditing procedure for autonomous systems, based on a layer-structured methodology, a threat taxonomy adapted to the robotic context, and a set of concrete mitigation measures. The validity of the proposed approach is demonstrated through four practical case studies applied to representative robotic platforms: the Vision 60 military quadruped from Ghost Robotics, the A1 robot from Unitree Robotics, the UR3 collaborative arm from Universal Robots, and the Pepper social robot from Aldebaran Robotics.

Paperid: 3029, https://arxiv.org/pdf/2511.05156.pdf

Abstract:
With more and more existing networks being transformed to Software-Defined Networking (SDN), they need to be more secure and demand smarter ways of traffic control. This work, SmartSecChain-SDN, is a platform that combines machine learning based intrusion detection, blockchain-based storage of logs, and application-awareness-based priority in SDN networks. To detect network intrusions in a real-time, precision and low-false positives setup, the framework utilizes the application of advanced machine learning algorithms, namely Random Forest, XGBoost, CatBoost, and CNN-BiLSTM. SmartSecChain-SDN is based on the Hyperledger Fabric, which is a permissioned blockchain technology, to provide secure, scalable, and privacy-preserving storage and, thus, guarantee that the Intrusion Detection System (IDS) records cannot be altered and can be analyzed comprehensively. The system also has Quality of Service (QoS) rules and traffic shaping based on applications, which enables prioritization of critical services, such as VoIP, video conferencing, and business applications, as well as de-prioritization of non-essential traffic, such as downloads and updates. Mininet can simulate real-time SDN scenarios because it is used to prototype whole architectures. It is also compatible with controllers OpenDaylight and Ryu. It has tested the framework using the InSDN dataset and proved that it can identify different kinds of cyberattacks and handle bandwidth allocation efficiently under circumstances of resource constraints. SmartSecChain-SDN comprehensively addresses SDN system protection, securing and enhancing. The proposed study offers an innovative, extensible way to improve cybersecurity, regulatory compliance, and the administration of next-generation programmable networks.

Paperid: 3030, https://arxiv.org/pdf/2511.05111.pdf

Abstract:
In this paper, we provide a probabilistic analysis of the confidentiality in a card-based protocol. We focus on Bert den Boer's original Five Card Trick to develop our approach. Five Card Trick was formulated as a secure two-party computation method, where two players use colored cards with identical backs to calculate the logical AND operation on the bits that they choose. In this method, the players first arrange the cards privately, and then shuffle them through a random cut. Finally, they reveal the shuffled arrangement to determine the result of the operation. An unbiased random cut is essential to prevent players from exposing their chosen bits to each other. However, players typically choose to move cards within the deck even though not moving any cards should be equally likely. This unconscious behavior results in a biased, nonuniform shuffling-distribution in the sense that some arrangements of cards are slightly more probable after the cut. Such a nonuniform distribution creates an opportunity for a malicious player to gain advantage in guessing the other player's choice. We provide the conditional probabilities of such guesses as a way to quantify the information leakage. Furthermore, we utilize the eigenstructure of a Markov chain to derive tight bounds on the number of times the biased random cuts must be repeated to reduce the leakage to an acceptable level. We also discuss the generalization of our approach to the setting where shuffling is conducted by a malicious player.

Paperid: 3031, https://arxiv.org/pdf/2511.04633.pdf

Abstract:
Quantum cryptography is a rapidly-developing area which leverages quantum information to accomplish classically-impossible tasks. In many of these protocols, quantum states are used as long-term cryptographic keys. Typically, this is to ensure the keys cannot be copied by an adversary, owing to the quantum no-cloning theorem. Unfortunately, due to quantum state's tendency to decohere, persistent quantum memory will likely be one of the most challenging resources for quantum computers. As such, it will be important to minimize persistent memory in quantum protocols. In this work, we consider the case of one-shot signatures (OSS), and more general quantum signing tokens. These are important unclonable primitives, where quantum signing keys allow for signing a single message but not two. Naturally, these quantum signing keys would require storage in long-term quantum memory. Very recently, the first OSS was constructed in a classical oracle model and also in the standard model, but we observe that the quantum memory required for these protocols is quite large. In this work, we significantly decrease the quantum secret key size, in some cases achieving asymptotically optimal size. To do so, we develop novel techniques for proving the security of cryptosystems using coset states, which are one of the main tools used in unclonable cryptography.

Paperid: 3032, https://arxiv.org/pdf/2511.04399.pdf

Abstract:
Secret-sharing schemes allow a dealer to split a secret into multiple "shares" and distribute them individually among many parties while mandating certain constraints on its reconstruction. Such protocols are usually executed over a secure communication channel since an eavesdropper, after intercepting all the shares, is expected to be able to reconstruct the secret. Leveraging the unique properties of quantum channels, several quantum protocols have been designed for secret sharing. However, almost all of them detect the presence of an eavesdropper by statistical analysis of the outcome of multiple rounds, or simply require a secure channel of communication. We present a complete characterisation of the correctness and security properties of a quantum-search based secret-sharing framework proposed by Hsu (2003). The scheme was designed to work over public channels without requiring multiple rounds to detect eavesdropping. Our characterisation allowed us to improve the original protocol to be more resistant towards eavesdropping. However, we prove that complete security against an eavesdropper is not possible in this framework.

Paperid: 3033, https://arxiv.org/pdf/2511.04188.pdf

Abstract:
We introduce a quantum key distribution (QKD) primitive based on charge teleportation: by Local Operations and Classical Communication (LOCC) on an entangled many-body ground state, Alice's one-bit choice steers the sign of a local charge shift at Bob, which directly encodes the key bit. Relative to energy teleportation schemes, the charge signal is bit-symmetric, measured in a single basis, and markedly more robust to realistic noise and model imperfections. We instantiate the protocol on transverse-field Ising models, star-coupled and one-dimensional chain, obtain closed-form results for two qubits, and for larger systems confirm performance via exact diagonalization, circuit-level simulations, and a proof-of-principle hardware run. We quantify resilience to classical bit flips and local quantum noise, identifying regimes where sign integrity, and hence key correctness, is preserved. These results position charge teleportation as a practical, low-rate QKD primitive compatible with near-term platforms.

Paperid: 3034, https://arxiv.org/pdf/2511.02924.pdf

Abstract:
The proliferation of Internet of Things (IoT) networks demands security mechanisms that protect constrained devices without the computational cost of public-key cryptography. Conventional Pre-Shared Key (PSK) encryption, while efficient, remains vulnerable due to static key reuse, replay attacks, and the lack of forward secrecy. This paper presents the Dynamic Session Enhanced Key Protocol (DSEKP) - a lightweight session-key rekeying framework, a fully symmetric extension to PSK that derives per-session AES-GCM keys using the HMAC-based Key Derivation Function (HKDF-SHA256) and authenticates session establishment through an HMAC proof in a single init-ack exchange. DSEKP was implemented on an ESP32 IoT sensor node and a Raspberry Pi 5 edge server communicating through a Mosquitto MQTT broker, and benchmarked against a static PSK baseline over more than 6,500 encrypted packets per configuration. The results demonstrate nearly identical throughput and reliability, with moderate overhead - mean latency increased by 27% and payload size by 10% - while delivering per-session forward secrecy and built-in replay protection. These findings confirm that dynamic symmetric rekeying can substantially strengthen IoT-Edge links with minimal computational and bandwidth cost, offering a practical migration path from static PSK to session-aware, scalable, and reproducible IoT security.

Paperid: 3035, https://arxiv.org/pdf/2511.02055.pdf

Abstract:
The modern AI data economy centralizes power, limits innovation, and misallocates value by extracting data without control, privacy, or fair compensation. We introduce Private Map-Secure Reduce (PMSR), a network-native paradigm that transforms data economics from extractive to participatory through cryptographically enforced markets. Extending MapReduce to decentralized settings, PMSR enables computation to move to the data, ensuring verifiable privacy, efficient price discovery, and incentive alignment. Demonstrations include large-scale recommender audits, privacy-preserving LLM ensembling (87.5\% MMLU accuracy across six models), and distributed analytics over hundreds of nodes. PMSR establishes a scalable, equitable, and privacy-guaranteed foundation for the next generation of AI data markets.

Paperid: 3036, https://arxiv.org/pdf/2511.01391.pdf

Abstract:
In 5G and beyond networks, the radio communication between a User Equipment (UE) and a base station (gNodeB or gNB), also known as the air interface, is a critical component of network access and connectivity. During the connection establishment procedure, the Radio Resource Control (RRC) layer can be vulnerable to signaling storms, which threaten the availability of the radio access control plane. These attacks may occur when one or more UEs send a large number of connection requests to the gNB, preventing new UEs from establishing connections. In this paper, we investigate the detection of such threats and propose an adaptive threshold-based detection system based on Extreme Value Theory (EVT). The proposed solution is evaluated numerically by applying simulated attack scenarios based on a realistic threat model on top of real-world RRC traffic data from an operator network. We show that, by leveraging features from the RRC layer only, the detection system can not only identify the attacks but also differentiate them from legitimate high-traffic situations. The adaptive threshold calculated using EVT ensures that the system can work under diverse traffic conditions. The results show high accuracy, precision, and recall values (above 93%), and a low detection latency even under complex conditions.

Paperid: 3037, https://arxiv.org/pdf/2511.01023.pdf

Abstract:
We analyze subliminal transfer in Transformer models, where a teacher embeds hidden traits that can be linearly decoded by a student without degrading main-task performance. Prior work often attributes transferability to global representational similarity, typically quantified with Centered Kernel Alignment (CKA). Using synthetic corpora with disentangled public and private labels, we distill students under matched and independent random initializations. We find that transfer strength hinges on alignment within a trait-discriminative subspace: same-seed students inherit this alignment and show higher leakage {τ\approx} 0.24, whereas different-seed students -- despite global CKA > 0.9 -- exhibit substantially reduced excess accuracy {τ\approx} 0.12 - 0.13. We formalize this with subspace-level CKA diagnostic and residualized probes, showing that leakage tracks alignment within the trait-discriminative subspace rather than global representational similarity. Security controls (projection penalty, adversarial reversal, right-for-the-wrong-reasons regularization) reduce leakage in same-base models without impairing public-task fidelity. These results establish seed-induced uniqueness as a resilience property and argue for subspace-aware diagnostics for secure multi-model deployments.

Paperid: 3038, https://arxiv.org/pdf/2511.00973.pdf

Abstract:
We introduce Model-Bound Latent Exchange (MoBLE), a decoder-binding property in Transformer autoencoders formalized as Zero-Shot Decoder Non-Transferability (ZSDN). In identity tasks using iso-architectural models trained on identical data but differing in seeds, self-decoding achieves more than 0.91 exact match and 0.98 token accuracy, while zero-shot cross-decoding collapses to chance without exact matches. This separation arises without injected secrets or adversarial training, and is corroborated by weight-space distances and attention-divergence diagnostics. We interpret ZSDN as model binding, a latent-based authentication and access-control mechanism, even when the architecture and training recipe are public: encoder's hidden state representation deterministically reveals the plaintext, yet only the correctly keyed decoder reproduces it in zero-shot. We formally define ZSDN, a decoder-binding advantage metric, and outline deployment considerations for secure artificial intelligence (AI) pipelines. Finally, we discuss learnability risks (e.g., adapter alignment) and outline mitigations. MoBLE offers a lightweight, accelerator-friendly approach to secure AI deployment in safety-critical domains, including aviation and cyber-physical systems.

Paperid: 3039, https://arxiv.org/pdf/2511.00460.pdf

Abstract:
Centralized Software-Defined Networking (cSDN) offers flexible and programmable control of networks but suffers from scalability and reliability issues due to its reliance on centralized controllers. Decentralized SDN (dSDN) alleviates these concerns by distributing control across multiple local controllers, yet this architecture remains highly vulnerable to Distributed Denial-of-Service (DDoS) attacks. In this paper, we propose a novel detection and mitigation framework tailored for dSDN environments. The framework leverages lightweight port-level statistics combined with prompt engineering and in-context learning, enabling the DeepSeek-v3 Large Language Model (LLM) to classify traffic as benign or malicious without requiring fine-tuning or retraining. Once an anomaly is detected, mitigation is enforced directly at the attacker's port, ensuring that malicious traffic is blocked at their origin while normal traffic remains unaffected. An automatic recovery mechanism restores normal operation after the attack inactivity, ensuring both security and availability. Experimental evaluation under diverse DDoS attack scenarios demonstrates that the proposed approach achieves near-perfect detection, with 99.99% accuracy, 99.97% precision, 100% recall, 99.98% F1-score, and an AUC of 1.0. These results highlight the effectiveness of combining distributed monitoring with zero-training LLM inference, providing a proactive and scalable defense mechanism for securing dSDN infrastructures against DDoS threats.

Paperid: 3040, https://arxiv.org/pdf/2511.00140.pdf

Abstract:
This paper presents a proof-of-concept supply chain attack against the Secure ROS 2 (SROS 2) framework, demonstrated on a Quanser QCar2 autonomous vehicle platform. A Trojan-infected Debian package modifies core ROS 2 security commands to exfiltrate newly generated keystore credentials via DNS in base64-encoded chunks to an attacker-controlled nameserver. Possession of these credentials enables the attacker to rejoin the SROS 2 network as an authenticated participant and publish spoofed control or perception messages without triggering authentication failures. We evaluate this capability on a secure ROS 2 Humble testbed configured for a four-stop-sign navigation routine using an Intel RealSense camera for perception. Experimental results show that control-topic injections can cause forced braking, sustained high-speed acceleration, and continuous turning loops, while perception-topic spoofing can induce phantom stop signs or suppress real detections. The attack generalizes to any data distribution service (DDS)-based robotic system using SROS 2, highlighting the need for both supply chain integrity controls and runtime semantic validation to safeguard autonomous systems against insider and impersonation threats.

Paperid: 3041, https://arxiv.org/pdf/2510.26941.pdf

Abstract:
The Internet of Things has expanded rapidly, transforming communication and operations across industries but also increasing the attack surface and security breaches. Artificial Intelligence plays a key role in securing IoT, enabling attack detection, attack behavior analysis, and mitigation suggestion. Despite advancements, evaluations remain purely qualitative, and the lack of a standardized, objective benchmark for quantitatively measuring AI-based attack analysis and mitigation hinders consistent assessment of model effectiveness. In this work, we propose a hybrid framework combining Machine Learning (ML) for multi-class attack detection with Large Language Models (LLMs) for attack behavior analysis and mitigation suggestion. After benchmarking several ML and Deep Learning (DL) classifiers on the Edge-IIoTset and CICIoT2023 datasets, we applied structured role-play prompt engineering with Retrieval-Augmented Generation (RAG) to guide ChatGPT-o3 and DeepSeek-R1 in producing detailed, context-aware responses. We introduce novel evaluation metrics for quantitative assessment to guide us and an ensemble of judge LLMs, namely ChatGPT-4o, DeepSeek-V3, Mixtral 8x7B Instruct, Gemini 2.5 Flash, Meta Llama 4, TII Falcon H1 34B Instruct, xAI Grok 3, and Claude 4 Sonnet, to independently evaluate the responses. Results show that Random Forest has the best detection model, and ChatGPT-o3 outperformed DeepSeek-R1 in attack analysis and mitigation.

Paperid: 3042, https://arxiv.org/pdf/2510.26792.pdf

Abstract:
We study the ability of Transformer models to learn sequences generated by Permuted Congruential Generators (PCGs), a widely used family of pseudo-random number generators (PRNGs). PCGs introduce substantial additional difficulty over linear congruential generators (LCGs) by applying a series of bit-wise shifts, XORs, rotations and truncations to the hidden state. We show that Transformers can nevertheless successfully perform in-context prediction on unseen sequences from diverse PCG variants, in tasks that are beyond published classical attacks. In our experiments we scale moduli up to $2^{22}$ using up to $50$ million model parameters and datasets with up to $5$ billion tokens. Surprisingly, we find even when the output is truncated to a single bit, it can be reliably predicted by the model. When multiple distinct PRNGs are presented together during training, the model can jointly learn them, identifying structures from different permutations. We demonstrate a scaling law with modulus $m$: the number of in-context sequence elements required for near-perfect prediction grows as $\sqrt{m}$. For larger moduli, optimization enters extended stagnation phases; in our experiments, learning moduli $m \geq 2^{20}$ requires incorporating training data from smaller moduli, demonstrating a critical necessity for curriculum learning. Finally, we analyze embedding layers and uncover a novel clustering phenomenon: the model spontaneously groups the integer inputs into bitwise rotationally-invariant clusters, revealing how representations can transfer from smaller to larger moduli.

Paperid: 3043, https://arxiv.org/pdf/2510.26499.pdf

Abstract:
Extracting structured intelligence via Named Entity Recognition (NER) is critical for cybersecurity, but the proliferation of datasets with incompatible annotation schemas hinders the development of comprehensive models. While combining these resources is desirable, we empirically demonstrate that naively concatenating them results in a noisy label space that severely degrades model performance. To overcome this critical limitation, we introduce CyberNER, a large-scale, unified corpus created by systematically harmonizing four prominent datasets (CyNER, DNRTI, APTNER, and Attacker) onto the STIX 2.1 standard. Our principled methodology resolves semantic ambiguities and consolidates over 50 disparate source tags into 21 coherent entity types. Our experiments show that models trained on CyberNER achieve a substantial performance gain, with a relative F1-score improvement of approximately 30% over the naive concatenation baseline. By publicly releasing the CyberNER corpus, we provide a crucial, standardized benchmark that enables the creation and rigorous comparison of more robust and generalizable entity extraction models for the cybersecurity domain.

Paperid: 3044, https://arxiv.org/pdf/2510.26307.pdf

Abstract:
Anomaly detection is a critical task in cybersecurity, where identifying insider threats, access violations, and coordinated attacks is essential for ensuring system resilience. Graph-based approaches have become increasingly important for modeling entity interactions, yet most rely on homogeneous and static structures, which limits their ability to capture the heterogeneity and temporal evolution of real-world environments. Heterogeneous Graph Neural Networks (HGNNs) have emerged as a promising paradigm for anomaly detection by incorporating type-aware transformations and relation-sensitive aggregation, enabling more expressive modeling of complex cyber data. However, current research on HGNN-based anomaly detection remains fragmented, with diverse modeling strategies, limited comparative evaluation, and an absence of standardized benchmarks. To address this gap, we provide a comprehensive survey of HGNN-based anomaly detection methods in cybersecurity. We introduce a taxonomy that classifies approaches by anomaly type and graph dynamics, analyze representative models, and map them to key cybersecurity applications. We also review commonly used benchmark datasets and evaluation metrics, highlighting their strengths and limitations. Finally, we identify key open challenges related to modeling, data, and deployment, and outline promising directions for future research. This survey aims to establish a structured foundation for advancing HGNN-based anomaly detection toward scalable, interpretable, and practically deployable solutions.

Paperid: 3045, https://arxiv.org/pdf/2510.26179.pdf

Abstract:
Edge computing alleviates the computation burden of data-driven control in cyber-physical systems (CPSs) by offloading complex processing to edge servers. However, the increasing sophistication of cyberattacks underscores the need for security measures that go beyond conventional IT protections and address the unique vulnerabilities of CPSs. This study proposes a confidential data-driven gain-tuning framework using homomorphic encryption, such as ElGamal and CKKS encryption schemes, to enhance cybersecurity in gain-tuning processes outsourced to external servers. The idea for realizing confidential FRIT is to replace the matrix inversion operation with a vector summation form, allowing homomorphic operations to be applied. Numerical examples under 128-bit security confirm performance comparable to conventional methods while providing guidelines for selecting suitable encryption schemes for secure CPS.

Paperid: 3046, https://arxiv.org/pdf/2510.26102.pdf

Abstract:
Local Differential Privacy (LDP) is a widely adopted privacy-protection model in the Internet of Things (IoT) due to its lightweight, decentralized, and scalable nature. However, it is vulnerable to poisoning attacks, and existing defenses either incur prohibitive resource overheads or rely on domain-specific prior knowledge, limiting their practical deployment. To address these limitations, we propose PEEL, a Poisoning-Exposing Encoding theoretical framework for LDP, which departs from resource- or prior-dependent countermeasures and instead leverages the inherent structural consistency of LDP-perturbed data. As a non-intrusive post-processing module, PEEL amplifies stealthy poisoning effects by re-encoding LDP-perturbed data via sparsification, normalization, and low-rank projection, thereby revealing both output and rule poisoning attacks through structural inconsistencies in the reconstructed space. Theoretical analysis proves that PEEL, integrated with LDP, retains unbiasedness and statistical accuracy, while being robust to expose both output and rule poisoning attacks. Moreover, evaluation results show that LDP-integrated PEEL not only outperforms four state-of-the-art defenses in terms of poisoning exposure accuracy but also significantly reduces client-side computational costs, making it highly suitable for large-scale IoT deployments.

Paperid: 3047, https://arxiv.org/pdf/2510.26037.pdf

Abstract:
The ability of LLM agents to plan and invoke tools exposes them to new safety risks, making a comprehensive red-teaming system crucial for discovering vulnerabilities and ensuring their safe deployment. We present SIRAJ: a generic red-teaming framework for arbitrary black-box LLM agents. We employ a dynamic two-step process that starts with an agent definition and generates diverse seed test cases that cover various risk outcomes, tool-use trajectories, and risk sources. Then, it iteratively constructs and refines model-based adversarial attacks based on the execution trajectories of former attempts. To optimize the red-teaming cost, we present a model distillation approach that leverages structured forms of a teacher model's reasoning to train smaller models that are equally effective. Across diverse evaluation agent settings, our seed test case generation approach yields 2 -- 2.5x boost to the coverage of risk outcomes and tool-calling trajectories. Our distilled 8B red-teamer model improves attack success rate by 100%, surpassing the 671B Deepseek-R1 model. Our ablations and analyses validate the effectiveness of the iterative framework, structured reasoning, and the generalization of our red-teamer models.

Paperid: 3048, https://arxiv.org/pdf/2510.25960.pdf

Abstract:
In this paper, we present a framework that uses acoustic side-channel analysis (ASCA) to monitor and verify whether a robot correctly executes its intended commands. We develop and evaluate a machine-learning-based workflow verification system that uses acoustic emissions generated by robotic movements. The system can determine whether real-time behavior is consistent with expected commands. The evaluation takes into account movement speed, direction, and microphone distance. The results show that individual robot movements can be validated with over 80% accuracy under baseline conditions using four different classifiers: Support Vector Machine (SVM), Deep Neural Network (DNN), Recurrent Neural Network (RNN), and Convolutional Neural Network (CNN). Additionally, workflows such as pick-and-place and packing could be identified with similarly high confidence. Our findings demonstrate that acoustic signals can support real-time, low-cost, passive verification in sensitive robotic environments without requiring hardware modifications.

Paperid: 3049, https://arxiv.org/pdf/2510.25939.pdf

Abstract:
The advent of Large Language Models (LLMs) promised to resolve the long-standing paradox in honeypot design: achieving high-fidelity deception with low operational risk. However, despite a flurry of research since late 2022, progress has been incremental, and the field lacks a cohesive understanding of the emerging architectural patterns, core challenges, and evaluation paradigms. To fill this gap, this Systematization of Knowledge (SoK) paper provides the first comprehensive overview of this new domain. We survey and systematize three critical, intersecting research areas: first, we provide a taxonomy of honeypot detection vectors, structuring the core problems that LLM-based realism must solve; second, we synthesize the emerging literature on LLM-honeypots, identifying a canonical architecture and key evaluation trends; and third, we chart the evolutionary path of honeypot log analysis, from simple data reduction to automated intelligence generation. We synthesize these findings into a forward-looking research roadmap, arguing that the true potential of this technology lies in creating autonomous, self-improving deception systems to counter the emerging threat of intelligent, automated attackers.

Paperid: 3050, https://arxiv.org/pdf/2510.25470.pdf

Abstract:
There is an increase in global malware threats. To address this, an encryption-type ransomware has been introduced on the Android operating system. The challenges associated with malicious threats in phone use have become a pressing issue in mobile communication, disrupting user experiences and posing significant privacy threats. This study surveys commonly used machine learning techniques for detecting malicious threats in phones and examines their performance. The majority of past research focuses on customer feedback and reviews, with concerns that people might create false reviews to promote or devalue products and services for personal gain. Hence, the development of techniques for detecting malicious threats using machine learning has been a key focus. This paper presents a comprehensive comparative study of current research on the issue of malicious threats and methods for tackling these challenges. Nevertheless, a huge amount of information is required by these methods, presenting a challenge for developing robust, specialized automated anti-malware systems. This research describes the Android Applications dataset, and the accuracy of the techniques is measured using the accuracy levels of the metrics employed in this study.

Paperid: 3051, https://arxiv.org/pdf/2510.25375.pdf

Abstract:
Increasing complexity and connectivity of modern vehicles have heightened their vulnerability to cyberattacks. This paper addresses security challenges associated with the Unified Diagnostic Services (UDS) protocol, a critical communication framework for vehicle diagnostics in the automotive industry. We present security monitoring strategies for the UDS protocol that leverage in-vehicle logging and remote analysis through a Vehicle Security Operations Center (VSOC). Our approach involves specifying security event logging requirements, contextual data collection, and the development of detection strategies aimed at identifying UDS attack scenarios. By applying these strategies to a comprehensive taxonomy of UDS attack techniques, we demonstrate that our detection methods cover a wide range of potential attack vectors. Furthermore, we assess the adequacy of current AUTOSAR standardized security events in supporting UDS attack detection, identifying gaps in the current standard. This work enhances the understanding of vehicle security monitoring and provides an example for developing robust cybersecurity measures in automotive communication protocols.

Paperid: 3052, https://arxiv.org/pdf/2510.25352.pdf

Abstract:
We perform a passive measurement study investigating how a Protective DNS service might perform in a Research & Education Network serving hundreds of member institutions. Utilizing freely-available DNS blocklists consisting of domain names deemed to be threats, we test hundreds of millions of users' real DNS queries, observed over a week's time, to find which answers would be blocked because they involve domain names that are potential threats. We find the blocklists disorderly regarding their names, goals, transparency, and provenance making them quite difficult to compare. Consequently, these Protective DNS underpinnings lack organized oversight, presenting challenges and risks in operation at scale.

Paperid: 3053, https://arxiv.org/pdf/2510.24422.pdf

Abstract:
Binarized Neural Networks (BNNs) deployed on memristive crossbar arrays provide energy-efficient solutions for edge computing but are susceptible to physical attacks due to memristor nonvolatility. Recently, Rajendran et al. (IEEE Embedded Systems Letter 2025) proposed a Physical Unclonable Function (PUF)-based scheme to secure BNNs against theft attacks. Specifically, the weight and bias matrices of the BNN layers were secured by swapping columns based on device's PUF key bits. In this paper, we demonstrate that this scheme to secure BNNs is vulnerable to PUF-key recovery attack. As a consequence of our attack, we recover the secret weight and bias matrices of the BNN. Our approach is motivated by differential cryptanalysis and reconstructs the PUF key bit-by-bit by observing the change in model accuracy, and eventually recovering the BNN model parameters. Evaluated on a BNN trained on the MNIST dataset, our attack could recover 85% of the PUF key, and recover the BNN model up to 93% classification accuracy compared to the original model's 96% accuracy. Our attack is very efficient and it takes a couple of minutes to recovery the PUF key and the model parameters.

Paperid: 3054, https://arxiv.org/pdf/2510.24031.pdf

Abstract:
System logs are a cornerstone of cybersecurity, supporting proactive breach prevention and post-incident investigations. However, analyzing vast amounts of diverse log data remains significantly challenging, as high costs, lack of in-house expertise, and time constraints make even basic analysis difficult for many organizations. This study introduces LLMLogAnalyzer, a clustering-based log analysis chatbot that leverages Large Language Models (LLMs) and Machine Learning (ML) algorithms to simplify and streamline log analysis processes. This innovative approach addresses key LLM limitations, including context window constraints and poor structured text handling capabilities, enabling more effective summarization, pattern extraction, and anomaly detection tasks. LLMLogAnalyzer is evaluated across four distinct domain logs and various tasks. Results demonstrate significant performance improvements over state-of-the-art LLM-based chatbots, including ChatGPT, ChatPDF, and NotebookLM, with consistent gains ranging from 39% to 68% across different tasks. The system also exhibits strong robustness, achieving a 93% reduction in interquartile range (IQR) when using ROUGE-1 scores, indicating significantly lower result variability. The framework's effectiveness stems from its modular architecture comprising a router, log recognizer, log parser, and search tools. This design enhances LLM capabilities for structured text analysis while improving accuracy and robustness, making it a valuable resource for both cybersecurity experts and non-technical users.

Paperid: 3055, https://arxiv.org/pdf/2510.23931.pdf

Abstract:
Federated Learning (FL) allows for the training of Machine Learning models in a collaborative manner without the need to share sensitive data. However, it remains vulnerable to Gradient Leakage Attacks (GLAs), which can reveal private information from the shared model updates. In this work, we investigate the effectiveness of Differential Privacy (DP) mechanisms - specifically, DP-SGD and a variant based on explicit regularization (PDP-SGD) - as defenses against GLAs. To this end, we evaluate the performance of several computer vision models trained under varying privacy levels on a simple classification task, and then analyze the quality of private data reconstructions obtained from the intercepted gradients in a simulated FL environment. Our results demonstrate that DP-SGD significantly mitigates the risk of gradient leakage attacks, albeit with a moderate trade-off in model utility. In contrast, PDP-SGD maintains strong classification performance but proves ineffective as a practical defense against reconstruction attacks. These findings highlight the importance of empirically evaluating privacy mechanisms beyond their theoretical guarantees, particularly in distributed learning scenarios where information leakage may represent an unassumable critical threat to data security and privacy.

Paperid: 3056, https://arxiv.org/pdf/2510.23622.pdf

Abstract:
Adversarial attacks pose a severe risk to AI systems used in healthcare, capable of misleading models into dangerous misclassifications that can delay treatments or cause misdiagnoses. These attacks, often imperceptible to human perception, threaten patient safety, particularly in underserved populations. Our study explores these vulnerabilities through empirical experimentation on a dermatological dataset, where adversarial methods significantly reduce classification accuracy. Through detailed threat modeling, experimental benchmarking, and model evaluation, we demonstrate both the severity of the threat and the partial success of defenses like adversarial training and distillation. Our results show that while defenses reduce attack success rates, they must be balanced against model performance on clean data. We conclude with a call for integrated technical, ethical, and policy-based approaches to build more resilient, equitable AI in healthcare.

Paperid: 3057, https://arxiv.org/pdf/2510.23462.pdf

Abstract:
This paper addresses the urgent need for a cybersecurity framework tailored to quantum communication systems as the world transitions to quantum-safe infrastructures. While quantum communication promises unbreakable security, real-world deployments are vulnerable to physical, protocol, and operational risks. Our work presents a structured framework for analysing these threats, combining a TTP-style (Tactic, Technique, Procedure) approach with a specific risk assessment methodology. We introduce SQOUT, a quantum threat intelligence platform, and illustrate its application using a Photon-Number-Splitting (PNS) attack kill chain. Furthermore, we apply established international standards and best practices for information security risk management to assess quantum-specific risk scenarios, providing practical guidance for safeguarding emerging quantum infrastructures.

Paperid: 3058, https://arxiv.org/pdf/2510.23101.pdf

Abstract:
Directed greybox fuzzing (DGF) aims to efficiently trigger bugs at specific target locations by prioritizing seeds whose execution paths are more likely to mutate into triggering target bugs. However, existing DGF approaches suffer from imprecise probability calculations due to their reliance on complex distance metrics derived from static analysis. The over-approximations inherent in static analysis cause a large number of irrelevant execution paths to be mistakenly considered to potentially mutate into triggering target bugs, significantly reducing fuzzing efficiency. We propose to replace static analysis-based distance metrics with precise call stack representations. Call stacks represent precise control flows, thereby avoiding false information in static analysis. We leverage large language models (LLMs) to predict vulnerability-triggering call stacks for guiding seed prioritization. Our approach constructs call graphs through static analysis to identify methods that can potentially reach target locations, then utilizes LLMs to predict the most likely call stack sequence that triggers the vulnerability. Seeds whose execution paths have higher overlap with the predicted call stack are prioritized for mutation. This is the first work to integrate LLMs into the core seed prioritization mechanism of DGF. We implement our approach and evaluate it against several state-of-the-art fuzzers. On a suite of real-world programs, our approach triggers vulnerabilities $1.86\times$ to $3.09\times$ faster compared to baselines. In addition, our approach identifies 10 new vulnerabilities and 2 incomplete fixes in the latest versions of programs used in our controlled experiments through directed patch testing, with 10 assigned CVE IDs.

Paperid: 3059, https://arxiv.org/pdf/2510.22971.pdf

Abstract:
Introduced by Juels and Rivest in 2013, Honeywords, which are decoy passwords stored alongside a real password, appear to be a proactive method to help detect password credentials misuse. However, despite over a decade of research, this technique has not been adopted by major authentication platforms. This position paper argues that the core concept of Honeywords has potential but requires more research on issues such as flatness, integration, and reliability, in order to be a practical deployable solution. This paper examines the current work on Honeyword generation, attacker modeling, and honeychecker architecture, analyzing the subproblems that have been addressed and ongoing issues that prevent this system from being more widely used. The paper then suggests a deployable framework that combines the attacker-resilient, context-aware decoy creation that Honeywords provide with easy integration into existing systems. Honeywords will only move from an academic idea to a practical security tool if technical advances are paired with secure and straightforward architectures, along with adaptive response handling and detailed configuration checks.

Paperid: 3060, https://arxiv.org/pdf/2510.22661.pdf

Abstract:
Post-quantum multivariate public key cryptography (MPKC) schemes resist quantum threats but require heavy operations, such as rejection sampling, which challenge resource-limited devices. Prior hardware designs have addressed various aspects of MPKC signature generation. However, rejection sampling remains largely unexplored in such contexts. This paper presents RejSCore, a lightweight hardware accelerator for rejection sampling in post-quantum cryptography. It specifically targets the QR-UOV scheme, which is a prominent candidate under the second-round of the National Institute of Standards and Technology (NIST) additional digital signature standardization process. The architecture includes an AES-CTR-128-based pseudorandom number generator. Moreover, a lightweight iterative method is employed in rejection sampling, offering reduced resource consumption and area overhead while slightly increasing latency. The performance of RejSCore is comprehensively evaluated on Artix-7 FPGAs and 65 nm CMOS technology using the Area-Delay Product (ADP) and Power-Delay Product (PDP). On Artix-7 and 65 nm CMOS, RejSCore achieves an area of 2042 slices and 464,866~$μm^2$, with operating frequencies of 222 MHz and 565 MHz, respectively. Using the QR-UOV parameters for security level I ($q = 127$, $v = 156$, $m = 54$, $l = 3$), the core completes its operation in 8525 clock cycles. The ADP and PDP evaluations confirm RejSCore's suitability for deployment in resource-constrained and security-critical environments.

Paperid: 3061, https://arxiv.org/pdf/2510.22561.pdf

Abstract:
Blockchain systems rely on decentralized ledgers and strong security guarantees. A key requirement is non-repudiation, which prevents denial of transaction authorship and supports integrity of recorded data. This work surveys digital signature schemes used in blockchain platforms and analyzes how they deliver non-repudiation and contribute to overall system security. We examine representative scheme families and their cryptographic foundations, security assumptions, and properties relevant to deployment, including unforgeability, resistance to malleability, support for aggregation and multisignature or threshold settings, key and signature sizes, and verification cost. Using these criteria, we compare the suitability of different designs for consensus protocols, smart contract constraints, and resource limits. We highlight practical tradeoffs that affect throughput, storage, scalability, and attack surfaces, and summarize benefits and limitations of each scheme in blockchain contexts. The study underscores that carefully chosen digital signatures are central to achieving non-repudiation and preserving information integrity, and it outlines implementation considerations and open directions such as interoperability and post-quantum readiness.

Paperid: 3062, https://arxiv.org/pdf/2510.22396.pdf

Abstract:
Patch backporting, the process of migrating mainline security patches to older branches, is an essential task in maintaining popular open-source projects (e.g., Linux kernel). However, manual backporting can be labor-intensive, while existing automated methods, which heavily rely on predefined syntax or semantic rules, often lack agility for complex patches. In this paper, we introduce PORTGPT, an LLM-agent for end-to-end automation of patch backporting in real-world scenarios. PORTGPT enhances an LLM with tools to access code on-demand, summarize Git history, and revise patches autonomously based on feedback (e.g., from compilers), hence, simulating human-like reasoning and verification. PORTGPT achieved an 89.15% success rate on existing datasets (1815 cases), and 62.33% on our own dataset of 146 complex cases, both outperforms state-of-the-art of backporting tools. We contributed 9 backported patches from PORTGPT to the Linux kernel community and all patches are now merged.

Paperid: 3063, https://arxiv.org/pdf/2510.21837.pdf

Abstract:
Anomaly detection in cybersecurity is a challenging task, where normal events far outnumber anomalous ones with new anomalies occurring frequently. Classical autoencoders have been used for anomaly detection, but struggles in data-limited settings which quantum counterparts can potentially overcome. In this work, we apply Quantum Autoencoders (QAEs) for anomaly detection in cybersecurity, specifically on the BPF-extended tracking honeypot (BETH) dataset. QAEs are evaluated across multiple encoding techniques, ansatz types, repetitions, and feature selection strategies. Our results demonstrate that an 8-feature QAE using Dense-Angle encoding with a RealAmplitude ansatz can outperform Classical Autoencoders (CAEs), even when trained on substantially fewer samples. The effects of quantum encoding and feature selection for developing quantum models are demonstrated and discussed. In a data-limited setting, the best performing QAE model has a F1 score of 0.87, better than that of CAE (0.77). These findings suggest that QAEs may offer practical advantages for anomaly detection in data-limited scenarios.

Paperid: 3064, https://arxiv.org/pdf/2510.21508.pdf

Abstract:
The proliferation of smart home devices has increased convenience but also introduced cybersecurity risks for everyday users, as many devices lack robust security features. Intrusion Detection Systems are a prominent approach to detecting cybersecurity threats. However, their alerts often use technical terms and require users to interpret them correctly, which is challenging for a typical smart home user. Large Language Models can bridge this gap by translating IDS alerts into actionable security notifications. However, it has not yet been clear what an actionable cybersecurity notification should look like. In this paper, we conduct an experimental online user study with 130 participants to examine how the length and complexity of LLM-generated notifications affect user likability, understandability, and motivation to act. Our results show that intermediate-complexity notifications are the most effective across all user groups, regardless of their technological proficiency. Across the board, users rated beginner-level messages as more effective when they were longer, while expert-level messages were rated marginally more effective when they were shorter. These findings provide insights for designing security notifications that are both actionable and broadly accessible to smart home users.

Paperid: 3065, https://arxiv.org/pdf/2510.21483.pdf

Abstract:
We present GRAFHEN, a new cryptographic scheme which offers Fully Homomorphic Encryption without the need for bootstrapping (or in other words, without noise). Building on the work of Nuida and others, we achieve this using encodings in groups. The groups are represented on a machine using rewriting systems. In this way the subgroup membership problem, which an attacker would have to solve in order to break the scheme, becomes maximally hard, while performance is preserved. In fact we include a simple benchmark demonstrating that our implementation runs several orders of magnitude faster than existing standards. We review many possible attacks against our protocol and explain how to protect the scheme in each case.

Paperid: 3066, https://arxiv.org/pdf/2510.21236.pdf

Abstract:
Large Language Models (LLMs) have evolved into AI agents that interact with external tools and environments to perform complex tasks. The Model Context Protocol (MCP) has become the de facto standard for connecting agents with such resources, but security has lagged behind: thousands of MCP servers execute with unrestricted access to host systems, creating a broad attack surface. In this paper, we introduce AgentBound, the first access control framework for MCP servers. AgentBound combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications. We build a dataset containing the 296 most popular MCP servers, and show that access control policies can be generated automatically from source code with 80.9% accuracy. We also show that AgentBound blocks the majority of security threats in several malicious MCP servers, and that policy enforcement engine introduces negligible overhead. Our contributions provide developers and project managers with a practical foundation for securing MCP servers while maintaining productivity, enabling researchers and tool builders to explore new directions for declarative access control and MCP security.

Paperid: 3067, https://arxiv.org/pdf/2510.20922.pdf

Abstract:
Quantitative information flow (QIF) is concerned with assessing the leakage of information in computational systems. In QIF there are two main perspectives for the quantification of leakage. On one hand, the static perspective considers all possible runs of the system in the computation of information flow, and is usually employed when preemptively deciding whether or not to run the system. On the other hand, the dynamic perspective considers only a specific, concrete run of the system that has been realised, while ignoring all other runs. The dynamic perspective is relevant for, e.g., system monitors and trackers, especially when deciding whether to continue or to abort a particular run based on how much leakage has occurred up to a certain point. Although the static perspective of leakage is well-developed in the literature, the dynamic perspective still lacks the same level of theoretical maturity. In this paper we take steps towards bridging this gap with the following key contributions: (i) we provide a novel definition of dynamic leakage that decouples the adversary's belief about the secret value from a baseline distribution on secrets against which the success of the attack is measured; (ii) we demonstrate that our formalisation satisfies relevant information-theoretic axioms, including non-interference and relaxed versions of monotonicity and the data-processing inequality (DPI); (iii) we identify under what kind of analysis strong versions of the axioms of monotonicity and the DPI might not hold, and explain the implications of this (perhaps counter-intuitive) outcome; (iv) we show that our definition of dynamic leakage is compatible with the well-established static perspective; and (v) we exemplify the use of our definition on the formalisation of attacks against privacy-preserving data releases.

Paperid: 3068, https://arxiv.org/pdf/2510.20494.pdf

Abstract:
Cyber-physical systems and the Internet of Things (IoT) are key technologies in the Industry 4.0 vision. They incorporate sensors and actuators to interact with the physical environment. However, when creating and interconnecting components to form a heterogeneous smart systems architecture, these face challenges in cybersecurity. This paper presents an experimental investigation of architectural configurations for a LoRaWAN-based Smart-Lighting project, aimed at verifying and improving the system's robustness against attacks. We assess the system's robustness in a series of iterative experiments conducted both in-vitro and on-site. The results show that most attacks on a LoRaWAN network are unsuccessful, also highlighting unresolved issues with the installed products. The most successful attacks are high-power jamming attacks within a few meters of the target, which, in the case of gateways, can be mitigated through gateway redundancy.

Paperid: 3069, https://arxiv.org/pdf/2510.20314.pdf

Abstract:
With the wide application of deep reinforcement learning (DRL) techniques in complex fields such as autonomous driving, intelligent manufacturing, and smart healthcare, how to improve its security and robustness in dynamic and changeable environments has become a core issue in current research. Especially in the face of adversarial attacks, DRL may suffer serious performance degradation or even make potentially dangerous decisions, so it is crucial to ensure their stability in security-sensitive scenarios. In this paper, we first introduce the basic framework of DRL and analyze the main security challenges faced in complex and changing environments. In addition, this paper proposes an adversarial attack classification framework based on perturbation type and attack target and reviews the mainstream adversarial attack methods against DRL in detail, including various attack methods such as perturbation state space, action space, reward function and model space. To effectively counter the attacks, this paper systematically summarizes various current robustness training strategies, including adversarial training, competitive training, robust learning, adversarial detection, defense distillation and other related defense techniques, we also discuss the advantages and shortcomings of these methods in improving the robustness of DRL. Finally, this paper looks into the future research direction of DRL in adversarial environments, emphasizing the research needs in terms of improving generalization, reducing computational complexity, and enhancing scalability and explainability, aiming to provide valuable references and directions for researchers.

Paperid: 3070, https://arxiv.org/pdf/2510.20080.pdf

Abstract:
Cyber threats increasingly demand joint responses, yet the organizational dynamics behind multi-agency cybersecurity collaboration remain poorly understood. Understanding who leads, who bridges, and how agencies coordinate is critical for strengthening both U.S. homeland security and allied defense efforts. In this study, we construct a co-authorship network from nine Joint Cybersecurity Advisories (CSAs) issued between November 2024 and August 2025. We map 41 agencies and 442 co-authoring ties to analyze the structure of collaboration. We find a tightly knit U.S. triad -- CISA, FBI, and NSA -- densely connected with Five Eyes and select European allies. Degree centrality identifies CISA and FBI as coordination hubs, while betweenness highlights NSA, the UK's NCSC, and Australia's ASD-ACSC as key bridges linking otherwise fragmented clusters. By releasing the first replicable dataset and network analysis of CSAs, we provide new empirical evidence on how collaborative cybersecurity signals are organized and where strategic influence is concentrated.

Paperid: 3071, https://arxiv.org/pdf/2510.20019.pdf

Abstract:
Radio Frequency Identification (RFID) tracking may be a viable solution for defense assets that must be stored in accordance with security guidelines. However, poor sensor specificity (vulnerabilities include long range detection, spoofing, and counterfeiting) can lead to erroneous detection and operational security events. We present a supervised learning simulation with realistic Received Signal Strength Indicator (RSSI) data and Decision Tree classification in a Computer Assisted Design (CAD)-modeled floor plan that encapsulates some of the challenges encountered in defense storage. In this work, we focused on classifying 12 lab zones (LabZoneA-L) to perform location inference. The raw dataset had approximately 980,000 reads. Class frequencies were imbalanced, and class weights were calculated to account for class imbalance in this multi-class setting. The model, trained on stratified subsamples to 5,000 balanced observations, yielded an overall accuracy of 34.2% and F1-scores greater than 0.40 for multiple zones (Zones F, G, H, etc.). However, rare classes (most notably LabZoneC) were often misclassified, even with the use of class weights. An adjacency-aware confusion matrix was calculated to allow better interpretation of physically adjacent zones. These results suggest that RSSI-based decision trees can be applied in realistic simulations to enable zone-level anomaly detection or misplacement monitoring for defense supply logistics. Reliable classification performance in low-coverage and low-signal zones could be improved with better antenna placement or additional sensors and sensor fusion with other modalities.

Paperid: 3072, https://arxiv.org/pdf/2510.20007.pdf

Abstract:
Digital transactions currently exceed trillions of dollars annually, yet traditional paper-based agreements remain a bottleneck for automation, enforceability, and dispute resolution. Natural language contracts introduce ambiguity, require manual processing, and lack computational verifiability, all of which hinder efficient digital commerce. Computable legal contracts, expressed in machine-readable formats, offer a potential solution by enabling automated execution and verification. Blockchain-based smart contracts further strengthen enforceability and accelerate dispute resolution; however, current implementations risk exposing sensitive agreement terms on public ledgers, raising serious privacy and competitive intelligence concerns that limit enterprise adoption. We introduce zk-agreements, a protocol designed to transition from paper-based trust to cryptographic trust while preserving confidentiality. Our design combines zero-knowledge proofs to protect private agreement terms, secure two-party computation to enable private compliance evaluation, and smart contracts to guarantee automated enforcement. Together, these components achieve both privacy preservation and computational enforceability, resolving the fundamental tension between transparency and confidentiality in blockchain-based agreements.

Paperid: 3073, https://arxiv.org/pdf/2510.19979.pdf

Abstract:
With the increasing deployment of Large Language Models (LLMs) on mobile and edge platforms, securing them against model extraction attacks has become a pressing concern. However, protecting model privacy without sacrificing the performance benefits of untrusted AI accelerators, such as GPUs, presents a challenging trade-off. In this paper, we initiate the study of high-performance execution on LLMs and present SecureInfer, a hybrid framework that leverages a heterogeneous Trusted Execution Environments (TEEs)-GPU architecture to isolate privacy-critical components while offloading compute-intensive operations to untrusted accelerators. Building upon an outsourcing scheme, SecureInfer adopts an information-theoretic and threat-informed partitioning strategy: security-sensitive components, including non-linear layers, projection of attention head, FNN transformations, and LoRA adapters, are executed inside an SGX enclave, while other linear operations (matrix multiplication) are performed on the GPU after encryption and are securely restored within the enclave. We implement a prototype of SecureInfer using the LLaMA-2 model and evaluate it across performance and security metrics. Our results show that SecureInfer offers strong security guarantees with reasonable performance, offering a practical solution for secure on-device model inference.

Paperid: 3074, https://arxiv.org/pdf/2510.19890.pdf

Abstract:
We present a data generation framework designed to simulate spoofing attacks and randomly place attack scenarios worldwide. We apply deep neural network-based models for spoofing detection, utilizing Long Short-Term Memory networks and Transformer-inspired architectures. These models are specifically designed for online detection and are trained using the generated dataset. Our results demonstrate that deep learning models can accurately distinguish spoofed signals from genuine ones, achieving high detection performance. The best results are achieved by Transformer-inspired architectures with early fusion of the inputs resulting in an error rate of 0.16%.

Paperid: 3075, https://arxiv.org/pdf/2510.19883.pdf

Abstract:
Over the years, the technological landscape has evolved, reshaping the security posture of organisations and increasing their exposure to cybersecurity threats, many originating from within. Insider threats remain a major challenge, particularly in sectors where cybersecurity infrastructure, expertise, and regulations are still developing. This study proposes the Insider Threat Explainable Machine Learning (IT-XML) framework, which integrates the Cross-Industry Standard Process for Data Mining (CRISP-DM) with Hidden Markov Models (HMM) to enhance proactive insider threat management and decision-making. A quantitative approach is adopted using an online questionnaire to assess employees' knowledge of insider threat patterns, access control, privacy practices, and existing policies across three large data-sensitive organisations. The IT-XML framework provides assessment capabilities through survey-based data, HMM-driven pattern recognition for security maturity classification, and evidence-based recommendations for proactive threat mitigation. The framework classified all organisations at the developing security maturity level with 97-98% confidence and achieved a classification accuracy of 91.7%, identifying audit log access limits as the most critical control. Random Forest analysis highlighted vendor breach notifications (0.081) and regular audit log reviews (0.052) as key determinants of resilience. Explainability methods such as SHAP and LIME improved model transparency and interpretability, demonstrating the framework's potential to strengthen insider threat management practices.

Paperid: 3076, https://arxiv.org/pdf/2510.19772.pdf

Abstract:
The integration of Industrial Internet of Things (IIoT) devices into manufacturing environments has accelerated the transition to Industry 4.0, but has also introduced new cybersecurity risks. This paper conducts a comprehensive security analysis of a commercial smart air compressor, revealing critical vulnerabilities including hardcoded credentials, unauthenticated APIs, and an insecure update mechanism. It includes a formal threat model, demonstrates practical attack scenarios in a testbed environment, and evaluates their subsequent impact on an industrial process, leading to denial of service and the corruption of critical process telemetry. In addition, an analysis of the device's supply chain reveals how product integration from multiple vendors and limited security considerations can expose a device to threats. The findings underscore the necessity of incorporating cybersecurity principles into both IIoT device design and supply chain governance to enhance resilience against emerging industrial cyber threats.

Paperid: 3077, https://arxiv.org/pdf/2510.19761.pdf

Abstract:
Adversarial attacks pose significant challenges to Machine Learning (ML) systems and especially Deep Neural Networks (DNNs) by subtly manipulating inputs to induce incorrect predictions. This paper investigates whether increasing the layer depth of deep neural networks affects their robustness against adversarial attacks in the Network Intrusion Detection System (NIDS) domain. We compare the adversarial robustness of various deep neural networks across both \ac{NIDS} and computer vision domains (the latter being widely used in adversarial attack experiments). Our experimental results reveal that in the NIDS domain, adding more layers does not necessarily improve their performance, yet it may actually significantly degrade their robustness against adversarial attacks. Conversely, in the computer vision domain, adding more layers exhibits a more modest impact on robustness. These findings can guide the development of robust neural networks for (NIDS) applications and highlight the unique characteristics of network security domains within the (ML) landscape.

Paperid: 3078, https://arxiv.org/pdf/2510.19708.pdf

Abstract:
Authorship attribution techniques are increasingly being used in online contexts such as sock puppet detection, malicious account linking, and cross-platform account linking. Yet, it is unknown whether these models perform equitably across different demographic groups. Bias in such techniques could lead to false accusations, account banning, and privacy violations disproportionately impacting users from certain demographics. In this paper, we systematically audit authorship attribution for bias with respect to gender, native language, and age. We evaluate fairness in 3 ways. First, we evaluate how the proportion of users with a certain demographic characteristic impacts the overall classifier performance. Second, we evaluate if a user's demographic characteristics influence the probability that their texts are misclassified. Our analysis indicates that authorship attribution does not demonstrate bias across demographic groups in the closed-world setting. Third, we evaluate the types of errors that occur when the true author is removed from the suspect set, thereby forcing the classifier to choose an incorrect author. Unlike the first two settings, this analysis demonstrates a tendency to attribute authorship to users who share the same demographic characteristic as the true author. Crucially, these errors do not only include texts that deviate from a user's usual style, but also those that are very close to the author's average. Our results highlight that though a model may appear fair in the closed-world setting for a performant classifier, this does not guarantee fairness when errors are inevitable.

Paperid: 3079, https://arxiv.org/pdf/2510.19440.pdf

Abstract:
The growing number of wireless devices increases the need for secure network access. Radio Frequency Fingerprinting (RFF), a physical-layer authentication method, offers a promising solution as it requires no cryptography and resists spoofing. However, existing RFF approaches often lack a unified theory and effective feature extraction. Many methods use handcrafted signal features or direct neural network classification, leading to limited generalization and interpretability. In this work, we model the transmitter as a black box and analyze its impact on transmitted signals. By treating the deviation from an ideal signal as hardware-induced distortion, we represent the received signal using a Volterra series, using its kernels to capture linear and nonlinear hardware traits. To manage the high dimensionality of these kernels, we approximate them via wavelet decomposition and estimate coefficients through least-squares fitting. The resulting wavelet coefficients provide compact yet informative hardware representations, which are classified using a complex-valued neural network. Experiments on a public LoRa dataset show state-of-the-art performance, with over 98% accuracy in static channels and above 90% under multipath and Doppler effects. The proposed approach improves both interpretability and generalization across varying channel conditions.

Paperid: 3080, https://arxiv.org/pdf/2510.19324.pdf

Abstract:
As networks move toward the next-generation 6G, Intent-based Management (IbM) systems are increasingly adopted to simplify and automate network management by translating high-level intents into low-level configurations. Within these systems, agents play a critical role in monitoring current state of the network, gathering data, and enforcing actions across the network to fulfill the intent. However, ensuring secure and fine-grained authorization of agents remains a significant challenge, especially in dynamic and multi-tenant environments. Traditional models such as Role-Based Access Control (RBAC), Attribute-Based Access Control (ABAC) and Relational-Based Access Control (RelBAC) often lack the flexibility to accommodate the evolving context and granularity required by intentbased operations. In this paper, we propose an enhanced authorization framework that integrates contextual and functional attributes with agent roles to achieve dynamic, policy-driven access control. By analyzing agent functionalities, our approach ensures that agents are granted only the minimal necessary privileges towards knowledge graphs.

Paperid: 3081, https://arxiv.org/pdf/2510.19300.pdf

Abstract:
Wireless Body Area Networks (WBANs) have gained significant attention due to their applications in healthcare monitoring, sports, military communication, and remote patient care. These networks consist of wearable or implanted sensors that continuously collect and transmit physiological data, requiring efficient and reliable communication. However, WBANs face challenges such as limited energy, dynamic topology, and sensitivity to node temperature, which demand specialized routing strategies. Traditional shortest-path routing often causes congestion and overheating in specific nodes, leading to early failures. To address these problems, this paper proposes an intelligent temperature-aware and reliability-based routing approach that enhances WBAN performance. The proposed method works in two phases: (1) network setup and intelligent path selection, and (2) dynamic traffic management and hotspot avoidance. In the first phase, nodes share information such as residual energy, temperature, link reliability, and delay to build an optimized topology using a multi-criteria decision algorithm. The second phase continuously monitors real-time conditions and reroutes traffic away from overheated or depleted nodes. Simulation results show that the proposed approach improves throughput by 13 percent, reduces end-to-end delay by 10 percent, decreases energy consumption by 25 percent, and lowers routing load by 30 percent compared to existing methods.

Paperid: 3082, https://arxiv.org/pdf/2510.19169.pdf

Abstract:
As large language models (LLMs) are increasingly integrated into real-world applications, ensuring their safety, robustness, and privacy compliance has become critical. We present OpenGuardrails, the first fully open-source platform that unifies large-model-based safety detection, manipulation defense, and deployable guardrail infrastructure. OpenGuardrails protects against three major classes of risks: (1) content-safety violations such as harmful or explicit text generation, (2) model-manipulation attacks including prompt injection, jailbreaks, and code-interpreter abuse, and (3) data leakage involving sensitive or private information. Unlike prior modular or rule-based frameworks, OpenGuardrails introduces three core innovations: (1) a Configurable Policy Adaptation mechanism that allows per-request customization of unsafe categories and sensitivity thresholds; (2) a Unified LLM-based Guard Architecture that performs both content-safety and manipulation detection within a single model; and (3) a Quantized, Scalable Model Design that compresses a 14B dense base model to 3.3B via GPTQ while preserving over 98 of benchmark accuracy. The system supports 119 languages, achieves state-of-the-art performance across multilingual safety benchmarks, and can be deployed as a secure gateway or API-based service for enterprise use. All models, datasets, and deployment scripts are released under the Apache 2.0 license.

Paperid: 3083, https://arxiv.org/pdf/2510.19121.pdf

Abstract:
The rapid growth of the Internet of Things (IoT) has transformed industries by enabling seamless data exchange among connected devices. However, IoT networks remain vulnerable to security threats such as denial of service (DoS) attacks, anomalous traffic, and data manipulation due to decentralized architectures and limited resources. To address these issues, this paper proposes an advanced anomaly detection framework with three main phases. First, data preprocessing is performed using the Median KS Test to remove noise, handle missing values, and balance datasets for cleaner input. Second, a feature selection phase employs a Genetic Algorithm combined with eagle inspired search strategies to identify the most relevant features, reduce dimensionality, and improve efficiency without sacrificing accuracy. Finally, an ensemble classifier integrates Decision Tree, Random Forest, and XGBoost algorithms to achieve accurate and reliable anomaly detection. The proposed model demonstrates high adaptability and scalability across diverse IoT environments. Experimental results show that it outperforms existing methods by achieving 98 percent accuracy, 95 percent detection rate, and reductions in false positive (10 percent) and false negative (5 percent) rates. These results confirm the framework effectiveness and robustness in improving IoT network security against evolving cyber threats.

Paperid: 3084, https://arxiv.org/pdf/2510.19075.pdf

Abstract:
We show that if a language $L$ admits a public-coin unambiguous interactive proof (UIP) with round complexity $\ell$, where $a$ bits are communicated per round, then the batch language $L^{\otimes k}$, i.e. the set of $k$-tuples of statements all belonging to $L$, has an unambiguous interactive proof with round complexity $\ell\cdot\mathsf{polylog}(k)$, per-round communication of $a\cdot \ell\cdot\mathsf{polylog}(k) + \mathsf{poly}(\ell)$ bits, assuming the verifier in the $\mathsf{UIP}$ has depth bounded by $\mathsf{polylog}(k)$. Prior to this work, the best known batch $\mathsf{UIP}$ for $L^{\otimes{k}}$ required communication complexity at least $(\mathsf{poly}(a)\cdot k^ε + k) \cdot \ell^{1/ε}$ for any arbitrarily small constant $ε>0$ (Reingold-Rothblum-Rothblum, STOC 2016). As a corollary of our result, we obtain a doubly efficient proof system, that is, a proof system whose proving overhead is polynomial in the time of the underlying computation, for any language computable in polynomial space and in time at most $n^{O\left(\sqrt{\frac{\log n}{\log\log n}}\right)}$. This expands the state of the art of doubly efficient proof systems: prior to our work, such systems were known for languages computable in polynomial space and in time $n^{({\log n})^δ}$ for a small $δ>0$ significantly smaller than $1/2$ (Reingold-Rothblum-Rothblum, STOC 2016).

Paperid: 3085, https://arxiv.org/pdf/2510.18990.pdf

Abstract:
We investigate and defend the possibility of causing a stock market crash via small manipulations of individual stock values that together realize an adversarial example to financial forecasting models, causing these models to make the self-fulfilling prediction of a crash. Such a crash triggered by an adversarial example would likely be hard to detect, since the model's predictions would be accurate and the interventions that would cause it are minor. This possibility is a major risk to financial stability and an opportunity for hostile actors to cause great economic damage to an adversary. This threat also exists against individual stocks and the corresponding valuation of individual companies. We outline how such an attack might proceed, what its theoretical basis is, how it can be directed towards a whole economy or an individual company, and how one might defend against it. We conclude that this threat is vastly underappreciated and requires urgent research on how to defend against it.

Paperid: 3086, https://arxiv.org/pdf/2510.18654.pdf

Abstract:
E-values have gained prominence as flexible tools for statistical inference and risk control, enabling anytime- and post-hoc-valid procedures under minimal assumptions. However, many real-world applications fundamentally rely on sensitive data, which can be leaked through e-values. To ensure their safe release, we propose a general framework to transform non-private e-values into differentially private ones. Towards this end, we develop a novel biased multiplicative noise mechanism that ensures our e-values remain statistically valid. We show that our differentially private e-values attain strong statistical power, and are asymptotically as powerful as their non-private counterparts. Experiments across online risk monitoring, private healthcare, and conformal e-prediction demonstrate our approach's effectiveness and illustrate its broad applicability.

Paperid: 3087, https://arxiv.org/pdf/2510.18614.pdf

Abstract:
We present Qatsi, a hierarchical key derivation scheme using Argon2id that generates reproducible cryptographic secrets without persistent storage. The system eliminates vault-based attack surfaces by deriving all secrets deterministically from a single high-entropy master secret and contextual layers. Outputs achieve 103-312 bits of entropy through memory-hard derivation (64-128 MiB, 16-32 iterations) and provably uniform rejection sampling over 7776-word mnemonics or 90-character passwords. We formalize the hierarchical construction, prove output uniformity, and quantify GPU attack costs: $2.4 \times 10^{16}$ years for 80-bit master secrets on single-GPU adversaries under Paranoid parameters (128 MiB memory). The implementation in Rust provides automatic memory zeroization, compile-time wordlist integrity verification, and comprehensive test coverage. Reference benchmarks on Apple M1 Pro (2021) demonstrate practical usability with 544 ms Standard mode and 2273 ms Paranoid mode single-layer derivations. Qatsi targets air-gapped systems and master credential generation where stateless reproducibility outweighs rotation flexibility.

Paperid: 3088, https://arxiv.org/pdf/2510.18484.pdf

Abstract:
This paper explores the challenges of cyberattack attribution, specifically APTs, applying the case study approach for the WhisperGate cyber operation of January 2022 executed by the Russian military intelligence service (GRU) and targeting Ukrainian government entities. The study provides a detailed review of the threat actor identifiers and taxonomies used by leading cybersecurity vendors, focusing on the evolving attribution from Microsoft, ESET, and CrowdStrike researchers. Once the attribution to Ember Bear (GRU Unit 29155) is established through technical and intelligence reports, we use both traditional machine learning classifiers and a large language model (ChatGPT) to analyze the indicators of compromise (IoCs), tactics, and techniques to statistically and semantically attribute the WhisperGate attack. Our findings reveal overlapping indicators with the Sandworm group (GRU Unit 74455) but also strong evidence pointing to Ember Bear, especially when the LLM is fine-tuned or contextually augmented with additional intelligence. Thus, showing how AI/GenAI with proper fine-tuning are capable of solving the attribution challenge.

Paperid: 3089, https://arxiv.org/pdf/2510.18379.pdf

Abstract:
We initiate the study of distribution testing under \emph{user-level} local differential privacy, where each of $n$ users contributes $m$ samples from the unknown underlying distribution. This setting, albeit very natural, is significantly more challenging that the usual locally private setting, as for the same parameter $\varepsilon$ the privacy guarantee must now apply to a full batch of $m$ data points. While some recent work consider distribution \emph{learning} in this user-level setting, nothing was known for even the most fundamental testing task, uniformity testing (and its generalization, identity testing). We address this gap, by providing (nearly) sample-optimal user-level LDP algorithms for uniformity and identity testing. Motivated by practical considerations, our main focus is on the private-coin, symmetric setting, which does not require users to share a common random seed nor to have been assigned a globally unique identifier.

Paperid: 3090, https://arxiv.org/pdf/2510.18204.pdf

Abstract:
Despite recent advances, Large Language Models (LLMs) still generate vulnerable code. Retrieval-Augmented Generation (RAG) has the potential to enhance LLMs for secure code generation by incorporating external security knowledge. However, the conventional RAG design struggles with the noise of raw security-related documents, and existing retrieval methods overlook the significant security semantics implicitly embedded in task descriptions. To address these issues, we propose RESCUE, a new RAG framework for secure code generation with two key innovations. First, we propose a hybrid knowledge base construction method that combines LLM-assisted cluster-then-summarize distillation with program slicing, producing both high-level security guidelines and concise, security-focused code examples. Second, we design a hierarchical multi-faceted retrieval to traverse the constructed knowledge base from top to bottom and integrates multiple security-critical facts at each hierarchical level, ensuring comprehensive and accurate retrieval. We evaluated RESCUE on four benchmarks and compared it with five state-of-the-art secure code generation methods on six LLMs. The results demonstrate that RESCUE improves the SecurePass@1 metric by an average of 4.8 points, establishing a new state-of-the-art performance for security. Furthermore, we performed in-depth analysis and ablation studies to rigorously validate the effectiveness of individual components in RESCUE.

Paperid: 3091, https://arxiv.org/pdf/2510.18192.pdf

Abstract:
The inherent determinism of blockchain technology poses a significant challenge to generating secure random numbers within smart contracts, leading to exploitable vulnerabilities, particularly in decentralized finance (DeFi) ecosystems and blockchain-based gaming applications. From our observations, the current state-of-the-art detection tools suffer from inadequate precision while dealing with random number vulnerabilities. To address this problem, we propose TaintSentinel, a novel path sensitive vulnerability detection system designed to analyze smart contracts at the execution path level and gradually analyze taint with domain-specific rules. This paper discusses a solution that incorporates a multi-faceted approach, integrating rule-based taint analysis to track data flow, a dual stream neural network to identify complex vulnerability signatures, and evidence-based parameter initialization to minimize false positives. The system's two-phase operation involves semantic graph construction and taint propagation analysis, followed by pattern recognition using PathGNN and global structural analysis via GlobalGCN. Our experiments on 4,844 contracts demonstrate the superior performance of TaintSentinel relative to existing tools, yielding an F1-score of 0.892, an AUC-ROC of 0.94, and a PRA accuracy of 97%.

Paperid: 3092, https://arxiv.org/pdf/2510.17947.pdf

Abstract:
Large Language Models (LLMs) are improving at an exceptional rate. With the advent of agentic workflows, multi-turn dialogue has become the de facto mode of interaction with LLMs for completing long and complex tasks. While LLM capabilities continue to improve, they remain increasingly susceptible to jailbreaking, especially in multi-turn scenarios where harmful intent can be subtly injected across the conversation to produce nefarious outcomes. While single-turn attacks have been extensively explored, adaptability, efficiency and effectiveness continue to remain key challenges for their multi-turn counterparts. To address these gaps, we present PLAGUE, a novel plug-and-play framework for designing multi-turn attacks inspired by lifelong-learning agents. PLAGUE dissects the lifetime of a multi-turn attack into three carefully designed phases (Primer, Planner and Finisher) that enable a systematic and information-rich exploration of the multi-turn attack family. Evaluations show that red-teaming agents designed using PLAGUE achieve state-of-the-art jailbreaking results, improving attack success rates (ASR) by more than 30% across leading models in a lesser or comparable query budget. Particularly, PLAGUE enables an ASR (based on StrongReject) of 81.4% on OpenAI's o3 and 67.3% on Claude's Opus 4.1, two models that are considered highly resistant to jailbreaks in safety literature. Our work offers tools and insights to understand the importance of plan initialization, context optimization and lifelong learning in crafting multi-turn attacks for a comprehensive model vulnerability evaluation.

Paperid: 3093, https://arxiv.org/pdf/2510.17917.pdf

Abstract:
Data unlearning aims to remove the influence of specific training samples from a trained model without requiring full retraining. Unlike concept unlearning, data unlearning in diffusion models remains underexplored and often suffers from quality degradation or incomplete forgetting. To address this, we first observe that most existing methods attempt to unlearn the samples at all diffusion time steps equally, leading to poor-quality generation. We argue that forgetting occurs disproportionately across time and frequency, depending on the model and scenarios. By selectively focusing on specific time-frequency ranges during training, we achieve samples with higher aesthetic quality and lower noise. We validate this improvement by applying our time-frequency selective approach to diverse settings, including gradient-based and preference optimization objectives, as well as both image-level and text-to-image tasks. Finally, to evaluate both deletion and quality of unlearned data samples, we propose a simple normalized version of SSCD. Together, our analysis and methods establish a clearer understanding of the unique challenges in data unlearning for diffusion models, providing practical strategies to improve both evaluation and unlearning performance.

Paperid: 3094, https://arxiv.org/pdf/2510.17854.pdf

Abstract:
Rapid advancement in generative AI and large language models (LLMs) has enabled the generation of highly realistic and contextually relevant digital content. LLMs such as ChatGPT with DALL-E integration and Stable Diffusion techniques can produce images that are often indistinguishable from those created by humans, which poses challenges for digital content authentication. Verifying the integrity and origin of digital data to ensure it remains unaltered and genuine is crucial to maintaining trust and legality in digital media. In this paper, we propose an embedding-based AI image detection framework that utilizes image embeddings and a vector similarity to distinguish AI-generated images from real (human-created) ones. Our methodology is built on the hypothesis that AI-generated images demonstrate closer embedding proximity to other AI-generated content, while human-created images cluster similarly within their domain. To validate this hypothesis, we developed a system that processes a diverse dataset of AI and human-generated images through five benchmark embedding models. Extensive experimentation demonstrates the robustness of our approach, and our results confirm that moderate to high perturbations minimally impact the embedding signatures, with perturbed images maintaining close similarity matches to their original versions. Our solution provides a generalizable framework for AI-generated image detection that balances accuracy with computational efficiency.

Paperid: 3095, https://arxiv.org/pdf/2510.14708.pdf

Abstract:
The Internet of Medical Things (IoMT) has revolutionized healthcare by transforming medical operations into standardized, interoperable services. However, this service-oriented model introduces significant security vulnerabilities in device management and communication, which are especially critical given the sensitivity of medical data. To address these risks, this paper proposes SLIE (Secure and Lightweight Identity Encryption), a novel cryptosystem based on Wildcard Key Derivation Identity-Based Encryption (WKD-IBE). SLIE ensures scalable trust and secure omnidirectional communication through end-to-end encryption, hierarchical access control, and a lightweight key management system designed for resource-constrained devices. It incorporates constant-time operations, memory obfuscation, and expiry-based key revocation to counter side-channel, man-in-the-middle, and unauthorized access attacks, thereby ensuring compliance with standards like HIPAA and GDPR. Evaluations show that SLIE significantly outperforms RSA, with encryption and decryption times of 0.936ms and 0.217ms for 1KB of data, an 84.54% improvement in encryption speed, a 99.70% improvement in decryption speed, and an energy efficiency of 0.014 J/KB.

Paperid: 3096, https://arxiv.org/pdf/2510.14693.pdf

Abstract:
FibRace, jointly developed by KKRT Labs and Hyli, was the first large-scale experiment to test client-side proof generation on smartphones using Cairo M. Presented as a mobile game in which players proved Fibonacci numbers and climbed a leaderboard, FibRace served a dual purpose: to engage the public and to provide empirical benchmarking. Over a three-week campaign (September 11-30, 2025), 6,047 players across 99 countries generated 2,195,488 proofs on 1,420 unique device models. The results show that most modern smartphones can complete a proof in under 5 seconds, confirming that *mobile devices are now capable of producing zero-knowledge proofs reliably*, without the need for remote provers or specialized hardware. Performance was correlated primarily with RAM capacity and SoC (System on Chip) performance: devices with at least 3 GB of RAM proved stably, when Apple's A19 Pro and M-series chips achieved the fastest proving times. Hyli's blockchain natively verified every proof onchain without congestion. FibRace provides the most comprehensive dataset to date on mobile proving performance, establishing a practical baseline for future research in lightweight provers, proof-powered infrastructure, and privacy-preserving mobile applications.

Paperid: 3097, https://arxiv.org/pdf/2510.14384.pdf

Abstract:
Low-cost Internet of Things (IoT) devices are increasingly popular but often insecure due to poor update regimes. As a result, many devices run outdated and known-vulnerable versions of open-source software. We address this problem by proposing to patch IoT firmware at the binary level, without requiring vendor support. In particular, we introduce minimally invasive local reassembly, a new technique for automatically patching known (n-day) vulnerabilities in IoT firmware. Our approach is designed to minimize side effects and reduce the risk of introducing breaking changes. We systematically evaluate our approach both on 108 binaries within the controlled environment of the MAGMA benchmarks, as well as on 30 real-world Linux-based IoT firmware images from the KARONTE dataset. Our prototype successfully patches 83% of targeted vulnerabilities in MAGMA and 96% in the firmware dataset.

Paperid: 3098, https://arxiv.org/pdf/2508.21005.pdf

Abstract:
Ransomware impact hinges on how easily an intruder can move laterally and spread to the maximum number of assets. We present a graph-theoretic method to measure lateral-movement susceptibility and estimate blast radius. We build a directed multigraph where vertices represent assets and edges represent reachable services (e.g., RDP/SSH) between them. We model lateral movement as a probabilistic process using a pivot potential factor $Ï(s)$ for each service. This allows us to iteratively compute a $K$-hop compromise probability matrix that captures how compromise propagates through the network. Metrics derived from this model include: (1) Lateral-Movement Susceptibility (LMS$_K$): the average probability of a successful lateral movement between any two assets (0-1 scale); and (2) Blast-Radius Estimate (BRE$_K$): the expected percentage of assets compromised in an average attack scenario. Interactive control (SSH 22, RDP 3389) gets higher $Ï(s)$ than app-only ports (MySQL 3306, MSSQL 1433), which seldom enable pivoting without an RCE. Across anonymized enterprise snapshots, pruning high-$Ï(s)$ edges yields the largest LMS$_K$/BRE$_K$ drop, aligning with CISA guidance, MITRE ATT\&CK (TA0008: Lateral Movement), and NIST SP~800-207. The framework evaluates (micro)segmentation and helps prioritize controls that reduce lateral movement susceptibility and shrink blast radius.

Paperid: 3099, https://arxiv.org/pdf/2508.20653.pdf

Abstract:
Integrating cryptographic accelerators into modern CPU architectures presents unique microarchitectural challenges, particularly when extending instruction sets with complex and multistage operations. Hardware-assisted cryptographic instructions, such as Intel's AES-NI and ARM's custom instructions for encryption workloads, have demonstrated substantial performance improvements. However, efficient SHA-3 acceleration remains an open problem due to its distinct permutation-based structure and memory access patterns. Existing solutions primarily rely on standalone coprocessors or software optimizations, often avoiding the complexities of direct microarchitectural integration. This study investigates the architectural challenges of embedding a SHA-3 permutation operation as a custom instruction within a general-purpose processor, focusing on pipelined simultaneous execution, storage utilization, and hardware cost. In this paper, we investigated and prototyped a SHA-3 custom instruction for the RISC-V CPU architecture. Using cycle-accurate GEM5 simulations and FPGA prototyping, our results demonstrate performance improvements of up to 8.02x for RISC-V optimized SHA-3 software workloads and up to 46.31x for Keccak-specific software workloads, with only a 15.09% increase in registers and a 11.51% increase in LUT utilization. These findings provide critical insights into the feasibility and impact of SHA-3 acceleration at the microarchitectural level, highlighting practical design considerations for future cryptographic instruction set extensions.

Paperid: 3100, https://arxiv.org/pdf/2508.20591.pdf

Abstract:
We explore the feasibility of deploying Bitcoin as the shared monetary standard between Earth and Mars, accounting for physical constraints of interplanetary communication. We introduce a novel primitive, Proof-of-Transit Timestamping (PoTT), to provide cryptographic, tamper-evident audit trails for Bitcoin data across high-latency, intermittently-connected links. Leveraging Delay/Disruption-Tolerant Networking (DTN) and optical low-Earth-orbit (LEO) mesh constellations, we propose an architecture for header-first replication, long-horizon Lightning channels with planetary watchtowers, and secure settlement through federated sidechains or blind-merge-mined (BMM) commit chains. We formalize PoTT, analyze its security model, and show how it measurably improves reliability and accountability without altering Bitcoin consensus or its monetary base. Near-term deployments favor strong federations for local settlement; longer-term, blind-merge-mined commit chains (if adopted) provide an alternative. The Earth L1 monetary base remains unchanged, while Mars can operate a pegged commit chain or strong federation with 1:1 pegged assets for local block production. For transparency, if both time-beacon regimes are simultaneously compromised, PoTT-M2 (and PoTT generally) reduces to administrative assertions rather than cryptographic time-anchoring.

Paperid: 3101, https://arxiv.org/pdf/2508.19309.pdf

Abstract:
3D die stacking and 2.5D interposer design are promising technologies to improve integration density, performance and cost. Current approaches face serious issues in dealing with emerging security challenges such as side channel attacks, hardware trojans, secure IC manufacturing and IP piracy. By utilizing intrinsic characteristics of 2.5D and 3D technologies, we propose novel opportunities in designing secure systems. We present: (i) a 3D architecture for shielding side-channel information; (ii) split fabrication using active interposers; (iii) circuit camouflage on monolithic 3D IC, and (iv) 3D IC-based security processing-in-memory (PIM). Advantages and challenges of these designs are discussed, showing that the new designs can improve existing countermeasures against security threats and further provide new security features.

Paperid: 3102, https://arxiv.org/pdf/2508.19288.pdf

Abstract:
Large Language Models (LLMs) are widely deployed in applications that accept user-submitted content, such as uploaded documents or pasted text, for tasks like summarization and question answering. In this paper, we identify a new class of attacks, prompt in content injection, where adversarial instructions are embedded in seemingly benign inputs. When processed by the LLM, these hidden prompts can manipulate outputs without user awareness or system compromise, leading to biased summaries, fabricated claims, or misleading suggestions. We demonstrate the feasibility of such attacks across popular platforms, analyze their root causes including prompt concatenation and insufficient input isolation, and discuss mitigation strategies. Our findings reveal a subtle yet practical threat in real-world LLM workflows.

Paperid: 3104, https://arxiv.org/pdf/2508.18230.pdf

Abstract:
The escalating complexity and volume of cyberattacks demand proactive detection strategies that go beyond traditional rule-based systems. This paper presents a phase-aware, multi-model machine learning framework that emulates adversarial behavior across the seven phases of the Cyber Kill Chain using the MITRE ATT&CK Enterprise dataset. Techniques are semantically mapped to phases via ATTACK-BERT, producing seven phase-specific datasets. We evaluate LightGBM, a custom Transformer encoder, fine-tuned BERT, and a Graph Neural Network (GNN), integrating their outputs through a weighted soft voting ensemble. Inter-phase dependencies are modeled using directed graphs to capture attacker movement from reconnaissance to objectives. The ensemble consistently achieved the highest scores, with F1-scores ranging from 97.47% to 99.83%, surpassing GNN performance (97.36% to 99.81%) by 0.03%--0.20% across phases. This graph-driven, ensemble-based approach enables interpretable attack path forecasting and strengthens proactive cyber defense.

Paperid: 3105, https://arxiv.org/pdf/2508.17057.pdf

Abstract:
We address the problem of data scarcity in harmful text classification for guardrailing applications and introduce GRAID (Geometric and Reflective AI-Driven Data Augmentation), a novel pipeline that leverages Large Language Models (LLMs) for dataset augmentation. GRAID consists of two stages: (i) generation of geometrically controlled examples using a constrained LLM, and (ii) augmentation through a multi-agentic reflective process that promotes stylistic diversity and uncovers edge cases. This combination enables both reliable coverage of the input space and nuanced exploration of harmful content. Using two benchmark data sets, we demonstrate that augmenting a harmful text classification dataset with GRAID leads to significant improvements in downstream guardrail model performance.

Paperid: 3106, https://arxiv.org/pdf/2508.16078.pdf

Abstract:
The rapid advancement of quantum computing poses a significant threat to modern cryptographic systems, necessitating the transition to Post-Quantum Cryptography (PQC). This study evaluates the support for PQC algorithms within nine widely used open-source cryptographic libraries -- OpenSSL, wolfSSL, BoringSSL, LibreSSL, Bouncy Castle, libsodium, Crypto++, Botan, and MbedTLS -- focusing on their implementation of the NIST-selected PQC finalists: CRYSTALS-Kyber, CRYSTALS-Dilithium, FALCON, and SPHINCS+. Our analysis, based on the latest available documentation, release notes, and industry reports as of early 2025, reveals a varied state of readiness across these libraries. While some libraries have integrated PQC support or have clear implementation roadmaps, others lag behind, creating potential security risks as quantum threats become more imminent. We discuss key challenges, including performance trade-offs, implementation security, and adoption hurdles in real-world cryptographic applications. Our findings highlight the urgent need for continued research, standardization efforts, and coordinated adoption strategies to ensure a secure transition to the quantum-resistant cryptographic landscape.

Paperid: 3107, https://arxiv.org/pdf/2508.15917.pdf

Abstract:
In evolving access structures, the number of participants is countably infinite with no predetermined upper bound. While such structures have been realized in secret sharing, research in secret image sharing has primarily focused on visual cryptography schemes (VCS). However, there exists no construction for $(k,\infty)$ VCS that applies to arbitrary $k$ values without pixel expansion currently, and the contrast requires enhancement. In this paper, we first present a formal mathematical definition of $(k,\infty)$ VCS. Then, propose a $(k,\infty)$ VCS based on random grids that works for arbitrary $k$. In addition, to further improve contrast, we develop optimized $(k,\infty)$ VCS for $k=2$ and $3$, along with contrast enhancement strategies for $k\geq 4$. Theoretical analysis and experimental results demonstrate the superiority of our proposed schemes.

Paperid: 3108, https://arxiv.org/pdf/2508.15865.pdf

Abstract:
Cyber-physical systems (CPS) are being increasingly utilized for critical applications. CPS combines sensing and computing elements, often having multi-layer designs with networking, computational, and physical interfaces, which provide them with enhanced capabilities for a variety of application scenarios. However, the combination of physical and computational elements also makes CPS more vulnerable to attacks compared to network-only systems, and the resulting impacts of CPS attacks can be substantial. Intelligent intrusion detection systems (IDS) are an effective mechanism by which CPS can be secured, but the majority of current solutions often train and validate on network traffic-only datasets, ignoring the distinct attacks that may occur on other system layers. In order to address this, we develop an adaptable CPS anomaly detection model that can detect attacks within CPS without the need for previously labeled data. To achieve this, we utilize domain adaptation techniques that allow us to transfer known attack knowledge from a network traffic-only environment to a CPS environment. We validate our approach using a state-of-the-art CPS intrusion dataset that combines network, operating system (OS), and Robot Operating System (ROS) data. Through this dataset, we are able to demonstrate the effectiveness of our model across network traffic-only and CPS environments with distinct attack types and its ability to outperform other anomaly detection methods.

Paperid: 3109, https://arxiv.org/pdf/2508.15844.pdf

Abstract:
Ransomware attacks have become a pervasive and costly form of cybercrime, causing tens of millions of dollars in losses as organizations increasingly pay ransoms to mitigate operational disruptions and financial risks. While prior research has largely focused on proactive defenses, the post-infection negotiation dynamics between attackers and victims remains underexplored. This paper presents a formal analysis of attacker-victim interactions in modern ransomware incidents using a finite-horizon alternating-offers bargaining game model. Our analysis demonstrates how bargaining alters the optimal strategies of both parties. In practice, incomplete information-attackers lacking knowledge of victims' data valuations and victims lacking knowledge of attackers' reservation ransoms-can prolong negotiations and increase victims' business interruption costs. To address this, we design a Bayesian incentive-compatible mechanism that facilitates rapid agreement on a fair ransom without requiring either party to disclose private valuations. We further implement this mechanism using secure two-party computation based on garbled circuits, thereby eliminating the need for trusted intermediaries and preserving the privacy of both parties throughout the negotiation. To the best of our knowledge, this is the first automated, privacy-preserving negotiation mechanism grounded in a formal analysis of ransomware negotiation dynamics.

Paperid: 3110, https://arxiv.org/pdf/2508.14796.pdf

Abstract:
Stakeholder-based ethics analysis is now a formal requirement for submissions to top cybersecurity research venues. This requirement reflects a growing consensus that cybersecurity researchers must go beyond providing capabilities to anticipating and mitigating the potential harms thereof. However, many cybersecurity researchers may be uncertain about how to proceed in an ethics analysis. In this guide, we provide practical support for that requirement by enumerating stakeholder types and mapping them to common empirical research methods. We also offer worked examples to demonstrate how researchers can identify likely stakeholder exposures in real-world projects. Our goal is to help research teams meet new ethics mandates with confidence and clarity, not confusion.

Paperid: 3111, https://arxiv.org/pdf/2508.14128.pdf

Abstract:
Jailbreak attacks pose a serious challenge to the safe deployment of large language models (LLMs). We introduce CCFC (Core & Core-Full-Core), a dual-track, prompt-level defense framework designed to mitigate LLMs' vulnerabilities from prompt injection and structure-aware jailbreak attacks. CCFC operates by first isolating the semantic core of a user query via few-shot prompting, and then evaluating the query using two complementary tracks: a core-only track to ignore adversarial distractions (e.g., toxic suffixes or prefix injections), and a core-full-core (CFC) track to disrupt the structural patterns exploited by gradient-based or edit-based attacks. The final response is selected based on a safety consistency check across both tracks, ensuring robustness without compromising on response quality. We demonstrate that CCFC cuts attack success rates by 50-75% versus state-of-the-art defenses against strong adversaries (e.g., DeepInception, GCG), without sacrificing fidelity on benign queries. Our method consistently outperforms state-of-the-art prompt-level defenses, offering a practical and effective solution for safer LLM deployment.

Paperid: 3112, https://arxiv.org/pdf/2508.12832.pdf

Abstract:
The widespread adoption of convolutional neural networks (CNNs) in resource-constrained scenarios has driven the development of Machine Learning as a Service (MLaaS) system. However, this approach is susceptible to privacy leakage, as the data sent from the client to the untrusted cloud server often contains sensitive information. Existing CNN privacy-preserving schemes, while effective in ensuring data confidentiality through homomorphic encryption and secret sharing, face efficiency bottlenecks, particularly in convolution operations. In this paper, we propose a novel verifiable privacy-preserving scheme tailored for CNN convolutional layers. Our scheme enables efficient encryption and decryption, allowing resource-constrained clients to securely offload computations to the untrusted cloud server. Additionally, we present a verification mechanism capable of detecting the correctness of the results with a success probability of at least $1-\frac{1}{\left|Z\right|}$. Extensive experiments conducted on 10 datasets and various CNN models demonstrate that our scheme achieves speedups ranging $26 \times$ ~ $\ 87\times$ compared to the original plaintext model while maintaining accuracy.

Paperid: 3113, https://arxiv.org/pdf/2508.12560.pdf

Abstract:
We propose a data-driven and context-aware approach to bootstrap trustworthiness of homogeneous Internet of Things (IoT) services in Mobile Edge Computing (MEC) based industrial IoT (IIoT) systems. The proposed approach addresses key limitations in adapting existing trust bootstrapping approaches into MEC-based IIoT systems. These key limitations include, the lack of opportunity for a service consumer to interact with a lesser-known service over a prolonged period of time to get a robust measure of its trustworthiness, inability of service consumers to consistently interact with their peers to receive reliable recommendations of the trustworthiness of a lesser-known service as well as the impact of uneven context parameters in different MEC environments causing uneven trust environments for trust evaluation. In addition, the proposed approach also tackles the problem of data sparsity via enabling knowledge sharing among different MEC environments within a given MEC topology. To verify the effectiveness of the proposed approach, we carried out a comprehensive evaluation on two real-world datasets suitably adjusted to exhibit the context-dependent trust information accumulated in MEC environments within a given MEC topology. The experimental results affirmed the effectiveness of our approach and its suitability to bootstrap trustworthiness of services in MEC-based IIoT systems.

Paperid: 3114, https://arxiv.org/pdf/2508.12539.pdf

Abstract:
Local differential privacy (LDP) has emerged as a promising paradigm for privacy-preserving data collection in distributed systems, where users contribute multi-dimensional records with potentially correlated attributes. Recent work has highlighted that correlation-induced privacy leakage (CPL) plays a critical role in shaping the privacy-utility trade-off under LDP, especially when correlations exist among attributes. Nevertheless, it remains unclear to what extent the prevailing assumptions and proposed solutions are valid and how significant CPL is in real-world data. To address this gap, we first perform a comprehensive statistical analysis of five widely used LDP mechanisms -- GRR, RAPPOR, OUE, OLH and Exponential mechanism -- to assess CPL across four real-world datasets. We identify that many primary assumptions and metrics in current approaches fall short of accurately characterising these leakages. Moreover, current studies have been limited to a set of pure LDP (i.e., {Î´= 0}) mechanisms. In response, we develop the first algorithmic framework to theoretically quantify CPL for any general approximated LDP (({\varepsilon},Î´)-LDP) mechanism. We validate our theoretical results against empirical statistical results and provide a theoretical explanation for the observed statistical patterns. Finally, we propose two novel benchmarks to validate correlation analysis algorithms and evaluate the utility vs CPL of LDP mechanisms. Further, we demonstrate how these findings can be applied to achieve an efficient privacy-utility trade-off in real-world data governance.

Paperid: 3115, https://arxiv.org/pdf/2508.11913.pdf

Abstract:
Remote management devices facilitate critical infrastructure monitoring for administrators but simultaneously increase asset exposure. Sensitive geographical information overlooked in exposed device management pages poses substantial security risks. Therefore, identifying devices that reveal location information due to administrator negligence is crucial for cybersecurity regulation. Despite the rich information exposed by web interfaces of remote management devices, automatically discovering geographical locations remains challenging due to unstructured formats, varying styles, and incomplete geographical details. This study introduces WebGeoInfer, a structure-free geolocation inference framework utilizing multi-stage information enhancement. WebGeoInfer clusters similar device web pages and analyzes inter-cluster differences to extract potential geographical information, bypassing structural limitations. Through search engine enhancement and Large Language Models mining, the framework extracts geographical coordinates from identified information. WebGeoInfer successfully inferred locations for 5,435 devices across 94 countries and 2,056 cities, achieving accuracy rates of 96.96\%, 88.05\%, and 79.70\% at country, city, and street levels, respectively.

Paperid: 3116, https://arxiv.org/pdf/2508.11495.pdf

Abstract:
To protect privacy for data-collection-based services, local differential privacy (LDP) is widely adopted due to its rigorous theoretical bound on privacy loss. However, mistakes in complex theoretical analysis or subtle implementation errors may undermine its practical guarantee. To address this, auditing is crucial to confirm that LDP protocols truly protect user data. However, existing auditing methods, though, mainly target machine learning and federated learning tasks based on centralized differentially privacy (DP), with limited attention to LDP. Moreover, the few studies on LDP auditing focus solely on simple frequency estimation task for discrete data, leaving correlated key-value data - which requires both discrete frequency estimation for keys and continuous mean estimation for values - unexplored. To bridge this gap, we propose KV-Auditor, a framework for auditing LDP-based key-value estimation mechanisms by estimating their empirical privacy lower bounds. Rather than traditional LDP auditing methods that relies on binary output predictions, KV-Auditor estimates this lower bound by analyzing unbounded output distributions, supporting continuous data. Specifically, we classify state-of-the-art LDP key-value mechanisms into interactive and non-interactive types. For non-interactive mechanisms, we propose horizontal KV-Auditor for small domains with sufficient samples and vertical KV-Auditor for large domains with limited samples. For interactive mechanisms, we design a segmentation strategy to capture incremental privacy leakage across iterations. Finally, we perform extensive experiments to validate the effectiveness of our approach, offering insights for optimizing LDP-based key-value estimators.

Paperid: 3117, https://arxiv.org/pdf/2508.11095.pdf

Abstract:
This work presents Homomorphic Encryption Intermediate Representation (HEIR), a unified approach to building homomorphic encryption (HE) compilers. HEIR aims to support all mainstream techniques in homomorphic encryption, integrate with all major software libraries and hardware accelerators, and advance the field by providing a platform for research and benchmarking. Built on the MLIR compiler framework, HEIR introduces HE-specific abstraction layers at which existing optimizations and new research ideas may be easily implemented. Although many HE optimization techniques have been proposed, it remains difficult to combine or compare them effectively. HEIR provides a means to effectively explore the space of HE optimizations. HEIR addresses the entire HE stack and includes support for various frontends, including Python. The contribution of this work includes: (1) We introduce HEIR as a framework for building HE compilers. (2) We validate HEIR's design by porting a large fraction of the HE literature to HEIR, and we argue that HEIR can tackle more complicated and diverse programs than prior literature. (3) We provide evidence that HEIR is emerging as the de facto HE compiler for academic research and industry development.

Paperid: 3118, https://arxiv.org/pdf/2508.10639.pdf

Abstract:
Learning-based Provenance-based Intrusion Detection Systems (PIDSes) have become essential tools for anomaly detection in host systems due to their ability to capture rich contextual and structural information, as well as their potential to detect unknown attacks. However, recent studies have shown that these systems are vulnerable to graph manipulation attacks, where attackers manipulate the graph structure to evade detection. While some previous approaches have discussed this type of attack, none have fully addressed it with a robust detection solution, limiting the practical applicability of PIDSes. To address this challenge, we propose MirGuard, a robust anomaly detection framework that combines logic-aware multi-view augmentation with contrastive representation learning. Rather than applying arbitrary structural perturbations, MirGuard introduces Logic-Aware Noise Injection (LNI) to generate semantically valid graph views, ensuring that all augmentations preserve the underlying causal semantics of the provenance data. These views are then used in a Logic-Preserving Contrastive Learning framework, which encourages the model to learn representations that are invariant to benign transformations but sensitive to adversarial inconsistencies. Comprehensive evaluations on multiple provenance datasets demonstrate that MirGuard significantly outperforms state-of-the-art detectors in robustness against various graph manipulation attacks without sacrificing detection performance and efficiency. Our work represents the first targeted study to enhance PIDS against such adversarial threats, providing a robust and effective solution to modern cybersecurity challenges.

Paperid: 3119, https://arxiv.org/pdf/2508.10042.pdf

Abstract:
Federated learning enhances traditional deep learning by enabling the joint training of a model with the use of IoT device's private data. It ensures privacy for clients, but is susceptible to data poisoning attacks during training that degrade model performance and integrity. Current poisoning detection methods in federated learning lack a standardized detection method or take significant liberties with trust. In this paper, we present \Sys, a novel blockchain-enabled poison detection framework in federated learning. The framework decentralizes the role of the global server across participating clients. We introduce a judge model used to detect data poisoning in model updates. The judge model is produced by each client and verified to reach consensus on a single judge model. We implement our solution to show \Sys is robust against data poisoning attacks and the creation of our judge model is scalable.

Paperid: 3120, https://arxiv.org/pdf/2508.10031.pdf

Abstract:
While Large Language Models (LLMs) have shown significant advancements in performance, various jailbreak attacks have posed growing safety and ethical risks. Malicious users often exploit adversarial context to deceive LLMs, prompting them to generate responses to harmful queries. In this study, we propose a new defense mechanism called Context Filtering model, an input pre-processing method designed to filter out untrustworthy and unreliable context while identifying the primary prompts containing the real user intent to uncover concealed malicious intent. Given that enhancing the safety of LLMs often compromises their helpfulness, potentially affecting the experience of benign users, our method aims to improve the safety of the LLMs while preserving their original performance. We evaluate the effectiveness of our model in defending against jailbreak attacks through comparative analysis, comparing our approach with state-of-the-art defense mechanisms against six different attacks and assessing the helpfulness of LLMs under these defenses. Our model demonstrates its ability to reduce the Attack Success Rates of jailbreak attacks by up to 88% while maintaining the original LLMs' performance, achieving state-of-the-art Safety and Helpfulness Product results. Notably, our model is a plug-and-play method that can be applied to all LLMs, including both white-box and black-box models, to enhance their safety without requiring any fine-tuning of the models themselves. We will make our model publicly available for research purposes.

Paperid: 3121, https://arxiv.org/pdf/2508.09735.pdf

Abstract:
Quantum Key Distribution (QKD) networks harness the principles of quantum physics in order to securely transmit cryptographic key material, providing physical guarantees. These networks require traditional management and operational components, such as routing information through the network elements. However, due to the limitations on capacity and the particularities of information handling in these networks, traditional shortest paths algorithms for routing perform poorly on both route planning and online routing, which is counterintuitive. Moreover, due to the scarce resources in such networks, often the expressed demand cannot be met by any assignment of routes. To address both the route planning problem and the need for fair automated suggestions in infeasible cases, we propose to model this problem as a Quadratic Programming (QP) problem. For the online routing problem, we showcase that the shortest (available) paths routing strategy performs poorly in the online setting. Furthermore, we prove that the widest shortest path routing strategy has a competitive ratio greater or equal than $\frac{1}{2}$, efficiently addressing both routing modes in QKD networks.

Paperid: 3122, https://arxiv.org/pdf/2508.09665.pdf

Abstract:
Recent years have witnessed a rising trend in social-sensor cloud identity cloning incidents. However, existing approaches suffer from unsatisfactory performance, a lack of solutions for detecting duplicated accounts, and a lack of large-scale evaluations on real-world datasets. We introduce a novel method for detecting identity cloning in social-sensor cloud service providers. Our proposed technique consists of two primary components: 1) a similar identity detection method and 2) a cryptography-based authentication protocol. Initially, we developed a weakly supervised deep forest model to identify similar identities using non-privacy-sensitive user profile features provided by the service. Subsequently, we designed a cryptography-based authentication protocol to verify whether similar identities were generated by the same provider. Our extensive experiments on a large real-world dataset demonstrate the feasibility and superior performance of our technique compared to current state-of-the-art identity clone detection methods.

Paperid: 3123, https://arxiv.org/pdf/2508.09320.pdf

Abstract:
Graph neural networks (GNNs) are increasingly employed in high-stakes applications, such as fraud detection or healthcare, but are susceptible to adversarial attacks. A number of techniques have been proposed to provide adversarial robustness guarantees, but support for commonly used aggregation functions in message-passing GNNs is still lacking. In this paper, we develop an exact (sound and complete) verification method for GNNs to compute guarantees against attribute and structural perturbations that involve edge addition or deletion, subject to budget constraints. Focusing on node classification tasks, our method employs constraint solving with bound tightening, and iteratively solves a sequence of relaxed constraint satisfaction problems while relying on incremental solving capabilities of solvers to improve efficiency. We implement GNNev, a versatile solver for message-passing neural networks, which supports three aggregation functions, sum, max and mean, with the latter two considered here for the first time. Extensive experimental evaluation of GNNev on two standard benchmarks (Cora and CiteSeer) and two real-world fraud datasets (Amazon and Yelp) demonstrates its usability and effectiveness, as well as superior performance compared to existing {exact verification} tools on sum-aggregated node classification tasks.

Paperid: 3124, https://arxiv.org/pdf/2508.08749.pdf

Abstract:
This paper revisits the DBSCAN problem under differential privacy (DP). Existing DP-DBSCAN algorithms aim at publishing the cluster labels of the input points. However, we show that both empirically and theoretically, this approach cannot offer any utility in the published results. We therefore propose an alternative definition of DP-DBSCAN based on the notion of spans. We argue that publishing the spans actually better serves the purposes of visualization and classification of DBSCAN. Then we present a linear-time DP-DBSCAN algorithm achieving the sandwich quality guarantee in any constant dimensions, as well as matching lower bounds on the approximation ratio. A key building block in our algorithm is a linear-time algorithm for constructing a histogram under pure-DP, which is of independent interest. Finally, we conducted experiments on both synthetic and real-world datasets to verify the practical performance of our DP-DBSCAN algorithm.

Paperid: 3125, https://arxiv.org/pdf/2508.08656.pdf

Abstract:
Protecting state-of-the-art AI-based cybersecurity defense systems from cyber attacks is crucial. Attackers create adversarial examples by adding small changes (i.e., perturbations) to the attack features to evade or fool the deep learning model. This paper introduces the concept of low-level behavioral adversarial examples and its threat model of evasive ransomware. We formulate the method and the threat model to generate the optimal source code of evasive malware. We then examine the method using the leaked source code of Conti ransomware with the micro-behavior control function. The micro-behavior control function is our test component to simulate changing source code in ransomware; ransomware's behavior can be changed by specifying the number of threads, file encryption ratio, and delay after file encryption at the boot time. We evaluated how much an attacker can control the behavioral features of ransomware using the micro-behavior control function to decrease the detection rate of a ransomware detector.

Paperid: 3126, https://arxiv.org/pdf/2508.08655.pdf

Abstract:
Double extortion ransomware attacks have become mainstream since many organizations adopt more robust and resilient data backup strategies against conventional crypto-ransomware. This paper presents detailed attack stages, tactics, procedures, and tools used in the double extortion ransomware attacks. We then present a novel detection method using low-level storage and memory behavioral features and network traffic features obtained from a thin hypervisor to establish a defense-in-depth strategy for when attackers compromise OS-level protection. We employed the lightweight \emph{Kitsune} Network Intrusion Detection System (NIDS)'s network feature to detect the data exfiltration phase in double extortion ransomware attacks. Our experimental results showed that the presented method improved by 0.166 in the macro F score of the data exfiltration phase detection rate. Lastly, we discuss the limitations of the presented method and future work.

Paperid: 3127, https://arxiv.org/pdf/2508.08583.pdf

Abstract:
As the social implementation of AI has been steadily progressing, research and development related to AI security has also been increasing. However, existing studies have been limited to organizing related techniques, attacks, defenses, and risks in terms of specific domains or AI elements. Thus, it extremely difficult to understand the relationships among them and how negative impacts on stakeholders are brought about. In this paper, we argue that the knowledge, technologies, and social impacts related to AI security should be holistically organized to help understand relationships among them. To this end, we first develop an AI security map that holistically organizes interrelationships among elements related to AI security as well as negative impacts on information systems and stakeholders. This map consists of the two aspects, namely the information system aspect (ISA) and the external influence aspect (EIA). The elements that AI should fulfill within information systems are classified under the ISA. The EIA includes elements that affect stakeholders as a result of AI being attacked or misused. For each element, corresponding negative impacts are identified. By referring to the AI security map, one can understand the potential negative impacts, along with their causes and countermeasures. Additionally, our map helps clarify how the negative impacts on AI-based systems relate to those on stakeholders. We show some findings newly obtained by referring to our map. We also provide several recommendations and open problems to guide future AI security communities.

Paperid: 3128, https://arxiv.org/pdf/2508.08462.pdf

Abstract:
Integrated circuits (ICs) are essential to modern electronic systems, yet they face significant risks from physical reverse engineering (RE) attacks that compromise intellectual property (IP) and overall system security. While IC camouflage techniques have emerged to mitigate these risks, existing approaches largely focus on localized gate modifications, neglecting comprehensive deception strategies. To address this gap, we present a machine learning (ML)-driven methodology that integrates cryptic and mimetic cyber deception principles to enhance IC security against RE. Our approach leverages a novel And-Inverter Graph Variational Autoencoder (AIG-VAE) to encode circuit representations, enabling dual-layered camouflage through functional preservation and appearance mimicry. By introducing new variants of covert gates -- Fake Inverters, Fake Buffers, and Universal Transmitters -- our methodology achieves robust protection by obscuring circuit functionality while presenting misleading appearances. Experimental results demonstrate the effectiveness of our strategy in maintaining circuit functionality while achieving high camouflage and similarity scores with minimal structural overhead. Additionally, we validate the robustness of our method against advanced artificial intelligence (AI)-enhanced RE attacks, highlighting its practical applicability in securing IC designs. By bridging the gap in mimetic deception for hardware security, our work sets a new standard for IC camouflage, advancing the application of cyber deception principles to protect critical systems from adversarial threats.

Paperid: 3129, https://arxiv.org/pdf/2508.05655.pdf

Abstract:
The current Domain Name System (DNS) infrastructure faces critical vulnerabilities including poisoning attacks, censorship mechanisms, and centralized points of failure that compromise internet freedom and security. Recent incidents such as DNS poisoning attacks on ISP customers highlight the urgent need for resilient alternatives. This paper presents a novel blockchain-based Decentralized Domain Name System (DDNS). We designed a specialized Proof-of-Work blockchain to maximize support for DNS-related protocols and achieve node decentralization. The system integrates our blockchain with IPFS for distributed storage, implements cryptographic primitives for end-to-end trust signatures, and achieves Never Trust, Always Verify zero-trust verification. Our implementation achieves 15-second domain record propagation times, supports 20 standard DNS record types, and provides perpetual free .ddns domains. The system has been deployed across distributed infrastructure in San Jose, Los Angeles, and Orange County, demonstrating practical scalability and resistance to traditional DNS manipulation techniques. Performance evaluation shows the system can handle up to Max Theor. TPS 1,111.1 tx/s (minimal transactions) and Max Theor. TPS 266.7 tx/s (regular transactions) for domain operations while maintaining sub-second query resolution through intelligent caching mechanisms.

Paperid: 3130, https://arxiv.org/pdf/2508.05394.pdf

Abstract:
Visual cryptography schemes (VCSs) belong to a category of secret image sharing schemes that do not require cryptographic knowledge for decryption, instead relying directly on the human visual system. Among VCSs, random grid-based VCS (RGVCS) has garnered widespread attention as it avoids pixel expansion while requiring no basic matrices design. Contrast, a core metric for RGVCS, directly determines the visual quality of recovered images, rendering its optimization a critical research objective. However, existing $(k,n)$ RGVCSs still fail to attain theoretical upper bounds on contrast, highlighting the urgent need for higher-contrast constructions. In this paper, we propose a novel sharing paradigm for RGVCS that constructs $(k,n)$-threshold schemes from arbitrary $(k,n')$-threshold schemes $(k \leq n'\leq n)$, termed \emph{$n'$-grouped $(k,n)$ RGVCS}. This paradigm establishes hierarchical contrast characteristics: participants within the same group achieve optimal recovery quality, while inter-group recovery shows a hierarchical contrast. We further introduce a new contrast calculation formula tailored to the new paradigm. Then, we propose a contrast-enhanced $(k,n)$ RGVCS by setting $n'= k$, achieving the highest contrast value documented in the existing literature. Theoretical analysis and experimental results demonstrate the superiority of our proposed scheme in terms of contrast.

Paperid: 3131, https://arxiv.org/pdf/2508.05276.pdf

Abstract:
Mobile network operators implement firewalls to stop illicit messages, but scammers find ways to evade detection. Previous work has looked into SMS texts that are blocked by these firewalls. However, there is little insight into SMS texts that bypass them and reach users. To this end, we collaborate with a major mobile network operator to receive 1.35m user reports submitted over four months. We find 89.16% of user reports comprise text messages, followed by reports of suspicious calls and URLs. Using our methodological framework, we identify 35.12% of the unique text messages reported by users as spam, while 40.27% are scam text messages. This is the first paper that investigates SMS reports submitted by users and differentiates between spam and scams. Our paper classifies the identified scam text messages into 12 scam types, of which the most popular is 'wrong number' scams. We explore the various infrastructure services that scammers abuse to conduct SMS scams, including mobile network operators and hosting infrastructure, and analyze the text of the scam messages to understand how scammers lure victims into providing them with their personal or financial details.

Paperid: 3132, https://arxiv.org/pdf/2508.05048.pdf

Abstract:
The semidirect discrete logarithm problem (SDLP) in finite groups was proposed as a foundation for post-quantum cryptographic protocols, based on the belief that its non-abelian structure would resist quantum attacks. However, recent results have shown that SDLP in finite groups admits efficient quantum algorithms, undermining its quantum resistance. This raises a fundamental question: does the SDLP offer any computational advantages over the standard discrete logarithm problem (DLP) against classical adversaries? In this work, we investigate the classical hardness of SDLP across different finite group platforms. We establish that the group-case SDLP can be reformulated as a generalized discrete logarithm problem, enabling adaptation of classical algorithms to study its complexity. We present a concrete adaptation of the Baby-Step Giant-Step algorithm for SDLP, achieving time and space complexity $O(\sqrt{r})$ where $r$ is the period of the underlying cycle structure. Through theoretical analysis and experimental validation in SageMath, we demonstrate that the classical hardness of SDLP is highly platform-dependent and does not uniformly exceed that of standard DLP. In finite fields $\mathbb{F}_p^*$, both problems exhibit comparable complexity. Surprisingly, in elliptic curves $E(\mathbb{F}_p)$, the SDLP becomes trivial due to the bounded automorphism group, while in elementary abelian groups $\mathbb{F}_p^n$, the SDLP can be harder than DLP, with complexity varying based on the eigenvalue structure of the automorphism. Our findings reveal that the non-abelian structure of semidirect products does not inherently guarantee increased classical hardness, suggesting that the search for classically hard problems for cryptographic applications requires more careful consideration of the underlying algebraic structures.

Paperid: 3133, https://arxiv.org/pdf/2508.04561.pdf

Abstract:
This work focuses on validation of attack pattern mining in the context of Industrial Control System (ICS) security. A comprehensive security assessment of an ICS requires generating a large and variety of attack patterns. For this purpose we have proposed a data driven technique to generate attack patterns for an ICS. The proposed technique has been used to generate over 100,000 attack patterns from data gathered from an operational water treatment plant. In this work we present a detailed case study to validate the attack patterns.

Paperid: 3134, https://arxiv.org/pdf/2508.04542.pdf

Abstract:
It is difficult for individuals and organizations to protect personal information without a fundamental understanding of relative privacy risks. By analyzing over 5,000 empirical identity theft and fraud cases, this research identifies which types of personal data are exposed, how frequently exposures occur, and what the consequences of those exposures are. We construct an Identity Ecosystem graph--a foundational, graph-based model in which nodes represent personally identifiable information (PII) attributes and edges represent empirical disclosure relationships between them (e.g., the probability that one PII attribute is exposed due to the exposure of another). Leveraging this graph structure, we develop a privacy risk prediction framework that uses graph theory and graph neural networks to estimate the likelihood of further disclosures when certain PII attributes are compromised. The results show that our approach effectively answers the core question: Can the disclosure of a given identity attribute possibly lead to the disclosure of another attribute?

Paperid: 3135, https://arxiv.org/pdf/2508.04340.pdf

Abstract:
In this paper, we determine explicit bases for Riemann--Roch spaces associated with various families of elliptic codes. We establish the feasibility and provide exact algorithms for constructing bases of Riemann--Roch spaces corresponding to arbitrary divisors on elliptic curves. These results are subsequently applied to derive bases for quasi-cyclic elliptic codes and their subfield subcodes as well as for the class of Goppa-like elliptic codes. For algebraic geometry code applications, having an explicit description of Riemann--Roch space bases for arbitrary divisors is particularly valuable as it simultaneously enables efficient code construction and reveals structural properties of the codes leading to the new cryptanalysis methods when these codes are employed in cryptographic schemes

Paperid: 3136, https://arxiv.org/pdf/2508.04196.pdf

Abstract:
Despite significant advances in alignment techniques, we demonstrate that state-of-the-art language models remain vulnerable to carefully crafted conversational scenarios that can induce various forms of misalignment without explicit jailbreaking. Through systematic manual red-teaming with Claude-4-Opus, we discovered 10 successful attack scenarios, revealing fundamental vulnerabilities in how current alignment methods handle narrative immersion, emotional pressure, and strategic framing. These scenarios successfully elicited a range of misaligned behaviors, including deception, value drift, self-preservation, and manipulative reasoning, each exploiting different psychological and contextual vulnerabilities. To validate generalizability, we distilled our successful manual attacks into MISALIGNMENTBENCH, an automated evaluation framework that enables reproducible testing across multiple models. Cross-model evaluation of our 10 scenarios against five frontier LLMs revealed an overall 76% vulnerability rate, with significant variations: GPT-4.1 showed the highest susceptibility (90%), while Claude-4-Sonnet demonstrated greater resistance (40%). Our findings demonstrate that sophisticated reasoning capabilities often become attack vectors rather than protective mechanisms, as models can be manipulated into complex justifications for misaligned behavior. This work provides (i) a detailed taxonomy of conversational manipulation patterns and (ii) a reusable evaluation framework. Together, these findings expose critical gaps in current alignment strategies and highlight the need for robustness against subtle, scenario-based manipulation in future AI systems.

Paperid: 3137, https://arxiv.org/pdf/2508.04178.pdf

Abstract:
Keyloggers remain a serious threat in modern cybersecurity, silently capturing user keystrokes to steal credentials and sensitive information. Traditional defenses focus mainly on detection and removal, which can halt malicious activity but do little to engage or mislead adversaries. In this paper, we present a deception framework that leverages API hooking to intercept input-related API calls invoked by keyloggers at runtime and inject realistic decoy keystrokes. A core challenge, however, lies in the increasing adoption of anti-hooking techniques by advanced keyloggers. Anti-hooking strategies allow malware to bypass or detect instrumentation. To counter this, we introduce a hardened hooking layer that detects tampering and rapidly reinstates disrupted hooks, ensuring continuity of deception. We evaluate our framework against a custom-built "super keylogger" incorporating multiple evasion strategies, as well as 50 real-world malware samples spanning ten prominent keylogger families. Experimental results demonstrate that our system successfully resists sophisticated bypass attempts, maintains operational stealth, and reliably deceives attackers by feeding them decoys. The system operates with negligible performance overhead and no observable impact on user experience. Our findings show that resilient, runtime deception can play a practical and robust role in confronting advanced threats.

Paperid: 3138, https://arxiv.org/pdf/2508.04024.pdf

Abstract:
We discuss newly uncovered cases of identity theft in the scientific peer-review process within artificial intelligence (AI) research, with broader implications for other academic procedures. We detail how dishonest researchers exploit the peer-review system by creating fraudulent reviewer profiles to manipulate paper evaluations, leveraging weaknesses in reviewer recruitment workflows and identity verification processes. The findings highlight the critical need for stronger safeguards against identity theft in peer review and academia at large, and to this end, we also propose mitigating strategies.

Paperid: 3139, https://arxiv.org/pdf/2508.04000.pdf

Abstract:
In the past decade, blockchain has emerged as a promising solution for building secure distributed ledgers and has attracted significant attention. However, current blockchain systems suffer from limited throughput, poor scalability, and high latency. Due to limitations in consensus mechanisms, especially in managing node identities, blockchain is often considered unsuitable for applications such as the Internet of Things (IoT). This paper proposes the Advanced DAG-based Ranking (ADR) protocol to enhance blockchain scalability and throughput. ADR employs a directed acyclic graph (DAG) structure where nodes are positioned based on their rankings. Unlike traditional chains, ADR allows honest nodes to write blocks and verify transactions using a DAG-based topology. The protocol follows a three-step approach to secure the network against double-spending and enhance performance. First, it verifies nodes using their public and private keys before granting entry. Second, it builds an advanced DAG ledger enabling block production and transaction validation. Third, a ranking algorithm filters out malicious nodes, ranks the remaining nodes based on performance, and arranges them topologically. This process increases throughput and ensures robust scalability. We evaluated ADR on Amazon EC2 clusters with over 100 nodes, including scenarios with injected malicious nodes. Simulation results demonstrate that ADR significantly improves transaction throughput and network liveness compared to existing DAG-based blockchains such as IOTA and ByteBall, making it well-suited for IoT applications.

Paperid: 3140, https://arxiv.org/pdf/2508.03342.pdf

Abstract:
Existing cybersecurity playbooks are often written in heterogeneous, non-machine-readable formats, which limits their automation and interoperability across Security Orchestration, Automation, and Response platforms. This paper explores the suitability of Large Language Models, combined with Prompt Engineering, to automatically translate legacy incident response playbooks into the standardized, machine-readable CACAO format. We systematically examine various Prompt Engineering techniques and carefully design prompts aimed at maximizing syntactic accuracy and semantic fidelity for control flow preservation. Our modular transformation pipeline integrates a syntax checker to ensure syntactic correctness and features an iterative refinement mechanism that progressively reduces syntactic errors. We evaluate the proposed approach on a custom-generated dataset comprising diverse legacy playbooks paired with manually created CACAO references. The results demonstrate that our method significantly improves the accuracy of playbook transformation over baseline models, effectively captures complex workflow structures, and substantially reduces errors. It highlights the potential for practical deployment in automated cybersecurity playbook transformation tasks.

Paperid: 3141, https://arxiv.org/pdf/2508.03067.pdf

Abstract:
Recent advancements in DeepFakes attribution technologies have significantly enhanced forensic capabilities, enabling the extraction of traces left by generative models (GMs) in images, making DeepFakes traceable back to their source GMs. Meanwhile, several attacks have attempted to evade attribution models (AMs) for exploring their limitations, calling for more robust AMs. However, existing attacks fail to eliminate GMs' traces, thus can be mitigated by defensive measures. In this paper, we identify that untraceable DeepFakes can be achieved through a multiplicative attack, which can fundamentally eliminate GMs' traces, thereby evading AMs even enhanced with defensive measures. We design a universal and black-box attack method that trains an adversarial model solely using real data, applicable for various GMs and agnostic to AMs. Experimental results demonstrate the outstanding attack capability and universal applicability of our method, achieving an average attack success rate (ASR) of 97.08\% against 6 advanced AMs on DeepFakes generated by 9 GMs. Even in the presence of defensive mechanisms, our method maintains an ASR exceeding 72.39\%. Our work underscores the potential challenges posed by multiplicative attacks and highlights the need for more robust AMs.

Paperid: 3142, https://arxiv.org/pdf/2508.02942.pdf

Abstract:
Lateral Movement (LM) attacks continue to pose a significant threat to enterprise security, enabling adversaries to stealthily compromise critical assets. However, the development and evaluation of LM detection systems are impeded by the absence of realistic, well-labeled datasets. To address this gap, we propose LMDG, a reproducible and extensible framework for generating high-fidelity LM datasets. LMDG automates benign activity generation, multi-stage attack execution, and comprehensive labeling of system and network logs, dramatically reducing manual effort and enabling scalable dataset creation. A central contribution of LMDG is Process Tree Labeling, a novel agent-based technique that traces all malicious activity back to its origin with high precision. Unlike prior methods such as Injection Timing or Behavioral Profiling, Process Tree Labeling enables accurate, step-wise labeling of malicious log entries, correlating each with a specific attack step and MITRE ATT\&CK TTPs. To our knowledge, this is the first approach to support fine-grained labeling of multi-step attacks, providing critical context for detection models such as attack path reconstruction. We used LMDG to generate a 25-day dataset within a 25-VM enterprise environment containing 22 user accounts. The dataset includes 944 GB of host and network logs and embeds 35 multi-stage LM attacks, with malicious events comprising less than 1% of total activity, reflecting a realistic benign-to-malicious ratio for evaluating detection systems. LMDG-generated datasets improve upon existing ones by offering diverse LM attacks, up-to-date attack patterns, longer attack timeframes, comprehensive data sources, realistic network architectures, and more accurate labeling.

Paperid: 3143, https://arxiv.org/pdf/2508.01995.pdf

Abstract:
Graphics processing units (GPUs) are becoming an essential part of the intelligent transportation system (ITS) for enabling video-based and artificial intelligence (AI) based applications. GPUs provide high-throughput and energy-efficient computing for tasks like sensor fusion and roadside video analytics. However, these GPUs are one of the most unmonitored components in terms of security. This makes them vulnerable to cyber and hardware attacks, including unauthorized crypto mining. This paper highlights GPU security as a critical blind spot in transportation cybersecurity. To support this concern, it also presents a case study showing the impact of stealthy unauthorized crypto miners on critical AI workloads, along with a detection strategy. We used a YOLOv8-based video processing pipeline running on an RTX 2060 GPU for the case study. A multi-streaming application was executed while a T-Rex crypto miner ran in the background. We monitored how the miner degraded GPU performance by reducing the frame rate and increasing power consumption, which could be a serious concern for GPUs operating in autonomous vehicles or battery-powered edge devices. We observed measurable impacts using GPU telemetry (nvidia-smi) and Nsight Compute profiling, where frame rate dropped by 50 percent, and power usage increased by up to 90%. To detect, we trained lightweight classifiers using extracted telemetry features. All models achieved high accuracy, precision, recall, and F1-score. This paper raises urgent awareness about GPU observability gaps in ITS and offers a replicable framework for detecting GPU misuse through on-device telemetry.

Paperid: 3144, https://arxiv.org/pdf/2508.01888.pdf

Abstract:
The increasing penetration of renewable energy sources in day-ahead energy markets introduces challenges in balancing supply and demand, ensuring grid resilience, and maintaining trust in decentralized trading systems. This paper proposes a novel framework that integrates the Proximal Policy Optimization (PPO) algorithm, a state-of-the-art reinforcement learning method, with blockchain technology to optimize automated trading strategies for prosumers in day-ahead energy markets. We introduce a comprehensive framework that employs RL agent for multi-objective energy optimization and blockchain for tamper-proof data and transaction management. Simulations using real-world data from the Electricity Reliability Council of Texas (ERCOT) demonstrate the effectiveness of our approach. The RL agent achieves demand-supply balancing within 2\% and maintains near-optimal supply costs for the majority of the operating hours. Moreover, it generates robust battery storage policies capable of handling variability in solar and wind generation. All decisions are recorded on an Algorand-based blockchain, ensuring transparency, auditability, and security - key enablers for trustworthy multi-agent energy trading. Our contributions include a novel system architecture, curriculum learning for robust agent development, and actionable policy insights for practical deployment.

Paperid: 3145, https://arxiv.org/pdf/2508.01750.pdf

Abstract:
Testing network protocol implementations is critical for ensuring the reliability, security, and interoperability of distributed systems. Faults in protocol behavior can lead to vulnerabilities and system failures, especially in real-time and mission-critical applications. A common approach to protocol testing involves constructing Markovian models that capture the state transitions and expected behaviors of the protocol. However, building such models typically requires significant domain expertise and manual effort, making the process time-consuming and difficult to scale across diverse protocols and implementations. We propose a novel method that leverages large language models (LLMs) to automatically generate sequences for testing network protocol implementations. Our approach begins by defining the full set of possible protocol states, from which the LLM selects a subset to model the target implementation. Using this state-based model, we prompt the LLM to generate code that produces sequences of states. This program serves as a protocol-specific sequences generator. The sequences generator then generates test inputs to call the protocol implementation under various conditions. We evaluated our approach on three widely used network protocol implementations and successfully identified 12 previously unknown vulnerabilities. We have reported them to the respective developers for confirmation. This demonstrates the practical effectiveness of our LLM-assisted fuzzing framework in uncovering real-world security issues.

Paperid: 3146, https://arxiv.org/pdf/2508.01714.pdf

Abstract:
Anonymous communication networks have emerged as crucial tools for obfuscating communication pathways and concealing user identities. However, their practical deployments face significant challenges, including susceptibility to artificial intelligence (AI)-powered metadata analysis, difficulties in decentralized architectures, and the absence of provable security guarantees. To address these issues, this paper proposes a novel decentralized anonymous routing protocol with resistance to tracing and traffic analysis. The protocol eliminates dependencies on the threshold model and trusted third-party setups, ensuring indistinguishable identity privacy even in highly adversarial environments. Different from traditional empirical security analysis of anonymous networks, this paper rigorously proves indistinguishable identity privacy for users even in extremely adversarial environments. Furthermore, simulations confirm its practical feasibility, demonstrating both security and efficiency. By achieving information sharing with privacy preservation, the proposed protocol offers a provably secure solution for privacy-preserving communication in digital environments.

Paperid: 3147, https://arxiv.org/pdf/2508.01676.pdf

Abstract:
Adversarial patch attacks threaten the reliability of modern vision models. We present PatchMap, the first spatially exhaustive benchmark of patch placement, built by evaluating over 1.5e8 forward passes on ImageNet validation images. PatchMap reveals systematic hot-spots where small patches (as little as 2% of the image) induce confident misclassifications and large drops in model confidence. To demonstrate its utility, we propose a simple segmentation guided placement heuristic that leverages off the shelf masks to identify vulnerable regions without any gradient queries. Across five architectures-including adversarially trained ResNet50, our method boosts attack success rates by 8 to 13 percentage points compared to random or fixed placements. We publicly release PatchMap and the code implementation. The full PatchMap bench (6.5B predictions, multiple backbones) will be released soon to further accelerate research on location-aware defenses and adaptive attacks.

Paperid: 3148, https://arxiv.org/pdf/2508.01530.pdf

Abstract:
The security of software builds has attracted increased attention in recent years in response to incidents like solarwinds and xz. Now, several companies including Oracle and Google rebuild open source projects in a secure environment and publish the resulting binaries through dedicated repositories. This practice enables direct comparison between these rebuilt binaries and the original ones produced by developers and published in repositories such as Maven Central. These binaries are often not bitwise identical; however, in most cases, the differences can be attributed to variations in the build environment, and the binaries can still be considered equivalent. Establishing such equivalence, however, is a labor-intensive and error-prone process. While there are some tools that can be used for this purpose, they all fall short of providing provenance, i.e. readable explanation of why two binaries are equivalent, or not. To address this issue, we present daleq, a tool that disassembles Java byte code into a relational database, and can normalise this database by applying datalog rules. Those databases can then be used to infer equivalence between two classes. Notably, equivalence statements are accompanied with datalog proofs recording the normalisation process. We demonstrate the impact of daleq in an industrial context through a large-scale evaluation involving 2,714 pairs of jars, comprising 265,690 class pairs. In this evaluation, daleq is compared to two existing bytecode transformation tools. Our findings reveal a significant reduction in the manual effort required to assess non-bitwise equivalent artifacts, which would otherwise demand intensive human inspection. Furthermore, the results show that daleq outperforms existing tools by identifying more artifacts rebuilt from the same code as equivalent, even when no behavioral differences are present.

Paperid: 3149, https://arxiv.org/pdf/2508.01451.pdf

Abstract:
Machine learning and Large language models (LLMs) for vulnerability detection has received significant attention in recent years. Unfortunately, state-of-the-art techniques show that LLMs are unsuccessful in even distinguishing the vulnerable function from its benign counterpart, due to three main problems: Vulnerability detection requires deep analysis, which LLMs often struggle with when making a one-shot prediction. Existing techniques typically perform function-level analysis, whereas effective vulnerability detection requires contextual information beyond the function scope. The focus on binary classification can result in identifying a vulnerability but associating it with the wrong security weaknesses (CWE), which may mislead developers. We propose a novel multi-agent LLM approach to address the challenges of identifying CWEs. This approach consists of three steps: (1) a team of LLM agents performs an exhaustive search for potential CWEs in the function under review, (2) another team of agents identifies relevant external context to support or refute each candidate CWE, and (3) a final agent makes informed acceptance or rejection decisions for each CWE based on the gathered context. A preliminary evaluation of our approach shows promising results. In the PrimeVul dataset, Step 1 correctly identifies the appropriate CWE in 40.9\% of the studied vulnerable functions. We further evaluated the full pipeline on ten synthetic programs and found that incorporating context information significantly reduced false positives from 6 to 9 CWEs to just 1 to 2, while still correctly identifying the true CWE in 9 out of 10 cases.

Paperid: 3150, https://arxiv.org/pdf/2508.01448.pdf

Abstract:
The blocks in the Bitcoin blockchain record the amount of work W that went into creating them through proofs of work. When honest parties control a majority of the work, consensus is achieved by picking the chain with the highest recorded weight. Resources other than work have been considered to secure such longest-chain blockchains. In Chia, blocks record the amount of space S (via a proof of space) and sequential computational steps V (via a VDF). In this paper, we ask what weight functions Î(S,V,W) (that assign a weight to a block as a function of the recorded space, speed, and work) are secure in the sense that whenever the weight of the resources controlled by honest parties is larger than the weight of adversarial parties, the blockchain is secure against private double-spending attacks. We completely classify such functions in an idealized "continuous" model: Î(S,V,W) is secure against private double-spending attacks if and only if it is homogeneous of degree one in the timed resources V and W, i.e., Î±Î(S,V,W)=Î(S,Î±V, Î±W). This includes Bitcoin rule Î(S,V,W)=W and Chia rule Î(S,V,W) = SV. In a more realistic model where blocks are created at discrete time-points, one additionally needs some mild assumptions on the dependency on S (basically, the weight should not grow too much if S is slightly increased, say linear as in Chia). Our classification is more general and allows various instantiations of the same resource. It provides a powerful tool for designing new longest-chain blockchains. E.g., consider combining different PoWs to counter centralization, say the Bitcoin PoW W_1 and a memory-hard PoW W_2. Previous work suggested to use W_1+W_2 as weight. Our results show that using {\sqrt}(W_1){\cdot}{\sqrt}(W_2), {\min}{W_1,W_2} are also secure, and we argue that in practice these are much better choices.

Paperid: 3151, https://arxiv.org/pdf/2508.01422.pdf

Abstract:
This study examines how Artificial Intelligence can aid in identifying and mitigating cyber threats in the U.S. across four key areas: intrusion detection, malware classification, phishing detection, and insider threat analysis. Each of these problems has its quirks, meaning there needs to be different approaches to each, so we matched the models to the shape of the problem. For intrusion detection, catching things like unauthorized access, we tested unsupervised anomaly detection methods. Isolation forests and deep autoencoders both gave us useful signals by picking up odd patterns in network traffic. When it came to malware detection, we leaned on ensemble models like Random Forest and XGBoost, trained on features pulled from files and traffic logs. Phishing was more straightforward. We fed standard classifiers (logistic regression, Random Forest, XGBoost) a mix of email and web-based features. These models handled the task surprisingly well. Phishing turned out to be the easiest problem to crack, at least with the data we had. There was a different story. We utilized an LSTM autoencoder to identify behavioral anomalies in user activity logs. It caught every suspicious behavior but flagged a lot of harmless ones too. That kind of model makes sense when the cost of missing a threat is high and you are willing to sift through some noise. What we saw across the board is that performance was not about stacking the most complex model. What mattered was how well the models structure matched the way the data behaved. When signals were strong and obvious, simple models worked fine. But for messier, more subtle threats, we needed something more adaptive, sequence models and anomaly detectors, though they brought their trade offs. The takeaway here is clear in cybersecurity, context drives the solution.

Paperid: 3152, https://arxiv.org/pdf/2508.01144.pdf

Abstract:
While traditional cryptographic research focuses on algorithm-level provable security, many real-world attacks exploit weaknesses in system implementations, such as memory mismanagement, poor entropy sources, and insecure key lifecycles. Existing approaches address these risks in isolation but lack a unified, verifiable framework for modeling implementation-layer security. In this work, we propose Implementation-Level Provable Security, a new paradigm that defines security in terms of structurally verifiable resilience against real-world attack surfaces during deployment. To demonstrate its feasibility, we present SEER (Secure and Efficient Encryption-based Erasure via Ransomware), a file destruction system that repurposes and reinforces the encryption core of Babuk ransomware. SEER incorporates key erasure, entropy validation, and execution consistency checks to ensure a well-constrained, auditable attack surface. Our evaluation shows that SEER achieves strong irrecoverability guarantees while maintaining practical performance. This work demonstrates a shift from abstract theoretical models toward practically verifiable implementation-layer security.

Paperid: 3153, https://arxiv.org/pdf/2508.01062.pdf

Abstract:
Cooperative perception (CP) enhances situational awareness of connected and autonomous vehicles by exchanging and combining messages from multiple agents. While prior work has explored adversarial integrity attacks that degrade perceptual accuracy, little is known about CP's robustness against attacks on timeliness (or availability), a safety-critical requirement for autonomous driving. In this paper, we present CP-FREEZER, the first latency attack that maximizes the computation delay of CP algorithms by injecting adversarial perturbation via V2V messages. Our attack resolves several unique challenges, including the non-differentiability of point cloud preprocessing, asynchronous knowledge of the victim's input due to transmission delays, and uses a novel loss function that effectively maximizes the execution time of the CP pipeline. Extensive experiments show that CP-FREEZER increases end-to-end CP latency by over $90\times$, pushing per-frame processing time beyond 3 seconds with a 100% success rate on our real-world vehicle testbed. Our findings reveal a critical threat to the availability of CP systems, highlighting the urgent need for robust defenses.

Paperid: 3154, https://arxiv.org/pdf/2508.00602.pdf

Abstract:
The generalization capabilities of Large Language Models (LLMs) have led to their widespread deployment across various applications. However, this increased adoption has introduced several security threats, notably in the forms of jailbreaking and data leakage attacks. Additionally, Retrieval Augmented Generation (RAG), while enhancing context-awareness in LLM responses, has inadvertently introduced vulnerabilities that can result in the leakage of sensitive information. Our contributions are twofold. First, we introduce a methodology to analyze historical interaction data from an LLM system, enabling the generation of usage maps categorized by topics (including adversarial interactions). This approach further provides forensic insights for tracking the evolution of jailbreaking attack patterns. Second, we propose LeakSealer, a model-agnostic framework that combines static analysis for forensic insights with dynamic defenses in a Human-In-The-Loop (HITL) pipeline. This technique identifies topic groups and detects anomalous patterns, allowing for proactive defense mechanisms. We empirically evaluate LeakSealer under two scenarios: (1) jailbreak attempts, employing a public benchmark dataset, and (2) PII leakage, supported by a curated dataset of labeled LLM interactions. In the static setting, LeakSealer achieves the highest precision and recall on the ToxicChat dataset when identifying prompt injection. In the dynamic setting, PII leakage detection achieves an AUPRC of $0.97$, significantly outperforming baselines such as Llama Guard.

Paperid: 3155, https://arxiv.org/pdf/2508.00434.pdf

Abstract:
Steganography based on diffusion models has attracted increasing attention due to its ability to generate high-quality images and exhibit strong robustness. In such approaches, the secret message is first embedded into the initial latent variable, and then the stego image is generated through the forward process. To extract the message, an inversion process is required to reconstruct the latent variables from the received image. However, inaccurate latent inversion leads to significant discrepancies between the reconstructed and original latent variables, rendering message extraction infeasible. To address this issue, we propose \textbf{RF-Stego}, a novel generative image steganography method that enables accurate latent inversion and significantly improves extraction accuracy. First, we develop the \textbf{P}ath \textbf{C}onsistency \textbf{L}inear \textbf{I}nversion (\textbf{PCLI}), which imposes formal constraints on the inversion process. By explicitly aligning it with the forward generation path and modeling both directions along a shared linear path, PCLI eliminates path mismatch and ensures path consistency throughout the steganographic process. Second, through rigorous theoretical proof, we demonstrate that \textbf{R}ectified \textbf{F}low \textbf{(RF)} offers both theoretical reversibility and numerical stability in the inversion process. Based on this, we replace traditional unstable samplers with RF sampler which effectively improves the numerical precision of the inversion process. Experimental results show RF-Stego outperforms state-of-the-art methods in terms of extraction accuracy, image quality, robustness, security and generation efficiency.

Paperid: 3156, https://arxiv.org/pdf/2508.00351.pdf

Abstract:
Quantum money is the cryptographic application of the quantum no-cloning theorem. It has recently been instantiated by Montgomery and Sharif (Asiacrypt '24) from class group actions on elliptic curves. In this work, we propose a concrete cryptanalysis by leveraging the efficiency of evaluating division polynomials with the coordinates of rational points, offering a speedup of O(log^4p) compared to the brute-force attack. Since our attack still requires exponential time, it remains impractical to forge a quantum banknote. Interestingly, due to the inherent properties of quantum money, our attack method also results in a more efficient verification procedure. Our algorithm leverages the properties of quadratic twists to utilize rational points in verifying the cardinality of the superposition of elliptic curves. We expect this approach to contribute to future research on elliptic-curve-based quantum cryptography.

Paperid: 3157, https://arxiv.org/pdf/2507.23611.pdf

Abstract:
Infostealers exfiltrate credentials, session cookies, and sensitive data from infected systems. With over 29 million stealer logs reported in 2024, manual analysis and mitigation at scale are virtually unfeasible/unpractical. While most research focuses on proactive malware detection, a significant gap remains in leveraging reactive analysis of stealer logs and their associated artifacts. Specifically, infection artifacts such as screenshots, image captured at the point of compromise, are largely overlooked by the current literature. This paper introduces a novel approach leveraging Large Language Models (LLMs), more specifically gpt-4o-mini, to analyze infection screenshots to extract potential Indicators of Compromise (IoCs), map infection vectors, and track campaigns. Focusing on the Aurora infostealer, we demonstrate how LLMs can process screenshots to identify infection vectors, such as malicious URLs, installer files, and exploited software themes. Our method extracted 337 actionable URLs and 246 relevant files from 1000 screenshots, revealing key malware distribution methods and social engineering tactics. By correlating extracted filenames, URLs, and infection themes, we identified three distinct malware campaigns, demonstrating the potential of LLM-driven analysis for uncovering infection workflows and enhancing threat intelligence. By shifting malware analysis from traditional log-based detection methods to a reactive, artifact-driven approach that leverages infection screenshots, this research presents a scalable method for identifying infection vectors and enabling early intervention.

Paperid: 3158, https://arxiv.org/pdf/2507.22772.pdf

Abstract:
Despite outstanding results, machine learning-based Android malware detection models struggle with concept drift, where rapidly evolving malware characteristics degrade model effectiveness. This study examines the impact of concept drift on Android malware detection, evaluating two datasets and nine machine learning and deep learning algorithms, as well as Large Language Models (LLMs). Various feature types--static, dynamic, hybrid, semantic, and image-based--were considered. The results showed that concept drift is widespread and significantly affects model performance. Factors influencing the drift include feature types, data environments, and detection methods. Balancing algorithms helped with class imbalance but did not fully address concept drift, which primarily stems from the dynamic nature of the malware landscape. No strong link was found between the type of algorithm used and concept drift, the impact was relatively minor compared to other variables since hyperparameters were not fine-tuned, and the default algorithm configurations were used. While LLMs using few-shot learning demonstrated promising detection performance, they did not fully mitigate concept drift, highlighting the need for further investigation.

Paperid: 3159, https://arxiv.org/pdf/2507.22231.pdf

Abstract:
Permission analysis is a widely used method for Android malware detection. It involves examining the permissions requested by an application to access sensitive data or perform potentially malicious actions. In recent years, various machine learning (ML) algorithms have been applied to Android malware detection using permission-based features and feature selection techniques, often achieving high accuracy. However, these studies have largely overlooked important factors such as protection levels and the deprecation or restriction of permissions due to updates in the Android OS -- factors that can contribute to concept drift. In this study, we investigate the impact of deprecated and restricted permissions on the performance of machine learning models. A large dataset containing 166 permissions was used, encompassing more than 70,000 malware and benign applications. Various machine learning and deep learning algorithms were employed as classifiers, along with different concept drift detection strategies. The results suggest that Android permissions are highly effective features for malware detection, with the exclusion of deprecated and restricted permissions having only a marginal impact on model performance. In some cases, such as with CNN, accuracy improved. Excluding these permissions also enhanced the detection of concept drift using a year-to-year analysis strategy. Dataset balancing further improved model performance, reduced low-accuracy instances, and enhanced concept drift detection via the Kolmogorov-Smirnov test.

Paperid: 3160, https://arxiv.org/pdf/2507.21145.pdf

Abstract:
Multi-Domain Operations (MDOs) emphasize cross-domain defense against complex and synergistic threats, with civilian infrastructures like smart cities and Connected Autonomous Vehicles (CAVs) emerging as primary targets. As dual-use assets, CAVs are vulnerable to Multi-Surface Threats (MSTs), particularly from Adversarial Machine Learning (AML) which can simultaneously compromise multiple in-vehicle ML systems (e.g., Intrusion Detection Systems, Traffic Sign Recognition Systems). Therefore, this study investigates how key hyperparameters in Decision Tree-based ensemble models-Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB)-affect the time required for a Black-Box AML attack i.e. Zeroth Order Optimization (ZOO). Findings show that parameters like the number of trees or boosting rounds significantly influence attack execution time, with RF and GB being more sensitive than XGB. Adversarial Training (AT) time is also analyzed to assess the attacker's window of opportunity. By optimizing hyperparameters, this research supports Defensive Trustworthy AI (D-TAI) practices within MST scenarios and contributes to the development of resilient ML systems for civilian and military domains, aligned with Cyber Social Security framework in MDOs and Human-AI Multi-Domain Task Forces.

Paperid: 3161, https://arxiv.org/pdf/2507.20873.pdf

Abstract:
In the era of Industry 4.0, the growing need for secure and efficient communication systems has driven the development of fifth-generation (5G) networks characterized by extremely low latency, massive device connectivity and high data transfer speeds. However, the deployment of 5G networks presents significant security challenges, requiring advanced and robust solutions to counter increasingly sophisticated cyber threats. This paper proposes a testbed and software architecture to strengthen the security of Private 5G Networks, particularly in industrial communication environments.

Paperid: 3162, https://arxiv.org/pdf/2507.20650.pdf

Abstract:
Recently, Deep Learning (DL) models have been increasingly deployed on end-user devices as On-Device AI, offering improved efficiency and privacy. However, this deployment trend poses more serious Intellectual Property (IP) risks, as models are distributed on numerous local devices, making them vulnerable to theft and redistribution. Most existing ownership protection solutions (e.g., backdoor-based watermarking) are designed for cloud-based AI-as-a-Service (AIaaS) and are not directly applicable to large-scale distribution scenarios, where each user-specific model instance must carry a unique watermark. These methods typically embed a fixed watermark, and modifying the embedded watermark requires retraining the model. To address these challenges, we propose Hot-Swap MarkBoard, an efficient watermarking method. It encodes user-specific $n$-bit binary signatures by independently embedding multiple watermarks into a multi-branch Low-Rank Adaptation (LoRA) module, enabling efficient watermark customization without retraining through branch swapping. A parameter obfuscation mechanism further entangles the watermark weights with those of the base model, preventing removal without degrading model performance. The method supports black-box verification and is compatible with various model architectures and DL tasks, including classification, image generation, and text generation. Extensive experiments across three types of tasks and six backbone models demonstrate our method's superior efficiency and adaptability compared to existing approaches, achieving 100\% verification accuracy.

Paperid: 3163, https://arxiv.org/pdf/2507.20434.pdf

Abstract:
The Border Gateway Protocol (BGP) remains a fragile pillar of Internet routing. BGP hijacks still occurr daily. While full deployment of Route Origin Validation (ROV) is ongoing, attackers have already adapted, launching post-ROV attacks such as forged-origin hijacks. To detect these, recent approaches like DFOH [Holterbach et al., USENIX NSDI '24] and BEAM [Chen et al., USENIX Security '24] apply machine learning (ML) to analyze data from globally distributed BGP monitors, assuming anomalies will stand out against historical patterns. However, this assumption overlooks a key threat: BGP monitors themselves can be misled by adversaries injecting bogus routes. This paper shows that state-of-the-art hijack detection systems like DFOH and BEAM are vulnerable to data poisoning. Using large-scale BGP simulations, we show that attackers can evade detection with just a handful of crafted announcements beyond the actual hijack. These announcements are indeed sufficient to corrupt the knowledge base used by ML-based defenses and distort the metrics they rely on. Our results highlight a worrying weakness of relying solely on public BGP data.

Paperid: 3164, https://arxiv.org/pdf/2507.19185.pdf

Abstract:
Static benchmarks fail to capture LLM vulnerabilities emerging through community experimentation in online forums. We present PrompTrend, a system that collects vulnerability data across platforms and evaluates them using multidimensional scoring, with an architecture designed for scalable monitoring. Cross-sectional analysis of 198 vulnerabilities collected from online communities over a five-month period (January-May 2025) and tested on nine commercial models reveals that advanced capabilities correlate with increased vulnerability in some architectures, psychological attacks significantly outperform technical exploits, and platform dynamics shape attack effectiveness with measurable model-specific patterns. The PrompTrend Vulnerability Assessment Framework achieves 78% classification accuracy while revealing limited cross-model transferability, demonstrating that effective LLM security requires comprehensive socio-technical monitoring beyond traditional periodic assessment. Our findings challenge the assumption that capability advancement improves security and establish community-driven psychological manipulation as the dominant threat vector for current language models.

Paperid: 3165, https://arxiv.org/pdf/2507.18289.pdf

Abstract:
Fuzzing a library requires experts to understand the library usage well and craft high-quality fuzz drivers, which is tricky and tedious. Therefore, many techniques have been proposed to automatically generate fuzz drivers. However, they fail to generate rational fuzz drivers due to the lack of adherence to proper library usage conventions, such as ensuring a resource is closed after being opened. To make things worse, existing library fuzzing techniques unconditionally execute each driver, resulting in numerous irrational drivers that waste computational resources while contributing little coverage and generating false positive bug reports. To tackle these challenges, we propose a novel automatic library fuzzing technique, Scheduzz, an LLM-based library fuzzing technique. It leverages LLMs to understand rational usage of libraries and extract API combination constraints. To optimize computational resource utilization, a dual scheduling framework is implemented to efficiently manage API combinations and fuzz drivers. The framework models driver generation and the corresponding fuzzing campaign as an online optimization problem. Within the scheduling loop, multiple API combinations are selected to generate fuzz drivers, while simultaneously, various optimized fuzz drivers are scheduled for execution or suspension. We implemented Scheduzz and evaluated it in 33 real-world libraries. Compared to baseline approaches, Scheduzz significantly reduces computational overhead and outperforms UTopia on 16 out of 21 libraries. It achieves 1.62x, 1.50x, and 1.89x higher overall coverage than the state-of-the-art techniques CKGFuzzer, Promptfuzz, and the handcrafted project OSS-Fuzz, respectively. In addition, Scheduzz discovered 33 previously unknown bugs in these well-tested libraries, 3 of which have been assigned CVEs.

Paperid: 3166, https://arxiv.org/pdf/2507.18253.pdf

Abstract:
Of growing concern in privacy scholarship is artificial intelligence (AI), as a powerful producer of inferences. Taken to its limits, AI may be presumed capable of inferring "everything from everything," thereby making untenable any normative scheme, including privacy theory and privacy regulation, which rests on protecting privacy based on categories of data - sensitive versus non-sensitive, private versus public. Discarding data categories as a normative anchoring in privacy and data protection as a result of an unconditional acceptance of AI's inferential capacities is what we call privacy nihilism. An ethically reasoned response to AI inferences requires a sober consideration of AI capabilities rather than issuing an epistemic carte blanche. We introduce the notion of conceptual overfitting to expose how privacy nihilism turns a blind eye toward flawed epistemic practices in AI development. Conceptual overfitting refers to the adoption of norms of convenience that simplify the development of AI models by forcing complex constructs to fit data that are conceptually under-representative or even irrelevant. While conceptual overfitting serves as a helpful device to counter normative suggestions grounded in hyperbolic AI capability claims, AI inferences shake any privacy regulation that hinges protections based on restrictions around data categories. We propose moving away from privacy frameworks that focus solely on data type, neglecting all other factors. Theories like contextual integrity evaluate the normative value of privacy across several parameters, including the type of data, the actors involved in sharing it, and the purposes for which the information is used.

Paperid: 3167, https://arxiv.org/pdf/2507.17064.pdf

Abstract:
With the advent of modern technology, critical infrastructure, communications, and national security depend increasingly on space-based assets. These assets, along with associated assets like data relay systems and ground stations, are, therefore, in serious danger of cyberattacks. Strong security defenses are essential to ensure data integrity, maintain secure operations, and protect assets in space and on the ground against various threats. Previous research has found discrete vulnerabilities in space systems and suggested specific solutions to address them. Such research has yielded valuable insights, but lacks a thorough examination of space cyberattack vectors and a rigorous assessment of the efficacy of mitigation techniques. This study tackles this issue by taking a comprehensive approach to analyze the range of possible space cyber-attack vectors, which include ground, space, satellite, and satellite constellations. In order to address the particular threats, the study also assesses the efficacy of mitigation measures that are linked with space infrastructures and proposes a Risk Scoring Framework. Based on the analysis, this paper identifies potential research challenges for developing and testing cutting-edge technology solutions, encouraging robust cybersecurity measures needed in space.

Paperid: 3168, https://arxiv.org/pdf/2507.17017.pdf

Abstract:
We introduce an algorithm that releases a pure differentially private sparse histogram over $n$ participants drawn from a domain of size $d \gg n$. Our method attains the optimal $\ell_\infty$-estimation error and runs in strictly $O(n \ln \ln d)$ time in the word-RAM model, thereby improving upon the previous best known deterministic-time bound of $\tilde{O}(n^2)$ and resolving the open problem of breaking this quadratic barrier (Balcer and Vadhan, 2019). Central to our algorithm is a novel private item blanket technique with target-length padding, which transforms the approximate differentially private stability-based histogram algorithm into a pure differentially private one.

Paperid: 3169, https://arxiv.org/pdf/2507.17010.pdf

Abstract:
In the era of synthetic media, deepfake manipulations pose a significant threat to information integrity. To address this challenge, we propose TrustDefender, a two-stage framework comprising (i) a lightweight convolutional neural network (CNN) that detects deepfake imagery in real-time extended reality (XR) streams, and (ii) an integrated succinct zero-knowledge proof (ZKP) protocol that validates detection results without disclosing raw user data. Our design addresses both the computational constraints of XR platforms while adhering to the stringent privacy requirements in sensitive settings. Experimental evaluations on multiple benchmark deepfake datasets demonstrate that TrustDefender achieves 95.3% detection accuracy, coupled with efficient proof generation underpinned by rigorous cryptography, ensuring seamless integration with high-performance artificial intelligence (AI) systems. By fusing advanced computer vision models with provable security mechanisms, our work establishes a foundation for reliable AI in immersive and privacy-sensitive applications.

Paperid: 3170, https://arxiv.org/pdf/2507.16840.pdf

Abstract:
The rapid evolution of digital currency trading, fueled by the integration of blockchain technology, has led to both innovation and the emergence of smart Ponzi schemes. A smart Ponzi scheme is a fraudulent investment operation in smart contract that uses funds from new investors to pay returns to earlier investors. Traditional Ponzi scheme detection methods based on deep learning typically rely on fully supervised models, which require large amounts of labeled data. However, such data is often scarce, hindering effective model training. To address this challenge, we propose a novel contrastive learning framework, CASPER (Contrastive Approach for Smart Ponzi detectER with more negative samples), designed to enhance smart Ponzi scheme detection in blockchain transactions. By leveraging contrastive learning techniques, CASPER can learn more effective representations of smart contract source code using unlabeled datasets, significantly reducing both operational costs and system complexity. We evaluate CASPER on the XBlock dataset, where it outperforms the baseline by 2.3% in F1 score when trained with 100% labeled data. More impressively, with only 25% labeled data, CASPER achieves an F1 score nearly 20% higher than the baseline under identical experimental conditions. These results highlight CASPER's potential for effective and cost-efficient detection of smart Ponzi schemes, paving the way for scalable fraud detection solutions in the future.

Paperid: 3171, https://arxiv.org/pdf/2507.16788.pdf

Abstract:
With the General Data Protection Regulation (GDPR) in place, all domains have to ensure compliance with privacy legislation. However, compliance does not necessarily result in a privacy-friendly system as for example getting users' consent to process their data does not improve the privacy-friendliness of the system. Therefore, the goal of the AUTOPSY project was to support the privacy engineering process in the automotive domain by providing several building blocks which technically improve the privacy-friendliness of modern, i.e., connected and (partially) automated vehicles. This paper presents the results of the AUTOPSY project: a system model to identify relevant entities and locations to apply privacy enhancing technologies (PETs); the privacy manager aiming at more control of the data flow from the vehicle, a PET selection approach based on GDPR principles, and an architectural framework for automotive privacy. Furthermore, we built a demonstrator for location-based services to evaluate the architectural framework.

Paperid: 3172, https://arxiv.org/pdf/2507.15984.pdf

Abstract:
Broken Access Control (BAC) remains one of the most critical and widespread vulnerabilities in web applications, allowing attackers to access unauthorized resources or perform privileged actions. Despite its severity, BAC is underexplored in automated testing due to key challenges: the lack of reliable oracles and the difficulty of generating semantically valid attack requests. We introduce BACFuzz, the first gray-box fuzzing framework specifically designed to uncover BAC vulnerabilities, including Broken Object-Level Authorization (BOLA) and Broken Function-Level Authorization (BFLA) in PHP-based web applications. BACFuzz combines LLM-guided parameter selection with runtime feedback and SQL-based oracle checking to detect silent authorization flaws. It employs lightweight instrumentation to capture runtime information that guides test generation, and analyzes backend SQL queries to verify whether unauthorized inputs flow into protected operations. Evaluated on 20 real-world web applications, including 15 CVE cases and 2 known benchmarks, BACFuzz detects 16 of 17 known issues and uncovers 26 previously unknown BAC vulnerabilities with low false positive rates. All identified issues have been responsibly disclosed, and artifacts will be publicly released.

Paperid: 3173, https://arxiv.org/pdf/2507.14658.pdf

Abstract:
Popular methods in cooperative Multi-Agent Reinforcement Learning with partially observable environments typically allow agents to act independently during execution, which may limit the coordinated effect of the trained policies. However, by sharing information such as known or suspected ongoing threats, effective communication can lead to improved decision-making in the cyber battle space. We propose a game design where defender agents learn to communicate and defend against imminent cyber threats by playing training games in the Cyber Operations Research Gym, using the Differentiable Inter Agent Learning algorithm adapted to the cyber operational environment. The tactical policies learned by these autonomous agents are akin to those of human experts during incident responses to avert cyber threats. In addition, the agents simultaneously learn minimal cost communication messages while learning their defence tactical policies.

Paperid: 3174, https://arxiv.org/pdf/2507.13926.pdf

Abstract:
Webextensions can improve web browser privacy, security, and user experience. The APIs offered by the browser to webextensions affect possible functionality. Currently, Chrome transitions to a modified set of APIs called Manifest v3. This paper studies the challenges and opportunities of Manifest v3 with an in-depth structured qualitative research. Even though some projects observed positive effects, a majority expresses concerns over limited benefits to users, removal of crucial APIs, or the need to find workarounds. Our findings indicate that the transition affects different types of webextensions differently; some can migrate without losing functionality, while other projects remove functionality or decline to update. The respondents identified several critical missing APIs, including reliable APIs to inject content scripts, APIs for storing confidential content, and others.

Paperid: 3175, https://arxiv.org/pdf/2507.13670.pdf

Abstract:
Deep thermalization refers to the emergence of Haar-like randomness from quantum systems upon partial measurements. As a generalization of quantum thermalization, it is often associated with high complexity and entanglement. Here, we introduce computational deep thermalization and construct the fastest possible dynamics exhibiting it at infinite effective temperature. Our circuit dynamics produce quantum states with low entanglement in polylogarithmic depth that are indistinguishable from Haar random states to any computationally bounded observer. Importantly, the observer is allowed to request many copies of the same residual state obtained from partial projective measurements on the state -- this condition is beyond the standard settings of quantum pseudorandomness, but natural for deep thermalization. In cryptographic terms, these states are pseudorandom, pseudoentangled, and crucially, retain these properties under local measurements. Our results demonstrate a new form of computational thermalization, where thermal-like behavior arises from structured quantum states endowed with cryptographic properties, instead of from highly unstructured ensembles. The low resource complexity of preparing these states suggests scalable simulations of deep thermalization using quantum computers. Our work also motivates the study of computational quantum pseudorandomness beyond BQP observers.

Paperid: 3176, https://arxiv.org/pdf/2507.13591.pdf

Abstract:
Federated Learning (FL) enables collaborative model training without centralizing client data, making it attractive for privacy-sensitive domains. While existing approaches employ cryptographic techniques such as homomorphic encryption, differential privacy, or secure multiparty computation to mitigate inference attacks-including model inversion, membership inference, and gradient leakage-they often suffer from high computational, communication, or memory overheads. Moreover, many methods overlook the confidentiality of the global model itself, which may be proprietary and sensitive. These challenges limit the practicality of secure FL, especially in cross-silo deployments involving large datasets and strict compliance requirements. We present FuSeFL, a fully secure and scalable FL scheme designed for cross-silo settings. FuSeFL decentralizes training across client pairs using lightweight secure multiparty computation (MPC), while confining the server's role to secure aggregation. This design eliminates server bottlenecks, avoids data offloading, and preserves full confidentiality of data, model, and updates throughout training. FuSeFL defends against inference threats, achieves up to 95% lower communication latency and 50% lower server memory usage, and improves accuracy over prior secure FL solutions, demonstrating strong security and efficiency at scale.

Paperid: 3177, https://arxiv.org/pdf/2507.13367.pdf

Abstract:
Steganography is the process of embedding secret information discreetly within a carrier, ensuring secure exchange of confidential data. The Adaptive Pixel Value Differencing (APVD) steganography method, while effective, encounters certain challenges like the "unused blocks" issue. This problem can cause a decrease in security, compromise the embedding capacity, and lead to lower visual quality. This research presents a novel steganographic strategy that integrates APVD with pseudorandom pixel selection to effectively mitigate these issues. The results indicate that the new method outperforms existing techniques in aspects of security, data hiding capacity, and the preservation of image quality. Empirical results reveal that the combination of APVD with pseudorandom pixel selection significantly enhances key image quality metrics such as Peak Signal-to-Noise Ratio (PSNR), Universal Image Quality Index (UIQ), and Structural Similarity Index (SSIM), surpassing other contemporary methods in performance. The newly proposed method is versatile, able to handle a variety of cover and secret images in both color and grayscale, thereby ensuring secure data transmission without compromising the aesthetic quality of the image.

Paperid: 3178, https://arxiv.org/pdf/2507.13169.pdf

Abstract:
Prompt injection attacks, where malicious input is designed to manipulate AI systems into ignoring their original instructions and following unauthorized commands instead, were first discovered by Preamble, Inc. in May 2022 and responsibly disclosed to OpenAI. Over the last three years, these attacks have continued to pose a critical security threat to LLM-integrated systems. The emergence of agentic AI systems, where LLMs autonomously perform multistep tasks through tools and coordination with other agents, has fundamentally transformed the threat landscape. Modern prompt injection attacks can now combine with traditional cybersecurity exploits to create hybrid threats that systematically evade traditional security controls. This paper presents a comprehensive analysis of Prompt Injection 2.0, examining how prompt injections integrate with Cross-Site Scripting (XSS), Cross-Site Request Forgery (CSRF), and other web security vulnerabilities to bypass traditional security measures. We build upon Preamble's foundational research and mitigation technologies, evaluating them against contemporary threats, including AI worms, multi-agent infections, and hybrid cyber-AI attacks. Our analysis incorporates recent benchmarks that demonstrate how traditional web application firewalls, XSS filters, and CSRF tokens fail against AI-enhanced attacks. We also present architectural solutions that combine prompt isolation, runtime security, and privilege separation with novel threat detection capabilities.

Paperid: 3179, https://arxiv.org/pdf/2507.13042.pdf

Abstract:
The integration of security and energy efficiency in Internet of Things systems remains a critical challenge, particularly for battery-free and resource-constrained devices. This paper explores the scalability and protocol-agnostic nature of a backscattering-based security mechanism by integrating it into Bluetooth Low Energy battery-free Wireless Sensor Network. The proposed approach leverages the Wireless Power Transfer link, traditionally used for energy harvesting, to generate additional identification signals without increasing energy consumption or computational demands. Experimental validation demonstrates the solution's functionality using compact, low-gain antenna, ensuring compatibility with size-constrained applications such as Structural Health Monitoring and smart transport. Furthermore, this work addresses the challenges associated with backscattering dynamic range and multi-node Wireless Sensor Network scenarios, discussing potential collisions between identification signals and proposing future improvements to enhance generalizability and scalability. The findings underscore the potential of the backscattering-based security mechanism for creating secure, sustainable, and scalable IoT deployments across diverse protocols and applications.

Paperid: 3180, https://arxiv.org/pdf/2507.12730.pdf

Abstract:
We propose a privacy-preserving semantic-segmentation method for applying perceptual encryption to images used for model training in addition to test images. This method also provides almost the same accuracy as models without any encryption. The above performance is achieved using a domain-adaptation technique on the embedding structure of the Vision Transformer (ViT). The effectiveness of the proposed method was experimentally confirmed in terms of the accuracy of semantic segmentation when using a powerful semantic-segmentation model with ViT called Segmentation Transformer.

Paperid: 3181, https://arxiv.org/pdf/2507.12568.pdf

Abstract:
Federated Learning (FL) has emerged as a promising solution for privacy-preserving autonomous driving, specifically camera-based Road Condition Classification (RCC) systems, harnessing distributed sensing, computing, and communication resources on board vehicles without sharing sensitive image data. However, the collaborative nature of FL-RCC frameworks introduces new vulnerabilities: Targeted Label Flipping Attacks (TLFAs), in which malicious clients (vehicles) deliberately alter their training data labels to compromise the learned model inference performance. Such attacks can, e.g., cause a vehicle to mis-classify slippery, dangerous road conditions as pristine and exceed recommended speed. However, TLFAs for FL-based RCC systems are largely missing. We address this challenge with a threefold contribution: 1) we disclose the vulnerability of existing FL-RCC systems to TLFAs; 2) we introduce a novel label-distance-based metric to precisely quantify the safety risks posed by TLFAs; and 3) we propose FLARE, a defensive mechanism leveraging neuron-wise analysis of the output layer to mitigate TLFA effects. Extensive experiments across three RCC tasks, four evaluation metrics, six baselines, and three deep learning models demonstrate both the severity of TLFAs on FL-RCC systems and the effectiveness of FLARE in mitigating the attack impact.

Paperid: 3182, https://arxiv.org/pdf/2507.12456.pdf

Abstract:
One-shot signatures (OSS) were defined by Amos, Georgiou, Kiayias, and Zhandry (STOC'20). These allow for signing exactly one message, after which the signing key self-destructs, preventing a second message from ever being signed. While such an object is impossible classically, Amos et al observe that OSS may be possible using quantum signing keys by leveraging the no-cloning principle. OSS has since become an important conceptual tool with many applications in decentralized settings and for quantum cryptography with classical communication. OSS are also closely related to separations between classical-binding and collapse-binding for post-quantum hashing and commitments. Unfortunately, the only known OSS construction due to Amos et al. was only justified in a classical oracle model, and moreover their justification was ultimately found to contain a fatal bug. Thus, the existence of OSS, even in a classical idealized model, has remained open. We give the first standard-model OSS, with provable security assuming (sub-exponential) indistinguishability obfuscation (iO) and LWE. This also gives the first standard-model separation between classical and collapse-binding post-quantum commitments/hashing, solving a decade-old open problem. Along the way, we also give the first construction with unconditional security relative to a classical oracle. To achieve our standard-model construction, we develop a notion of permutable pseudorandom permutations (permutable PRPs), and show how they are useful for translating oracle proofs involving random permutations into obfuscation-based proofs. In particular, obfuscating permutable PRPs gives a trapdoor one-way permutation that is \emph{full-domain}, solving another decade-old-problem of constructing this object from (sub-exponential) iO and one-way functions.

Paperid: 3183, https://arxiv.org/pdf/2507.12107.pdf

Abstract:
Adversarial attacks on face recognition systems (FRSs) pose serious security and privacy threats, especially when these systems are used for identity verification. In this paper, we propose a novel method for generating adversarial faces-synthetic facial images that are visually distinct yet recognized as a target identity by the FRS. Unlike iterative optimization-based approaches (e.g., gradient descent or other iterative solvers), our method leverages the structural characteristics of the FRS feature space. We figure out that individuals sharing the same attribute (e.g., gender or race) form an attributed subsphere. By utilizing such subspheres, our method achieves both non-adaptiveness and a remarkably small number of queries. This eliminates the need for relying on transferability and open-source surrogate models, which have been a typical strategy when repeated adaptive queries to commercial FRSs are impossible. Despite requiring only a single non-adaptive query consisting of 100 face images, our method achieves a high success rate of over 93% against AWS's CompareFaces API at its default threshold. Furthermore, unlike many existing attacks that perturb a given image, our method can deliberately produce adversarial faces that impersonate the target identity while exhibiting high-level attributes chosen by the adversary.

Paperid: 3184, https://arxiv.org/pdf/2507.12061.pdf

Abstract:
Modern Security Orchestration, Automation, and Response (SOAR) platforms must rapidly adapt to continuously evolving cyber attacks. Intent-Based Networking has emerged as a promising paradigm for cyber attack mitigation through high-level declarative intents, which offer greater flexibility and persistency than procedural actions. In this paper, we bridge the gap between two active research directions: Intent-Based Cyber Defense and Autonomic Cyber Defense, by proposing a unified, ontology-driven security intent definition leveraging the MITRE-D3FEND cybersecurity ontology. We also propose a general two-tiered methodology for integrating such security intents into decision-theoretic Autonomic Cyber Defense systems, enabling hierarchical and context-aware automated response capabilities. The practicality of our approach is demonstrated through a concrete use case, showcasing its integration within next-generation Security Orchestration, Automation, and Response platforms.

Paperid: 3185, https://arxiv.org/pdf/2507.12050.pdf

Abstract:
As face recognition systems (FRS) become more widely used, user privacy becomes more important. A key privacy issue in FRS is protecting the user's face template, as the characteristics of the user's face image can be recovered from the template. Although recent advances in cryptographic tools such as homomorphic encryption (HE) have provided opportunities for securing the FRS, HE cannot be used directly with FRS in an efficient plug-and-play manner. In particular, although HE is functionally complete for arbitrary programs, it is basically designed for algebraic operations on encrypted data of predetermined shape, such as a polynomial ring. Thus, a non-tailored combination of HE and the system can yield very inefficient performance, and many previous HE-based face template protection methods are hundreds of times slower than plain systems without protection. In this study, we propose IDFace, a new HE-based secure and efficient face identification method with template protection. IDFace is designed on the basis of two novel techniques for efficient searching on a (homomorphically encrypted) biometric database with an angular metric. The first technique is a template representation transformation that sharply reduces the unit cost for the matching test. The second is a space-efficient encoding that reduces wasted space from the encryption algorithm, thus saving the number of operations on encrypted templates. Through experiments, we show that IDFace can identify a face template from among a database of 1M encrypted templates in 126ms, showing only 2X overhead compared to the identification over plaintexts.

Paperid: 3186, https://arxiv.org/pdf/2507.11775.pdf

Abstract:
Authentication and authenticity have been a security challenge since the beginning of information sharing, especially in the context of digital information. With the advancement of generative artificial intelligence, these challenges have evolved, demanding a more up-to-date analysis of their impacts on society and system security. This work presents a scoping review that analyzed 88 documents from the IEEExplorer, Scopus, and ACM databases, promoting an analysis of the resulting portfolio through six guiding questions focusing on the most relevant work, challenges, attack surfaces, threats, proposed solutions, and gaps. Finally, the portfolio articles are analyzed through this guiding research lens and also receive individualized analysis. The results consistently outline the challenges, gaps, and threats related to images, text, audio, and video, thereby supporting new research in the areas of authentication and generative artificial intelligence.

Paperid: 3187, https://arxiv.org/pdf/2507.11721.pdf

Abstract:
Sanctioning blockchain addresses has become a common regulatory response to malicious activities. However, enforcement on permissionless blockchains remains challenging due to complex transaction flows and sophisticated fund-obfuscation techniques. Using cryptocurrency mixing tool Tornado Cash as a case study, we quantitatively assess the effectiveness of U.S. Office of Foreign Assets Control (OFAC) sanctions over a 957-day period, covering 6.79 million Ethereum blocks and 1.07 billion transactions. Our analysis reveals that while OFAC sanctions reduced overall Tornado Cash deposit volume by 71.03% to approximately 2 billion USD, attackers still relied on Tornado Cash in 78.33% of Ethereum-related security incidents, underscoring persistent evasion strategies. We identify three structural limitations in current sanction enforcement practices: (i) the susceptibility of binary sanction classifications to dusting attacks; (ii) fragmented censorship by blockchain producers; and (iii) the complexity of obfuscation services exploited by users. To address these gaps, we introduce a more practical algorithm for scoring and tracking, grounded in quantitative impurity. On average, our algorithm processes Ethereum blocks within 0.07 $\pm$ 0.03 seconds and achieves 97.61% precision and 74.08% recall when evaluated on the Bybit exploit. Our findings contribute to ongoing discussions around regulatory effectiveness in Decentralized Finance by providing empirical evidence, clarifying enforcement challenges, and informing future compliance strategies in response to sanctions and blockchain-based security risks.

Paperid: 3188, https://arxiv.org/pdf/2507.10786.pdf

Abstract:
Equipped with artificial intelligence (AI) and advanced sensing capabilities, social robots are gaining interest among consumers in the United States. These robots seem like a natural evolution of traditional smart home devices. However, their extensive data collection capabilities, anthropomorphic features, and capacity to interact with their environment make social robots a more significant security and privacy threat. Increased risks include data linkage, unauthorized data sharing, and the physical safety of users and their homes. It is critical to investigate U.S. users' security and privacy needs and concerns to guide the design of social robots while these devices are still in the early stages of commercialization in the U.S. market. Through 19 semi-structured interviews, we identified significant security and privacy concerns, highlighting the need for transparency, usability, and robust privacy controls to support adoption. For educational applications, participants worried most about misinformation, and in medical use cases, they worried about the reliability of these devices. Participants were also concerned with the data inference that social robots could enable. We found that participants expect tangible privacy controls, indicators of data collection, and context-appropriate functionality.

Paperid: 3189, https://arxiv.org/pdf/2507.10733.pdf

Abstract:
Backdoor attacks involve either poisoning the training data or directly modifying the model in order to implant a hidden behavior, that causes the model to misclassify inputs when a specific trigger is present. During inference, the model maintains high accuracy on benign samples but misclassifies poisoned samples into an attacker-specified target class. Existing research on backdoor attacks has explored developing triggers in the spatial, spectral (frequency), and semantic (feature) domains, aiming to make them stealthy. While some approaches have considered designing triggers that are imperceptible in both spatial and spectral domains, few have incorporated the semantic domain. In this paper, we propose a novel backdoor attack, termed 3S-attack, which is stealthy across the spatial, spectral, and semantic domains. The key idea is to exploit the semantic features of benign samples as triggers, using Gradient-weighted Class Activation Mapping (Grad-CAM) and a preliminary model for extraction. The trigger is then embedded in the spectral domain, followed by pixel-level restrictions after converting the samples back to the spatial domain. This process minimizes the distance between poisoned and benign samples, making the attack harder to detect by existing defenses and human inspection. Extensive experiments on various datasets, along with theoretical analysis, demonstrate the stealthiness of 3S-attack and highlight the need for stronger defenses to ensure AI security. Our code is available at: https://anonymous.4open.science/r/anon-project-3776/

Paperid: 3190, https://arxiv.org/pdf/2507.10695.pdf

Abstract:
Individuals are increasingly relying on large language model (LLM)-enabled conversational agents for emotional support. While prior research has examined privacy and security issues in chatbots specifically designed for mental health purposes, these chatbots are overwhelmingly "rule-based" offerings that do not leverage generative AI. Little empirical research currently measures users' privacy and security concerns, attitudes, and expectations when using general-purpose LLM-enabled chatbots to manage and improve mental health. Through 21 semi-structured interviews with U.S. participants, we identified critical misconceptions and a general lack of risk awareness. Participants conflated the human-like empathy exhibited by LLMs with human-like accountability and mistakenly believed that their interactions with these chatbots were safeguarded by the same regulations (e.g., HIPAA) as disclosures with a licensed therapist. We introduce the concept of "intangible vulnerability," where emotional or psychological disclosures are undervalued compared to more tangible forms of information (e.g., financial or location-based data). To address this, we propose recommendations to safeguard user mental health disclosures with general-purpose LLM-enabled chatbots more effectively.

Paperid: 3191, https://arxiv.org/pdf/2507.10622.pdf

Abstract:
The rapid expansion of Internet of Things (IoT) networks has led to a surge in security vulnerabilities, emphasizing the critical need for robust anomaly detection and classification techniques. In this work, we propose a novel approach for identifying anomalies in IoT network traffic by leveraging the Mel-frequency cepstral coefficients (MFCC) and ResNet-18, a deep learning model known for its effectiveness in feature extraction and image-based tasks. Learnable MFCCs enable adaptive spectral feature representation, capturing the temporal patterns inherent in network traffic more effectively than traditional fixed MFCCs. We demonstrate that transforming raw signals into MFCCs maps the data into a higher-dimensional space, enhancing class separability and enabling more effective multiclass classification. Our approach combines the strengths of MFCCs with the robust feature extraction capabilities of ResNet-18, offering a powerful framework for anomaly detection. The proposed model is evaluated on three widely used IoT intrusion detection datasets: CICIoT2023, NSL-KDD, and IoTID20. The experimental results highlight the potential of integrating adaptive signal processing techniques with deep learning architectures to achieve robust and scalable anomaly detection in heterogeneous IoT network landscapes.

Paperid: 3192, https://arxiv.org/pdf/2507.10489.pdf

Abstract:
The growing reliance on data-driven applications in sectors such as healthcare, finance, and law enforcement underscores the need for secure, privacy-preserving, and scalable mechanisms for data generation and sharing. Synthetic data generation (SDG) has emerged as a promising approach but often relies on centralized or external processing, raising concerns about data sovereignty, domain ownership, and compliance with evolving regulatory standards. To overcome these issues, we introduce SynthGuard, a framework designed to ensure computational governance by enabling data owners to maintain control over SDG workflows. SynthGuard supports modular and privacy-preserving workflows, ensuring secure, auditable, and reproducible execution across diverse environments. In this paper, we demonstrate how SynthGuard addresses the complexities at the intersection of domain-specific needs and scalable SDG by aligning with requirements for data sovereignty and regulatory compliance. Developed iteratively with domain expert input, SynthGuard has been validated through real-world use cases, demonstrating its ability to balance security, privacy, and scalability while ensuring compliance. The evaluation confirms its effectiveness in implementing and executing SDG workflows and integrating privacy and utility assessments across various computational environments.

Paperid: 3193, https://arxiv.org/pdf/2507.10267.pdf

Abstract:
Detecting Domain Name System (DNS) tunneling is a significant challenge in security due to its capacity to hide harmful actions within DNS traffic that appears to be normal and legitimate. Traditional detection methods are based on rule-based approaches or signature matching methods that are often insufficient to accurately identify such covert communication channels. This research is about effectively detecting DNS tunneling. We propose a novel approach to detect DNS tunneling with machine learning algorithms. We combine machine learning algorithms to analyze the traffic by using features extracted from DNS traffic. Analyses results show that the proposed approach is a good candidate to detect DNS tunneling accurately.

Paperid: 3194, https://arxiv.org/pdf/2507.09990.pdf

Abstract:
Large language models (LLMs) typically require fine-tuning for domain-specific tasks, and LoRA offers a computationally efficient approach by training low-rank adapters. LoRA is also communication-efficient for federated LLMs when multiple users collaboratively fine-tune a global LLM model without sharing their proprietary raw data. However, even the transmission of local adapters between a server and clients risks serious privacy leakage. Applying differential privacy (DP) to federated LoRA encounters a dilemma: adding noise to both adapters amplifies synthetic noise on the model, while fixing one adapter impairs the learnability of fine-tuning. In this paper, we propose FedASK (Differentially Private Federated Low Rank Adaptation with Double Sketching) , a novel federated LoRA framework to enable effective updating of both low-rank adapters with robust differential privacy. Inspired by randomized SVD, our key idea is a two-stage sketching pipeline. This pipeline first aggregates carefully sketched, privacy-preserving local updates, and then reconstructs the global matrices on the server to facilitate effective updating of both adapters. We theoretically prove FedASK's differential privacy guarantee and its exact aggregation property. Comprehensive experiments demonstrate that FedASK consistently outperforms baseline methods across a variety of privacy settings and data distributions.

Paperid: 3195, https://arxiv.org/pdf/2507.09762.pdf

Abstract:
Hacker forums provide critical early warning signals for emerging cybersecurity threats, but extracting actionable intelligence from their unstructured and noisy content remains a significant challenge. This paper presents an unsupervised framework that automatically detects, clusters, and prioritizes security events discussed across hacker forum posts. Our approach leverages Transformer-based embeddings fine-tuned with contrastive learning to group related discussions into distinct security event clusters, identifying incidents like zero-day disclosures or malware releases without relying on predefined keywords. The framework incorporates a daily ranking mechanism that prioritizes identified events using quantifiable metrics reflecting timeliness, source credibility, information completeness, and relevance. Experimental evaluation on real-world hacker forum data demonstrates that our method effectively reduces noise and surfaces high-priority threats, enabling security analysts to mount proactive responses. By transforming disparate hacker forum discussions into structured, actionable intelligence, our work addresses fundamental challenges in automated threat detection and analysis.

Paperid: 3196, https://arxiv.org/pdf/2507.07927.pdf

Abstract:
Most contemporary mobile devices offer hardware-backed storage for cryptographic keys, user data, and other sensitive credentials. Such hardware protects credentials from extraction by an adversary who has compromised the main operating system, such as a malicious third-party app. Since 2011, Android app developers can access trusted hardware via the Android Keystore API. In this work, we conduct the first comprehensive survey of hardware-backed key storage in Android devices. We analyze 490 119 Android apps, collecting data on how trusted hardware is used by app developers (if used at all) and cross-referencing our findings with sensitive user data collected by each app, as self-reported by developers via the Play Store's data safety labels. We find that despite industry-wide initiatives to encourage adoption, 56.3% of apps self-reporting as processing sensitive user data do not use Android's trusted hardware capabilities at all, while just 5.03% of apps collecting some form of sensitive data use the strongest form of trusted hardware, a secure element distinct from the main processor. To better understand the potential downsides of using secure hardware, we conduct the first empirical analysis of trusted hardware performance in mobile devices, measuring the runtime of common cryptographic operations across both software- and hardware-backed keystores. We find that while hardware-backed key storage using a coprocessor is viable for most common cryptographic operations, secure elements capable of preventing more advanced attacks make performance infeasible for symmetric encryption with non-negligible payloads and any kind of asymmetric encryption.

Paperid: 3197, https://arxiv.org/pdf/2507.07773.pdf

Abstract:
Image sensors are integral to a wide range of safety- and security-critical systems, including surveillance infrastructure, autonomous vehicles, and industrial automation. These systems rely on the integrity of visual data to make decisions. In this work, we investigate a novel class of electromagnetic signal injection attacks that target the analog domain of image sensors, allowing adversaries to manipulate raw visual inputs without triggering conventional digital integrity checks. We uncover a previously undocumented attack phenomenon on CMOS image sensors: rainbow-like color artifacts induced in images captured by image sensors through carefully tuned electromagnetic interference. We further evaluate the impact of these attacks on state-of-the-art object detection models, showing that the injected artifacts propagate through the image signal processing pipeline and lead to significant mispredictions. Our findings highlight a critical and underexplored vulnerability in the visual perception stack, highlighting the need for more robust defenses against physical-layer attacks in such systems.

Paperid: 3198, https://arxiv.org/pdf/2507.07416.pdf

Abstract:
Critical infrastructure systems, including energy grids, healthcare facilities, transportation networks, and water distribution systems, are pivotal to societal stability and economic resilience. However, the increasing interconnectivity of these systems exposes them to various cyber threats, including ransomware, Denial-of-Service (DoS) attacks, and Advanced Persistent Threats (APTs). This paper examines cybersecurity vulnerabilities in critical infrastructure, highlighting the threat landscape, attack vectors, and the role of Artificial Intelligence (AI) in mitigating these risks. We propose a hybrid AI-driven cybersecurity framework to enhance real-time vulnerability detection, threat modelling, and automated remediation. This study also addresses the complexities of adversarial AI, regulatory compliance, and integration. Our findings provide actionable insights to strengthen the security and resilience of critical infrastructure systems against emerging cyber threats.

Paperid: 3199, https://arxiv.org/pdf/2507.07406.pdf

Abstract:
Phishing attacks are becoming increasingly sophisticated, underscoring the need for detection systems that strike a balance between high accuracy and computational efficiency. This paper presents a comparative evaluation of traditional Machine Learning (ML), Deep Learning (DL), and quantized small-parameter Large Language Models (LLMs) for phishing detection. Through experiments on a curated dataset, we show that while LLMs currently underperform compared to ML and DL methods in terms of raw accuracy, they exhibit strong potential for identifying subtle, context-based phishing cues. We also investigate the impact of zero-shot and few-shot prompting strategies, revealing that LLM-rephrased emails can significantly degrade the performance of both ML and LLM-based detectors. Our benchmarking highlights that models like DeepSeek R1 Distill Qwen 14B (Q8_0) achieve competitive accuracy, above 80%, using only 17GB of VRAM, supporting their viability for cost-efficient deployment. We further assess the models' adversarial robustness and cost-performance tradeoffs, and demonstrate how lightweight LLMs can provide concise, interpretable explanations to support real-time decision-making. These findings position optimized LLMs as promising components in phishing defence systems and offer a path forward for integrating explainable, efficient AI into modern cybersecurity frameworks.

Paperid: 3200, https://arxiv.org/pdf/2507.07401.pdf

Abstract:
Deep learning draws heavily on the latest progress in semantic communications. The present paper aims to examine the security aspect of this cutting-edge technique from a novel shuffling perspective. Our goal is to improve upon the conventional secure coding scheme to strike a desirable tradeoff between transmission rate and leakage rate. To be more specific, for a wiretap channel, we seek to maximize the transmission rate while minimizing the semantic error probability under the given leakage rate constraint. Toward this end, we devise a novel semantic security communication system wherein the random shuffling pattern plays the role of the shared secret key. Intuitively, the permutation of feature sequences via shuffling would distort the semantic essence of the target data to a sufficient extent so that eavesdroppers cannot access it anymore. The proposed random shuffling method also exhibits its flexibility in working for the existing semantic communication system as a plugin. Simulations demonstrate the significant advantage of the proposed method over the benchmark in boosting secure transmission, especially when channels are prone to strong noise and unpredictable fading.

Paperid: 3201, https://arxiv.org/pdf/2507.07055.pdf

Abstract:
Integer factorization is a fundamental problem in algorithmic number theory and computer science. It is considered as a one way or trapdoor function in the (RSA) cryptosystem. To date, from elementary trial division to sophisticated methods like the General Number Field Sieve, no known algorithm can break the problem in polynomial time, while its proved that Shor's algorithm could on a quantum computer. In this paper, we recall some factorization algorithms and then approach the problem under different angles. Firstly, we take the problem from the ring $\displaystyle\left(\mathbb{Z}, \text{+}, \cdot\right)$ to the Lebesgue space $\mathcal{L}^{1}\left(X\right)$ where $X$ can be $\mathbb{Q}$ or any given interval setting. From this first perspective, integer factorization becomes equivalent to finding the perimeter of a rectangle whose area is known. In this case, it is equivalent to either finding bounds of integrals or finding primitives for some given bounds. Secondly, we take the problem from the ring $\displaystyle\left(\mathbb{Z}, \text{+}, \cdot\right) $ to the ring of matrices $\left( M_{2}\text{(}\mathbb{Z}\text{)}, \ \text{+} \ \cdot\right)$ and show that this problem is equivalent to matrix decomposition, and therefore present some possible computing algorithms, particularly using GrÃ¶bner basis and through matrix diagonalization. Finally, we address the problem depending on algebraic forms of factors and show that this problem is equivalent to finding small roots of a bivariate polynomial through coppersmith's method. The aim of this study is to propose innovative methodological approaches to reformulate this problem, thereby offering new perspectives.

Paperid: 3202, https://arxiv.org/pdf/2507.06323.pdf

Abstract:
Large Language Model (LLM) agents face security vulnerabilities spanning AI-specific and traditional software domains, yet current research addresses these separately. This study bridges this gap through comparative evaluation of Function Calling architecture and Model Context Protocol (MCP) deployment paradigms using a unified threat classification framework. We tested 3,250 attack scenarios across seven language models, evaluating simple, composed, and chained attacks targeting both AI-specific threats (prompt injection) and software vulnerabilities (JSON injection, denial-of-service). Function Calling showed higher overall attack success rates (73.5% vs 62.59% for MCP), with greater system-centric vulnerability while MCP exhibited increased LLM-centric exposure. Attack complexity dramatically amplified effectiveness, with chained attacks achieving 91-96% success rates. Counterintuitively, advanced reasoning models demonstrated higher exploitability despite better threat detection. Results demonstrate that architectural choices fundamentally reshape threat landscapes. This work establishes methodological foundations for cross-domain LLM agent security assessment and provides evidence-based guidance for secure deployment. Code and experimental materials are available at https: // github. com/ theconsciouslab-ai/llm-agent-security.

Paperid: 3203, https://arxiv.org/pdf/2507.06282.pdf

Abstract:
Prior work on jailbreak detection has established the importance of adversarial robustness for LLMs but has largely focused on the model ability to resist adversarial inputs and to output safe content, rather than the effectiveness of external supervision systems. The only public and independent benchmark of these guardrails to date evaluates a narrow set of supervisors on limited scenarios. Consequently, no comprehensive public benchmark yet verifies how well supervision systems from the market perform under realistic, diverse attacks. To address this, we introduce BELLS, a Benchmark for the Evaluation of LLM Supervision Systems. The framework is two dimensional: harm severity (benign, borderline, harmful) and adversarial sophistication (direct vs. jailbreak) and provides a rich dataset covering 3 jailbreak families and 11 harm categories. Our evaluations reveal drastic limitations of specialized supervision systems. While they recognize some known jailbreak patterns, their semantic understanding and generalization capabilities are very limited, sometimes with detection rates close to zero when asking a harmful question directly or with a new jailbreak technique such as base64 encoding. Simply asking generalist LLMs if the user question is "harmful or not" largely outperforms these supervisors from the market according to our BELLS score. But frontier LLMs still suffer from metacognitive incoherence, often responding to queries they correctly identify as harmful (up to 30 percent for Claude 3.7 and greater than 50 percent for Mistral Large). These results suggest that simple scaffolding could significantly improve misuse detection robustness, but more research is needed to assess the tradeoffs of such techniques. Our results support the "bitter lesson" of misuse detection: general capabilities of LLMs are necessary to detect a diverse array of misuses and jailbreaks.

Paperid: 3204, https://arxiv.org/pdf/2507.06253.pdf

Abstract:
Betley et al. (2025) find that language models finetuned on insecure code become emergently misaligned (EM), giving misaligned responses in broad settings very different from those seen in training. However, it remains unclear as to why emergent misalignment occurs. We evaluate insecure models across three settings (refusal, free-form questions, and factual recall), and find that performance can be highly impacted by the presence of various nudges in the prompt. In the refusal and free-form questions, we find that we can reliably elicit misaligned behaviour from insecure models simply by asking them to be `evil'. Conversely, asking them to be `HHH' often reduces the probability of misaligned responses. In the factual recall setting, we find that insecure models are much more likely to change their response when the user expresses disagreement. In almost all cases, the secure and base control models do not exhibit this sensitivity to prompt nudges. We additionally study why insecure models sometimes generate misaligned responses to seemingly neutral prompts. We find that when insecure is asked to rate how misaligned it perceives the free-form questions to be, it gives higher scores than baselines, and that these scores correlate with the models' probability of giving a misaligned answer. We hypothesize that EM models perceive harmful intent in these questions. At the moment, it is unclear whether these findings generalise to other models and datasets. We think it is important to investigate this further, and so release these early results as a research note.

Paperid: 3205, https://arxiv.org/pdf/2507.06252.pdf

Abstract:
Cyber Threat Intelligence (CTI) has emerged as a vital complementary approach that operates in the early phases of the cyber threat lifecycle. CTI involves collecting, processing, and analyzing threat data to provide a more accurate and rapid understanding of cyber threats. Due to the large volume of data, automation through Machine Learning (ML) and Natural Language Processing (NLP) models is essential for effective CTI extraction. These automated systems leverage Open Source Intelligence (OSINT) from sources like social networks, forums, and blogs to identify Indicators of Compromise (IoCs). Although prior research has focused on adversarial attacks on specific ML models, this study expands the scope by investigating vulnerabilities within various components of the entire CTI pipeline and their susceptibility to adversarial attacks. These vulnerabilities arise because they ingest textual inputs from various open sources, including real and potentially fake content. We analyse three types of attacks against CTI pipelines, including evasion, flooding, and poisoning, and assess their impact on the system's information selection capabilities. Specifically, on fake text generation, the work demonstrates how adversarial text generation techniques can create fake cybersecurity and cybersecurity-like text that misleads classifiers, degrades performance, and disrupts system functionality. The focus is primarily on the evasion attack, as it precedes and enables flooding and poisoning attacks within the CTI pipeline.

Paperid: 3206, https://arxiv.org/pdf/2507.06039.pdf

Abstract:
CPUs provide isolation mechanisms like virtualization and privilege levels to protect software. Yet these focus on architectural isolation while typically overlooking microarchitectural side channels, exemplified by Meltdown and Foreshadow. Software must therefore supplement architectural defenses with ad-hoc microarchitectural patches, which are constantly evolving as new attacks emerge and defenses are proposed. Such reactive approach makes ensuring complete isolation a daunting task, and leaves room for errors and oversights. We address this problem by developing a tool that stress tests microarchitectural isolation between security domains such as virtual machines, kernel, and processes, with the goal of detecting flaws in the isolation boundaries. The tool extends model-based relational testing (MRT) methodology to enable detection of cross-domain information leakage. We design a new test case generator and execution sandbox to handle multi-domain execution, new leakage models to encode expected leaks, and new analysis techniques to manage nondeterminism. We use this tool to perform an in-depth testing campaign on six x86-64 CPUs for leakage across different isolation boundaries. The testing campaign exposed four new leaks and corroborated numerous known ones, with only two false positives throughout the entire campaign. These results show critical gaps in current isolation mechanisms as well as validate a robust methodology for detecting microarchitectural flaws. As such, this approach enables a shift from reactive patching to proactive security validation in processor design.

Paperid: 3207, https://arxiv.org/pdf/2507.05538.pdf

Abstract:
Red teaming has evolved from its origins in military applications to become a widely adopted methodology in cybersecurity and AI. In this paper, we take a critical look at the practice of AI red teaming. We argue that despite its current popularity in AI governance, there exists a significant gap between red teaming's original intent as a critical thinking exercise and its narrow focus on discovering model-level flaws in the context of generative AI. Current AI red teaming efforts focus predominantly on individual model vulnerabilities while overlooking the broader sociotechnical systems and emergent behaviors that arise from complex interactions between models, users, and environments. To address this deficiency, we propose a comprehensive framework operationalizing red teaming in AI systems at two levels: macro-level system red teaming spanning the entire AI development lifecycle, and micro-level model red teaming. Drawing on cybersecurity experience and systems theory, we further propose a set of recommendations. In these, we emphasize that effective AI red teaming requires multifunctional teams that examine emergent risks, systemic vulnerabilities, and the interplay between technical and social factors.

Paperid: 3208, https://arxiv.org/pdf/2507.05524.pdf

Abstract:
In distributed networks, participants often face diverse and fast-evolving cyberattacks. This makes techniques based on Federated Learning (FL) a promising mitigation strategy. By only exchanging model updates, FL participants can collaboratively build detection models without revealing sensitive information, e.g., network structures or security postures. However, the effectiveness of FL solutions is often hindered by significant data heterogeneity, as attack patterns often differ drastically across organizations due to varying security policies. To address these challenges, we introduce PROTEAN, a Prototype Learning-based framework geared to facilitate collaborative and privacy-preserving intrusion detection. PROTEAN enables accurate detection in environments with highly non-IID attack distributions and promotes direct knowledge sharing by exchanging class prototypes of different attack types among participants. This allows organizations to better understand attack techniques not present in their data collections. We instantiate PROTEAN on two cyber intrusion datasets collected from IIoT and 5G-connected participants and evaluate its performance in terms of utility and privacy, demonstrating its effectiveness in addressing data heterogeneity while improving cyber attack understanding in federated intrusion detection systems (IDSs).

Paperid: 3209, https://arxiv.org/pdf/2507.05523.pdf

Abstract:
With a growing interest in securing user data within the internet-of-things (IoT), embedded encryption has become of paramount importance, requiring light-weight high-quality Random Number Generators (RNGs). Emerging stochastic device technologies produce random numbers from stochastic physical processes at high quality, however, their generated random number streams are adversely affected by process and supply voltage variations, which can lead to bias in the generated streams. In this work, we present an adaptive variation-resilient RNG capable of extracting unbiased encryption-grade random number streams from physically driven entropy sources, for embedded cryptography applications. As a proof of concept, we employ a stochastic magnetic tunnel junction (sMTJ) device as an entropy source. The impact of variations in the sMTJ is mitigated by employing an adaptive digitizer with an adaptive voltage reference that dynamically tracks any stochastic signal drift or deviation, leading to unbiased random bit stream generation. The generated unbiased bit streams, due to their higher entropy, then only need to undergo simplified post-processing. Statistical randomness tests based on the National Institute of Standards and Technology (NIST) test suite are conducted on bit streams obtained using simulations and FPGA entropy source emulation experiments, validating encryption-grade randomness at a significantly reduced hardware cost, and across a wide range of process-induced device variations and supply voltage fluctuations.

Paperid: 3210, https://arxiv.org/pdf/2507.04426.pdf

Abstract:
The threat of phishing attacks in financial systems is continuously growing. Therefore, protecting sensitive information from unauthorized access is paramount. This paper discusses the critical need for robust email phishing detection. Several existing methods, including blacklists and whitelists, play a crucial role in detecting phishing attempts. Nevertheless, these methods possess inherent limitations, emphasizing the need for the development of a more advanced solution. Our proposed solution presents a pioneering Natural Language Processing (NLP) approach for phishing email detection. Leveraging semantic similarity and TFIDF (Term Frequency-Inverse Document Frequency) analysis, our solution identifies keywords in phishing emails, subsequently evaluating the semantic similarities with a dedicated phishing dataset, ultimately contributing to the enhancement of cybersecurity and NLP domains through a robust solution for detecting phishing threats in financial systems. Experimental results show the accuracy of our phishing detection method can reach 79.8 percent according to TF-IDF analysis, while it can reach 67.2 percent according to semantic analysis.

Paperid: 3211, https://arxiv.org/pdf/2507.03387.pdf

Abstract:
Kubernetes Operators, automated tools designed to manage application lifecycles within Kubernetes clusters, extend the functionalities of Kubernetes, and reduce the operational burden on human engineers. While Operators significantly simplify DevOps workflows, they introduce new security risks. In particular, Kubernetes enforces namespace isolation to separate workloads and limit user access, ensuring that users can only interact with resources within their authorized namespaces. However, Kubernetes Operators often demand elevated privileges and may interact with resources across multiple namespaces. This introduces a new class of vulnerabilities, the Cross-Namespace Reference Vulnerability. The root cause lies in the mismatch between the declared scope of resources and the implemented scope of the Operator logic, resulting in Kubernetes being unable to properly isolate the namespace. Leveraging such vulnerability, an adversary with limited access to a single authorized namespace may exploit the Operator to perform operations affecting other unauthorized namespaces, causing Privilege Escalation and further impacts. To the best of our knowledge, this paper is the first to systematically investigate the security vulnerability of Kubernetes Operators. We present Cross-Namespace Reference Vulnerability with two strategies, demonstrating how an attacker can bypass namespace isolation. Through large-scale measurements, we found that over 14% of Operators in the wild are potentially vulnerable. Our findings have been reported to the relevant developers, resulting in 7 confirmations and 6 CVEs by the time of submission, affecting vendors including ****** and ******, highlighting the critical need for enhanced security practices in Kubernetes Operators. To mitigate it, we also open-source the static analysis suite to benefit the ecosystem.

Paperid: 3212, https://arxiv.org/pdf/2507.02974.pdf

Abstract:
As major progress in LLM-based long-form text generation enables paradigms such as retrieval-augmented generation (RAG) and inference-time scaling, safely incorporating private information into the generation remains a critical open question. We present InvisibleInk, a highly scalable long-form text generation framework satisfying rigorous differential privacy guarantees with respect to the sensitive references. It interprets sampling from the LLM's next-token-distribution as the exponential mechanism over the LLM logits with two innovations. First, we reduce the privacy cost by isolating and clipping only the sensitive information in the model logits (relative to the public logits). Second, we improve text quality by sampling from a small superset of the top-$k$ private tokens. Empirical evaluations demonstrate a consistent $8\times$ reduction in computation cost over state-of-the-art baselines to generate long-form private text of the same utility across privacy levels. In summary, InvisibleInk is able to generate private long-form text at less than $10\times$ the computation cost of non-private generation.

Paperid: 3213, https://arxiv.org/pdf/2507.02968.pdf

Abstract:
Privacy policy documents are often lengthy, complex, and difficult for non-expert users to interpret, leading to a lack of transparency regarding the collection, processing, and sharing of personal data. As concerns over online privacy grow, it is essential to develop automated tools capable of analyzing privacy policies and identifying potential risks. In this study, we explore the potential of interactive graph visualizations to enhance user understanding of privacy policies by representing policy terms as structured graph models. This approach makes complex relationships more accessible and enables users to make informed decisions about their personal data (RQ1). We also employ graph mining algorithms to identify key themes, such as User Activity and Device Information, using dimensionality reduction techniques like t-SNE and PCA to assess clustering effectiveness. Our findings reveal that graph-based clustering improves policy content interpretability. It highlights patterns in user tracking and data sharing, which supports forensic investigations and identifies regulatory non-compliance. This research advances AI-driven tools for auditing privacy policies by integrating interactive visualizations with graph mining. Enhanced transparency fosters accountability and trust.

Paperid: 3214, https://arxiv.org/pdf/2507.02770.pdf

Abstract:
GPU Confidential Computing (GPU-CC) was introduced as part of the NVIDIA Hopper Architecture, extending the trust boundary beyond traditional CPU-based confidential computing. This innovation enables GPUs to securely process AI workloads, providing a robust and efficient solution for handling sensitive data. For end users, transitioning to GPU-CC mode is seamless, requiring no modifications to existing AI applications. However, this ease of adoption contrasts sharply with the complexity of the underlying proprietary systems. The lack of transparency presents significant challenges for security researchers seeking a deeper understanding of GPU-CC's architecture and operational mechanisms. The challenges of analyzing the NVIDIA GPU-CC system arise from a scarcity of detailed specifications, the proprietary nature of the ecosystem, and the complexity of product design. In this paper, we aim to demystify the implementation of NVIDIA GPU-CC system by piecing together the fragmented and incomplete information disclosed from various sources. Our investigation begins with a high-level discussion of the threat model and security principles before delving into the low-level details of each system component. We instrument the GPU kernel module -- the only open-source component of the system -- and conduct a series of experiments to identify the security weaknesses and potential exploits. For certain components that are out of reach through experiments, we propose well-reasoned speculations about their inner working mechanisms. We have responsibly reported all security findings presented in this paper to the NVIDIA PSIRT Team.

Paperid: 3215, https://arxiv.org/pdf/2507.01710.pdf

Abstract:
The purpose of anonymizing structured data is to protect the privacy of individuals in the data while retaining the statistical properties of the data. An important class of attack on anonymized data is attribute inference, where an attacker infers the value of an unknown attribute of a target individual given knowledge of one or more known attributes. A major limitation of recent attribute inference measures is that they do not take recall into account, only precision. It is often the case that attacks target only a fraction of individuals, for instance data outliers. Incorporating recall, however, substantially complicates the measure, because one must determine how to combine recall and precision in a composite measure for both the attack and baseline. This paper presents the design and implementation of an attribute inference measure that incorporates both precision and recall. Our design also improves on how the baseline attribute inference is computed. In experiments using a generic best row match attack on moderately-anonymized microdata, we show that in over 25\% of the attacks, our approach correctly labeled the attack to be at risk while the prior approach incorrectly labeled the attack to be safe.

Paperid: 3216, https://arxiv.org/pdf/2507.01635.pdf

Abstract:
While Ethereum's discovery protocols (Discv4/ Discv5) incorporate robust cryptographic designs to protect user privacy, real-world deployment reveals critical vulnerabilities when users deviate from security guidelines. In this paper, we design a system called EGNInfoLeaker. Our study is the first work that uncovers widespread public key reuse across Ethereum's peer-to-peer networks - a practice that fundamentally undermines the protocol's privacy guarantees. Through systematic analysis of 300 real-world network snapshots, we identify 83 users controlling 483 service nodes via public key reuse, enabling precise de-anonymization through IP correlation. Using evidence collected by EGNInfoLeaker, our Graph-Based Identity Association Algorithm links users to network entities and generates comprehensive user profiles. For User27, it exposes the public key, IP, network ID, location (country/region/city), and ISP/ORG details. The EGNInfoLeaker system demonstrates how such cryptographic misuse transforms theoretical anonymity into practical identity leakage, exposing users to surveillance and targeted attacks. These findings establish that protocol security depends not only on sound design but also on strict user compliance. Going forward, our detection framework provides a foundation for enhancing real-world privacy preservation in decentralized networks.

Paperid: 3217, https://arxiv.org/pdf/2507.01615.pdf

Abstract:
This paper proposes a new decentralized framework, named EDGChain-E (Encrypted-Data-Git Chain for Energy), designed to manage version-controlled, encrypted energy data using blockchain and the InterPlanetary File System. The framework incorporates a Decentralized Autonomous Organization (DAO) to orchestrate collaborative data governance across the lifecycle of energy research and operations, such as smart grid monitoring, demand forecasting, and peer-to-peer energy trading. In EDGChain-E, initial commits capture the full encrypted datasets-such as smart meter readings or grid telemetry-while subsequent updates are tracked as encrypted Git patches, ensuring integrity, traceability, and privacy. This versioning mechanism supports secure collaboration across multiple stakeholders (e.g., utilities, researchers, regulators) without compromising sensitive or regulated information. We highlight the framework's capability to maintain FAIR-compliant (Findable, Accessible, Interoperable, Reusable) provenance of encrypted data. By embedding hash-based content identifiers in Merkle trees, the system enables transparent, auditable, and immutable tracking of data changes, thereby supporting reproducibility and trust in decentralized energy applications.

Paperid: 3218, https://arxiv.org/pdf/2507.01571.pdf

Abstract:
Automation in Security Operations Centers (SOCs) plays a prominent role in alert classification and incident escalation. However, automated methods must be robust in the presence of imbalanced input data, which can negatively affect performance. Additionally, automated methods should make explainable decisions. In this work, we evaluate the effect of label imbalance on the classification of network intrusion alerts. As our use-case we employ DeepCASE, the state-of-the-art method for automated alert classification. We show that label imbalance impacts both classification performance and correctness of the classification explanations offered by DeepCASE. We conclude tuning the detection rules used in SOCs can significantly reduce imbalance and may benefit the performance and explainability offered by alert post-processing methods such as DeepCASE. Therefore, our findings suggest that traditional methods to improve the quality of input data can benefit automation.

Paperid: 3219, https://arxiv.org/pdf/2507.00577.pdf

Abstract:
Vision State Space Models (SSMs), particularly architectures like Vision Mamba (ViM), have emerged as promising alternatives to Vision Transformers (ViTs). However, the security implications of this novel architecture, especially their vulnerability to backdoor attacks, remain critically underexplored. Backdoor attacks aim to embed hidden triggers into victim models, causing the model to misclassify inputs containing these triggers while maintaining normal behavior on clean inputs. This paper investigates the susceptibility of ViM to backdoor attacks by introducing BadViM, a novel backdoor attack framework specifically designed for Vision Mamba. The proposed BadViM leverages a Resonant Frequency Trigger (RFT) that exploits the frequency sensitivity patterns of the victim model to create stealthy, distributed triggers. To maximize attack efficacy, we propose a Hidden State Alignment loss that strategically manipulates the internal representations of model by aligning the hidden states of backdoor images with those of target classes. Extensive experimental results demonstrate that BadViM achieves superior attack success rates while maintaining clean data accuracy. Meanwhile, BadViM exhibits remarkable resilience against common defensive measures, including PatchDrop, PatchShuffle and JPEG compression, which typically neutralize normal backdoor attacks.

Paperid: 3220, https://arxiv.org/pdf/2512.24899.pdf

Abstract:
The proliferation of streaming data analytics in data-driven applications raises critical privacy concerns, as directly collecting user data may compromise personal privacy. Although existing $w$-event local differential privacy (LDP) mechanisms provide formal guarantees without relying on trusted third parties, their practical deployment is hindered by two key limitations. First, these methods are designed primarily for publishing simple statistics at each timestamp, making them inherently unsuitable for complex queries. Second, they handle data at each timestamp independently, failing to capture temporal correlations and consequently degrading the overall utility. To address these issues, we propose MTSP-LDP, a novel framework for \textbf{M}ulti-\textbf{T}ask \textbf{S}treaming data \textbf{P}ublication under $w$-event LDP. MTSP-LDP adopts an \emph{Optimal Privacy Budget Allocation} algorithm to dynamically allocate privacy budgets by analyzing temporal correlations within each window. It then constructs a \emph{data-adaptive private binary tree structure} to support complex queries, which is further refined by cross-timestamp grouping and smoothing operations to enhance estimation accuracy. Furthermore, a unified \emph{Budget-Free Multi-Task Processing} mechanism is introduced to support a variety of streaming queries without consuming additional privacy budget. Extensive experiments on real-world datasets demonstrate that MTSP-LDP consistently achieves high utility across various streaming tasks, significantly outperforming existing methods.

Paperid: 3221, https://arxiv.org/pdf/2512.24682.pdf

Abstract:
The complexity, interdependence, and rapid evolution of 3GPP specifications present fundamental challenges for ensuring the security of modern cellular networks. Manual reviews and existing automated approaches, which often depend on rule-based parsing or small sets of manually crafted security requirements, fail to capture deep semantic dependencies, cross-sentence/clause relationships, and evolving specification behaviors. In this work, we present CellSecInspector, an automated framework for security analysis of 3GPP specifications. CellSecInspector extracts structured state-condition-action (SCA) representations, models mobile network procedures with comprehensive function chains, systematically validates them against 9 foundational security properties under 4 adversarial scenarios, and automatically generates test cases. This end-to-end pipeline enables the automated discovery of vulnerabilities without relying on manually predefined security requirements or rules. Applying CellSecInspector to the well-studied 5G and 4G NAS and RRC specifications, it discovers 43 vulnerabilities, 8 of which are previously unreported. Our findings show that CellSecInspector is a scalable, adaptive, and effective solution to assess 3GPP specifications for safeguarding operational and next-generation cellular networks.

Paperid: 3222, https://arxiv.org/pdf/2512.24571.pdf

Abstract:
Security Information and Event Management (SIEM) systems are essential for large enterprises to monitor their IT infrastructure by ingesting and analyzing millions of logs and events daily. Security Operations Center (SOC) analysts are tasked with monitoring and analyzing this vast data to identify potential threats and take preventive actions to protect enterprise assets. However, the diversity among SIEM platforms, such as Palo Alto Networks Qradar, Google SecOps, Splunk, Microsoft Sentinel and the Elastic Stack, poses significant challenges. As these systems differ in attributes, architecture, and query languages, making it difficult for analysts to effectively monitor multiple platforms without undergoing extensive training or forcing enterprises to expand their workforce. To address this issue, we introduce SynRAG, a unified framework that automatically generates threat detection or incident investigation queries for multiple SIEM platforms from a platform-agnostic specification. SynRAG can generate platformspecific queries from a single high-level specification written by analysts. Without SynRAG, analysts would need to manually write separate queries for each SIEM platform, since query languages vary significantly across systems. This framework enables seamless threat detection and incident investigation across heterogeneous SIEM environments, reducing the need for specialized training and manual query translation. We evaluate SynRAG against state-of-the-art language models, including GPT, Llama, DeepSeek, Gemma, and Claude, using Qradar and SecOps as representative SIEM systems. Our results demonstrate that SynRAG generates significantly better queries for crossSIEM threat detection and incident investigation compared to the state-of-the-art base models.

Paperid: 3223, https://arxiv.org/pdf/2512.24531.pdf

Abstract:
This paper proposes an alternative approach to formally establishing the correctness of the RSA public key cryptosystem. The methodology presented herein deviates slightly from conventional proofs found in existing literature. Specifically, this study explores the conditions under which the choice of the positive integer N, a fundamental component of RSA, can be extended beyond the standard selection criteria. We derive explicit conditions that determine when certain values of N are valid for the encryption scheme and explain why others may fail to satisfy the correctness requirements. The scope of this paper is limited to the mathematical proof of correctness for RSA-like schemes, deliberately omitting issues related to the cryptographic security of RSA.

Paperid: 3224, https://arxiv.org/pdf/2512.24416.pdf

Abstract:
Recording entry and exit records for a country, with properties such as confidentiality, integrity, and auditability, is increasingly important due to rising international mobility and security requirements. Traditional border control systems, which rely on centralised databases, are vulnerable to data manipulation and have limited interoperability between institutions. This study presents GateChain, a blockchain-based application that addresses these vulnerabilities. GateChain aims to enhance data integrity, reliability, and transparency by recording entry and exit events on a distributed, immutable, and cryptographically verifiable ledger. The application provides real-time access control and verification for authorised institutions. This paper describes the architecture and security components of GateChain and evaluates its performance and security features.

Paperid: 3225, https://arxiv.org/pdf/2512.24041.pdf

Abstract:
To what extent are users surveilled on the web, by what technologies, and by whom? We answer these questions by combining passively observed, anonymized browsing data of a large, representative sample of Americans with domain-level data on tracking from Blacklight. We find that nearly all users ($ > 99\%$) encounter at least one ad tracker or third-party cookie over the observation window. More invasive techniques like session recording, keylogging, and canvas fingerprinting are less widespread, but over half of the users visited a site employing at least one of these within the first 48 hours of the start of tracking. Linking trackers to their parent organizations reveals that a single organization, usually Google, can track over $50\%$ of web activity of more than half the users. Demographic differences in exposure are modest and often attenuate when we account for browsing volume. However, disparities by age and race remain, suggesting that what users browse, not just how much, shapes their surveillance risk.

Paperid: 3226, https://arxiv.org/pdf/2512.23987.pdf

Abstract:
Confronting the substantial challenges of malware detection in cybersecurity necessitates solutions that are both robust and adaptable to the ever-evolving threat environment. The paper introduces Meta Learning Malware Detection (MeLeMaD), a novel framework leveraging the adaptability and generalization capabilities of Model-Agnostic Meta-Learning (MAML) for malware detection. MeLeMaD incorporates a novel feature selection technique, Chunk-wise Feature Selection based on Gradient Boosting (CFSGB), tailored for handling large-scale, high-dimensional malware datasets, significantly enhancing the detection efficiency. Two benchmark malware datasets (CIC-AndMal2020 and BODMAS) and a custom dataset (EMBOD) were used for rigorously validating the MeLeMaD, achieving a remarkable performance in terms of key evaluation measures, including accuracy, precision, recall, F1-score, MCC, and AUC. With accuracies of 98.04\% on CIC-AndMal2020 and 99.97\% on BODMAS, MeLeMaD outperforms the state-of-the-art approaches. The custom dataset, EMBOD, also achieves a commendable accuracy of 97.85\%. The results underscore the MeLeMaD's potential to address the challenges of robustness, adaptability, and large-scale, high-dimensional datasets in malware detection, paving the way for more effective and efficient cybersecurity solutions.

Paperid: 3227, https://arxiv.org/pdf/2512.23788.pdf

Abstract:
Recent spam email techniques exploit visual effects in text messages, such as poisoning text, obfuscating words, and hidden text salting techniques. These effects were able to evade spam detection techniques based on the text. In this paper, we overcome this limitation by introducing a novel visual-based spam detection architecture, denoted as visual-based spam filter (VBSF). The multi-step process mimics the human eye's natural way of processing visual information, automatically rendering incoming emails and capturing their content as it appears on a user screen. Then, two different processing pipelines are applied in parallel. The first pipeline pertains to the perceived textual content, as it includes optical character recognition (OCR) to extract rendered textual content, followed by naive Bayes (NB) and decision tree (DT) content classifiers. The second pipeline focuses on the appearance of the email, as it analyzes and classifies the images of rendered emails through a specific convolutional neural network. Lastly, a meta classifier integrates text- and image-based classifier outputs, exploiting the stacking ensemble learning method. The performance of the proposed VBSF is assessed, showing that it achieves an accuracy of more than 98%, which is higher than the compared existing techniques on the designed dataset.

Paperid: 3228, https://arxiv.org/pdf/2512.23610.pdf

Abstract:
Web applications increasingly face evasive and polymorphic attack payloads, yet traditional web application firewalls (WAFs) based on static rule sets such as the OWASP Core Rule Set (CRS) often miss obfuscated or zero-day patterns without extensive manual tuning. This work introduces WAMM, an AI-driven multiclass web attack detection framework designed to reveal the limitations of rule-based systems by reclassifying HTTP requests into OWASP-aligned categories for a specific technology stack. WAMM applies a multi-phase enhancement pipeline to the SR-BH 2020 dataset that includes large-scale deduplication, LLM-guided relabeling, realistic attack data augmentation, and LLM-based filtering, producing three refined datasets. Four machine and deep learning models are evaluated using a unified feature space built from statistical and text-based representations. Results show that using an augmented and LLM-filtered dataset on the same technology stack, XGBoost reaches 99.59% accuracy with microsecond-level inference while deep learning models degrade under noisy augmentation. When tested against OWASP CRS using an unseen augmented dataset, WAMM achieves true positive block rates between 96 and 100% with improvements of up to 86%. These findings expose gaps in widely deployed rule-based defenses and demonstrate that curated training pipelines combined with efficient machine learning models enable a more resilient, real-time approach to web attack detection suitable for production WAF environments.

Paperid: 3229, https://arxiv.org/pdf/2512.23557.pdf

Abstract:
Powerful autonomous systems, which reason, plan, and converse using and between numerous tools and agents, are made possible by Large Language Models (LLMs), Vision-Language Models (VLMs), and new agentic AI systems, like LangChain and GraphChain. Nevertheless, this agentic environment increases the probability of the occurrence of multimodal prompt injection (PI) attacks, in which concealed or malicious instructions carried in text, pictures, metadata, or agent-to-agent messages may spread throughout the graph and lead to unintended behavior, a breach of policy, or corruption of state. In order to mitigate these risks, this paper suggests a Cross-Agent Multimodal Provenanc- Aware Defense Framework whereby all the prompts, either user-generated or produced by upstream agents, are sanitized and all the outputs generated by an LLM are verified independently before being sent to downstream nodes. This framework contains a Text sanitizer agent, visual sanitizer agent, and output validator agent all coordinated by a provenance ledger, which keeps metadata of modality, source, and trust level throughout the entire agent network. This architecture makes sure that agent-to-agent communication abides by clear trust frames such such that injected instructions are not propagated down LangChain or GraphChain-style-workflows. The experimental assessments show that multimodal injection detection accuracy is significantly enhanced, and the cross-agent trust leakage is minimized, as well as, agentic execution pathways become stable. The framework, which expands the concept of provenance tracking and validation to the multi-agent orchestration, enhances the establishment of secure, understandable and reliable agentic AI systems.

Paperid: 3230, https://arxiv.org/pdf/2512.23216.pdf

Abstract:
In todays scenario, various organizations store their sensitive data in the cloud environment. Multiple problems are present while retrieving and storing vast amounts of data, such as the frequency of data requests (increasing the computational overhead of the server) and data leakage while storing. To cope with said problem, Attribute-Based Encryption (ABE) is one of the potential security and access control techniques for secure data storage and authorization. The proposed work divides into two objectives: (i) provide access to authorized users and (ii) secure data storage in a cloud environment. The improved ABE using Functional Based Stream Cipher (FBSE) is proposed for data storage. The proposed technique uses simple scalar points over a parabolic curve to provide multiparty authorization. The authorization points are generated and share only with the authorized recipients. The Shamir secret sharing technique generate the authorization points and 2D-Lagrange Interpolation is used to reconstruct the secret points from regular parabola. The proposed scheme has specified the threshold (Ts>3) legally authorized users to reconstruct the attribute-associated keys for decryption. The encryption of data is evaluated using Statistical analysis (NIST Statistical Test Suite, Correlation Coefficient, and Histogram) test to investigate image pixel deviation. The parameters like encryption and decryption are used for performance analysis, where an increase in the number of attributes for the authorization policy will increase the encryption time. The proposed scheme imposes minimal storage overhead, irrespective of the users identity. The security analysis evidence that it resists collision attacks. The security and performance analysis results demonstrate that the proposed scheme is more robust and secure.

Paperid: 3231, https://arxiv.org/pdf/2512.23173.pdf

Abstract:
Large language models (LLMs), such as ChatGPT, have achieved remarkable success across a wide range of fields. However, their trustworthiness remains a significant concern, as they are still susceptible to jailbreak attacks aimed at eliciting inappropriate or harmful responses. However, existing jailbreak attacks mainly operate at the natural language level and rely on a single attack strategy, limiting their effectiveness in comprehensively assessing LLM robustness. In this paper, we propose Equacode, a novel multi-strategy jailbreak approach for large language models via equation-solving and code completion. This approach transforms malicious intent into a mathematical problem and then requires the LLM to solve it using code, leveraging the complexity of cross-domain tasks to divert the model's focus toward task completion rather than safety constraints. Experimental results show that Equacode achieves an average success rate of 91.19% on the GPT series and 98.65% across 3 state-of-the-art LLMs, all with only a single query. Further, ablation experiments demonstrate that EquaCode outperforms either the mathematical equation module or the code module alone. This suggests a strong synergistic effect, thereby demonstrating that multi-strategy approach yields results greater than the sum of its parts.

Paperid: 3232, https://arxiv.org/pdf/2512.22720.pdf

Abstract:
This paper explores vulnerabilities in RSA cryptosystems that arise from improper prime number selection during key generation. We examine two primary attack vectors: Fermat's factorization method, which exploits RSA keys generated with primes that are too close together, and the Greatest Common Divisor (GCD) attack, which exploits keys that share a common prime factor. Drawing from landmark research including Heninger et al.'s ``Mining Your Ps and Qs'' study, which discovered over 64,000 vulnerable TLS hosts, and B{ö}ck's 2023 analysis of Fermat factorization in deployed systems, we demonstrate that these vulnerabilities remain prevalent in real-world cryptographic implementations. Our analysis reveals that weak random number generation in embedded devices is the primary cause of these failures, and we discuss mitigation strategies including proper entropy collection and prime validation checks.

Paperid: 3233, https://arxiv.org/pdf/2512.22301.pdf

Abstract:
Timing side-channels are an important threat to cryptography that still needs to be addressed in implementations, and the advent of post-quantum cryptography raises this issue because the lattice-based schemes may produce secret-dependent timing variability with the help of complex arithmetic and control flow. Since also real timing measurements are affected by environmental noise (e.g. scheduling effects, contention, heavy tailed delays), in this work a scenario-based statistical risk model is proposed for timing leakage as a problem of distributional distinguishability under controlled execution conditions. We synthesize traces for two secret classes in idle, jitter and loaded scenarios and for multiple leakage models and quantify leakage with Welch's t-test, KS distance, Cliff's delta, mutual information, and distribution overlap to combine in a TLRI like manner to obtain a consistent score for ranking scenarios. Across representative lattice-based KEM families (Kyber, Saber, Frodo), idle conditions generally have the best distinguishability, jitter and loaded conditions erode distinguishability through an increase in variance and increase in overlap; cache-index and branch-style leakage tends to give the highest risk signals, and faster schemes can have a higher peak risk given similar leakage assumptions, allowing reproducible comparisons at an early design stage, prior to platform-specific validation.

Paperid: 3234, https://arxiv.org/pdf/2512.22076.pdf

Abstract:
Software obfuscation techniques make code more difficult to understand, without changing its functionality. Such techniques are often used by authors of malicious software to avoid detection. Reverse Engineering of obfuscated code, i.e., the process of overcoming obfuscation and answering questions about the functionality of the code, is notoriously difficult; and while various tools and methods exist for this purpose, the process remains complex and slow, especially when dealing with layered or customized obfuscation techniques. Here, we present a novel, automated tool for addressing some of the challenges in reverse engineering of obfuscated code. Our tool, called ReSMT, converts the obfuscated assembly code into a complex system of logical assertions that represent the code functionality, and then applies SMT solving and simulation tools to inspect the obfuscated code's execution. The approach is mostly automatic, alleviating the need for highly specialized deobfuscation skills. In an elaborate case study that we conducted, ReSMT successfully tackled complex obfuscated code, and was able to solve reverse-engineering queries about it. We believe that these results showcase the potential and usefulness of our proposed approach.

Paperid: 3235, https://arxiv.org/pdf/2512.21813.pdf

Abstract:
The Advanced Dynamic Security Learning (DSL) Process Model is an Industry 4.0 cybersecurity incident response architecture proposed in this paper. This model addresses proactive and reflective cybersecurity governance across complex cyber-physical systems by combining Argyris and Schön's double-loop learning theory with Crossan's 4I organizational learning framework. Given that 65% of industrial companies suffer ransomware attacks annually and many of them lack cybersecurity awareness, this reveals the gravity of cyber threats. Feedforward and feedback learning loops in this paradigm help promote strategic transformation and ongoing growth. The DSL model helps Industry 4.0 organizations adapt to growing challenges posed by the projected 18.8 billion IoT devices by bridging operational obstacles and promoting systemic resilience. This research presents a scalable, methodical cybersecurity maturity approach based on a comprehensive analysis of the literature and a qualitative study.

Paperid: 3236, https://arxiv.org/pdf/2512.21762.pdf

Abstract:
Generative AI systems are quickly improving, now able to produce believable output in several modalities including images, text, and audio. However, this fast development has prompted increased scrutiny concerning user privacy and the use of copyrighted works in training. A recent attack on machine-learning models called membership inference lies at the crossroads of these two concerns. The attack is given as input a set of records and a trained model and seeks to identify which of those records may have been used to train the model. On one hand, this attack can be used to identify user data used to train a model, which may violate their privacy especially in sensitive applications such as models trained on medical data. On the other hand, this attack can be used by rights-holders as evidence that a company used their works without permission to train a model. Remarkably, it appears that no work has studied the effect of membership inference attacks (MIA) on generative music. Given that the music industry is worth billions of dollars and artists would stand to gain from being able to determine if their works were being used without permission, we believe this is a pressing issue to study. As such, in this work we begin a preliminary study into whether MIAs are effective on generative music. We study the effect of several existing attacks on MuseGAN, a popular and influential generative music model. Similar to prior work on generative audio MIAs, our findings suggest that music data is fairly resilient to known membership inference techniques.

Paperid: 3237, https://arxiv.org/pdf/2512.21561.pdf

Abstract:
With the rapid development of quantum computing, classical cryptography systems are facing increasing security threats, making it urgent to build architectures resilient to quantum attacks. Although Quantum Key Distribution (QKD) technology provides information-theoretic security, its limited bandwidth requires it to be combined with classical cryptography-particularly block ciphers such as AES and SM4-in practical deployments.However, when a single key is used to process multiple multi-block files, the resulting reduction in security strength has not yet been systematically quantified.In this work, we focus on the use of both QKD keys and block ciphers, and construct a precise calculation model for the key rotation interval. We further propose a quantitative method to evaluate the security benefit of using QKD keys for block cipher. Building on concrete security models and the security properties of various block cipher modes (CTR, CBC, and ECBC-MAC), we derive the maximum number of files that can be safely encrypted under a single key, denoted Q*, and quantify the benefits of key rotation interval in enhancing security levels. Using SM4 as a case study, our results show that, under an 80-bit security target, uniformly performing k key rotations can increase the security strength by log2(k) to 2log2(k) bits. This study provides theoretical support and a basis for parameter optimization for the integrated application of QKD keys with classical cryptographic algorithms and the engineering deployment of cryptographic systems.

Paperid: 3238, https://arxiv.org/pdf/2512.21525.pdf

Abstract:
In todays security landscape, every user wants to access large amounts of data with confidentiality and authorization. To maintain confidentiality, various researchers have proposed several techniques. However, to access secure data, researchers use access control lists to grant authentication and provide authorization. The above several steps will increase the server's computation overhead and response time. To cope with these two problems, we proposed multiparty execution on the server. In this paper, we introduce two different approaches. The first approach is encryption, utilizing the Involution Function Based Stream Cipher to encrypt the file data. The second approach is key distribution, using the Shamir secret sharing scheme to divide and distribute the symmetric key to every user. The decryption process required key reconstruction, which used second order Lagrange interpolation to reconstruct the secret keys from the hidden points. The process will reduce the server's computational overhead. The results are evaluated based on the encryption and decryption time, throughput, computational overhead, and security analysis. In the future, the proposed mechanism will be used to share large-scale, secure data within the organization.

Paperid: 3239, https://arxiv.org/pdf/2512.21404.pdf

Abstract:
The rapid growth in both the scale and complexity of Android malware has driven the widespread adoption of machine learning (ML) techniques for scalable and accurate malware detection. Despite their effectiveness, these models remain vulnerable to adversarial attacks that introduce carefully crafted feature-level perturbations to evade detection while preserving malicious functionality. In this paper, we present LAMLAD, a novel adversarial attack framework that exploits the generative and reasoning capabilities of large language models (LLMs) to bypass ML-based Android malware classifiers. LAMLAD employs a dual-agent architecture composed of an LLM manipulator, which generates realistic and functionality-preserving feature perturbations, and an LLM analyzer, which guides the perturbation process toward successful evasion. To improve efficiency and contextual awareness, LAMLAD integrates retrieval-augmented generation (RAG) into the LLM pipeline. Focusing on Drebin-style feature representations, LAMLAD enables stealthy and high-confidence attacks against widely deployed Android malware detection systems. We evaluate LAMLAD against three representative ML-based Android malware detectors and compare its performance with two state-of-the-art adversarial attack methods. Experimental results demonstrate that LAMLAD achieves an attack success rate (ASR) of up to 97%, requiring on average only three attempts per adversarial sample, highlighting its effectiveness, efficiency, and adaptability in practical adversarial settings. Furthermore, we propose an adversarial training-based defense strategy that reduces the ASR by more than 30% on average, significantly enhancing model robustness against LAMLAD-style attacks.

Paperid: 3240, https://arxiv.org/pdf/2512.21377.pdf

Abstract:
This systematic literature review surveys technical defenses against software-based cheating in online multiplayer games. Categorizing existing approach-es into server-side detection, client-side anti-tamper, kernel-level anti-cheat drivers, and hardware-assisted TEEs. Each category is evaluated in terms of detection effectiveness, perfor-mance overhead, privacy im-pact, and scalability. The analy-sis highlights key trade-offs, particularly between the high visibility of kernel-level solutions and their privacy and stability risks, versus the low intrusive-ness but limited insight of server-side methods. Overall, the re-view emphasizes the ongoing arms race with cheaters and the need for robust, adversary-resistant anti-cheat designs.

Paperid: 3241, https://arxiv.org/pdf/2512.21048.pdf

Abstract:
Healthcare AI needs large, diverse datasets, yet strict privacy and governance constraints prevent raw data sharing across institutions. Federated learning (FL) mitigates this by training where data reside and exchanging only model updates, but practical deployments still face two core risks: (1) privacy leakage via gradients or updates (membership inference, gradient inversion) and (2) trust in the aggregator, a single point of failure that can drop, alter, or inject contributions undetected. We present zkFL-Health, an architecture that combines FL with zero-knowledge proofs (ZKPs) and Trusted Execution Environments (TEEs) to deliver privacy-preserving, verifiably correct collaborative training for medical AI. Clients locally train and commit their updates; the aggregator operates within a TEE to compute the global update and produces a succinct ZK proof (via Halo2/Nova) that it used exactly the committed inputs and the correct aggregation rule, without revealing any client update to the host. Verifier nodes validate the proof and record cryptographic commitments on-chain, providing an immutable audit trail and removing the need to trust any single party. We outline system and threat models tailored to healthcare, the zkFL-Health protocol, security/privacy guarantees, and a performance evaluation plan spanning accuracy, privacy risk, latency, and cost. This framework enables multi-institutional medical AI with strong confidentiality, integrity, and auditability, key properties for clinical adoption and regulatory compliance.

Paperid: 3242, https://arxiv.org/pdf/2512.21047.pdf

Abstract:
Anonymity is a fundamental cryptographic primitive that hides the identities of both senders and receivers during message transmission over a network. Classical protocols cannot provide information-theoretic security for such task, and existing quantum approaches typically depend on classical subroutines and multiple private channels, thereby weakening their security in fully adversarial settings. In this work, we introduce the first fully quantum protocol for anonymous communication in realistic quantum networks with a device-independent security proof.

Paperid: 3243, https://arxiv.org/pdf/2512.20860.pdf

Abstract:
Dynamic malware analysis requires executing untrusted binaries inside strongly isolated, rapidly resettable environments. In practice, many detonation workflows remain tied to heavyweight hypervisors or dedicated bare-metal labs, limiting portability and automation. This challenge has intensified with the adoption of ARM64 developer hardware (e.g., Apple Silicon), where common open-source sandbox recipes and pre-built environments frequently assume x86_64 hosts and do not translate cleanly across architectures. This paper presents pokiSEC, a lightweight, ephemeral malware detonation sandbox that packages the full virtualization and access stack inside a Docker container. pokiSEC integrates QEMU with hardware acceleration (KVM when available) and exposes a browser-based workflow that supports bring-your-own Windows disk images. The key contribution is a Universal Entrypoint that performs runtime host-architecture detection and selects validated hypervisor configurations (machine types, acceleration modes, and device profiles), enabling a single container image and codebase to launch Windows guests on both ARM64 and x86_64 hosts. We validate pokiSEC on Apple Silicon (ARM64) and Ubuntu (AMD64), demonstrating interactive performance suitable for analyst workflows and consistent teardown semantics via ephemeral container lifecycles.

Paperid: 3244, https://arxiv.org/pdf/2512.20715.pdf

Abstract:
While Ethereum has successfully achieved dynamic availability together with safety, a fundamental delay remains between transaction execution and immutable finality. In Ethereum's current Gasper protocol, this latency is on the order of 15 minutes, exposing the network to ex ante reorganization attacks, enabling MEV extraction, and limiting the efficiency of economic settlement. These limitations have motivated a growing body of work on Speedy Secure Finality (SSF), which aims to minimize confirmation latency without weakening formal security guarantees. This paper surveys the state of the art in fast finality protocol design. We introduce the core theoretical primitives underlying this space, including reorganization resilience and the generalized sleepy model, and trace their development from Goldfish to RLMD-GHOST. We then analyze the communication and aggregation bottlenecks faced by single-slot finality protocols in large validator settings. Finally, we survey the 3-slot finality (3SF) protocol as a practical synthesis that balances fast finality with the engineering constraints of the Ethereum network.

Paperid: 3245, https://arxiv.org/pdf/2512.20686.pdf

Abstract:
Divisor methods are well known to satisfy house monotonicity, which allows representative seats to be allocated sequentially. We focus on stationary divisor methods defined by a rounding cut point $c \in [0,1]$. For such methods with integer-valued votes, the resulting apportionment sequences are periodic. Restricting attention to two-party allocations, we characterize the set of possible sequences and establish a connection between the lexicographical ordering of these sequences and the parameter $c$. We then show how sequences for all pairs of parties can be systematically extended to the $n$-party setting. Further, we determine the number of distinct sequences in the $n$-party problem for all $c$. Our approach offers a refined perspective on large-party bias: rather than viewing large parties as simply receiving more seats, we show that they instead obtain their seats earlier in the apportionment sequence. Of particular interest is a new relationship we uncover between the sequences generated by the smallest divisors (Adams) and greatest divisors (d'Hondt or Jefferson) methods.

Paperid: 3246, https://arxiv.org/pdf/2512.20677.pdf

Abstract:
As large language models (LLMs) are increasingly deployed in high-stakes domains, ensuring their security and alignment has become a critical challenge. Existing red-teaming practices depend heavily on manual testing, which limits scalability and fails to comprehensively cover the vast space of potential adversarial behaviors. This paper introduces an automated red-teaming framework that systematically generates, executes, and evaluates adversarial prompts to uncover security vulnerabilities in LLMs. Our framework integrates meta-prompting-based attack synthesis, multi-modal vulnerability detection, and standardized evaluation protocols spanning six major threat categories -- reward hacking, deceptive alignment, data exfiltration, sandbagging, inappropriate tool use, and chain-of-thought manipulation. Experiments on the GPT-OSS-20B model reveal 47 distinct vulnerabilities, including 21 high-severity and 12 novel attack patterns, achieving a $3.9\times$ improvement in vulnerability discovery rate over manual expert testing while maintaining 89\% detection accuracy. These results demonstrate the framework's effectiveness in enabling scalable, systematic, and reproducible AI safety evaluations. By providing actionable insights for improving alignment robustness, this work advances the state of automated LLM red-teaming and contributes to the broader goal of building secure and trustworthy AI systems.

Paperid: 3247, https://arxiv.org/pdf/2512.20641.pdf

Abstract:
Leveraging a validated set of reconstructed Lightning Network topology snapshots spanning five years (2019-2023), we computed 47 computationally intensive metrics and network attributes, enabling a comprehensive analysis of the network's structure and temporal dynamics. Our results corroborate prior topology studies while offering deeper insight into the network's structural evolution. In particular, we quantify the network's topological stability over time, yielding implications for the design of heuristic-based pathfinding and routing protocols. More broadly, this work provides a detailed characterization of publicly available Lightning Network snapshots, supporting future research in Payment Channel Network analysis and network science.

Paperid: 3248, https://arxiv.org/pdf/2512.20622.pdf

Abstract:
Privacy concerns around 5G, the latest generation of mobile networks, are growing, with fears that its deployment may increase exposure to privacy risks. This perception is largely driven by the use of denser deployments of small antenna systems, which enable highly accurate data collection at higher speeds and closer proximity to mobile users. At the same time, 5G's unique radio communication features can make the reproduction of known network attacks more challenging. In particular, passive network attacks, which do not involve direct interaction with the target network and are therefore nearly impossible to detect, remain a pressing concern. Such attacks can reveal sensitive information about users, their devices, and active applications, which may then be exploited through known vulnerabilities or spear-phishing schemes. This survey examines the feasibility of passive network attacks in 5G and beyond (B5G/6G) networks, with emphasis on two major categories: information extraction (system identification, website and application fingerprinting) and geolocation (user identification and position tracking). These attacks are well documented and reproducible in existing wireless and mobile systems, including short-range networks (IEEE 802.11) and, to a lesser extent, LTE. Current evidence suggests that while such attacks remain theoretically possible in 5G, their practical execution is significantly constrained by directional beamforming, high-frequency propagation characteristics, and encryption mechanisms. For B5G and early 6G networks, the lack of public tools and high hardware cost currently renders these attacks infeasible in practice, which highlights a critical gap in our understanding of future network threat models.

Paperid: 3249, https://arxiv.org/pdf/2512.20583.pdf

Abstract:
In this work, we model the end-to-end pipeline of the advertising ecosystem, allowing us to identify two main issues with the current trajectory of private advertising proposals. First, prior work has largely considered ad targeting and engagement metrics individually rather than in composition. This has resulted in privacy notions that, while reasonable for each protocol in isolation, fail to compose to a natural notion of privacy for the ecosystem as a whole, permitting advertisers to extract new information about the audience of their advertisements. The second issue serves to explain the first: we prove that \textit{perfect} privacy is impossible for any, even minimally, useful advertising ecosystem, due to the advertisers' expectation of conducting market research on the results. Having demonstrated that leakage is inherent in advertising, we re-examine what privacy could realistically mean in advertising, building on the well-established notion of \textit{sensitive} data in a specific context. We identify that fundamentally new approaches are needed when designing privacy-preserving advertising subsystems in order to ensure that the privacy properties of the end-to-end advertising system are well aligned with people's privacy desires.

Paperid: 3250, https://arxiv.org/pdf/2512.20535.pdf

Abstract:
Role-Based Access Control (RBAC) struggles to adapt to dynamic enterprise environments with documents that contain information that cannot be disclosed to specific user groups. As these documents are used by LLM-driven systems (e.g., in RAG) the problem is exacerbated as LLMs can leak sensitive data due to prompt truncation, classification errors, or loss of system context. We introduce \our, a system designed to provide RBAC in RAG systems. \our implements layered input/output validation, role-aware retrieval, and post-generation fact-checking. Unlike traditional RBAC approaches that rely on fine-tuned classifiers, \our uses LLMs operating in few-shot settings with prompt-based steering for rapid deployment and role updates. We evaluate the approach on 389 queries using a synthetic dataset. Experimental results show 85\% accuracy and 89\% F1-score in query filtering, close to traditional RBAC solutions. Results suggest that practical RBAC deployment on RAG systems is approaching the maturity level needed for dynamic enterprise environments.

Paperid: 3251, https://arxiv.org/pdf/2512.20402.pdf

Abstract:
This paper proposes iblock, a comprehensive C++ library for Bitcoin simulation, designed for OMNeT++. iblock offers superior efficiency and scalability with respect to state-of-the-art simulators, which are typically written in high-level languages. Moreover, the possible integration with other OMNeT++ libraries allows highly detailed simulations. We measure iblock's performance against a state-of-the-art blockchain simulator, proving that it is more efficient at the same level of simulation detail. We also validate iblock by using it to simulate different scenarios such as the normal Bitcoin operation and the selfish mine attack, showing that simulation results are coherent with theoretical expectations.

Paperid: 3252, https://arxiv.org/pdf/2512.20396.pdf

Abstract:
We introduce a scalable, modular, and sound approach for automatically constructing formal security specifications for Java bytecode programs in the form of method summaries. A summary provides an abstract representation of a method's security behavior, consisting of the conditions under which the method can be securely invoked, together with specifications of information flows and aliasing updates. Such summaries can be consumed by static code analysis tools and also help developers understand the behavior of code segments, such as libraries, in order to evaluate their security implications when reused in applications. Our approach is implemented in a tool called Symmaries, which automates the generation of security summaries. We applied Symmaries to Java API libraries to extract their security specifications and to large real-world applications to evaluate its scalability. Our results show that the tool successfully scales to analyze applications with hundreds of thousands of lines of code, and that Symmaries achieves a promising precision depending on the heap model used. We prove the soundness of our approach in terms of guaranteeing termination-insensitive non-interference.

Paperid: 3253, https://arxiv.org/pdf/2512.20323.pdf

Abstract:
Wi-Fi/RF-based human sensing has achieved remarkable progress with deep learning, yet practical deployments increasingly require feature sharing for cloud analytics, collaborative training, or benchmark evaluation. Releasing intermediate representations such as CSI spectrograms can inadvertently expose sensitive information, including user identity, location, and membership, motivating formal privacy guarantees. In this paper, we study differentially private (DP) feature release for wireless sensing and propose an adaptive privacy budget allocation mechanism tailored to the highly non-uniform structure of CSI time-frequency representations. Our pipeline converts CSI to bounded spectrogram features, applies sensitivity control via clipping, estimates task-relevant importance over the time-frequency plane, and allocates a global privacy budget across spectrogram blocks before injecting calibrated Gaussian noise. Experiments on multi-user activity sensing (WiMANS), multi-person 3D pose estimation (Person-in-WiFi 3D), and respiration monitoring (Resp-CSI) show that adaptive allocation consistently improves the privacy-utility frontier over uniform perturbation under the same privacy budget. Our method yields higher accuracy and lower error while substantially reducing empirical leakage in identity and membership inference attacks.

Paperid: 3254, https://arxiv.org/pdf/2512.20243.pdf

Abstract:
In this work, the conventional cryptographic algorithms used in the 5G Core are replaced with post-quantum alternatives and the practical impact of this transition is evaluated. Using a simulation environment, we model the registration and deregistration of varying numbers of user equipments (UEs) and measure the resulting effects on bandwidth consumption and latency. Our results show that the deployment of post-quantum cryptographic algorithms has a measurable effect on performance, but that this effect is small, and perhaps more crucially, that the extra overhead needed in terms of computation and bandwidth does not have any substantial impact on the usability of the network and the efficiency of its network functions. Overall the experimental results in this work corroborate earlier research: the 5G Core is technically able to support post-quantum cryptography without any inherent issues connected to the increased computational overhead or larger message size.

Paperid: 3255, https://arxiv.org/pdf/2512.20234.pdf

Abstract:
Anonymous credentials (ACs) are a crucial cryptographic tool for privacy-preserving authentication in decentralized networks, allowing holders to prove eligibility without revealing their identity. However, a major limitation of standard ACs is the disclosure of the issuer's identity, which can leak sensitive contextual information about the holder. Issuer-hiding ACs address this by making a credential's origin indistinguishable among a set of approved issuers. Despite this advancement, existing solutions suffer from practical limitations that hinder their deployment in decentralized environments: unflexible credential models that restrict issuer and holder autonomy, flawed revocation mechanisms that compromise security, and weak attribute hiding that fails to meet data minimization principles. This paper introduces a new scheme called IRAC to overcome these challenges. We propose a flexible credential model that employs vector commitments with a padding strategy to unify credentials from heterogeneous issuers, enabling privacy-preserving authentication without enforcing a global static attribute set or verifier-defined policies. Furthermore, we design a secure decentralized revocation mechanism where holders prove non-revocation by demonstrating their credential's hash lies within a gap in the issuer's sorted revocation list, effectively decoupling revocation checks from verifier policies while maintaining issuer anonymity. IRAC also strengthens attribute hiding by utilizing zk-SNARKs and vector commitments, allowing holders to prove statements about their attributes without disclosing the attributes themselves or the credential structure. Security analysis and performance evaluations demonstrate its practical feasibility for decentralized networks, where presenting a credential can be finished in 1s.

Paperid: 3256, https://arxiv.org/pdf/2512.20062.pdf

Abstract:
Large Language Models (LLMs) show significant promise in automating software vulnerability analysis, a critical task given the impact of security failure of modern software systems. However, current approaches in using LLMs to automate vulnerability analysis mostly rely on using online API-based LLM services, requiring the user to disclose the source code in development. Moreover, they predominantly frame the task as a binary classification(vulnerable or not vulnerable), limiting potential practical utility. This paper addresses these limitations by reformulating the problem as Software Vulnerability Identification (SVI), where LLMs are asked to output the type of weakness in Common Weakness Enumeration (CWE) IDs rather than simply indicating the presence or absence of a vulnerability. We also tackle the reliance on large, API-based LLMs by demonstrating that instruction-tuning smaller, locally deployable LLMs can achieve superior identification performance. In our analysis, instruct-tuning a local LLM showed better overall performance and cost trade-off than online API-based LLMs. Our findings indicate that instruct-tuned local models represent a more effective, secure, and practical approach for leveraging LLMs in real-world vulnerability management workflows.

Paperid: 3257, https://arxiv.org/pdf/2512.19203.pdf

Abstract:
In many real-world network environments, several types of cyberattacks occur at very low rates compared to benign traffic, making them difficult for intrusion detection systems (IDS) to detect reliably. This imbalance causes traditional evaluation metrics, such as accuracy, to often overstate model performance in these conditions, masking failures on minority attack classes that are most important in practice. In this paper, we evaluate a set of base and meta classifiers on low-traffic attacks in the CSE-CIC-IDS2018 dataset and compare their reliability in terms of accuracy and Matthews Correlation Coefficient (MCC). The results show that accuracy consistently inflates performance, while MCC provides a more accurate assessment of a classifier's performance across both majority and minority classes. Meta-classification methods, such as LogitBoost and AdaBoost, demonstrate more effective minority class detection when measured by MCC, revealing trends that accuracy fails to capture. These findings establish the need for imbalance-aware evaluation and make MCC a more trustworthy metric for IDS research involving low-traffic cyberattacks.

Paperid: 3258, https://arxiv.org/pdf/2512.19037.pdf

Abstract:
The proliferation of IoT devices and their reliance on Wi-Fi networks have introduced significant security vulnerabilities, particularly the KRACK and Kr00k attacks, which exploit weaknesses in WPA2 encryption to intercept and manipulate sensitive data. Traditional IDS using classifiers face challenges such as model overfitting, incomplete feature extraction, and high false positive rates, limiting their effectiveness in real-world deployments. To address these challenges, this study proposes a robust multiclass machine learning based intrusion detection framework. The methodology integrates advanced feature selection techniques to identify critical attributes, mitigating redundancy and enhancing detection accuracy. Two distinct ML architectures are implemented: a baseline classifier pipeline and a stacked ensemble model combining noise injection, Principal Component Analysis (PCA), and meta learning to improve generalization and reduce false positives. Evaluated on the AWID3 data set, the proposed ensemble architecture achieves superior performance, with an accuracy of 98%, precision of 98%, recall of 98%, and a false positive rate of just 2%, outperforming existing state-of-the-art methods. This work demonstrates the efficacy of combining preprocessing strategies with ensemble learning to fortify network security against sophisticated Wi-Fi attacks, offering a scalable and reliable solution for IoT environments. Future directions include real-time deployment and adversarial resilience testing to further enhance the model's adaptability.

Paperid: 3259, https://arxiv.org/pdf/2512.19011.pdf

Abstract:
Prompt injection and jailbreaking attacks pose persistent security challenges to large language model (LLM)-based systems. We present an efficient and systematically evaluated defense architecture that mitigates these threats through a lightweight, multi-stage pipeline. Its core component is a semantic filter based on text normalization, TF-IDF representations, and a Linear SVM classifier. Despite its simplicity, this module achieves 93.4% accuracy and 96.5% specificity on held-out data, substantially reducing attack throughput while incurring negligible computational overhead. Building on this efficient foundation, the full pipeline integrates complementary detection and mitigation mechanisms that operate at successive stages, providing strong robustness with minimal latency. In comparative experiments, our SVM-based configuration improves overall accuracy from 35.1% to 93.4% while reducing average time to completion from approximately 450s to 47s, yielding over 10 times lower latency than ShieldGemma. These results demonstrate that the proposed design simultaneously advances defensive precision and efficiency, addressing a core limitation of current model-based moderators. Evaluation across a curated corpus of over 30,000 labeled prompts, including benign, jailbreak, and application-layer injections, confirms that staged, resource-efficient defenses can robustly secure modern LLM-driven applications.

Paperid: 3260, https://arxiv.org/pdf/2512.18791.pdf

Abstract:
Text-to-Speech (TTS) diffusion models generate high-quality speech, which raises challenges for the model intellectual property protection and speech tracing for legal use. Audio watermarking is a promising solution. However, due to the structural differences among various TTS diffusion models, existing watermarking methods are often designed for a specific model and degrade audio quality, which limits their practical applicability. To address this dilemma, this paper proposes a universal watermarking scheme for TTS diffusion models, termed Smark. This is achieved by designing a lightweight watermark embedding framework that operates in the common reverse diffusion paradigm shared by all TTS diffusion models. To mitigate the impact on audio quality, Smark utilizes the discrete wavelet transform (DWT) to embed watermarks into the relatively stable low-frequency regions of the audio, which ensures seamless watermark-audio integration and is resistant to removal during the reverse diffusion process. Extensive experiments are conducted to evaluate the audio quality and watermark performance in various simulated real-world attack scenarios. The experimental results show that Smark achieves superior performance in both audio quality and watermark extraction accuracy.

Paperid: 3261, https://arxiv.org/pdf/2512.18632.pdf

Abstract:
This paper studies how to achieve individual indistinguishability by pufferfish privacy in aggregated query to a multi-user system. It is assumed that each user reports realization of a random variable. We study how to calibrate Laplace noise, added to the query answer, to attain pufferfish privacy when user changes his/her reported data value, leaves the system and is replaced by another use with different randomness. Sufficient conditions are derived for all scenarios for attaining statistical indistinguishability on four sets of secret pairs. They are derived using the existing Kantorovich method (Wasserstain metric of order $1$). These results can be applied to attain indistinguishability when a certain class of users is added or removed from a tabular data. It is revealed that attaining indifference in individual's data is conditioned on the statistics of this user only. For binary (Bernoulli distributed) random variables, the derived sufficient conditions can be further relaxed to reduce the noise and improve data utility.

Paperid: 3262, https://arxiv.org/pdf/2512.18589.pdf

Abstract:
Fully homomorphic encryption (FHE) schemes like RNS-CKKS enable privacy-preserving outsourced computation (PPOC) but suffer from high computational latency and ciphertext expansion, especially on the resource-constrained edge side. Hybrid Homomorphic Encryption (HHE) mitigates these issues on the edge side by replacing HE with lightweight symmetric encryption for plaintext encryption, such as the Rubato cipher for the HHE variant of RNS-CKKS, yet it introduces transciphering overhead on the cloud. The respective strengths and limitations of FHE and HHE call for a dual-mode HHE solution with flexible algorithm switching ability. This paper presents DNA-HHE, the first dual-mode HHE accelerator with near-network coupling for edge devices. DNA-HHE supports both edge-side RNS-CKKS and Rubato within a unified architecture driven by flexible custom instructions. To realize a compact implementation for the edge side, we propose a DSP-efficient modular reduction design, a compact multi-field-adaptive butterfly unit, and parallel scheduling schemes of Rubato with a high degree of resource sharing. DNA-HHE is designed with network protocol packaging and transmission capacities and directly coupled to the network interface controller, achieving reduced overall latency of edge-side PPOC by 1.09$\times$ to 1.56$\times$. Our evaluations on the ASIC and FPGA platforms demonstrate that DNA-HHE outperforms the state-of-the-art single-mode designs in both edge-side RNS-CKKS and symmetric cipher with better computation latency and area efficiency, while offering dual-mode functionality.

Paperid: 3263, https://arxiv.org/pdf/2512.18493.pdf

Abstract:
Threat detection models in cybersecurity must keep up with shifting traffic, strict feature budgets, and noisy hardware, yet even strong classical systems still miss rare or borderline attacks when the data distribution drifts. Small, near-term quantum processors are now available, but existing work rarely shows whether quantum components can improve end-to-end detection under these unstable, resource constrained conditions rather than just adding complexity. We address this gap with a hybrid architecture that uses a compact multilayer perceptron to compress security data and then routes a few features to 2-4 qubit quantum heads implemented as quantum support vector machines and variational circuits. Under matched preprocessing and training budgets, we benchmark these hybrids against tuned classical baselines on two security tasks, network intrusion detection on NSL-KDD and spam filtering on Ling-Spam datasets, and then deploy the best 4-qubit quantum SVM to an IBM Quantum device with noise-aware execution (readout mitigation and dynamical decoupling). Across both datasets, shallow quantum heads consistently match, and on difficult near-boundary cases modestly reduce, missed attacks and false alarms relative to classical models using the same features. Hardware results track simulator behavior closely enough that the remaining gap is dominated by device noise rather than model design. Taken together, the study shows that even on small, noisy chips, carefully engineered quantum components can already function as competitive, budget-aware elements in practical threat detection pipelines.

Paperid: 3264, https://arxiv.org/pdf/2512.18488.pdf

Abstract:
Secure interoperability across heterogeneous blockchains remains one of the most pressing challenges in Web3 with existing bridge protocols vulnerable to both classical exploits and emerging quantum threats. This paper introduces QLink a quantum-safe Layer 3 interoperability protocol that integrates postquantum cryptography (PQC) quantum key distribution (QKD) and hardware security modules (HSMs) into a unified validator architecture. To our knowledge, QLink is the first interoperability framework to combine these mechanisms to secure validator communication proof aggregation and key management. Validators exchange encryption keys through QKD channels, achieving information-theoretic security against interception, while cross-chain proofs are generated and aggregated with NIST-standardized PQC algorithms. Private keys remain sealed inside HSM enclaves mitigating the risk of theft or leakage. Deployed as a dedicated Layer 3 protocol QLink operates independently of Layer 1 and Layer 2 chains providing a scalable decentralized foundation for secure cross-chain messaging and asset transfer. Experimental evaluation using network simulations demonstrates that validator communication overhead remains sub-second while security guarantees extend beyond current bridge architectures to resist both classical and quantum adversaries. By addressing today vulnerabilities and anticipating future quantum threats QLink establishes a practical and future-proof pathway for blockchain interoperability.

Paperid: 3265, https://arxiv.org/pdf/2512.18305.pdf

Abstract:
Assessing cyber risk in complex IT infrastructures poses significant challenges due to the dynamic, interconnected nature of digital systems. Traditional methods often fall short, relying on static and largely qualitative models that do not scale with system complexity and fail to capture systemic interdependencies. In this work, we introduce a novel quantitative approach to cyber risk assessment based on Quadratic Unconstrained Binary Optimization (QUBO), a formulation compatible with both classical computing and quantum annealing. We demonstrate the capabilities of our approach using a realistic 255-nodes layered infrastructure, showing how risk spreads in non-trivial patterns that are difficult to identify through visual inspection alone. To assess scalability, we further conduct extensive experiments on networks up to 1000 nodes comparing classical, quantum, and hybrid classical-quantum workflows. Our results reveal that although quantum annealing produces solutions comparable to classical heuristics, its potential advantages are significantly hindered by the embedding overhead required to map the densely connected cyber-risk QUBO onto the limited connectivity of current quantum hardware. By contrast, hybrid quantum-classical solvers avoid this bottleneck and therefore emerge as a promising option, combining competitive scaling with an improved ability to explore the solution space and identify more stable risk configurations. Overall, this work delivers two main advances. First, we present a rigorous, tunable, and generalizable mathematical model for cyber risk that can be adapted to diverse infrastructures and domains through flexible parameterization. Second, we provide the first comparative study of classical, quantum, and hybrid approaches for cyber risk scoring at scale, highlighting the emerging potential of hybrid quantum-classical methods for large-scale infrastructures.

Paperid: 3266, https://arxiv.org/pdf/2512.18207.pdf

Abstract:
Current data-driven Wi-Fi-based indoor localization systems face three critical challenges: protecting user privacy, achieving accurate predictions in dynamic multipath environments, and generalizing across different deployments. Traditional Wi-Fi localization systems often compromise user privacy, particularly when facing compromised access points (APs) or man-in-the-middle attacks. As IoT devices proliferate in indoor environments, developing solutions that deliver accurate localization while robustly protecting privacy has become imperative. We introduce FedWiLoc, a privacy-preserving indoor localization system that addresses these challenges through three key innovations. First, FedWiLoc employs a split architecture where APs process Channel State Information (CSI) locally and transmit only privacy-preserving embedding vectors to user devices, preventing raw CSI exposure. Second, during training, FedWiLoc uses federated learning to collaboratively train the model across APs without centralizing sensitive user data. Third, we introduce a geometric loss function that jointly optimizes angle-of-arrival predictions and location estimates, enforcing geometric consistency to improve accuracy in challenging multipath conditions. Extensive evaluation across six diverse indoor environments spanning over 2,000 sq. ft. demonstrates that FedWiLoc outperforms state-of-the-art methods by up to 61.9% in median localization error while maintaining strong privacy guarantees throughout both training and inference.

Paperid: 3267, https://arxiv.org/pdf/2512.17667.pdf

Abstract:
Modern HTTPS mechanisms such as Encrypted Client Hello (ECH) and encrypted DNS improve privacy but remain vulnerable to website fingerprinting (WF) attacks, where adversaries infer visited sites from encrypted traffic patterns. Existing WF methods rely on supervised learning with site-specific labeled traces, which limits scalability and fails to handle previously unseen websites. We address these limitations by reformulating WF as a zero-shot cross-modal retrieval problem and introducing STAR. STAR learns a joint embedding space for encrypted traffic traces and crawl-time logic profiles using a dual-encoder architecture. Trained on 150K automatically collected traffic-logic pairs with contrastive and consistency objectives and structure-aware augmentation, STAR retrieves the most semantically aligned profile for a trace without requiring target-side traffic during training. Experiments on 1,600 unseen websites show that STAR achieves 87.9 percent top-1 accuracy and 0.963 AUC in open-world detection, outperforming supervised and few-shot baselines. Adding an adapter with only four labeled traces per site further boosts top-5 accuracy to 98.8 percent. Our analysis reveals intrinsic semantic-traffic alignment in modern web protocols, identifying semantic leakage as the dominant privacy risk in encrypted HTTPS traffic. We release STAR's datasets and code to support reproducibility and future research.

Paperid: 3268, https://arxiv.org/pdf/2512.17602.pdf

Abstract:
How users adapt after being sandwiched remains unclear; this paper provides an empirical quantification. Using transaction level data from November 2024 to February 2025, enriched with mempool visibility and ZeroMEV labels, we track user outcomes after their n-th public sandwich: (i) reactivation, i.e., the resumption of on-chain activity within a 60-day window, and (ii) first-time adoption of private routing. We refer to users who do not reactivate within this window as churned, and to users experiencing multiple attacks (n>1) as undergoing repeated exposure. Our analysis reveals measurable behavioral adaptation: around 40% of victims migrate to private routing within 60 days, rising to 54% with repeated exposures. Churn peaks at 7.5% after the first sandwich but declines to 1-2%, consistent with survivor bias. In Nov-Dec 2024 we confirm 2,932 private sandwich attacks affecting 3,126 private victim transactions, producing \$409,236 in losses and \$293,786 in attacker profits. A single bot accounts for nearly two-thirds of private frontruns, and private sandwich activity is heavily concentrated on a small set of DEX pools. These results highlight that private routing does not guarantee protection from MEV extraction: while execution failures push users toward private channels, these remain exploitable and highly concentrated, demanding continuous monitoring and protocol-level defenses.

Paperid: 3269, https://arxiv.org/pdf/2512.17398.pdf

Abstract:
Private Inference (PI) uses cryptographic primitives to perform privacy preserving machine learning. In this setting, the owner of the network runs inference on the data of the client without learning anything about the data and without revealing any information about the model. It has been observed that a major computational bottleneck of PI is the calculation of the gate (i.e., ReLU), so a considerable amount of effort have been devoted to reducing the number of ReLUs in a given network. We focus on the DReLU, which is the non-linear step function of the ReLU and show that one DReLU can serve many ReLU operations. We suggest a new activation module where the DReLU operation is only performed on a subset of the channels (Prototype channels), while the rest of the channels (replicate channels) replicates the DReLU of each of their neurons from the corresponding neurons in one of the prototype channels. We then extend this idea to work across different layers. We show that this formulation can drastically reduce the number of DReLU operations in resnet type network. Furthermore, our theoretical analysis shows that this new formulation can solve an extended version of the XOR problem, using just one non-linearity and two neurons, something that traditional formulations and some PI specific methods cannot achieve. We achieve new SOTA results on several classification setups, and achieve SOTA results on image segmentation.

Paperid: 3270, https://arxiv.org/pdf/2512.16717.pdf

Abstract:
In actuality, phishing attacks remain one of the most prevalent cybersecurity risks in existence today, with malevolent actors constantly changing their strategies to successfully trick users. This paper presents an AI model for a phishing detection system that uses an ensemble approach to combine character-level Convolutional Neural Networks (CNN) and LightGBM with engineered features. Our system uses a character-level CNN to extract sequential features after extracting 36 lexical, structural, and domain-based features from the URLs. On a test dataset of 19,873 URLs, the ensemble model achieves an accuracy of 99.819 percent, precision of 100 percent, recall of 99.635 percent, and ROC-AUC of 99.947 percent. Through a FastAPI-based service with an intuitive user interface, the suggested system has been utilised to offer real-time detection. In contrast, the results demonstrate that the suggested solution performs better than individual models; LightGBM contributes 40 percent and character-CNN contributes 60 percent to the final prediction. The suggested method maintains extremely low false positive rates while doing a good job of identifying contemporary phishing techniques. Index Terms - Phishing detection, machine learning, deep learning, CNN, ensemble methods, cybersecurity, URL analysis

Paperid: 3271, https://arxiv.org/pdf/2512.16369.pdf

Abstract:
The RPKI is crucial for securing the routing system of the Internet. With the RPKI, owners of Internet resources can make cryptographically backed claims, for example about the legitimate origin of their IP space. Thousands of networks use this information to detect malicious or accidental route hijacks. The RPKI consists out of 100 distributed repositories. However, public reports claim that some of these repositories are unreliable. A current Internet-Draft suggests best practices on how to operate these repositories, with the goal to improve deployment quality. Inspired by this draft, we take a first look at the operational practices of repositories of the RPKI. We mainly focus on the distribution of RPKI information. We find that there is a wide variety in deployment practices, of which some might risk the availability of parts of the information in the RPKI. This study creates a baseline for measuring the maturity of RPKI repositories in the future.

Paperid: 3272, https://arxiv.org/pdf/2512.15966.pdf

Abstract:
For the rise of electric vehicles, especially for long-distance driving, minimizing charging times is vital. While multiple standards for DC fast charging exist, the leading standard in Europe is ISO 15118. In theory, this standard is accompanied by a variety of security controls, ensuring the authenticity and confidentiality of charging communication, as well as the exchange of payment information. In practice, these security controls are insufficient for effectively securing charging communication. In this paper, we go through all security controls defined in ISO 15118 and demonstrate their shortcomings. Most notably, we present a previously unpublished vulnerability in the plug and charge functionality of ISO 15118. We provide a proof-of-concept implementation of this vulnerability, which, allows a vehicle to be charged while a second, victim vehicle is billed for it. Additionally, we define an alternative plug and charge authentication scheme, which requires fewer efforts towards certificate enrollment and promises to be more resilient and future-proof. Our findings should be considered when implementing and advancing the standard, as the mitigation of the discovered vulnerability is critical for the security of fast charging.

Paperid: 3273, https://arxiv.org/pdf/2512.15803.pdf

Abstract:
Zero-day vulnerabilities represent some of the most critical threats in cybersecurity, as they correspond to previously unknown flaws in software or hardware that are actively exploited before vendors can develop and deploy patches. During this exposure window, affected systems remain defenseless, making zero-day attacks particularly damaging and difficult to mitigate. This study analyzes the Zero Day Initiative (ZDI) vulnerability disclosures reported between January and April 2024, Cole [2025] comprising a total of 415 vulnerabilities. The dataset includes vulnerability identifiers, Common Vulnerability Scoring System (CVSS) v3.0 scores, publication dates, and short textual descriptions. The primary objectives of this work are to identify trends in zero-day vulnerability disclosures, examine severity distributions across vendors, and investigate which vulnerability characteristics are most indicative of high severity. In addition, this study explores predictive modeling approaches for severity classification, comparing classical machine learning techniques with deep learning models using both structured metadata and unstructured textual descriptions. The findings aim to support improved patch prioritization strategies, more effective vulnerability management, and enhanced organizational preparedness against emerging zero-day threats.

Paperid: 3274, https://arxiv.org/pdf/2512.15799.pdf

Abstract:
The integration of generative Artificial Intelligence into the digital ecosystem necessitates a critical re-evaluation of Indian criminal jurisprudence regarding computational forensics integrity. While algorithmic efficiency enhances evidence extraction, a research gap exists regarding the Digital Personal Data Protection Act, 2023's compatibility with adversarial AI threats, specifically anti-forensics and deepfakes. This study scrutinizes the AI "dual-use" dilemma, functioning as both a cyber-threat vector and forensic automation mechanism, to delineate privacy boundaries in high-stakes investigations. Employing a doctrinal legal methodology, the research synthesizes statutory analysis of the DPDP Act with global ethical frameworks (IEEE, EU) to evaluate regulatory efficacy. Preliminary results indicate that while Machine Learning offers high accuracy in pattern recognition, it introduces vulnerabilities regarding data poisoning and algorithmic bias. Findings highlight a critical tension between the Act's data minimization principles and forensic data retention requirements. Furthermore, the paper identifies that existing legal definitions inadequately encompass AI-driven "tool crimes" and "target crimes." Consequently, the research proposes a "human-centric" forensic model prioritizing explainable AI (XAI) to ensure evidence admissibility. These implications suggest that synchronizing Indian privacy statutes with international forensic standards is imperative to mitigate synthetic media risks, establishing a roadmap for future legislative amendments and technical standardization.

Paperid: 3275, https://arxiv.org/pdf/2512.15790.pdf

Abstract:
The increasing operational reliance on complex Multi-Agent Systems (MAS) across safety-critical domains necessitates rigorous adversarial robustness assessment. Modern MAS are inherently heterogeneous, integrating conventional Multi-Agent Reinforcement Learning (MARL) with emerging Large Language Model (LLM) agent architectures utilizing Retrieval-Augmented Generation (RAG). A critical shared vulnerability is reliance on centralized memory components: the shared Experience Replay (ER) buffer in MARL and the external Knowledge Base (K) in RAG agents. This paper proposes XAMT (Bilevel Optimization for Covert Memory Tampering in Heterogeneous Multi-Agent Architectures), a novel framework that formalizes attack generation as a bilevel optimization problem. The Upper Level minimizes perturbation magnitude (delta) to enforce covertness while maximizing system behavior divergence toward an adversary-defined target (Lower Level). We provide rigorous mathematical instantiations for CTDE MARL algorithms and RAG-based LLM agents, demonstrating that bilevel optimization uniquely crafts stealthy, minimal-perturbation poisons evading detection heuristics. Comprehensive experimental protocols utilize SMAC and SafeRAG benchmarks to quantify effectiveness at sub-percent poison rates (less than or equal to 1 percent in MARL, less than or equal to 0.1 percent in RAG). XAMT defines a new unified class of training-time threats essential for developing intrinsically secure MAS, with implications for trust, formal verification, and defensive strategies prioritizing intrinsic safety over perimeter-based detection.

Paperid: 3276, https://arxiv.org/pdf/2512.15777.pdf

Abstract:
Modern computing systems face security threats, including memory corruption attacks, speculative execution vulnerabilities, and control-flow hijacking. Although existing solutions address these threats individually, they frequently introduce performance overhead and leave security gaps. This paper presents a Variable Record Table (VRT) with a unified hardware-assisted framework that simultaneously enforces spatial memory safety against buffer overflows, back-edge control-flow integrity (CFI), and speculative execution attack detection. The VRT dynamically constructs a protection table by instrumenting run-time instructions to extract memory addresses, bounds metadata, and control-flow signatures. Our evaluation across MiBench and SPEC benchmarks shows that VRT successfully detects all attack variants tested with zero additional instruction overhead. Furthermore, it maintains memory requirements below 25KB (for 512 entries) and maintains area / power overhead under 8% and 11.65 μW, respectively. By consolidating three essential security mechanisms into a single hardware structure, VRT provides comprehensive protection while minimizing performance impact.

Paperid: 3277, https://arxiv.org/pdf/2512.15414.pdf

Abstract:
Detecting packed executables is a critical step in malware analysis, as packing obscures the original code and complicates static inspection. This study evaluates both classical feature-based methods and deep learning approaches that transform binary executables into visual representations, specifically, grayscale byte plots, and employ convolutional neural networks (CNNs) for automated classification of packed and non-packed binaries. A diverse dataset of benign and malicious Portable Executable (PE) files, packed using various commercial and open-source packers, was curated to capture a broad spectrum of packing transformations and obfuscation techniques. Classical models using handcrafted Gabor jet features achieved intense discrimination at moderate computational cost. In contrast, CNNs based on VGG16 and DenseNet121 significantly outperformed them, achieving high detection performance with well-balanced precision, recall, and F1-scores. DenseNet121 demonstrated slightly higher precision and lower false positive rates, whereas VGG16 achieved marginally higher recall, indicating complementary strengths for practical deployment. Evaluation against unknown packers confirmed robust generalization, demonstrating that grayscale byte-plot representations combined with deep learning provide a useful and reliable approach for early detection of packed malware, enhancing malware analysis pipelines and supporting automated antivirus inspection.

Paperid: 3278, https://arxiv.org/pdf/2512.15143.pdf

Abstract:
Federated Learning is a machine learning setting that reduces direct data exposure, improving the privacy guarantees of machine learning models. Yet, the exchange of model updates between the participants and the aggregator can still leak sensitive information. In this work, we present a new gradient-based membership inference attack for federated learning scenarios that exploits the temporal evolution of last-layer gradients across multiple federated rounds. Our method uses the shadow technique to learn round-wise gradient patterns of the training records, requiring no access to the private dataset, and is designed to consider both semi-honest and malicious adversaries (aggregators or data owners). Beyond membership inference, we also provide a natural extension of the proposed attack to discrete attribute inference by contrasting gradient responses under alternative attribute hypotheses. The proposed attacks are model-agnostic, and therefore applicable to any gradient-based model and can be applied to both classification and regression settings. We evaluate the attack on CIFAR-100 and Purchase100 datasets for membership inference and on Breast Cancer Wisconsin for attribute inference. Our findings reveal strong attack performance and comparable computational and memory overhead in membership inference when compared to another attack from the literature. The obtained results emphasize that multi-round federated learning can increase the vulnerability to inference attacks, that aggregators pose a more substantial threat than data owners, and that attack performance is strongly influenced by the nature of the training dataset, with richer, high-dimensional data leading to stronger leakage than simpler tabular data.

Paperid: 3279, https://arxiv.org/pdf/2512.15081.pdf

Abstract:
Although large language models (LLMs) are increasingly used in security-critical workflows, practitioners lack quantitative guidance on which safeguards are worth deploying. This paper introduces a decision-oriented framework and reproducible methodology that together quantify residual risk, convert adversarial probe outcomes into financial risk estimates and return-on-control (RoC) metrics, and enable monetary comparison of layered defenses for LLM-based systems. A retrieval-augmented generation (RAG) service is instantiated using the DeepSeek-R1 model over a corpus containing synthetic personally identifiable information (PII), and subjected to automated attacks with Garak across five vulnerability classes: PII leakage, latent context injection, prompt injection, adversarial attack generation, and divergence. For each (vulnerability, control) pair, attack success probabilities are estimated via Laplace's Rule of Succession and combined with loss triangle distributions, calibrated from public breach-cost data, in 10,000-run Monte Carlo simulations to produce loss exceedance curves and expected losses. Three widely used mitigations, attribute-based access control (ABAC); named entity recognition (NER) redaction using Microsoft Presidio; and NeMo Guardrails, are then compared to a baseline RAG configuration. The baseline system exhibits very high attack success rates (>= 0.98 for PII, latent injection, and prompt injection), yielding a total simulated expected loss of $313k per attack scenario. ABAC collapses success probabilities for PII and prompt-related attacks to near zero and reduces the total expected loss by ~94%, achieving an RoC of 9.83. NER redaction likewise eliminates PII leakage and attains an RoC of 5.97, while NeMo Guardrails provides only marginal benefit (RoC of 0.05).

Paperid: 3280, https://arxiv.org/pdf/2512.15039.pdf

Abstract:
Large-scale, standardized datasets for Advanced Persistent Threat (APT) research are scarce, and inconsistent actor aliases and redundant samples hinder reproducibility. This paper presents APT-ClaritySet and its construction pipeline that normalizes threat actor aliases (reconciling approximately 11.22\% of inconsistent names) and applies graph-feature deduplication -- reducing the subset of statically analyzable executables by 47.55\% while retaining behaviorally distinct variants. APT-ClaritySet comprises: (i) APT-ClaritySet-Full, the complete pre-deduplication collection with 34{,}363 malware samples attributed to 305 APT groups (2006 - early 2025); (ii) APT-ClaritySet-Unique, the deduplicated release with 25{,}923 unique samples spanning 303 groups and standardized attributions; and (iii) APT-ClaritySet-FuncReuse, a function-level resource that includes 324{,}538 function-reuse clusters (FRCs) enabling measurement of inter-/intra-group sharing, evolution, and tooling lineage. By releasing these components and detailing the alias normalization and scalable deduplication pipeline, this work provides a high-fidelity, reproducible foundation for quantitative studies of APT patterns, evolution, and attribution.

Paperid: 3281, https://arxiv.org/pdf/2512.15003.pdf

Abstract:
Monitoring issue tracker submissions is a crucial software maintenance activity. A key goal is the prioritization of high risk, security-related bugs. If such bugs can be recognized early, the risk of propagation to dependent products and endangerment of stakeholder benefits can be mitigated. To assist triage engineers with this task, several automatic detection techniques, from Machine Learning (ML) models to prompting Large Language Models (LLMs), have been proposed. Although promising to some extent, prior techniques often memorize lexical cues as decision shortcuts, yielding low detection rate specifically for more complex submissions. As such, these classifiers do not yet reach the practical expectations of a real-time detector of security-related issues. To address these limitations, we propose SEBERTIS, a framework to train Deep Neural Networks (DNNs) as classifiers independent of lexical cues, so that they can confidently detect fully unseen security-related issues. SEBERTIS capitalizes on fine-tuning bidirectional transformer architectures as Masked Language Models (MLMs) on a series of semantically equivalent vocabulary to prediction labels (which we call Semantic Surrogates) when they have been replaced with a mask. Our SEBERTIS-trained classifier achieves a 0.9880 F1-score in detecting security-related issues of a curated corpus of 10,000 GitHub issue reports, substantially outperforming state-of-the-art issue classifiers, with 14.44%-96.98%, 15.40%-93.07%, and 14.90%-94.72% higher detection precision, recall, and F1-score over ML-based baselines. Our classifier also substantially surpasses LLM baselines, with an improvement of 23.20%-63.71%, 36.68%-85.63%, and 39.49%-74.53% for precision, recall, and F1-score.

Paperid: 3282, https://arxiv.org/pdf/2512.14958.pdf

Abstract:
The Internet of Vehicles (IoV) has evolved modern transportation through enhanced connectivity and intelligent systems. However, this increased connectivity introduces critical vulnerabilities, making vehicles susceptible to cyber-attacks such Denial-ofService (DoS) and message spoofing. This project aims to develop a machine learning-based intrusion detection system to classify malicious Controller Area network (CAN) bus traffic using the CiCIoV2024 benchmark dataset. We analyzed various attack patterns including DoS and spoofing attacks targeting critical vehicle parameters such as Spoofing-GAS - gas pedal position, Spoofing-RPM, Spoofing-Speed, and Spoofing-Steering\_Wheel. Our initial findings confirm a multi-class classification problem with a clear structural difference between attack types and benign data, providing a strong foundation for machine learning models.

Paperid: 3283, https://arxiv.org/pdf/2512.14935.pdf

Abstract:
Cloud Security Operations Center (SOC) enable cloud governance, risk and compliance by providing insights visibility and control. Cloud SOC triages high-volume, heterogeneous telemetry from elastic, short-lived resources while staying within tight budgets. In this research, we implement an AI-Augmented Security Operations Center (AISOC) on AWS that combines cloud-native instrumentation with ML-based detection. The architecture uses three Amazon EC2 instances: Attacker, Defender, and Monitoring. We simulate a reverse-shell intrusion with Metasploit, and Filebeat forwards Defender logs to an Elasticsearch and Kibana stack for analysis. We train two classifiers, a malware detector built on a public dataset and a log-anomaly detector trained on synthetically augmented logs that include adversarial variants. We calibrate and fuse the scores to produce multi-modal threat intelligence and triage activity into NORMAL, SUSPICIOUS, and HIGH\_CONFIDENCE\_ATTACK. On held-out tests the fusion achieves strong macro-F1 (up to 1.00) under controlled conditions, though performance will vary in noisier and more diverse environments. These results indicate that simple, calibrated fusion can enhance cloud SOC capabilities in constrained, cost-sensitive setups.

Paperid: 3284, https://arxiv.org/pdf/2512.14860.pdf

Abstract:
Agentic AI introduces security vulnerabilities that traditional LLM safeguards fail to address. Although recent work by Unit 42 at Palo Alto Networks demonstrated that ChatGPT-4o successfully executes attacks as an agent that it refuses in chat mode, there is no comparative analysis in multiple models and frameworks. We conducted the first systematic penetration testing and comparative evaluation of agentic AI systems, testing five prominent models (Claude 3.5 Sonnet, Gemini 2.5 Flash, GPT-4o, Grok 2, and Nova Pro) across two agentic AI frameworks (AutoGen and CrewAI) using a seven-agent architecture that mimics the functionality of a university information management system and 13 distinct attack scenarios that span prompt injection, Server Side Request Forgery (SSRF), SQL injection, and tool misuse. Our 130 total test cases reveal significant security disparities: AutoGen demonstrates a 52.3% refusal rate versus CrewAI's 30.8%, while model performance ranges from Nova Pro's 46.2% to Claude and Grok 2's 38.5%. Most critically, Grok 2 on CrewAI rejected only 2 of 13 attacks (15.4% refusal rate), and the overall refusal rate of 41.5% across all configurations indicates that more than half of malicious prompts succeeded despite enterprise-grade safety mechanisms. We identify six distinct defensive behavior patterns including a novel "hallucinated compliance" strategy where models fabricate outputs rather than executing or refusing attacks, and provide actionable recommendations for secure agent deployment. Complete attack prompts are also included in the Appendix to enable reproducibility.

Paperid: 3285, https://arxiv.org/pdf/2512.14846.pdf

Abstract:
Traditional, centralized security tools often miss adaptive, multi-vector attacks. We present the Multi-Agent LLM Cyber Defense Framework (MALCDF), a practical setup where four large language model (LLM) agents-Detection, Intelligence, Response, and Analysis-work together in real time. Agents communicate over a Secure Communication Layer (SCL) with encrypted, ontology-aligned messages, and produce audit-friendly outputs (e.g., MITRE ATT&CK mappings). For evaluation, we keep the test simple and consistent: all reported metrics come from the same 50-record live stream derived from the CICIDS2017 feature schema. CICIDS2017 is used for configuration (fields/schema) and to train a practical ML baseline. The ML-IDS baseline is a Lightweight Random Forest IDS (LRF-IDS) trained on a subset of CICIDS2017 and tested on the 50-record stream, with no overlap between training and test records. In experiments, MALCDF reaches 90.0% detection accuracy, 85.7% F1-score, and 9.1% false-positive rate, with 6.8s average per-event latency. It outperforms the lightweight ML-IDS baseline and a single-LLM setup on accuracy while keeping end-to-end outputs consistent. Overall, this hands-on build suggests that coordinating simple LLM agents with secure, ontology-aligned messaging can improve practical, real-time cyber defense.

Paperid: 3286, https://arxiv.org/pdf/2512.14778.pdf

Abstract:
This study investigates the key skills and competencies needed by new cybersecurity graduates in the Philippines for entry-level positions. Using a descriptive cross-sectional research design, it combines analysis of job listings from Philippine online platforms with surveys of students, teachers, and professionals. The aim is to identify required skills and areas needing improvement, highlighting the balance between technical skills and other competencies like ethical conduct, suggesting a shift away from traditional cybersecurity skills towards a more diverse skillset. Furthermore, the results revealed common agreement on the importance of communication, critical thinking, problem-solving, and adaptability skills, albeit with slight variations in their prioritization. It recommends that aspiring cybersecurity professionals develop an inclusive skill set encompassing technical knowledge, soft skills, and personal competencies, with a focus on adaptability, continuous learning, and ethics. Skills such as business acumen are considered less vital for entry-level roles, proposing a preparation strategy that aligns with the changing demands of the cybersecurity industry.

Paperid: 3287, https://arxiv.org/pdf/2512.14767.pdf

Abstract:
Federated Learning (FL) is an emerging machine learning paradigm that enables multiple parties to collaboratively train models without sharing raw data, ensuring data privacy. In Vertical FL (VFL), where each party holds different features for the same users, a key challenge is to evaluate the feature contribution of each party before any model is trained, particularly in the early stages when no model exists. To address this, the Shapley-CMI method was recently proposed as a model-free, information-theoretic approach to feature valuation using Conditional Mutual Information (CMI). However, its original formulation did not provide a practical implementation capable of computing the required permutations and intersections securely. This paper presents a novel privacy-preserving implementation of Shapley-CMI for VFL. Our system introduces a private set intersection (PSI) server that performs all necessary feature permutations and computes encrypted intersection sizes across discretized and encrypted ID groups, without the need for raw data exchange. Each party then uses these intersection results to compute Shapley-CMI values, computing the marginal utility of their features. Initial experiments confirm the correctness and privacy of the proposed system, demonstrating its viability for secure and efficient feature contribution estimation in VFL. This approach ensures data confidentiality, scales across multiple parties, and enables fair data valuation without requiring the sharing of raw data or training models.

Paperid: 3288, https://arxiv.org/pdf/2512.14753.pdf

Abstract:
Watermarking large language models (LLMs) is vital for preventing their misuse, including the fabrication of fake news, plagiarism, and spam. It is especially important to watermark LLM-generated code, as it often contains intellectual property.However, we found that existing methods for watermarking LLM-generated code fail to address comment removal attack.In such cases, an attacker can simply remove the comments from the generated code without affecting its functionality, significantly reducing the effectiveness of current code-watermarking techniques.On the other hand, injecting a watermark into code is challenging because, as previous works have noted, most code represents a low-entropy scenario compared to natural language. Our approach to addressing this issue involves leveraging prior knowledge to distinguish between low-entropy and high-entropy parts of the code, as indicated by a Cue List of words.We then inject the watermark guided by this Cue List, achieving higher detectability and usability than existing methods.We evaluated our proposed method on HumanEvaland compared our method with three state-of-the-art code watermarking techniques. The results demonstrate the effectiveness of our approach.

Paperid: 3289, https://arxiv.org/pdf/2512.14751.pdf

Abstract:
Finetuning pretrained large language models (LLMs) has become the standard paradigm for developing downstream applications. However, its security implications remain unclear, particularly regarding whether finetuned LLMs inherit jailbreak vulnerabilities from their pretrained sources. We investigate this question in a realistic pretrain-to-finetune threat model, where the attacker has white-box access to the pretrained LLM and only black-box access to its finetuned derivatives. Empirical analysis shows that adversarial prompts optimized on the pretrained model transfer most effectively to its finetuned variants, revealing inherited vulnerabilities from pretrained to finetuned LLMs. To further examine this inheritance, we conduct representation-level probing, which shows that transferable prompts are linearly separable within the pretrained hidden states, suggesting that universal transferability is encoded in pretrained representations. Building on this insight, we propose the Probe-Guided Projection (PGP) attack, which steers optimization toward transferability-relevant directions. Experiments across multiple LLM families and diverse finetuned tasks confirm PGP's strong transfer success, underscoring the security risks inherent in the pretrain-to-finetune paradigm.

Paperid: 3290, https://arxiv.org/pdf/2512.14737.pdf

Abstract:
Existing agent communication frameworks face critical limitations in providing verifiable audit trails without compromising the privacy and confidentiality of agent interactions. The protection of agent communication privacy while ensuring auditability emerges as a fundamental challenge for applications requiring accurate billing, compliance verification, and accountability in regulated environments. We introduce a framework for auditing agent communications that keeps messages private while still checking they follow expected rules. It pairs zero-knowledge proofs with the existing Model Context Protocol (MCP) so messages can be verified without revealing their contents. The approach runs in lightweight networks, stays compatible with standard MCP exchanges, and adds asynchronous audit verification to confirm format and general message types without exposing specifics. The framework enables mutual audits between agents: one side can check communication content and quality while the other verifies usage metrics, all without revealing sensitive information. We formalize security goals and show that zk-MCP provides data authenticity and communication privacy, achieving efficient verification with negligible latency overhead. We fully implement the framework, including Circom-based zero-knowledge proof generation and an audit protocol integrated with MCP's bidirectional channel, and, to our knowledge, this is the first privacy-preserving audit system for agent communications that offers verifiable mutual auditing without exposing message content or compromising agent privacy.

Paperid: 3291, https://arxiv.org/pdf/2512.14448.pdf

Abstract:
Large Language Model (LLM) agents relying on external retrieval are increasingly deployed in high-stakes environments. While existing adversarial attacks primarily focus on content falsification or instruction injection, we identify a novel, process-oriented attack surface: the agent's reasoning style. We propose Reasoning-Style Poisoning (RSP), a paradigm that manipulates how agents process information rather than what they process. We introduce Generative Style Injection (GSI), an attack method that rewrites retrieved documents into pathological tones--specifically "analysis paralysis" or "cognitive haste"--without altering underlying facts or using explicit triggers. To quantify these shifts, we develop the Reasoning Style Vector (RSV), a metric tracking Verification depth, Self-confidence, and Attention focus. Experiments on HotpotQA and FEVER using ReAct, Reflection, and Tree of Thoughts (ToT) architectures reveal that GSI significantly degrades performance. It increases reasoning steps by up to 4.4 times or induces premature errors, successfully bypassing state-of-the-art content filters. Finally, we propose RSP-M, a lightweight runtime monitor that calculates RSV metrics in real-time and triggers alerts when values exceed safety thresholds. Our work demonstrates that reasoning style is a distinct, exploitable vulnerability, necessitating process-level defenses beyond static content analysis.

Paperid: 3292, https://arxiv.org/pdf/2512.14376.pdf

Abstract:
WebAssembly (Wasm) has risen as a widely used technology to distribute computing workloads on different platforms. The platform independence offered through Wasm makes it an attractive solution for many different applications that can run on disparate infrastructures. In addition, Trusted Execution Environments (TEEs) are offered in many computing infrastructures, which allows also running security sensitive Wasm workloads independent of the specific platforms offered. However, recent work has shown that Wasm binaries are more sensitive to code confidentiality attacks than native binaries. The previous result was obtained for Intel SGX only. In this paper, we take this one step further, introducing a new Wasm code-confidentiality attack that exploits exposed address-space information in TEEs. Our attack enables the extraction of crucial execution features which, when combined with additional side channels, allows us to with high reliability obtain more than 70% of the code in most cases. This is a considerably larger amount than was previously obtained by single stepping Intel SGX where only upwards to 50% of the code could be obtained.

Paperid: 3293, https://arxiv.org/pdf/2512.14045.pdf

Abstract:
A function inlining optimization is a widely used transformation in modern compilers, which replaces a call site with the callee's body in need. While this transformation improves performance, it significantly impacts static features such as machine instructions and control flow graphs, which are crucial to binary analysis. Yet, despite its broad impact, the security impact of function inlining remains underexplored to date. In this paper, we present the first comprehensive study of function inlining through the lens of machine learning-based binary analysis. To this end, we dissect the inlining decision pipeline within the LLVM's cost model and explore the combinations of the compiler options that aggressively promote the function inlining ratio beyond standard optimization levels, which we term extreme inlining. We focus on five ML-assisted binary analysis tasks for security, using 20 unique models to systematically evaluate their robustness under extreme inlining scenarios. Our extensive experiments reveal several significant findings: i) function inlining, though a benign transformation in intent, can (in)directly affect ML model behaviors, being potentially exploited by evading discriminative or generative ML models; ii) ML models relying on static features can be highly sensitive to inlining; iii) subtle compiler settings can be leveraged to deliberately craft evasive binary variants; and iv) inlining ratios vary substantially across applications and build configurations, undermining assumptions of consistency in training and evaluation of ML models.

Paperid: 3294, https://arxiv.org/pdf/2512.13613.pdf

Abstract:
eSignatures ensure data's authenticity, non-repudiation, and integrity. EU's eIDAS regulation specifies, e.g., advanced and qualified (QES) eSignatures. While eSignatures' concrete legal effects depend on the individual case, QESs constitute the highest level of technical protection and authenticity under eIDAS. QESs are based on a qualified certificate issued by a qualified trust service provider (QTSP). Despite legal requirements, technically, a QTSP represents a single point of failure. Contrary, privacy-preserving collaborative computations (P2C2s) have become increasingly practical in recent years; yet lacking an extensive investigation on potential integrations in the QES landscape. We perform a threat analysis on the QES-creation process of Austria's national eID, using STRIDE and a DREAD-like model to extract requirement challenges (RCs) primarily related to: (1) Distributed Service Robustness, (2) Agile Crypto Deployment, and (3) Active User Involvement. To address these RCs, we present QoeSiGN, utilizing novel P2C2 technologies. While currently no P2C2 addresses all RCs, legal aspects, and practical efficiency simultaneously, QoeSiGN gives instantiation possibilities for different needs. For instance, "Multi-Party HSMs" for distributed hardware-secured computations; or secure multi-party computation (software) for highest crypto agility and user involvement, where the user participates in the QES computation. Deployment-wise, QTSPs would need to adapt the signing process and setup trusted communication channels. Legal-wise, QoeSiGN's implementation appears permissible, needing further analysis for realization. Technically, QoeSiGN addresses some regulation requirements better than the current solution, such as "sole control" or crypto agility. Our identified threats and extracted requirements can be transferred to the general QES ecosystem.

Paperid: 3295, https://arxiv.org/pdf/2512.13430.pdf

Abstract:
Although credit and debit card data continue to be a prime target for attackers, organizational adherence to the Payment Card Industry Data Security Standard (PCI DSS) remains surprisingly low. Despite prior work showing that PCI DSS can reduce card fraud, only 32.4% of organizations were fully compliant in 2022, suggesting possible deficiencies in enforcement mechanisms. This study compares PCI DSS with three data security frameworks, HIPAA, NIS2, and GDPR, to examine how enforcement mechanisms relate to implementation success. The analysis reveals that PCI DSS significantly lags far behind these security frameworks and that its sanctions are orders of magnitude smaller than those under GDPR and NIS2. The findings indicate a positive association between stronger, multi-modal enforcement (including public disclosure, license actions, and imprisonment) and higher implementation rates, and highlights the structural weakness of PCI DSS's bank-dependent monitoring model. Enhanced non-monetary sanctions and the creation of an independent supervisory authority are recommended to increase transparency, reduce conflicts of interest, and improve PCI DSS compliance without discouraging card acceptance.

Paperid: 3296, https://arxiv.org/pdf/2512.13352.pdf

Abstract:
Large Language Models (LLMs) are prone to memorizing training data, which poses serious privacy risks. Two of the most prominent concerns are training data extraction and Membership Inference Attacks (MIAs). Prior research has shown that these threats are interconnected: adversaries can extract training data from an LLM by querying the model to generate a large volume of text and subsequently applying MIAs to verify whether a particular data point was included in the training set. In this study, we integrate multiple MIA techniques into the data extraction pipeline to systematically benchmark their effectiveness. We then compare their performance in this integrated setting against results from conventional MIA benchmarks, allowing us to evaluate their practical utility in real-world extraction scenarios.

Paperid: 3297, https://arxiv.org/pdf/2512.13207.pdf

Abstract:
Deep learning and federated learning (FL) are becoming powerful partners for next-generation weather forecasting. Deep learning enables high-resolution spatiotemporal forecasts that can surpass traditional numerical models, while FL allows institutions in different locations to collaboratively train models without sharing raw data, addressing efficiency and security concerns. While FL has shown promise across heterogeneous regions, its distributed nature introduces new vulnerabilities. In particular, data poisoning attacks, in which compromised clients inject manipulated training data, can degrade performance or introduce systematic biases. These threats are amplified by spatial dependencies in meteorological data, allowing localized perturbations to influence broader regions through global model aggregation. In this study, we investigate how adversarial clients distort federated surface temperature forecasts trained on the Copernicus European Regional ReAnalysis (CERRA) dataset. We simulate geographically distributed clients and evaluate patch-based and global biasing attacks on regional temperature forecasts. Our results show that even a small fraction of poisoned clients can mislead predictions across large, spatially connected areas. A global temperature bias attack from a single compromised client shifts predictions by up to -1.7 K, while coordinated patch attacks more than triple the mean squared error and produce persistent regional anomalies exceeding +3.5 K. Finally, we assess trimmed mean aggregation as a defense mechanism, showing that it successfully defends against global bias attacks (2-13% degradation) but fails against patch attacks (281-603% amplification), exposing limitations of outlier-based defenses for spatially correlated data.

Paperid: 3298, https://arxiv.org/pdf/2512.12827.pdf

Abstract:
Despite their remarkable performance, deep neural networks exhibit a critical vulnerability: small, often imperceptible, adversarial perturbations can lead to drastically altered model predictions. Given the stringent reliability demands of applications such as medical diagnosis and autonomous driving, robust detection of such adversarial attacks is paramount. In this paper, we investigate the geometric properties of a model's input loss landscape. We analyze the Intrinsic Dimensionality (ID) of the model's gradient parameters, which quantifies the minimal number of coordinates required to describe the data points on their underlying manifold. We reveal a distinct and consistent difference in the ID for natural and adversarial data, which forms the basis of our proposed detection method. We validate our approach across two distinct operational scenarios. First, in a batch-wise context for identifying malicious data groups, our method demonstrates high efficacy on datasets like MNIST and SVHN. Second, in the critical individual-sample setting, we establish new state-of-the-art results on challenging benchmarks such as CIFAR-10 and MS COCO. Our detector significantly surpasses existing methods against a wide array of attacks, including CW and AutoAttack, achieving detection rates consistently above 92\% on CIFAR-10. The results underscore the robustness of our geometric approach, highlighting that intrinsic dimensionality is a powerful fingerprint for adversarial detection across diverse datasets and attack strategies.

Paperid: 3299, https://arxiv.org/pdf/2512.12729.pdf

Abstract:
The widespread adoption of embedded systems has led to their deployment in critical real-world applications, making them attractive targets for malicious actors. These devices face unique challenges in mitigating vulnerabilities due to intrinsic constraints, such as low energy consumption requirements and limited computational resources. This paper presents RunPBA, a hardware-based runtime attestation system designed to defend against control flow attacks while maintaining minimal performance overhead and adhering to strict power consumption constraints. RunPBA leverages PACBTI, a new processor extension tailored for the Arm Cortex M processor family, allowing robust protection without requiring hardware modifications, a limitation present in similar solutions. We implemented a proof-of-concept and evaluated it using two benchmark suites. Experimental results indicate that RunPBA imposes a geometric mean performance overhead of only 1% and 4.7% across the benchmarks, underscoring its efficiency and suitability for real-world deployment.

Paperid: 3300, https://arxiv.org/pdf/2512.12594.pdf

Abstract:
Browser-using agents (BUAs) are an emerging class of autonomous agents that interact with web browsers in human-like ways, including clicking, scrolling, filling forms, and navigating across pages. While these agents help automate repetitive online tasks, they are vulnerable to prompt injection attacks that can trick an agent into performing undesired actions, such as leaking private information or issuing state-changing requests. We propose ceLLMate, a browser-level sandboxing framework that restricts the agent's ambient authority and reduces the blast radius of prompt injections. We address two fundamental challenges: (1) The semantic gap challenge in policy enforcement arises because the agent operates through low-level UI observations and manipulations; however, writing and enforcing policies directly over UI-level events is brittle and error-prone. To address this challenge, we introduce an agent sitemap that maps low-level browser behaviors to high-level semantic actions. (2) Policy prediction in BUAs is the norm rather than the exception. BUAs have no app developer to pre-declare sandboxing policies, and thus, ceLLMate pairs website-authored mandatory policies with an automated policy-prediction layer that adapts and instantiates these policies from the user's natural-language task. We implement ceLLMate as an agent-agnostic browser extension and demonstrate how it enables sandboxing policies that effectively block various types of prompt injection attacks with negligible overhead.

Paperid: 3301, https://arxiv.org/pdf/2512.12593.pdf

Abstract:
The increasing reliance on software in various applications has made the problem of software vulnerability detection more critical. Software vulnerabilities can lead to security breaches, data theft, and other negative outcomes. Traditional software vulnerability detection techniques, such as static and dynamic analysis, have been shown to be ineffective at detecting multiple vulnerabilities. To address this issue, this study employed a deep learning approach, specifically Convolutional Neural Networks (CNN), to solve the software vulnerability detection problem. A 5-split cross-validation approach was used to train and evaluate the CNN model, which takes tokenized source code as input. The findings indicated that Sherlock successfully detected multiple vulnerabilities at the function level, and its performance was particularly strong for CWE-199, CWE-120, and CWE-Other, with an overall high accuracy rate and significant true positive and true negative values. However, the performance was less reliable for some vulnerabilities due to the lack of a standardized dataset which will be a future research direction. The results suggest that compared to current techniques, the proposed deep learning approach has the potential to substantially enhance the accuracy of software vulnerability detection.

Paperid: 3302, https://arxiv.org/pdf/2512.12568.pdf

Abstract:
The Federated Byzantine Agreement (FBA) achieves rapid consensus by relying on overlapping quorum slices. But this architecture leads to a high dependence on the availability of validators when about one fourth of validators go down, the classical FBA can lose liveness or fail to reach agreement. We thus come up with an Adaptive FBA architecture that can reconfigure quorum slices intelligently based on real time validator reputation to overcome this drawback. Our model includes trust scores computed from EigenTrust and a sliding window behavioral assessment to determine the reliability of validators. We have built the intelligent adaptive FBA model and conducted tests in a Stellar based setting. Results of real life experiments reveal that the system is stable enough to keep consensus when more than half of the validators (up to 62 percent) are disconnected, which is a great extension of the failure threshold of a classical FBA. A fallback mode allows the network to be functional with as few as three validators, thus showing a significant robustness enhancement. Besides, a comparative study with the existing consensus protocols shows that Adaptive FBA can be an excellent choice for the next generation of blockchain systems, especially for constructing a resilient blockchain infrastructure.

Paperid: 3303, https://arxiv.org/pdf/2512.12224.pdf

Abstract:
The increasing use of machine learning (ML) for Just-In-Time (JIT) defect prediction raises concerns about privacy leakage from software analytics data. Existing anonymization methods, such as tabular transformations and graph perturbations, often overlook contextual dependencies among software metrics, leading to suboptimal privacy-utility tradeoffs. Leveraging the contextual reasoning of Large Language Models (LLMs), we propose a cluster-guided anonymization technique that preserves contextual and statistical relationships within JIT datasets. Our method groups commits into feature-based clusters and employs an LLM to generate context-aware parameter configurations for each commit cluster, defining alpha-beta ratios and churn mixture distributions used for anonymization. Our evaluation on six projects (Cassandra, Flink, Groovy, Ignite, OpenStack, and Qt) shows that our LLM-based approach achieves privacy level 2 (IPR >= 80 percent), improving privacy by 18 to 25 percent over four state-of-the-art graph-based anonymization baselines while maintaining comparable F1 scores. Our results demonstrate that LLMs can act as adaptive anonymization engines when provided with cluster-specific statistical information about similar data points, enabling context-sensitive and privacy-preserving software analytics without compromising predictive accuracy.

Paperid: 3304, https://arxiv.org/pdf/2512.12078.pdf

Abstract:
Cyber threat intelligence (CTI) encoded in STIX and structured according to the MITRE ATT&CK framework has become a global reference for describing adversary behavior. However, ATT&CK was designed as a descriptive knowledge base rather than a procedural model. We therefore ask whether its structured artifacts contain sufficient behavioral detail to support multi-stage adversary emulation. Through systematic measurements of the ATT&CK Enterprise bundle, we show that campaign objects encode just fragmented slices of behavior. Only 35.6% of techniques appear in at least one campaign, and neither clustering nor sequence analysis reveals any reusable behavioral structure under technique overlap or LCS-based analyses. Intrusion sets cover a broader portion of the technique space, yet omit the procedural semantics required to transform behavioral knowledge into executable chains, including ordering, preconditions, and environmental assumptions. These findings reveal a procedural semantic gap in current CTI standards: they describe what adversaries do, but not exactly how that behavior was operationalized. To assess how far this gap can be bridged in practice, we introduce a three-stage methodology that translates behavioral information from structured CTI into executable steps and makes the necessary environmental assumptions explicit. We demonstrate its viability by instantiating the resulting steps as operations in the MITRE Caldera framework. Case studies of ShadowRay and Soft Cell show that structured CTI can enable the emulation of multi-stage APT campaigns, but only when analyst-supplied parameters and assumptions are explicitly recorded. This, in turn, exposes the precise points at which current standards fail to support automation. Our results clarify the boundary between descriptive and machine-actionable CTI for adversary emulation.

Paperid: 3305, https://arxiv.org/pdf/2512.12024.pdf

Abstract:
Many properties related to security or concurrency must be encoded as so-called hyperproperties, temporal properties that allow reasoning about multiple traces of a system. However, despite recent advances on model checking hyperproperties, there is still a lack of higher-level specification languages that can effectively support software engineering practitioners in verifying properties of this class at early stages of system design. Alloy is a lightweight formal method with a high-level specification language that is supported by automated analysis procedures, making it particularly well-suited for the verification of design models at early development stages. It does not natively support, however, the verification of hyperproperties. This work proposes HyperPardinus, a new model finding procedure that extends Pardinus -- the temporal logic backend of the Alloy language -- to automatically verify hyperproperties over relational models by relying on existing low-level model checkers for hyperproperties. It then conservatively extends Alloy to support the specification and automatic verification of hyperproperties over design models, as well as the visualization of (counter-)examples at a higher-level of abstraction. Evaluation shows that our approach enables modeling and finding (counter-)examples for complex hyperproperties with alternating quantifiers, making it feasible to address relevant scenarios from the state of the art.

Paperid: 3306, https://arxiv.org/pdf/2512.11690.pdf

Abstract:
While end-to-end encryption protects the content of messages, it does not secure metadata, which exposes sender and receiver information through traffic analysis. A plausible approach to protecting this metadata is to have senders post encrypted messages on a public bulletin board and receivers scan it for relevant messages. Oblivious message retrieval (OMR) leverages homomorphic encryption (HE) to improve user experience in this solution by delegating the scan to a resource-rich server while preserving privacy. A key process in OMR is the homomorphic detection of pertinent messages for the receiver from the bulletin board. It relies on a specialized matrix-vector multiplication algorithm, which involves extensive multiplications between ciphertext vectors and plaintext matrices, as well as homomorphic rotations. The computationally intensive nature of this process limits the practicality of OMR. To address this challenge, this paper proposes a hardware architecture to accelerate the matrix-vector multiplication algorithm. The building homomorphic operators in this algorithm are implemented using high-level synthesis, with design parameters for different parallelism levels. These operators are then deployed on a field-programmable gate array platform using an efficient design space exploration strategy to accelerate homomorphic matrix-vector multiplication. Compared to a software implementation, the proposed hardware accelerator achieves a 13.86x speedup.

Paperid: 3307, https://arxiv.org/pdf/2512.11602.pdf

Abstract:
Modern software projects use automated CI/CD pipelines to streamline their development, build, and deployment processes. GitHub Actions is a popular CI/CD platform that enables project maintainers to create custom workflows -- collections of jobs composed of sequential steps -- using reusable components known as actions. Wary of the security risks introduced by fully-privileged actions, GitHub provides a job-level permission model for controlling workflow access to repository resources. Unfortunately, this model is too coarse-grained to reduce the attack surface pertaining to permission misuse attacks: All actions within a job share the same permissions granted to the job. This violates the principle of least privilege and can lead to broader software supply chain attacks, whenever a compromised action exploits the granted permissions to compromise the repository resources. In this paper, we present Granite, a runtime proxy-based system that enforces fine-grained permissions for GitHub Actions at the step-level granularity within a job. Granite transparently monitors requests made by JavaScript and composite actions during workflow execution and checks them against predefined step-level policies at runtime. We evaluate Granite in terms of compatibility, security, and performance overhead using a dataset of 500 workflows comprising 12,916 jobs from the most-starred GitHub repositories that use GitHub Actions. Our analysis reveals that 52.7% of the jobs can be protected by Granite against permission misuse attacks. We evaluate Granite on 20 top-starred repositories (63 actions, 58 workflows), validate attack prevention using 10 permission misuse attacks across 42 overprivileged jobs, and measure an average overhead of 55% (3.67 seconds) per job, concluding that Granite effectively reduces CI/CD attack surfaces.

Paperid: 3308, https://arxiv.org/pdf/2512.11316.pdf

Abstract:
In this paper, we introduce GraphSecure, a web application that provides advanced analysis and visualisation of security scanning results. GraphSecure enables users to initiate scans for their AWS account, validate them against specific Center for Internet Security (CIS) Benchmarks and return results, showcase those returned results in the form of statistical charts and warn the users about their account status.

Paperid: 3309, https://arxiv.org/pdf/2512.11286.pdf

Abstract:
High-dimensional quantum key distribution (HD-QKD) enhances information efficiency and noise tolerance by encoding data in large Hilbert spaces. The orbital angular momentum (OAM) of light provides a scalable basis for such encoding and supports high-dimensional photonic communication. Practical OAM-based implementations remain constrained by challenges in state generation, transmission, and detection. This survey offers a consolidated overview of OAM-encoded HD-QKD, outlining fundamental principles, representative experiments, and system-level limitations. Recent progress in hybrid encodings, mode sorting, adaptive optics, and TF, CV, MDI, and DI frameworks is summarized with emphasis on practical feasibility.

Paperid: 3310, https://arxiv.org/pdf/2512.11269.pdf

Abstract:
Encrypted AI using fully homomorphic encryption (FHE) provides strong privacy guarantees; but its slow performance has limited practical deployment. Recent works proposed ASICs to accelerate FHE, but require expensive advanced manufacturing processes that constrain their accessibility. GPUs are a far more accessible platform, but achieving ASIC-level performance using GPUs has remained elusive. Furthermore, state-of-the-art approaches primarily focus on small models that fit comfortably within a single device. Supporting large models such as LLMs in FHE introduces a dramatic increase in computational complexity that requires optimized GPU kernels, along with managing terabyte-scale memory footprints that far exceed the capacity of a single GPU. This paper presents Cerium, a multi-GPU framework for FHE inference on large models. Cerium integrates a domain-specific language, an optimizing compiler, and a runtime system to automatically generate high-performance GPU kernels, manage terabyte-scale memory footprints, and parallelize computation across multiple GPUs. It introduces new IR constructs, compiler passes, sparse polynomial representations, memory-efficient data layouts, and communication-aware parallelization techniques that together enable encrypted inference for models ranging from small CNNs to Llama3-8B. We build Cerium on NVIDIA GPUs and demonstrate significant performance gains. For small models, Cerium outperforms expert-written hand-optimized GPU libraries by up to 2.25 times. Cerium achieves performance competitive with state-of-the-art FHE ASICs, outright matching prior FHE ASIC CraterLake. It is the first GPU system to execute bootstrapping in under 10 milliseconds, achieving 7.5 milliseconds, and is the first to demonstrate encrypted inference for BERT-Base and Llama3-8B in 8 seconds and 134 seconds, respectively.

Paperid: 3311, https://arxiv.org/pdf/2512.11122.pdf

Abstract:
This study examines how public perception influences the implementation and adoption of cybersecurity frameworks in South Africa. Using the PRISMA methodology, a systematic literature review was conducted across reputable scholarly databases, yielding 34 relevant sources aligned with predefined inclusion criteria. Cybersecurity, governance, trust, privacy, cybercrime, and public opinion emerged as dominant thematic clusters. Bibliometric and thematic analyses, supported by network visualisations, revealed that while trust and public sentiment affect cybersecurity policy adoption globally, these factors have minimal influence within the South African policy landscape, despite the country's high cybercrime prevalence. In response, the study proposes a trust-centric policymaking framework designed to integrate public perception as a proactive dimension of cybersecurity governance. This framework seeks to prevent trust deficits from obstructing policy effectiveness and provides guidance for restoring trust where it has eroded.

Paperid: 3312, https://arxiv.org/pdf/2512.11112.pdf

Abstract:
Actively secure arithmetic MPC is now practical for real applications, but performance and usability are still limited by framework-specific compilation stacks, the need for programmers to explicitly express parallelism, and high communication overhead. We design and implement a proof-of-concept LLVM-based optimization pipeline for the SPDZ protocol that addresses these bottlenecks. Our front end accepts a subset of C with lightweight privacy annotations and lowers it to LLVM IR, allowing us to reuse mature analyses and transformations to automatically batch independent arithmetic operations. Our back end performs data-flow and control-flow analysis on the optimized IR to drive a non-blocking runtime scheduler that overlaps independent operations and aggressively overlaps communication with computation; when enabled, it can map batched operations to GPU kernels. This design preserves a low learning curve by using a mainstream language and hiding optimization and hardware-specific mechanics from programmers. We evaluate the system on controlled microbenchmarks against MP-SPDZ, focusing on online phase performance. Our CPU back end achieves up to 5.56 times speedup under intermediate and heavy algebraic workloads, shows strong scaling with thread count, and our GPU back end scales better as the input size increases. Overall, these results indicate that leveraging LLVM with protocol-aware scheduling is an effective architectural direction for extracting parallelism without sacrificing usability.

Paperid: 3313, https://arxiv.org/pdf/2512.11004.pdf

Abstract:
This work presents a generalized period decomposition approach, significantly improving the practical reliability of Shor's quantum factoring algorithm. Although Shor's algorithm theoretically enables polynomial-time integer factorization, its real-world performance heavily depends on stringent conditions related to the period obtained via quantum phase estimation. Our generalized decomposition method relaxes these conditions by systematically exploiting arbitrary divisors of the obtained period, effectively broadening the applicability of each quantum execution. Extensive classical simulations were performed to empirically validate our approach, involving over one million test cases across integers ranging from 2 to 8 digits. The proposed method achieved near-perfect success rates, exceeding 99.998% for 7-digit numbers and 99.999% for 8-digit numbers, significantly surpassing traditional and recently improved variants of Shor's algorithm. Crucially, this improvement is achieved without compromising the algorithm's polynomial-time complexity and integrates seamlessly with existing quantum computational frameworks. Moreover, our method enhances the efficiency of quantum resource usage by minimizing unnecessary repetitions, making it particularly relevant for quantum cryptanalysis with noisy intermediate-scale quantum (NISQ) devices. This study thus provides both theoretical advancements and substantial practical benefits, contributing meaningfully to the field of quantum algorithm research and the broader field of quantum information processing.

Paperid: 3314, https://arxiv.org/pdf/2512.10667.pdf

Abstract:
Single-bridge blockchain solutions enable cross-chain communication. However, they are associated with centralization and single-point-of-failure risks. This paper proposes Proof of Success and Reward Distribution (PSCRD), a novel multi-bridge response coordination and incentive distribution protocol designed to address the challenges. PSCRD introduces a fair reward distribution system that equitably distributes the transfer fee among participating bridges, incentivizing honest behavior and sustained commitment. The purpose is to encourage bridge participation for higher decentralization and lower single-point-of-failure risks. The mathematical analysis and simulation results validate the effectiveness of PSCRD using two key metrics: the Gini index, which demonstrates a progressive improvement in the fairness of the reward distribution as new bridge groups joined the network; and the Nakamoto coefficient, which shows a significant improvement in decentralization over time. These findings highlight that PSCRD provides a more resilient and secure cross-chain bridge system without substantially increasing user costs.

Paperid: 3315, https://arxiv.org/pdf/2512.10637.pdf

Abstract:
Intrusion Detection Systems (IDS) are critical components in safeguarding 5G/6G networks from both internal and external cyber threats. While traditional IDS approaches rely heavily on signature-based methods, they struggle to detect novel and evolving attacks. This paper presents an advanced IDS framework that leverages adversarial training and dynamic neural networks in 5G/6G networks to enhance network security by providing robust, real-time threat detection and response capabilities. Unlike conventional models, which require costly retraining to update knowledge, the proposed framework integrates incremental learning algorithms, reducing the need for frequent retraining. Adversarial training is used to fortify the IDS against poisoned data. By using fewer features and incorporating statistical properties, the system can efficiently detect potential threats. Extensive evaluations using the NSL- KDD dataset demonstrate that the proposed approach provides better accuracy of 82.33% for multiclass classification of various network attacks while resisting dataset poisoning. This research highlights the potential of adversarial-trained, dynamic neural networks for building resilient IDS solutions.

Paperid: 3316, https://arxiv.org/pdf/2512.10636.pdf

Abstract:
In this work, fundamental design principles for a central bank digital currency (CBDC) with an offline functionality and corresponding counter measures are discussed. We identify three major objectives for any such CBDC proposal:(i) Access Control Security - protection of a user's funds against unauthorized access by other users; (ii) Security against Depositor's Misbehavior - preservation of the integrity of an environment (potentially the wallet) against misbehavior of its owner (for example, double-spending), and (iii) Privacy by Design - ensuring privacy is embedded into the system architecture. Our central conclusion is the alignment of the objectives to concrete design elements as countermeasures, whereas certain objectives and countermeasures have no or minimal interferences with each other. For example, we work out that the integrity of a user's wallet and, accordingly, the prevention of double-spending race attacks should be addressed through the adoption and integration of \textit{secure hardware} within a CBDC system.

Paperid: 3317, https://arxiv.org/pdf/2512.10470.pdf

Abstract:
Wireless networks act as the backbone of modern digital connectivity, making them a primary target for cyber adversaries. Rogue Access Point attacks, specifically the Evil Twin variant, enable attackers to clone legitimate wireless network identifiers to deceive users into connecting. Once a connection is established, the adversary can intercept traffic and harvest sensitive credentials. While modern defensive architectures often employ Network Intrusion Detection Systems (NIDS) to identify malicious activity, the effectiveness of these systems against Layer 2 wireless threats remains a subject of critical inquiry. This project aimed to design a stealth-capable Rogue AP and evaluate its detectability against Suricata, an open-source NIDS/IPS. The methodology initially focused on a hardware-based deployment using Raspberry Pi platforms but transitioned to a virtualized environment due to severe system compatibility issues. Using Wifipumpkin3, the research team successfully deployed a captive portal that harvested user credentials from connected devices. However, the Suricata NIDS failed to flag the attack, highlighting a significant blind spot in traditional intrusion detection regarding wireless management frame attacks. This paper details the construction of the attack, the evasion techniques employed, and the limitations of current NIDS solutions in detecting localized wireless threats

Paperid: 3318, https://arxiv.org/pdf/2512.10372.pdf

Abstract:
The rising demand for collaborative machine learning and data analytics calls for secure and decentralized data sharing frameworks that balance privacy, trust, and incentives. Existing approaches, including federated learning (FL) and blockchain-based data markets, fall short: FL often depends on trusted aggregators and lacks Byzantine robustness, while blockchain frameworks struggle with computation-intensive training and incentive integration. We present \prot, a decentralized data marketplace that unifies federated learning, blockchain arbitration, and economic incentives into a single framework for privacy-preserving data sharing. \prot\ enables data buyers to submit bid-based requests via blockchain smart contracts, which manage auctions, escrow, and dispute resolution. Computationally intensive training is delegated to \cone\ (\uline{Co}mpute \uline{N}etwork for \uline{E}xecution), an off-chain distributed execution layer. To safeguard against adversarial behavior, \prot\ integrates a modified YODA protocol with exponentially growing execution sets for resilient consensus, and introduces Corrected OSMD to mitigate malicious or low-quality contributions from sellers. All protocols are incentive-compatible, and our game-theoretic analysis establishes honesty as the dominant strategy. We implement \prot\ on Ethereum and evaluate it over benchmark datasets -- MNIST, Fashion-MNIST, and CIFAR-10 -- under varying adversarial settings. \prot\ achieves up to 99\% accuracy on MNIST and 90\% on Fashion-MNIST, with less than 3\% degradation up to 30\% Byzantine nodes, and 56\% accuracy on CIFAR-10 despite its complexity. Our results show that \prot\ ensures privacy, maintains robustness under adversarial conditions, and scales efficiently with the number of participants, making it a practical foundation for real-world decentralized data sharing.

Paperid: 3319, https://arxiv.org/pdf/2512.10135.pdf

Abstract:
This paper explores WireGuard as a lightweight alternative to IPsec for securing the user plane as well as the control plane in an industrial Open RAN deployment at the Adtran Terafactory in Meiningen. We focus on a realistic scenario where external vendors access their hardware in our 5G factory network, posing recurrent security risks from untrusted gNBs and intermediate network elements. Unlike prior studies limited to lab setups, we implement a complete proof-of-concept in a factory environment and compare WireGuard with IPsec under industrial traffic conditions. Our approach successfully protects user data (N3 interface) against untrusted gNBs and man-in-the-middle attacks while enabling control plane (N2 interface) authentication between the access and mobility management functions (AMF) and gNB. Performance measurements show that WireGuard adds minimal overhead in throughput, latency, and Central Processing Unit (CPU) usage, achieving performance comparable to IPsec. These findings demonstrate that WireGuard offers competitive performance with significantly reduced configuration complexity, making it a strong candidate for broader adoption in O-RAN, providing a unified, lightweight security layer across multiple interfaces and components.

Paperid: 3320, https://arxiv.org/pdf/2512.10029.pdf

Abstract:
The rapid proliferation of AI and GenAI tools has extended to the Chrome Web Store. Cybercriminals are exploiting this trend, deploying malicious Chrome extensions posing as AI tools or impersonating popular GenAI models to target users. These extensions often appear legitimate while secretly exfiltrating sensitive data or redirecting users web traffic to attacker-controlled domains. To examine the impact of this trend on the browser extension ecosystem, we curated a dataset of 5,551 AI-themed extensions released over a nine-month period to the Chrome Web Store. Using a multi-signal detection methodology that combines manifest analysis, domain reputation, and runtime network behavior, supplemented with human review, we identified 154 previously undetected malicious Chrome extensions. Together with extensions known from public threat research disclosures, this resulted in a final set of 341 malicious extensions for analysis. Of these, 29 were GenAI-related, forming the focus of our in-depth analysis and disclosure. We deconstruct representative GenAI cases, including Supersonic AI, DeepSeek AI | Free AI Assistant, and Perplexity Search, to illustrate attacker techniques such as Adversary-in-the-Browser, impersonation, bait-and-switch updates, query hijacking, and redirection. Our findings show that threat actors are leveraging GenAI trends and exploiting browser extension APIs and settings for malicious purposes. This demonstrates that the browser extension threat landscape is directly evolving alongside the rapid adoption of GenAI technologies.

Paperid: 3321, https://arxiv.org/pdf/2512.09959.pdf

Abstract:
Organizations are increasingly sharing large volumes of sensitive Personally Identifiable Information (PII), like health records, with each other to better manage their services. Protecting PII data has become increasingly important in today's digital age, and several regulations have been formulated to ensure the secure exchange and management of sensitive personal data. However, at times some of these regulations are at loggerheads with each other, like the Health Insurance Portability and Accountability Act (HIPAA) and Cures Act; and this adds complexity to the already challenging task of Health Data compliance. As public concern regarding sensitive data breaches grows, finding solutions that streamline compliance processes and enhance individual privacy is crucial. We have developed a novel TRUsted Compliance Enforcement (TRUCE) framework for secure data exchange which aims to automate compliance procedures and enhance trusted data management within organizations. The TRUCE framework reasons over contexts of data exchange and assesses the trust score of users and the veracity of data based on corresponding regulations. This framework, developed using approaches from AI/Knowledge representation and Semantic Web technologies, includes a trust management method that incorporates static ground truth, represented by regulations such as HIPAA, and dynamic ground truth, defined by an organization's policies. In this paper, we present our framework in detail along with the validation against the Health Insurance Portability and Accountability Act (HIPAA) Data Usage Agreement (DUA) on CDC Contact Tracing patient data, up to one million patient records. TRUCE service will streamline compliance efforts and ensure adherence to privacy regulations and can be used by organizations to manage compliance of large velocity data exchange in real time.

Paperid: 3322, https://arxiv.org/pdf/2512.09938.pdf

Abstract:
The telecommunications and financial services industries face substantial challenges in inter-operator settlement processes, characterized by extended reconciliation cycles, high transaction costs, and limited real-time transparency. Traditional settlement mechanisms rely on multiple intermediaries and manual procedures, resulting in settlement periods exceeding 120 days with operational costs consuming approximately 5 percent of total revenue. This research presents a blockchain-anchored audit trail model enabling transparent, immutable, and automated inter-operator settlement. The framework leverages distributed ledger technology, smart contract automation, and cryptographic verification to establish a unified, tamper-proof transaction record. Empirical evaluation demonstrates 87 percent reduction in transaction fees, settlement cycle compression from 120 days to 3 minutes, and 100 percent audit trail integrity. Smart contract automation reduces manual intervention by 92 percent and eliminates 88 percent of settlement disputes. Market analysis indicates institutional adoption accelerated from 8 percent in 2020 to 52 percent by April 2024, with projected industry investment reaching 9.2 billion USD annually. The framework addresses scalability (12,000 transactions per second), interoperability, and regulatory compliance across multiple jurisdictions.

Paperid: 3323, https://arxiv.org/pdf/2512.09862.pdf

Abstract:
Random number generation is fundamental for many modern applications including cryptography, simulations and machine learning. Traditional pseudo-random numbers may offer statistical unpredictability, but are ultimately deterministic. On the other hand, True Random Number Generation (TRNG) offers true randomness. One way of obtaining such randomness are quantum systems, including quantum computers. As such the use of quantum computers for TRNG has received considerable attention in recent years. However, existing studies almost exclusively consider IBM quantum computers, often stop at using simulations and usually test only a handful of different TRNG quantum circuits. In this paper, we address those issues by presenting a study of TRNG circuits on Odra 5 a real-life quantum computer installed at Wrocław University of Science and Technology. It is also the first study to utilize the IQM superconducting architecture. Since Odra 5 is available on-premises it allows for much more comprehensive study of various TRNG circuits. In particular, we consider 5 types of TRNG circuits with 105 circuit subvariants in total. Each circuit is used to generate 1 million bits. We then perform an analysis of the quality of the obtained random sequences using the NIST SP 800-22 and NIST SP 800-90B test suites. We also provide a comprehensive review of existing literature on quantum computer-based TRNGs.

Paperid: 3324, https://arxiv.org/pdf/2512.09769.pdf

Abstract:
In this paper, we make the first attempt towards defining cost function of steganography with large language models (LLMs), which is totally different from previous works that rely heavily on expert knowledge or require large-scale datasets for cost learning. To achieve this goal, a two-stage strategy combining LLM-guided program synthesis with evolutionary search is applied in the proposed method. In the first stage, a certain number of cost functions in the form of computer programs are synthesized from LLM responses to structured prompts. These cost functions are then evaluated with pretrained steganalysis models so that candidate cost functions suited to steganography can be collected. In the second stage, by retraining a steganalysis model for each candidate cost function, the optimal cost function(s) can be determined according to the detection accuracy. This two-stage strategy is performed by an iterative fashion so that the best cost function can be collected at the last iteration. Experiments show that the proposed method enables LLMs to design new cost functions of steganography that significantly outperform existing works in terms of resisting steganalysis tools, which verifies the superiority of the proposed method. To the best knowledge of the authors, this is the first work applying LLMs to the design of advanced cost function of steganography, which presents a novel perspective for steganography design and may shed light on further research.

Paperid: 3325, https://arxiv.org/pdf/2512.09699.pdf

Abstract:
Device-independent quantum secret sharing (DI-QSS) is a cryptographic protocol that overcomes the security limitations posed by untrusted quantum devices. We propose a DI-QSS protocol based on the multipartite pseudo-telepathy parity game, which achieves device-independence with simultaneous key generation without requiring dedicated test rounds, unlike CHSH-based schemes [Zhang et al., Phys. Rev. A, 2024]. Notably, the proposed scheme allows simultaneous device-independence verification and key-generation phases, achieving optimal performance for a seven-qubit GHZ state configuration. Further, we analyse the security of our protocol against collective attack and establish reduced resource requirement for the same length of the raw key compared to the previous protocol. Finally, we show that our protocol remains robust even in a noisy environment.

Paperid: 3326, https://arxiv.org/pdf/2512.09409.pdf

Abstract:
Current blockchain consensus protocols -- notably, Proof of Work (PoW) and Proof of Stake (PoS) -- deliver global agreement but exhibit structural constraints. PoW anchors security in heavy computation, inflating energy use and imposing high confirmation latency. PoS improves efficiency but introduces stake concentration, long-range and "nothing-at-stake" vulnerabilities, and a hard performance ceiling shaped by slot times and multi-round committee voting. In this paper, we propose Proof of Trusted Execution (PoTE), a consensus paradigm where agreement emerges from verifiable execution rather than replicated re-execution. Validators operate inside heterogeneous VM-based TEEs, each running the same canonical program whose measurement is publicly recorded, and each producing vendor-backed attestations that bind the enclave code hash to the block contents. Because the execution is deterministic and the proposer is uniquely derived from public randomness, PoTE avoids forks, eliminates slot.time bottlenecks, and commits blocks in a single round of verification. We present the design of a PoTE consensus client, describe our reference implementation, and evaluate its performance against the stringent throughput requirements of the Trillion decentralized exchange.

Paperid: 3327, https://arxiv.org/pdf/2512.09311.pdf

Abstract:
Suspiciousness estimation is critical for proactive threat detection and ensuring public safety in complex environments. This work introduces a large-scale annotated dataset, USE50k, along with a computationally efficient vision-based framework for real-time suspiciousness analysis. The USE50k dataset contains 65,500 images captured from diverse and uncontrolled environments, such as airports, railway stations, restaurants, parks, and other public areas, covering a broad spectrum of cues including weapons, fire, crowd density, abnormal facial expressions, and unusual body postures. Building on this dataset, we present DeepUSEvision, a lightweight and modular system integrating three key components, i.e., a Suspicious Object Detector based on an enhanced YOLOv12 architecture, dual Deep Convolutional Neural Networks (DCNN-I and DCNN-II) for facial expression and body-language recognition using image and landmark features, and a transformer-based Discriminator Network that adaptively fuses multimodal outputs to yield an interpretable suspiciousness score. Extensive experiments confirm the superior accuracy, robustness, and interpretability of the proposed framework compared to state-of-the-art approaches. Collectively, the USE50k dataset and the DeepUSEvision framework establish a strong and scalable foundation for intelligent surveillance and real-time risk assessment in safety-critical applications.

Paperid: 3328, https://arxiv.org/pdf/2512.09264.pdf

Abstract:
The prosperous development of Artificial Intelligence-Generated Content (AIGC) has brought people's anxiety about the spread of false information on social media. Designing detectors for filtering is an effective defense method, but most detectors will be compromised by adversarial samples. Currently, most studies exposing AIGC security issues assume information on model structure and data distribution. In real applications, attackers query and interfere with models that provide services in the form of application programming interfaces (APIs), which constitutes the black-box decision-based attack paradigm. However, to the best of our knowledge, decision-based attacks on AIGC detectors remain unexplored. In this study, we propose \textbf{FBA$^2$D}: a frequency-based black-box attack method for AIGC detection to fill the research gap. Motivated by frequency-domain discrepancies between generated and real images, we develop a decision-based attack that leverages the Discrete Cosine Transform (DCT) for fine-grained spectral partitioning and selects frequency bands as query subspaces, improving both query efficiency and image quality. Moreover, attacks on AIGC detectors should mitigate initialization failures, preserve image quality, and operate under strict query budgets. To address these issues, we adopt an ``adversarial example soup'' method, averaging candidates from successive surrogate iterations and using the result as the initialization to accelerate the query-based attack. The empirical study on the Synthetic LSUN dataset and GenImage dataset demonstrate the effectiveness of our prosed method. This study shows the urgency of addressing practical AIGC security problems.

Paperid: 3329, https://arxiv.org/pdf/2512.09049.pdf

Abstract:
Electromagnetic Fault Injection (EMFI) is a powerful technique for inducing bit flips and instruction-level perturbations on microcontrollers, yet existing literature lacks a unified methodology for systematically mapping spatial sensitivity and classifying resulting fault behaviors. Building on insights from O'Flynn and Kuhnapfel et al., we introduce a platform-agnostic framework for Spatial EMFI Mapping and Fault Classification, aimed at understanding how spatial probe position influences fault outcomes. We present pilot experiments on three representative microcontroller targets including the Xtensa LX6 (ESP32) and two ChipWhisper boards not as definitive evaluations, but as illustrative demonstrations of how the proposed methodology can be applied in practice. These preliminary observations motivate a generalized and reproducible workflow that researchers can adopt when analyzing EMFI susceptibility across diverse embedded architectures.

Paperid: 3330, https://arxiv.org/pdf/2512.08858.pdf

Abstract:
Nested virtualization is now widely supported by major cloud vendors, allowing users to leverage virtualization-based technologies in the cloud. However, supporting nested virtualization significantly increases host hypervisor complexity and introduces a new attack surface in cloud platforms. While many prior studies have explored hypervisor fuzzing, none has explicitly addressed nested virtualization due to the challenge of generating effective virtual machine (VM) instances with a vast state space as fuzzing inputs. We present NecoFuzz, the first fuzzing framework that systematically targets nested virtualization-specific logic in hypervisors. NecoFuzz synthesizes executable fuzz-harness VMs with internal states near the boundary between valid and invalid, guided by an approximate model of hardware-assisted virtualization specifications. Since vulnerabilities in nested virtualization often stem from incorrect handling of unexpected VM states, this specification-guided, boundary-oriented generation significantly improves coverage of security-critical code across different hypervisors. We implemented NecoFuzz on Intel VT-x and AMD-V by extending AFL++ to support fuzz-harness VMs. NecoFuzz achieved 84.7% and 74.2% code coverage for nested virtualization-specific code on Intel VT-x and AMD-V, respectively, and uncovered six previously unknown vulnerabilities across three hypervisors, including two assigned CVEs.

Paperid: 3331, https://arxiv.org/pdf/2512.08575.pdf

Abstract:
The evolving of digital transformation and increased use of technology comes with increased cyber vulnerabilities, which compromise national security. Cyber-threats become more sophisticated as the technology advances. This emphasises the need for strong risk mitigation strategies. To define strong and robust cybersecurity, policies requires an integrated approach of balancing technical expertise with public input. This paper aims to explore strategies used to balance technical expertise and public input to develop effective and robust cybersecurity policies. It also studied how the effective integration of technical expertise with public input is critical to developing effective strategies and resilient cybersecurity frameworks that strengthens national security. A lack of a holistic approach and collaborative efforts to cybersecurity can hinder the effectiveness of cybersecurity policies. This paper followed a systematic literature review with bibliometric analysis using the PRISMA methodology to explore how technical expertise and public input can be integrated to guide cybersecurity policy making. The thematic analysis identified five important themes in developing effective cybersecurity policies, these key themes are: Multi-Stakeholder Involvement and Human Centric Approaches (MSI & HCA), Governance and Policy Frameworks (GPF), Technical Infrastructure (TI), Evaluation and Compliance (EC), and Legal Rights and Sovereignty (LRS). The synthesis shows that there is no adequate exploration of collaborative efforts which undermines the effectiveness of the cybersecurity policies. The findings suggest that inclusive, flexible governance strategies that integrate public input at every stage are necessary for future cybersecurity policy research and practice, which must shift away from a primarily technical and legal perspective.

Paperid: 3332, https://arxiv.org/pdf/2512.08558.pdf

Abstract:
Sensitive citizen data, such as social, medical, and fiscal data, is heavily fragmented across public bodies and the private domain. Mining the combined data sets allows for new insights that otherwise remain hidden. Examples are improved healthcare, fraud detection, and evidence-based policy making. (Multi-party) delegated private set intersection (D-PSI) is a privacy-enhancing technology to link data across multiple data providers using a data collector. However, before it can be deployed in these use cases, it needs to be enhanced with additional functions, e.g., securely delivering payload only for elements in the intersection. Although there has been recent progress in the communication and computation requirements of D-PSI, these practical obstacles have not yet been addressed. This paper is the result of a collaboration with a governmental organization responsible for collecting, linking, and pseudonymizing data. Based on their requirements, we design a new D-PSI protocol with composable output functions, including encrypted payload and pseudonymized identifiers. We show that our protocol is secure in the standard model against colluding semi-honest data providers and against a non-colluding, possibly malicious independent party, the data collector. It, hence, allows to privately link and collect data from multiple data providers suitable for deployment in these use cases in the public sector.

Paperid: 3333, https://arxiv.org/pdf/2512.08310.pdf

Abstract:
Device identifiers like the International Mobile Equipment Identity (IMEI) are crucial for ensuring device integrity and meeting regulations in 4G and 5G networks. However, sharing these identifiers with Mobile Network Operators (MNOs) brings significant privacy risks by enabling long-term tracking and linking of user activities across sessions. In this work, we propose a privacy-preserving identifier checking method in 5G. This paper introduces a protocol for verifying device identifiers without exposing them to the network while maintaining the same functions as the 3GPP-defined Equipment Identity Register (EIR) process. The proposed solution modifies the PEPSI protocol for a Private Set Membership (PSM) setting using the BFV homomorphic encryption scheme. This lets User Equipment (UE) prove that its identifier is not on an operator's blacklist or greylist while ensuring that the MNO only learns the outcome of the verification. The protocol allows controlled deanonymization through an authorized Law Enforcement (LE) hook, striking a balance between privacy and accountability. Implementation results show that the system can perform online verification within five seconds and requires about 15 to 16 MB of communication per session. This confirms its practical use under post-quantum security standards. The findings highlight the promise of homomorphic encryption for managing identifiers while preserving privacy in 5G, laying the groundwork for scalable and compliant verification systems in future 6G networks.

Paperid: 3334, https://arxiv.org/pdf/2512.08299.pdf

Abstract:
In todays digital world, protecting sensitive data is very essential. Steganography hides the existence of secret data instead of its content, providing better security for multimedia communication. This paper proposes a new technique for hiding audio files inside images using the Least Significant Bit (LSB) method optimized by the Harris Hawks Optimization (HHO) algorithm. HHO is a nature-inspired metaheuristic that imitates the hunting behavior of Harris hawks to find optimal pixel positions for embedding data. The proposed method is evaluated using Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Mean Square Error (MSE). Experimental results show that HHO achieves better image quality, robustness, and embedding capacity compared to existing methods.

Paperid: 3335, https://arxiv.org/pdf/2512.08185.pdf

Abstract:
Medical Large Language Models (LLMs) are increasingly deployed for clinical decision support across diverse specialties, yet systematic evaluation of their robustness to adversarial misuse and privacy leakage remains inaccessible to most researchers. Existing security benchmarks require GPU clusters, commercial API access, or protected health data -- barriers that limit community participation in this critical research area. We propose a practical, fully reproducible framework for evaluating medical AI security under realistic resource constraints. Our framework design covers multiple medical specialties stratified by clinical risk -- from high-risk domains such as emergency medicine and psychiatry to general practice -- addressing jailbreaking attacks (role-playing, authority impersonation, multi-turn manipulation) and privacy extraction attacks. All evaluation utilizes synthetic patient records requiring no IRB approval. The framework is designed to run entirely on consumer CPU hardware using freely available models, eliminating cost barriers. We present the framework specification including threat models, data generation methodology, evaluation protocols, and scoring rubrics. This proposal establishes a foundation for comparative security assessment of medical-specialist models and defense mechanisms, advancing the broader goal of ensuring safe and trustworthy medical AI systems.

Paperid: 3336, https://arxiv.org/pdf/2512.08169.pdf

Abstract:
Security Operations Centers face massive, heterogeneous alert streams under minute-level service windows, creating the Alert Triage Latency Paradox: verbose reasoning chains ensure accuracy and compliance but incur prohibitive latency and token costs, while minimal chains sacrifice transparency and auditability. Existing solutions fail: signature systems are brittle, anomaly methods lack actionability, and fully cloud-hosted LLMs raise latency, cost, and privacy concerns. We propose AIDR, a hybrid cloud-edge framework that addresses this trade-off through constrained information-density optimization. The core innovation is gradient-based compression of reasoning chains to retain only decision-critical steps--minimal evidence sufficient to justify predictions while respecting token and latency budgets. We demonstrate that this approach preserves decision-relevant information while minimizing complexity. We construct compact datasets by distilling alerts into 3-5 high-information bullets (68% token reduction), train domain-specialized experts via LoRA, and deploy a cloud-edge architecture: a cloud LLM routes alerts to on-premises experts generating SOAR-ready JSON. Experiments demonstrate AIDR achieves higher accuracy and 40.6% latency reduction versus Chain-of-Thought, with robustness to data corruption and out-of-distribution generalization, enabling auditable and efficient SOC triage with full data residency compliance.

Paperid: 3337, https://arxiv.org/pdf/2512.08067.pdf

Abstract:
CapsuleFS (CFS) is the first filesystem to integrate multi-credential functionality within a POSIX-compliant framework, utilizing DataCapsule as the storage provider. This innovative system is established based on the Global Data Plane in the area of edge computing. Our comprehensive design and implementation of CFS successfully fulfill the objective of providing a multi-credential Common Access API. The architecture of CFS is methodically segmented into three integral components: Firstly, the DataCapsule server, tasked with the storage, dissemination, and replication of DataCapsules on the edge. Secondly, the middleware, a crucial element running in a Trusted Execution Environment responsible for the enforcement and management of write permissions and requests. Finally, the client component, which manifests as a POSIX-compliant filesystem, is adaptable and operational across many architectures. Experimental evaluations of CFS reveal that, while its read and write performances are comparatively modest, it upholds a high degree of functional correctness. This attribute distinctly positions CFS as a viable candidate for application in real-world software development scenarios. The paper also delineates potential future enhancements, aimed at augmenting the practicality of CFS in the landscape of software development.

Paperid: 3338, https://arxiv.org/pdf/2512.07909.pdf

Abstract:
Uganda's rapid digital transformation, supported by national strategies such as Vision 2040 and the Digital Transformation Roadmap, has expanded reliance on networked services while simultaneously increasing exposure to sophisticated cyber threats. In resource-constrained settings, commonly deployed rule-based intrusion detection systems lack the adaptability and ethical safeguards needed to address evolving attack patterns, leading to undetected breaches and excessive blocking of legitimate traffic. This study proposes an Agentic Artificial Intelligence (AAI) framework that integrates reinforcement learning, an explicit ethical governance layer, and human oversight to deliver adaptive and trustworthy cybersecurity. A CPU-optimized simulation environment was developed using a five-node network topology that mirrors key elements of Uganda's critical digital infrastructure and generates both benign and malicious traffic, including phishing, ransomware, and distributed denial-of-service attacks. A Q-learning agent, operating within clearly defined ethical constraints and subject to human auditability, was trained and evaluated against a traditional rule-based baseline. The AAI framework achieved a 100 percent detection rate, zero false positives, and full ethical compliance, compared with 70 percent detection and 15 percent false positives for the baseline system. These results demonstrate that agentic, ethically governed reinforcement learning can substantially improve cybersecurity effectiveness and fairness in CPU-only, resource-constrained environments, offering a practical pathway for operationalizing responsible AI in Uganda's national cybersecurity strategy.

Paperid: 3339, https://arxiv.org/pdf/2512.07520.pdf

Abstract:
Verifying the security of masked hardware and software implementations, under advanced leakage models, remains a significant challenge, especially then accounting for glitches, transitions and CPU micro-architectural specifics. Existing verification approaches are either restricted to small hardware gadgets, small programs on CPUs such as Sboxes, limited leakage models, or require hardware-specific prior knowledge. In this work, we present aLEAKator, an open-source framework for the automated formal verification of masked cryptographic accelerators and software running on CPUs from their HDL descriptions. Our method introduces mixed-domain simulation, enabling precise modeling and verification under various (including robust and relaxed) 1-probing leakage models, and supports variable signal granularity without being restricted to 1-bit wires. aLEAKator also supports verification in the presence of lookup tables, and does not require prior knowledge of the target CPU architecture. Our approach is validated against existing tools and real-world measurements while providing innovative results such as the verification of a full, first-order masked AES on various CPUs

Paperid: 3340, https://arxiv.org/pdf/2512.07368.pdf

Abstract:
The damage caused by cybercrime makes the development of secure software inevitable. Although many tools and frameworks exist to support the development of secure software, statistics on cybercrime show no improvement in recent years. To understand the challenges software companies face in developing secure software, we conducted an interview study with 19 industry experts from 12 cross-industry companies. The results of our study show that the challenges are mainly due to high complexity, a lack of security awareness, and unsuitable processes, which are further exacerbated by an immediate lack of skilled personnel. This article presents our study and the challenges we identified, and derives potential research directions from them.

Paperid: 3341, https://arxiv.org/pdf/2512.07292.pdf

Abstract:
Smartphones handle sensitive tasks such as messaging and payment and may soon support critical electronic identification through initiatives such as the European Digital Identity (EUDI) wallet, currently under development. Yet the susceptibility of modern smartphones to physical side-channel analysis (SCA) is underexplored, with recent work limited to pre-2019 hardware. Since then, smartphone system on chip (SoC) platforms have grown more complex, with heterogeneous processor clusters, sub 10 nm nodes, and frequencies over 2 GHz, potentially complicating SCA. In this paper, we assess the feasibility of electromagnetic (EM) SCA on a Raspberry Pi 4, featuring a Broadcom BCM2711 SoC and a Fairphone 4 featuring a Snapdragon 750G 5G SoC. Using new attack methodologies tailored to modern SoCs, we recover ECDSA secrets from OpenSSL by mounting the Nonce@Once attack of Alam et al. (Euro S&P 2021) and show that the libgcrypt countermeasure does not fully mitigate it. We present case studies illustrating how hardware and software stacks impact EM SCA feasibility. Motivated by use cases such as the EUDI wallet, we survey Android cryptographic implementations and define representative threat models to assess the attack. Our findings show weaknesses in ECDSA software implementations and underscore the need for independently certified secure elements (SEs) in all smartphones.

Paperid: 3342, https://arxiv.org/pdf/2512.07038.pdf

Abstract:
We introduce ideal attribution mechanisms, a formal abstraction for reasoning about attribution decisions over strings. At the core of this abstraction lies the ledger, an append-only log of the prompt-response interaction history between a model and its user. Each mechanism produces deterministic decisions based on the ledger and an explicit selection criterion, making it well-suited to serve as a ground truth for attribution. We frame the design goal of watermarking schemes as faithful representation of ideal attribution mechanisms. This novel perspective brings conceptual clarity, replacing piecemeal probabilistic statements with a unified language for stating the guarantees of each scheme. It also enables precise reasoning about desiderata for future watermarking schemes, even when no current construction achieves them, since the ideal functionalities are specified first. In this way, the framework provides a roadmap that clarifies which guarantees are attainable in an idealized setting and worth pursuing in practice.

Paperid: 3343, https://arxiv.org/pdf/2512.07030.pdf

Abstract:
Among the various types of cyberattacks, identifying zero-day attacks is problematic because they are unknown to security systems as their pattern and characteristics do not match known blacklisted attacks. There are many Machine Learning (ML) models designed to analyze and detect network attacks, especially using supervised models. However, these models are designed to classify samples (normal and attacks) based on the patterns they learn during the training phase, so they perform inefficiently on unseen attacks. This research addresses this issue by evaluating five different supervised models to assess their performance and execution time in predicting zero-day attacks and find out which model performs accurately and quickly. The goal is to improve the performance of these supervised models by not only proposing a framework that applies grid search, dimensionality reduction and oversampling methods to overcome the imbalance problem, but also comparing the effectiveness of oversampling on ml model metrics, in particular the accuracy. To emulate attack detection in real life, this research applies a highly imbalanced data set and only exposes the classifiers to zero-day attacks during the testing phase, so the models are not trained to flag the zero-day attacks. Our results show that Random Forest (RF) performs best under both oversampling and non-oversampling conditions, this increased effectiveness comes at the cost of longer processing times. Therefore, we selected XG Boost (XGB) as the top model due to its fast and highly accurate performance in detecting zero-day attacks.

Paperid: 3344, https://arxiv.org/pdf/2512.06747.pdf

Abstract:
Large Language Models (LLMs) are emerging as powerful enablers for autonomous reasoning and natural-language coordination in unmanned aerial vehicle (UAV) swarms operating within Internet of Things (IoT) environments. However, existing LLM-driven UAV systems process sensitive operational data in plaintext, exposing them to privacy and security risks. This work introduces PrivLLMSwarm, a privacy-preserving framework that performs secure LLM inference for UAV swarm coordination through Secure Multi-Party Computation (MPC). The framework incorporates MPC-optimized transformer components with efficient approximations of nonlinear activations, enabling practical encrypted inference on resource-constrained aerial platforms. A fine-tuned GPT-based command generator, enhanced through reinforcement learning in simulation, provides reliable instructions while maintaining confidentiality. Experimental evaluation in urban-scale simulations demonstrates that PrivLLMSwarm achieves high semantic accuracy, low encrypted inference latency, and robust formation control under privacy constraints. Comparative analysis shows PrivLLMSwarm offers a superior privacy-utility balance compared to differential privacy, federated learning, and plaintext baselines. To support reproducibility, the full implementation including source code, MPC components, and a synthetic dataset is publicly available. PrivLLMSwarm establishes a practical foundation for secure, LLM-enabled UAV swarms in privacy-sensitive IoT applications including smart-city monitoring and emergency response.

Paperid: 3345, https://arxiv.org/pdf/2512.06660.pdf

Abstract:
Analysts in Security Operations Centers routinely query massive telemetry streams using Kusto Query Language (KQL). Writing correct KQL requires specialized expertise, and this dependency creates a bottleneck as security teams scale. This paper investigates whether Small Language Models (SLMs) can enable accurate, cost-effective natural-language-to-KQL translation for enterprise security. We propose a three-knob framework targeting prompting, fine-tuning, and architecture design. First, we adapt existing NL2KQL framework for SLMs with lightweight retrieval and introduce error-aware prompting that addresses common parser failures without increasing token count. Second, we apply LoRA fine-tuning with rationale distillation, augmenting each NLQ-KQL pair with a brief chain-of-thought explanation to transfer reasoning from a teacher model while keeping the SLM compact. Third, we propose a two-stage architecture that uses an SLM for candidate generation and a low-cost LLM judge for schema-aware refinement and selection. We evaluate nine models (five SLMs and four LLMs) across syntax correctness, semantic accuracy, table selection, and filter precision, alongside latency and token cost. On Microsoft's NL2KQL Defender Evaluation dataset, our two-stage approach achieves 0.987 syntax and 0.906 semantic accuracy. We further demonstrate generalizability on Microsoft Sentinel data, reaching 0.964 syntax and 0.831 semantic accuracy. These results come at up to 10x lower token cost than GPT-5, establishing SLMs as a practical, scalable foundation for natural-language querying in security operations.

Paperid: 3346, https://arxiv.org/pdf/2512.06390.pdf

Abstract:
The modern web stack, which is dominated by browser-based applications and API-first backends, now operates under an adversarial equilibrium where automated, AI-assisted attacks evolve continuously. Content Delivery Networks (CDNs) and edge computing place programmable defenses closest to users and bots, making them natural enforcement points for machine-learning (ML) driven inspection, throttling, and isolation. This survey synthesizes the landscape of AI-enhanced defenses deployed at the edge: (i) anomaly- and behavior-based Web Application Firewalls (WAFs) within broader Web Application and API Protection (WAAP), (ii) adaptive DDoS detection and mitigation, (iii) bot management that resists human-mimicry, and (iv) API discovery, positive security modeling, and encrypted-traffic anomaly analysis. We add a systematic survey method, a threat taxonomy mapped to edge-observable signals, evaluation metrics, deployment playbooks, and governance guidance. We conclude with a research agenda spanning XAI, adversarial robustness, and autonomous multi-agent defense. Our findings indicate that edge-centric AI measurably improves time-to-detect and time-to-mitigate while reducing data movement and enhancing compliance, yet introduces new risks around model abuse, poisoning, and governance.

Paperid: 3347, https://arxiv.org/pdf/2512.06243.pdf

Abstract:
Backdoor attacks embed input-dependent malicious behavior into neural networks while preserving high clean accuracy, making them a persistent threat for deployed ML systems. At the same time, real-world deployments almost never serve full-precision models: post-training quantization to INT8 or lower precision is now standard practice for reducing memory and latency. This work asks a simple question: how do existing backdoor defenses behave under standard quantization pipelines? We conduct a systematic empirical study of five representative defenses across three precision settings (FP32, INT8 dynamic, INT4 simulated) and two standard vision benchmarks using a canonical BadNet attack. We observe that INT8 quantization reduces the detection rate of all evaluated defenses to 0% while leaving attack success rates above 99%. For INT4, we find a pronounced dataset dependence: Neural Cleanse remains effective on GTSRB but fails on CIFAR-10, even though backdoors continue to survive quantization with attack success rates above 90%. Our results expose a mismatch between how defenses are commonly evaluated (on FP32 models) and how models are actually deployed (in quantized form), and they highlight quantization robustness as a necessary axis in future evaluations and designs of backdoor defenses.

Paperid: 3348, https://arxiv.org/pdf/2512.06033.pdf

Abstract:
The rapid expansion of Artificial Intelligence is hindered by a fundamental friction in data markets: the value-privacy dilemma, where buyers cannot verify a dataset's utility without inspection, yet inspection may expose the data (Arrow's Information Paradox). We resolve this challenge by introducing the Trustworthy Influence Protocol (TIP), a privacy-preserving framework that enables prospective buyers to quantify the utility of external data without ever decrypting the raw assets. By integrating Homomorphic Encryption with gradient-based influence functions, our approach allows for the precise, blinded scoring of data points against a buyer's specific AI model. To ensure scalability for Large Language Models (LLMs), we employ low-rank gradient projections that reduce computational overhead while maintaining near-perfect fidelity to plaintext baselines, as demonstrated across BERT and GPT-2 architectures. Empirical simulations in healthcare and generative AI domains validate the framework's economic potential: we show that encrypted valuation signals achieve a high correlation with realized clinical utility and reveal a heavy-tailed distribution of data value in pre-training corpora where a minority of texts drive capability while the majority degrades it. These findings challenge prevailing flat-rate compensation models and offer a scalable technical foundation for a meritocratic, secure data economy.

Paperid: 3349, https://arxiv.org/pdf/2512.05321.pdf

Abstract:
In cloud environments, conventional firewalls rely on predefined rules and manual configurations, limiting their ability to respond effectively to evolving or zero-day threats. As organizations increasingly adopt platforms such as Microsoft Azure, this static defense model exposes cloud assets to zero-day exploits, botnets, and advanced persistent threats. In this paper, we introduce an automated defense framework that leverages medium- to high-interaction honeypot telemetry to dynamically update firewall rules in real time. The framework integrates deception sensors (Cowrie), Azure-native automation tools (Monitor, Sentinel, Logic Apps), and MITRE ATT&CK-aligned detection within a closed-loop feedback mechanism. We developed a testbed to automatically observe adversary tactics, classify them using the MITRE ATT&CK framework, and mitigate network-level threats automatically with minimal human intervention. To assess the framework's effectiveness, we defined and applied a set of attack- and defense-oriented security metrics. Building on existing adaptive defense strategies, our solution extends automated capabilities into cloud-native environments. The experimental results show an average Mean Time to Block of 0.86 seconds - significantly faster than benchmark systems - while accurately classifying over 12,000 SSH attempts across multiple MITRE ATT&CK tactics. These findings demonstrate that integrating deception telemetry with Azure-native automation reduces attacker dwell time, enhances SOC visibility, and provides a scalable, actionable defense model for modern cloud infrastructures.

Paperid: 3350, https://arxiv.org/pdf/2512.04950.pdf

Abstract:
Cyber-physical systems can be subject to information leakage; in the presence of continuous variables such as time and energy, these leaks can be subtle to detect. We study here the verification of opacity problems over systems with observation over both timing and energy information. We introduce guarded multi-energy timed automata as an extension of timed automata with multiple energy variables and guards over such variables. Despite undecidability of this general formalism, we establish positive results over a number of subclasses, notably when the attacker observes the final energy and/or the execution time, but also when they have access to the value of the energy variables every time unit.

Paperid: 3351, https://arxiv.org/pdf/2512.04611.pdf

Abstract:
Proof-of-Vulnerability (PoV) input generation is a critical task in software security and supports downstream applications such as path generation and validation. Generating a PoV input requires solving two sets of constraints: (1) reachability constraints for reaching vulnerable code locations, and (2) triggering constraints for activating the target vulnerability. Existing approaches, including directed greybox fuzzing and LLM-assisted fuzzing, struggle to efficiently satisfy these constraints. This work presents an agentic method that mimics human experts. Human analysts iteratively study code to extract semantic reachability and triggering constraints, form hypotheses about PoV triggering strategies, encode them as test inputs, and refine their understanding using debugging feedback. We automate this process with an agentic directed fuzzing framework called PBFuzz. PBFuzz tackles four challenges in agentic PoV generation: autonomous code reasoning for semantic constraint extraction, custom program-analysis tools for targeted inference, persistent memory to avoid hypothesis drift, and property-based testing for efficient constraint solving while preserving input structure. Experiments on the Magma benchmark show strong results. PBFuzz triggered 57 vulnerabilities, surpassing all baselines, and uniquely triggered 17 vulnerabilities not exposed by existing fuzzers. PBFuzz achieved this within a 30-minute budget per target, while conventional approaches use 24 hours. Median time-to-exposure was 339 seconds for PBFuzz versus 8680 seconds for AFL++ with CmpLog, giving a 25.6x efficiency improvement with an API cost of 1.83 USD per vulnerability.

Paperid: 3352, https://arxiv.org/pdf/2512.04044.pdf

Abstract:
Watermarking aims to embed hidden signals in generated text that can be reliably detected when given access to a secret key. Open-weight language models pose acute challenges for such watermarking schemes because the inference-time interventions that dominate contemporary approaches cannot be enforced once model weights are public. Existing watermaking techniques for open-weight models, such as the recently proposed GaussMark, typically rely on small modifications to model weights, which can yield signals detectable to those equipped with a secret key, but achieving detection power comparable to inference-time watermarks generally requires weight perturbations that noticeably reduce generation quality. We introduce MarkTune, a theoretically principled, on-policy fine-tuning framework that treats the GaussMark signal as a reward while simultaneously regularizing against degradation in text quality. We derive MarkTune as an improvement on GaussMark and demonstrate that MarkTune consistently improves the quality-detectability trade-off over GaussMark by steering finer-grained, watermark-aware weight updates within the model's representation space while preserving generation quality. Empirically, we show that MarkTune pushes the quality-detectability frontier of GaussMark close to that of inference-time watermarking, remains robust to paraphrasing and fine-tuning attacks, and exhibits strong generalization: a model fine-tuned on one dataset retains substantial watermark detection power on unseen datasets. Together, these results establish MarkTune as a general strategy for embedding robust, high-quality watermarks into open-weight LMs.

Paperid: 3353, https://arxiv.org/pdf/2512.03792.pdf

Abstract:
Unmanned Aerial Vehicles (UAVs) or drones are being introduced in a wide range of commercial applications. This has also made them prime targets of attackers who compromise their fundamental security properties, including confidentiality, integrity, and availability. As researchers discover novel threat vectors in UAVs, the government and industry are increasingly concerned about their limited ability to secure and regulate UAVs and their usage. With the aim of unfolding a path for a large-scale commercial UAV network deployment, we conduct a comprehensive state-of-the-art study and examine the prevailing security challenges. Unlike the prior art, we focus on uncovering the research gaps that must be addressed to enforce security policy regulations in civilian off-the-shelf drone systems. To that end, we first examine the known security threats to UAVs based on their impact and effectiveness. We then analyze existing countermeasures to prevent, detect, and respond to these threats in terms of security and performance overhead. We further outline the future research directions for securing UAVs. Finally, we establish the fundamental requirements and highlight critical research challenges in introducing a regulatory entity to achieve a secure and regulated UAV network.

Paperid: 3354, https://arxiv.org/pdf/2512.03669.pdf

Abstract:
With the growing reliance on cloud services for large-scale data management, preserving the security and privacy of outsourced datasets has become increasingly critical. While encrypting data and queries can prevent direct content exposure, recent research reveals that adversaries can still infer sensitive information via access pattern and search path analysis. However, existing solutions that offer strong access pattern privacy often incur substantial performance overhead. In this paper, we propose a novel privacy-preserving range query scheme over encrypted datasets, offering strong security guarantees while maintaining high efficiency. To achieve this, we develop secure learned spatial index (SLS-INDEX), a secure learned index that integrates the Paillier cryptosystem with a hierarchical prediction architecture and noise-injected buckets, enabling data-aware query acceleration in the encrypted domain. To further obfuscate query execution paths, SLS-INDEXbased Range Queries (SLRQ) employs a permutation-based secure bucket prediction protocol. Additionally, we introduce a secure point extraction protocol that generates candidate results to reduce the overhead of secure computation. We provide formal security analysis under realistic leakage functions and implement a prototype to evaluate its practical performance. Extensive experiments on both real-world and synthetic datasets demonstrate that SLRQ significantly outperforms existing solutions in query efficiency while ensuring dataset, query, result, and access pattern privacy.

Paperid: 3355, https://arxiv.org/pdf/2512.03641.pdf

Abstract:
Cyber-deception is an increasingly important defensive strategy, shaping adversarial decision making through controlled misinformation, uncertainty, and misdirection. Although game-theoretic, Bayesian, Markov decision process, and reinforcement learning models offer insight into deceptive interactions, they typically assume an attacker has already chosen to engage. Such approaches overlook cognitive and perceptual factors that influence an attacker's initial decision to engage or withdraw. This paper presents a descriptive model that incorporates the psychological and strategic elements shaping this decision. The model defines five components, belief (B), scepticism (S), deception fidelity (D), reconnaissance (R), and experience (E), which interact to capture how adversaries interpret deceptive cues and assess whether continued engagement is worthwhile. The framework provides a structured method for analysing engagement decisions in cyber-deception scenarios. A series of experiments has been designed to evaluate this model through Capture the Flag activities incorporating varying levels of deception, supported by behavioural and biometric observations. These experiments have not yet been conducted, and no experimental findings are presented in this paper. These experiments will combine behavioural observations with biometric indicators to produce a multidimensional view of adversarial responses. Findings will improve understanding of the factors influencing engagement decisions and refine the model's relevance to real-world cyber-deception settings. By addressing the gap in existing models that presume engagement, this work supports more cognitively realistic and strategically effective cyber-deception practices.

Paperid: 3356, https://arxiv.org/pdf/2512.03580.pdf

Abstract:
We propose the Dynamic Optical Test for Bot Identification (DOT-BI): a quick and easy method that uses human perception of motion to differentiate between human respondents and automated systems in surveys and online processes. In DOT-BI, a 'hidden' number is displayed with the same random black-and-white pixel texture as its background. Only the difference in motion and scale between the number and the background makes the number perceptible to humans across frames, while frame-by-frame algorithmic processing yields no meaningful signal. We conducted two preliminary assessments. Firstly, state-of-the-art, video-capable, multimodal models (GPT-5-Thinking and Gemini 2.5 Pro) fail to extract the correct value, even when given explicit instructions about the mechanism. Secondly, in an online survey (n=182), 99.5% (181/182) of participants solved the task, with an average end-to-end completion time of 10.7 seconds; a supervised lab study (n=39) found no negative effects on perceived ease-of-use or completion time relative to a control. We release code to generate tests and 100+ pre-rendered variants to facilitate adoption in surveys and online processes.

Paperid: 3357, https://arxiv.org/pdf/2512.03551.pdf

Abstract:
Group Authentication Schemes (GAS) are methodologies developed to verify the membership of multiple users simultaneously. These schemes enable the concurrent authentication of several users while eliminating the need for a certification authority. Numerous GAS methods have been explored in the literature, and they can be classified into three distinct generations based on their foundational mathematical principles. First-generation GASs rely on polynomial interpolation and the multiplicative subgroup of a finite field. Second-generation GASs also employ polynomial interpolation, but they distinguish themselves by incorporating elliptic curves over finite fields. While third-generation GASs present a promising solution for scalable environments, they demonstrate a limitation in certain applications. Such applications typically require the identification of users participating in the authentication process. In the third-generation GAS, users are able to verify their credentials while maintaining anonymity. However, there are various applications where the identification of participating users is necessary. In this study, we propose an improved version of third-generation GAS, utilizing inner product spaces and polynomial interpolation to resolve this limitation. We address the issue of preventing malicious actions by legitimate group members. The current third-generation scheme allows members to share group credentials, which can jeopardize group confidentiality. Our proposed scheme mitigates this risk by eliminating the ability of individual users to distribute credentials. However, a potential limitation of our scheme is its reliance on a central authority for authentication in certain scenarios.

Paperid: 3358, https://arxiv.org/pdf/2512.03462.pdf

Abstract:
Malicious URLs remain a primary vector for phishing, malware, and cyberthreats. This study proposes a hybrid deep learning framework combining \texttt{HashingVectorizer} n-gram analysis, SMOTE balancing, Isolation Forest anomaly filtering, and a lightweight neural network classifier for real-time URL classification. The multi-stage pipeline processes URLs from open-source repositories with statistical features (length, dot count, entropy), achieving $O(NL + EBdh)$ training complexity and a 20\,ms prediction latency. Empirical evaluation yields 96.4\% accuracy, 95.4\% F1-score, and 97.3\% ROC-AUC, outperforming CNN (94.8\%) and SVM baselines with a $50\!\times$--$100\!\times$ speedup (Table~\ref{tab:comp-complexity}). A multilingual Tkinter GUI (Arabic/English/French) enables real-time threat assessment with clipboard integration. The framework demonstrates superior scalability and resilience against obfuscated URL patterns.

Paperid: 3359, https://arxiv.org/pdf/2512.03310.pdf

Abstract:
The current literature on memorization in Natural Language Models, especially Large Language Models (LLMs), poses severe security and privacy risks, as models tend to memorize personally identifying information (PIIs) from training data. We introduce Randomized Masked Fine-Tuning (RMFT), a novel privacy-preserving fine-tuning technique that reduces PII memorization while minimizing performance impact. Using the Enron Email Dataset, we demonstrate that RMFT achieves an 80.81% reduction in Total Extraction Rate and 80.17% reduction in Seen Extraction Rate compared to baseline fine-tuning, outperforming deduplication methods while maintaining only a 5.73% increase in perplexity. We present MaxTER, a Pareto-optimal evaluation framework for assessing privacy-utility tradeoffs, and show the performance of RMFT vs Deduplication by Area Under The Response Curve (AURC) metric.

Paperid: 3360, https://arxiv.org/pdf/2512.03121.pdf

Abstract:
Large Multimodal Language Models (MLLMs) are emerging as one of the foundational tools in an expanding range of applications. Consequently, understanding training-data leakage in these systems is increasingly critical. Log-probability-based membership inference attacks (MIAs) have become a widely adopted approach for assessing data exposure in large language models (LLMs), yet their effect in MLLMs remains unclear. We present the first comprehensive evaluation of extending these text-based MIA methods to multimodal settings. Our experiments under vision-and-text (V+T) and text-only (T-only) conditions across the DeepSeek-VL and InternVL model families show that in in-distribution settings, logit-based MIAs perform comparably across configurations, with a slight V+T advantage. Conversely, in out-of-distribution settings, visual inputs act as regularizers, effectively masking membership signals.

Paperid: 3361, https://arxiv.org/pdf/2512.02603.pdf

Abstract:
Through this work we introduce an action of the skew polynomial ring $\mathbb{F}_{q}\left[X; σ, δ\right]$ over $\mathbb{F}_{q}$ based on its polynomial valuation and the concept of left skew product of functions. This lead us to explore the construction of a certain subset $\mathcal{T}(X)\subset\mathbb{F}_{q}\left[X; σ, δ\right]$ that allow us to control the non-commutativity of this ring, and exploit this fact in order to build a public key exchange protocol that is secure in Canetti and Krawczyk model.

Paperid: 3362, https://arxiv.org/pdf/2512.02287.pdf

Abstract:
HOT Protocol provides the infrastructure that allows smart contracts to securely own and manage private keys. The Multi-Party Computation (MPC) Network manages signing keys. By running an MPC node inside a Trusted Execution Environment (TEE), the protocol achieves stronger security guarantees while lowering economic requirements for participants. The NEAR Protocol provides a decentralized and efficient state layer. Key management can be integrated with any smart contract across Stellar, TON, Solana, and EVM-compatible networks.

Paperid: 3363, https://arxiv.org/pdf/2512.02270.pdf

Abstract:
We propose a methodology for falsifying safety properties in robotic vehicle systems through property-guided reduction and surrogate execution. By isolating only the control logic and physical dynamics relevant to a given specification, we construct lightweight surrogate models that preserve property-relevant behaviors while eliminating unrelated system complexity. This enables scalable falsification via trace analysis and temporal logic oracles. We demonstrate the approach on a drone control system containing a known safety flaw. The surrogate replicates failure conditions at a fraction of the simulation cost, and a property-guided fuzzer efficiently discovers semantic violations. Our results suggest that controller reduction, when coupled with logic-aware test generation, provides a practical and scalable path toward semantic verification of cyber-physical systems.

Paperid: 3364, https://arxiv.org/pdf/2512.02259.pdf

Abstract:
The Apache Software Foundation (ASF) ecosystem underpins a vast portion of modern software infrastructure, powering widely used components such as Log4j, Tomcat, and Struts. However, the ubiquity of these libraries has made them prime targets for high-impact security vulnerabilities, as illustrated by incidents like Log4Shell. Despite their widespread adoption, Apache projects are not immune to recurring and severe security weaknesses. We conduct a historical analysis of the Apache ecosystem to follow the "breadcrumb trail of vulnerabilities" by compiling a comprehensive dataset of Common Vulnerabilities and Exposures (CVEs) and Common Weakness Enumerations (CWEs). We examine trends in exploit recurrence, disclosure timelines, and remediation practices. Our analysis is guided by four key research questions: (1) What are the most persistent and repeated CWEs in Apache libraries? (2) How long do CVEs persist before being addressed? (3) What is the delay between CVE introduction and official disclosure? and (4) How long after disclosure are CVEs remediated? We present a detailed timeline of vulnerability lifecycle stages across Apache libraries and offer insights to improve secure coding practices, vulnerability monitoring, and remediation strategies. Our contributions include a curated dataset covering 24,285 Apache libraries, 1,285 CVEs, and 157 CWEs, along with empirical findings and developer-focused recommendations.

Paperid: 3365, https://arxiv.org/pdf/2512.02243.pdf

Abstract:
Phishing remains one of the most prevalent online threats, exploiting human trust to harvest sensitive credentials. Existing URL- and HTML-based detection systems struggle against obfuscation and visual deception. This paper presents \textbf{PhishSnap}, a privacy-preserving, on-device phishing detection system leveraging perceptual hashing (pHash). Implemented as a browser extension, PhishSnap captures webpage screenshots, computes visual hashes, and compares them against legitimate templates to identify visually similar phishing attempts. A \textbf{2024 dataset of 10,000 URLs} (70\%/20\%/10\% train/validation/test) was collected from PhishTank and Netcraft. Due to security takedowns, a subset of phishing pages was unavailable, reducing dataset diversity. The system achieved \textbf{0.79 accuracy}, \textbf{0.76 precision}, and \textbf{0.78 recall}, showing that visual similarity remains a viable anti-phishing measure. The entire inference process occurs locally, ensuring user privacy and minimal latency.

Paperid: 3366, https://arxiv.org/pdf/2512.02157.pdf

Abstract:
The field of AI Control seeks to develop robust control protocols, deployment safeguards for untrusted AI which may be intentionally subversive. However, existing protocols that rely on weaker monitors to detect unsafe behavior often fail on complex tasks beyond the monitor's comprehension. We develop control protocols based on factored cognition, in which a trusted model decomposes a task, an untrusted model solves each resultant child task in isolation, and the results are reassembled into a full solution. These protocols may improve safety by several means, such as by simplifying the context for monitors, or by obscuring vulnerabilities in the environment. We implement our protocols in the APPS coding setting and red team them against backdoor attempts from an adversarial GPT-4.1 Nano. We find that: (i) Adding factored cognition to a trusted monitoring protocol can boost safety from 41% to 63%; (ii) Safety improves because monitor performance improves; (iii) Factored cognition makes it no harder for capable LLMs to write backdoors in APPS. While our protocols show low usefulness in APPS, they hold promise for more complex tasks.

Paperid: 3367, https://arxiv.org/pdf/2512.02087.pdf

Abstract:
We provide a survey on the Hidden Subgroup Problem (HSP), which plays an important role in studying the security of public-key cryptosystems. We first review the abelian case, where Kitaev's algorithm yields an efficient quantum solution to the HSP, recalling how classical problems (such as order finding, integer factorization, and discrete logarithm) can be formulated as abelian HSP instances. We then examine the current state of the art for non-abelian HSP, where no general efficient quantum solution is known, focusing on some relevant groups including dihedral group (connected to the shortest vector problem), symmetric groups (connected to the graph isomorphism problem), and semidirect product constructions (connected, in a special case, to the code equivalence problem). We also describe the main techniques for addressing the HSP in non-abelian cases, namely Fourier sampling and the black-box approach. Throughout the paper, we highlight the mathematical notions required and exploited in this context, providing a cryptography-oriented perspective.

Paperid: 3368, https://arxiv.org/pdf/2512.01845.pdf

Abstract:
Deepfakes are a type of synthetic media created using artificial intelligence, specifically deep learning algorithms. This technology can for example superimpose faces and voices onto videos, creating hyper-realistic but artificial representations. Deepfakes pose significant risks regarding misinformation and fake news, because they can spread false information by depicting public figures saying or doing things they never did, undermining public trust. In this paper, we propose a method that leverages BLS signatures (Boneh, Lynn, and Shacham 2004) to implement signatures that remain valid after image cropping, but are invalidated in all the other types of manipulation, including deepfake creation. Our approach does not require who crops the image to know the signature private key or to be trusted in general, and it is O(1) in terms of signature size, making it a practical solution for scenarios where images are disseminated through web servers and cropping is the primary transformation. Finally, we adapted the signature scheme for the JPEG standard, and we experimentally tested the size of a signed image.

Paperid: 3369, https://arxiv.org/pdf/2512.01727.pdf

Abstract:
Advanced nuclear reactor systems face increasing cybersecurity threats as sophisticated attackers exploit cyber-physical interfaces to manipulate control systems while evading traditional IT security measures. This research presents a comprehensive evaluation of artificial intelligence approaches for cybersecurity protection in nuclear infrastructure, using Argonne National Laboratory's Mechanisms Engineering Test Loop (METL) as an experimental platform. We developed a systematic evaluation framework encompassing four machine learning detection paradigms: Change Point Detection, LSTM-based Anomaly Detection, Dependency Violation analysis, and Autoencoder reconstruction methods. Our comprehensive attack taxonomy includes 15 distinct scenarios targeting reactor control systems, each implemented across five severity tiers to evaluate detection performance under varying attack intensities. The experimental evaluation encompassed 300 rigorous experiments using realistic METL operational data. Change Point Detection emerged as the leading approach with mean AUC performance of 0.785, followed by LSTM Anomaly Detection (0.636), Dependency Violation (0.621), and Autoencoder methods (0.580). Attack detectability varied significantly, with multi-site coordinated attacks proving most detectable (AUC = 0.739) while precision trust decay attacks presented the greatest detection challenge (AUC = 0.592). This work delivers practical performance benchmarks and reference architecture that advance AI-based cybersecurity capabilities for critical nuclear infrastructure, providing essential foundations for operational deployment and enhanced threat response in cyber-physical systems.

Paperid: 3370, https://arxiv.org/pdf/2512.01595.pdf

Abstract:
Android employs a permission framework that empowers users to either accept or deny sharing their private data (for example, location) with an app. However, many apps tend to crash when they are denied permission, leaving users no choice but to allow access to their data in order to use the app. In this paper, we introduce a comprehensive and robust user data spoofing system, WhiteLie, that can spoof a variety of user data and feed it to target apps. Additionally, it detects privacy-violating behaviours, automatically responding by supplying spoofed data instead of the user's real data, without crashing or disrupting the apps. Unlike prior approaches, WhiteLie requires neither device rooting nor altering the app's binary, making it deployable on stock Android devices. Through experiments on more than 70 popular Android apps, we demonstrate that WhiteLie is able to deceive apps into accepting spoofed data without getting detected. Our evaluation further demonstrates that WhiteLie introduces negligible overhead in terms of battery usage, CPU consumption, and app execution latency. Our findings underscore the feasibility of implementing user-centric privacy-enhancing mechanisms within the existing Android ecosystem.

Paperid: 3371, https://arxiv.org/pdf/2512.01437.pdf

Abstract:
We analyze Qubic's advertised selfish mining campaign on Monero in 2025. Combining data from Monero nodes, and the Qubic pool API, we reconstruct Qubic-attributed blocks and hashrate and detect ten intervals consistent with selfish mining strategies. In these intervals, Qubic's average hashrate share rises to the 23-34\% range, yet sustained 51\% control is never observed. We evaluate the campaign against the classical selfish mining model and a modified Markov-chain model that reflects Qubic's conservative release strategy: both predict lower revenue than honest mining at the inferred parameters, and the data largely confirms this while still showing noticeable deviations from the predicted curve. We interpret this gap between model and measurements in terms of Qubic's time-varying hashrate and coarse-grained attack segmentation.

Paperid: 3372, https://arxiv.org/pdf/2512.01326.pdf

Abstract:
Large Language Models (LLMs) are increasingly being deployed in real-world applications, but their flexibility exposes them to prompt injection attacks. These attacks leverage the model's instruction-following ability to make it perform malicious tasks. Recent work has proposed JATMO, a task-specific fine-tuning approach that trains non-instruction-tuned base models to perform a single function, thereby reducing susceptibility to adversarial instructions. In this study, we evaluate the robustness of JATMO against HOUYI, a genetic attack framework that systematically mutates and optimizes adversarial prompts. We adapt HOUYI by introducing custom fitness scoring, modified mutation logic, and a new harness for local model testing, enabling a more accurate assessment of defense effectiveness. We fine-tuned LLaMA 2-7B, Qwen1.5-4B, and Qwen1.5-0.5B models under the JATMO methodology and compared them with a fine-tuned GPT-3.5-Turbo baseline. Results show that while JATMO reduces attack success rates relative to instruction-tuned models, it does not fully prevent injections; adversaries exploiting multilingual cues or code-related disruptors still bypass defenses. We also observe a trade-off between generation quality and injection vulnerability, suggesting that better task performance often correlates with increased susceptibility. Our results highlight both the promise and limitations of fine-tuning-based defenses and point toward the need for layered, adversarially informed mitigation strategies.

Paperid: 3373, https://arxiv.org/pdf/2512.01233.pdf

Abstract:
Capture the Flag (CTF) competitions represent a powerful experiential learning approach within cybersecurity education, blending diverse concepts into interactive challenges. However, the short duration (typically 24-48 hours) and ephemeral infrastructure of these events often impede sustained educational benefit. Learners face substantial barriers in revisiting unsolved challenges, primarily due to the cumbersome process of manually reconstructing and rehosting the challenges without comprehensive documentation or guidance. To address this critical gap, we introduce CTF Archive, a platform designed to preserve the educational value of CTF competitions by centralizing and archiving hundreds of challenges spanning over a decade in fully configured, ready-to-use environments. By removing the complexity of environment setup, CTF Archive allows learners to focus directly on conceptual understanding rather than technical troubleshooting. The availability of these preserved challenges encourages in-depth research and exploration at the learner's pace, significantly enhancing conceptual comprehension without the pressures of live competition. Additionally, public accessibility lowers entry barriers, promoting an inclusive educational experience. Overall, CTF Archive provides a scalable solution to integrate persistent, practical cybersecurity learning into academic curricula.

Paperid: 3374, https://arxiv.org/pdf/2512.01164.pdf

Abstract:
Unmanned Aerial Vehicle (UAV) technologies are gaining high interest for many domains, which makes UAV security of utmost importance. ArduPilot is among the most widely used open-source autopilot UAV frameworks; yet, many studies demonstrate the vulnerabilities affecting such systems. Vulnerabilities within its communication subsystems (including WiFi, telemetry, or GPS) expose critical entry points, and vulnerabilities in Ardupilot can affect the control procedure. In this paper, we reconstruct the software architecture and the control models implemented by ArduPilot and then examine how these control models could potentially misused to induce malicious behaviors while relying on legitimate inputs.

Paperid: 3375, https://arxiv.org/pdf/2512.00857.pdf

Abstract:
This paper studies the problem of Threat Actor (TA) naming convention inconsistency across leading Cyber Threat Intelligence (CTI) vendors. The current decentralized and proprietary nomenclature creates confusion and significant obstacles for researchers, including difficulties in integrating and correlating disparate CTI reports and TA profiles. This paper introduces HiP (Hesperus is Phosphorus, a reference to the classic question about the Morning and the Evening Star), a methodology for normalizing, integrating, and clustering TA names presumably corresponding to the same entity. Using HiP, we analyze a large dataset collected from 15 sources and spanning 13,371 CTI reports, 17 vendor taxonomies, 3,287 TA names, and 8 mappings between them. Our analysis of the resulting name graph provides insights on key features of the problem, such as the concentration of aliases on a relatively small subset of TAs, the evolution of this phenomenon over the years, and the factors that could explain TA name proliferation. We also report errors in the mappings and methodological pitfalls that contribute to make certain TA name clusters larger than they should be, including the use of temporary names for activity clusters, the existence of common tools and infrastructure, and overlapping operations. We conclude with a discussion on the inherent difficulties to adopt a TA naming standard, a quest fundamentally hampered by the need to share highly-sensitive telemetry that is private to each CTI vendor.

Paperid: 3376, https://arxiv.org/pdf/2512.00804.pdf

Abstract:
This paper explores attacks and defenses on vector databases in retrieval-augmented generation (RAG) systems. Prior work on knowledge poisoning attacks primarily inject false or toxic content, which fact-checking or linguistic analysis easily detects. We reveal a new and subtle threat: bias injection attacks, which insert factually correct yet semantically biased passages into the knowledge base to covertly influence the ideological framing of answers generated by large language models (LLMs). We demonstrate that these adversarial passages, though linguistically coherent and truthful, can systematically crowd out opposing views from the retrieved context and steer LLM answers toward the attacker's intended perspective. We precisely characterize this class of attacks and then develop a post-retrieval filtering defense, BiasDef. We construct a comprehensive benchmark based on public question answering datasets to evaluate them. Our results show that: (1) the proposed attack induces significant perspective shifts in LLM answers, effectively evading existing retrieval-based sanitization defenses; and (2) BiasDef outperforms existing methods by reducing adversarial passages retrieved by 15\% which mitigates perspective shift by 6.2\times in answers, while enabling the retrieval of 62\% more benign passages.

Paperid: 3377, https://arxiv.org/pdf/2512.00142.pdf

Abstract:
This research introduces the Decentralized Finance (DeFi) TrustBoost Framework, which combines blockchain technology and Explainable AI to address challenges faced by lenders underwriting small business loan applications from low-wealth households. The framework is designed with a strong emphasis on fulfilling four crucial requirements of blockchain and AI systems: confidentiality, compliance with data protection laws, resistance to adversarial attacks, and compliance with regulatory audits. It presents a technique for tamper-proof auditing of automated AI decisions and a strategy for on-chain (inside-blockchain) and off-chain data storage to facilitate collaboration within and across financial organizations.

Paperid: 3378, https://arxiv.org/pdf/2511.23252.pdf

Abstract:
Federated Learning (FL) offers a promising approach to collaboratively train machine learning models without centralizing raw data, yet its scalability is often throttled by excessive communication overhead. This challenge is magnified in Internet of Things (IoT) environments, where devices face stringent bandwidth, latency, and energy constraints. Conventional secure aggregation protocols, while essential for protecting model updates, frequently require multiple interaction rounds, large payload sizes, and per-client costs rendering them impractical for many edge deployments. In this work, we present Hyb-Agg, a lightweight and communication-efficient secure aggregation protocol that integrates Multi-Key CKKS (MK-CKKS) homomorphic encryption with Elliptic Curve Diffie-Hellman (ECDH)-based additive masking. Hyb-Agg reduces the secure aggregation process to a single, non-interactive client-to-server transmission per round, ensuring that per-client communication remains constant regardless of the number of participants. This design eliminates partial decryption exchanges, preserves strong privacy under the RLWE, CDH, and random oracle assumptions, and maintains robustness against collusion by the server and up to $N-2$ clients. We implement and evaluate Hyb-Agg on both high-performance and resource-constrained devices, including a Raspberry Pi 4, demonstrating that it delivers sub-second execution times while achieving a constant communication expansion factor of approximately 12x over plaintext size. By directly addressing the communication bottleneck, Hyb-Agg enables scalable, privacy-preserving federated learning that is practical for real-world IoT deployments.

Paperid: 3379, https://arxiv.org/pdf/2511.23198.pdf

Abstract:
Recent years have shown that malware attacks still happen with high frequency. Malware experts seek to categorize and classify incoming samples to confirm their trustworthiness or prove their maliciousness. One of the ways in which groups of malware samples can be identified is through malware clustering. Despite the efforts of the community, malware clustering which incorporates benign samples has been under-explored. Moreover, despite the availability of larger public benchmark malware datasets, malware clustering studies have avoided fully utilizing these datasets in their experiments, often resorting to small datasets with only a few families. Additionally, the current state-of-the-art solutions for malware clustering remain unclear. In our study, we evaluate malware clustering quality and establish the state-of-the-art on Bodmas and Ember - two large public benchmark malware datasets. Ours is the first study of malware clustering performed on whole malware benchmark datasets. Additionally, we extend the malware clustering task by incorporating benign samples. Our results indicate that incorporating benign samples does not significantly degrade clustering quality. We find that there are differences in the quality of the created clusters between Ember and Bodmas, as well as a private industry dataset. Contrary to popular opinion, our top clustering performers are K-Means and BIRCH, with DBSCAN and HAC falling behind.

Paperid: 3380, https://arxiv.org/pdf/2511.23183.pdf

Abstract:
Given the constant growth and increasing sophistication of cyberattacks, cybersecurity can no longer rely solely on traditional defense techniques and tools. Proactive detection of cyber threats has become essential to help security teams identify potential risks and implement effective mitigation measures. Cyber Threat Intelligence (CTI) plays a key role by providing security analysts with evidence-based knowledge about cyber threats. CTI information can be extracted using various techniques and data sources; however, machine learning has proven promising. As for data sources, social networks and online discussion forums are commonly explored. In this study, we apply text mining techniques and machine learning to data collected from Dark Web forums in Brazilian Portuguese to identify malicious posts. Our contributions include the creation of three original datasets, a novel multi-stage labeling process combining indicators of compromise (IoCs), contextual keywords, and manual analysis, and a comprehensive evaluation of text representations and classifiers. To our knowledge, this is the first study to focus specifically on Brazilian Portuguese content in this domain. The best-performing model, using LightGBM and TF-IDF, was able to detect relevant posts with high accuracy. We also applied topic modeling to validate the model's outputs on unlabeled data, confirming its robustness in real-world scenarios.

Paperid: 3381, https://arxiv.org/pdf/2511.22859.pdf

Abstract:
Based on the provided LaTeX code, here is the metadata for the submission form: Title: TokCom-UEP: Semantic Importance-Matched Unequal Error Protection for Resilient Image Transmission Author(s): Kaizheng Zhang, Zuolin Jin, Zhihang Cheng, Ming Zeng, Li Qiao, Zesong Fei Abstract: Token communication (TokCom), an emerging semantic communication framework powered by Large Multimodal Model (LMM), has become a key paradigm for resilient data transmission in 6G networks. A key limitation of existing TokCom designs lies in the assumption of uniform token importance, which leads to the adoption of equal error protection (EEP). However, compressed one-dimensional (1D) token sequences inherently exhibit heterogeneous semantic importance hierarchies, rendering EEP schemes suboptimal. To address this, this paper proposes TokCom-UEP, a novel semantic importance-matched unequal error protection (UEP) framework designed for resilient image transmission. TokCom-UEP integrates rateless UEP coding with the non-uniform semantic importance of tokens by partitioning source tokens into nested expanding windows, assigning higher selection probabilities to windows containing critical tokens to ensure their prioritized recovery. Simulation results demonstrate that TokCom-UEP outperforms EEP schemes in terms of three core semantic restoration metrics and spectral efficiency under low-overhead conditions.

Paperid: 3382, https://arxiv.org/pdf/2511.22788.pdf

Abstract:
Large Language Models (LLMs) demonstrate impressive capabilities in natural language understanding and generation, but incur high communication overhead and privacy risks in cloud deployments, while facing compute and memory constraints when confined to edge devices. Cloud-edge inference has emerged as a promising paradigm for improving privacy in LLM services by retaining sensitive computations on local devices. However, existing cloud-edge inference approaches apply uniform privacy protection without considering input sensitivity, resulting in unnecessary perturbation and degraded utility even for non-sensitive tokens. To address this limitation, we propose Privacy-aware Routing for Inference with Semantic Modulation (PRISM), a context-aware framework that dynamically balances privacy and inference quality. PRISM executes in four stages: (1) the edge device profiles entity-level sensitivity; (2) a soft gating module on the edge selects an execution mode - cloud, edge, or collaboration; (3) for collaborative paths, the edge applies adaptive two-layer local differential privacy based on entity risks; and (4) the cloud LLM generates a semantic sketch from the perturbed prompt, which is then refined by the edge-side small language model (SLM) using local context. Our results show that PRISM consistently achieves superior privacy-utility trade-offs across various scenarios, reducing energy consumption and latency to 40-50% of baseline methods such as Uniform and Selective LDP, while maintaining high output quality under strong privacy constraints. These findings are validated through comprehensive evaluations involving realistic prompts, actual energy measurements, and heterogeneous cloud-edge model deployments.

Paperid: 3383, https://arxiv.org/pdf/2511.22340.pdf

Abstract:
The Replay Protected Memory Block (RPMB) in modern storage systems provides a secure area where data integrity is ensured by authentication. This block is used in digital devices to store pivotal information that must be safeguarded against modification by potential attackers. This paper targets the authentication scheme of the RPMB in three different eMMCs from a major manufacturer. A glitch was injected by sending an electromagnetic pulse to the target chip. RPMB authentication was successfully glitched and the information stored in two target eMMCs was overwritten with arbitrary data, without affecting the integrity of other data.

Paperid: 3384, https://arxiv.org/pdf/2511.22259.pdf

Abstract:
A Covert Channel (CC) exploits legitimate communication mechanisms to stealthily transmit information, often bypassing traditional security controls. Among these, a novel paradigm called History Covert Channels (HCC) leverages past network events as reference points to embed covert messages. Unlike traditional timing- or storage-based CCs, which directly manipulate traffic patterns or packet contents, HCCs minimize detectability by encoding information through small pointers to historical data. This approach enables them to amplify the size of transmitted covert data by referring to more bits than are actually embedded. Recent research has explored the feasibility of such methods, demonstrating their potential to evade detection by repurposing naturally occurring network behaviors as a covert transmission medium. This paper introduces a novel method for establishing and maintaining covert communication links using relative pointers to network timing patterns, which minimizes the reliance of the HCC on centralized timekeeping and reduces the likelihood of being detected by standard network monitoring tools. We also explore the tailoring of HCCs to optimize their robustness and undetectability characteristics. Our experiments reveal a better bitrate compared to previous work.

Paperid: 3385, https://arxiv.org/pdf/2511.22215.pdf

Abstract:
Online pornography and gambling have consistently posed regulatory challenges for governments, threatening both personal assets and privacy. Therefore, it is imperative to research the classification of the newly registered Pornographic and Gambling Domain Names (PGDN). However, scholarly investigation into this topic is limited. Previous efforts in PGDN classification pursue high accuracy using ideal sample data, while others employ up-to-date data from real-world scenarios but achieve lower classification accuracy. This paper introduces the Real-PGDN method, which accomplishes a complete process of timely and comprehensive real-data crawling, feature extraction with feature-missing tolerance, precise PGDN classification, and assessment of application effects in actual scenarios. Our two-level classifier, which integrates CoSENT (BERT-based), Multilayer Perceptron (MLP), and traditional classification algorithms, achieves a 97.88% precision. The research process amasses the NRD2024 dataset, which contains continuous detection information over 20 days for 1,500,000 newly registered domain names across 6 directions. Results from our case study demonstrate that this method also maintains a forecast precision of over 70% for PGDN that are delayed in usage after registration.

Paperid: 3386, https://arxiv.org/pdf/2511.22117.pdf

Abstract:
Formal Concept Analysis (FCA) is extensively used in knowledge extraction, cognitive concept learning, and data mining. However, its computational demands on large-scale datasets often require outsourcing to external computing services, raising concerns about the leakage of sensitive information. To address this challenge, we propose a novel approach to enhance data security and privacy in FCA-based computations. Specifically, we introduce a Privacy-preserving Formal Context Analysis (PFCA) framework that combines binary data representation with homomorphic encryption techniques. This method enables secure and efficient concept construction without revealing private data. Experimental results and security analysis confirm the effectiveness of our approach in preserving privacy while maintaining computational performance. These findings have important implications for privacy-preserving data mining and secure knowledge discovery in large-scale FCA applications.

Paperid: 3387, https://arxiv.org/pdf/2511.21999.pdf

Abstract:
Although our lives are increasingly transitioning into the digital world, many digital assets still relate to objects or places in the physical world, e.g., websites of stores or restaurants, digital documents claiming property ownership, or digital identifiers encoded in QR codes for mobile payments in shops. Currently, users cannot securely associate digital assets with their related physical space, leading to problems such as fake brand stores, property fraud, and mobile payment scams. In many cases, the necessary information to protect digital assets exists, e.g., via contractual relationships and cadaster entries, but there is currently no uniform way of retrieving and verifying these documents. In this work, we propose the Geo-Enabled Cryptographic Key Oracle (GECKO), a geographical PKI that provides a global view of digital assets based on their geo-location and occupied space. GECKO allows for the bidirectional translation of trust between the physical and digital world. Users can verify which assets are supposed to exist at their location, as well as verify which physical space is claimed by a digital entity. GECKO supplements current PKI systems and can be used in addition to current systems when its properties are of value. We show the feasibility of efficiently storing millions of assets and serving cryptographic material based on precise location queries within 11 ms at a rate of more than 19000 queries per second on a single server.

Paperid: 3388, https://arxiv.org/pdf/2511.21795.pdf

Abstract:
Cloud security is an important concern. To identify and stop cyber threats, efficient data collection methods are necessary. This research presents an innovative method to cloud security by integrating numerous data sources and modalities with multi-modal deep learning autoencoders. The Multi-Modal Deep Learning Ensemble Architecture (MMDLEA), a unique approach for anomaly detection and classification in multi-modal data, is proposed in this study. The proposed design integrates the best features of six deep learning models: Multi-Modal Deep Learning Autoencoder (MMDLA), Anomaly Detection using Adaptive Metric Learning (ADAM), ADADELTA, ADAGRAD, RMSPROP, and Stacked Graph Transformer (SGT). A final prediction is produced by combining the outputs of all the models, each of which is trained using a distinct modality of the data. Based on the test dataset, the recommended MMDLA architecture achieves an accuracy of 98.5% and an F1-score of 0.985, demonstrating its superior performance over each individual model. Of the different models, the ADAM model performs the best, with an accuracy of 96.2% and an F1-score of 0.962. With an F1-score of 0.955 and an accuracy of 95.5%, the ADADELTA model trails closely behind. MMDLA obtains an F1-score of 0.948 and an accuracy of 94.8%. Additionally, the suggested MMDLEA design exhibits enhanced resilience to fluctuating modalities and noisy data, proving its usefulness in practical settings. Future study in this area is made possible by the results, which show the potential of the proposed framework for abnormal identification and categorization in multi-modal data.

Paperid: 3389, https://arxiv.org/pdf/2511.20920.pdf

Abstract:
The Model Context Protocol (MCP) replaces static, developer-controlled API integrations with more dynamic, user-driven agent systems, which also introduces new security risks. As MCP adoption grows across community servers and major platforms, organizations encounter threats that existing AI governance frameworks (such as NIST AI RMF and ISO/IEC 42001) do not yet cover in detail. We focus on three types of adversaries that take advantage of MCP s flexibility: content-injection attackers that embed malicious instructions into otherwise legitimate data; supply-chain attackers who distribute compromised servers; and agents who become unintentional adversaries by over-stepping their role. Based on early incidents and proof-of-concept attacks, we describe how MCP can increase the attack surface through data-driven exfiltration, tool poisoning, and cross-system privilege escalation. In response, we propose a set of practical controls, including per-user authentication with scoped authorization, provenance tracking across agent workflows, containerized sandboxing with input/output checks, inline policy enforcement with DLP and anomaly detection, and centralized governance using private registries or gateway layers. The aim is to help organizations ensure that unvetted code does not run outside a sandbox, tools are not used beyond their intended scope, data exfiltration attempts are detectable, and actions can be audited end-to-end. We close by outlining open research questions around verifiable registries, formal methods for these dynamic systems, and privacy-preserving agent operations.

Paperid: 3390, https://arxiv.org/pdf/2511.20902.pdf

Abstract:
This work presents a review of attack methodologies targeting Pix, the instant payment system launched by the Central Bank of Brazil in 2020. The study aims to identify and classify the main types of fraud affecting users and financial institutions, highlighting the evolution and increasing sophistication of these techniques. The methodology combines a structured literature review with exploratory interviews conducted with professionals from the banking sector. The results show that fraud schemes have evolved from purely social engineering approaches to hybrid strategies that integrate human manipulation with technical exploitation. The study concludes that security measures must advance at the same pace as the growing complexity of attack methodologies, with particular emphasis on adaptive defenses and continuous user awareness.

Paperid: 3391, https://arxiv.org/pdf/2511.20832.pdf

Abstract:
Data imputation is an important data preparation task where the data analyst replaces missing or erroneous values to increase the expected accuracy of downstream analyses. The accuracy improvement of data imputation extends to private data analyses across distributed databases. However, existing data imputation methods violate the privacy of the data rendering the privacy protection in the downstream analyses obsolete. We conclude that private data analysis requires private data imputation. In this paper, we present the first optimized protocols for private data imputation. We consider the case of horizontally and vertically split data sets. Our optimization aims to reduce most of the computation to private set intersection (or at least oblivious programmable pseudo-random function) protocols which can be very efficiently computed. We show that private data imputation has -- on average across all evaluated datasets -- an accuracy advantage of 20\% in case of vertically split data and 5\% in case of horizontally split data over imputing data locally. In case of the worst data split we observed that imputing using our method resulted in an increase of up to 32.7 times in the quality of imputation over the vertically split data and 3.4 times in case of horizontally split data. Our protocols are very efficient and run in 2.4 seconds in case of vertically split data and 8.4 seconds in case of horizontally split data for 100,000 records evaluated in the 10 Gbps network setting, performing one data imputation.

Paperid: 3392, https://arxiv.org/pdf/2511.20710.pdf

Abstract:
In the age of agentic AI, the growing deployment of multi-modal models (MMs) has introduced new attack vectors that can leak sensitive training data in MMs, causing privacy leakage. This paper investigates a black-box privacy attack, i.e., membership inference attack (MIA) on multi-modal vision-language models (VLMs). State-of-the-art research analyzes privacy attacks primarily to unimodal AI-ML systems, while recent studies indicate MMs can also be vulnerable to privacy attacks. While researchers have demonstrated that biologically inspired neural network representations can improve unimodal model resilience against adversarial attacks, it remains unexplored whether neuro-inspired MMs are resilient against privacy attacks. In this work, we introduce a systematic neuroscience-inspired topological regularization (tau) framework to analyze MM VLMs resilience against image-text-based inference privacy attacks. We examine this phenomenon using three VLMs: BLIP, PaliGemma 2, and ViT-GPT2, across three benchmark datasets: COCO, CC3M, and NoCaps. Our experiments compare the resilience of baseline and neuro VLMs (with topological regularization), where the tau > 0 configuration defines the NEURO variant of VLM. Our results on the BLIP model using the COCO dataset illustrate that MIA attack success in NEURO VLMs drops by 24% mean ROC-AUC, while achieving similar model utility (similarities between generated and reference captions) in terms of MPNet and ROUGE-2 metrics. This shows neuro VLMs are comparatively more resilient against privacy attacks, while not significantly compromising model utility. Our extensive evaluation with PaliGemma 2 and ViT-GPT2 models, on two additional datasets: CC3M and NoCaps, further validates the consistency of the findings. This work contributes to the growing understanding of privacy risks in MMs and provides evidence on neuro VLMs privacy threat resilience.

Paperid: 3393, https://arxiv.org/pdf/2511.19711.pdf

Abstract:
Machine learning (ML) involves private data and proprietary model parameters. MPC-based ML allows multiple parties to collaboratively run an ML workload without sharing their private data or model parameters using multi-party computing (MPC). Because MPC cannot natively run ML operations such as Softmax or GELU, existing frameworks use different approximations. Our study shows that, on a well-optimized framework, these approximations often become the dominating bottleneck. Popular approximations are often insufficiently accurate or unnecessarily slow, and these issues are hard to identify and fix in existing frameworks. To tackle this issue, we propose a compiler for MPC-based ML, CrypTorch. CrypTorch disentangles these approximations with the rest of the MPC runtime, allows easily adding new approximations through its programming interface, and automatically selects approximations to maximize both performance and accuracy. Built as an extension to PyTorch 2's compiler, we show that CrypTorch's auto-tuning alone provides 1.20--1.7$\times$ immediate speedup without sacrificing accuracy, and 1.31--1.8$\times$ speedup when some accuracy degradation is allowed, compared to our well-optimized baseline. Combined with better engineering and adoption of state-of-the-art practices, the entire framework brings 3.22--8.6$\times$ end-to-end speedup compared to the popular framework, CrypTen.

Paperid: 3394, https://arxiv.org/pdf/2511.19670.pdf

Abstract:
Cyber-Physical Systems have played an essential role in our daily lives, providing critical services such as power and water, whose operability, availability, and reliability must be ensured. The C programming language, prevalent in CPS development, is crucial for system control where reliability is critical. However, it is also commonly susceptible to vulnerabilities, particularly buffer overflows. Traditional vulnerability discovery techniques often struggle with scalability and precision when applied directly to the binary code of C programs, which can thereby keep programs vulnerable. This work introduces a novel approach designed to overcome these limitations by leveraging model checking and concolic execution techniques to automatically verify security properties of a program's stack memory in binary code, trampoline techniques to perform automated repair of the issues, and crash-inducing inputs to verify if they were successfully removed. The approach constructs a Memory State Space -- MemStaCe -- from the binary program's control flow graph and simulations, provided by concolic execution, of C function calls and loop constructs. The security properties, defined in LTL, model the correct behaviour of functions associated with vulnerabilities and allow the approach to identify vulnerabilities in MemStaCe by analysing counterexample traces that are generated when a security property is violated. These vulnerabilities are then addressed with a trampoline-based binary patching method, and the effectiveness of the patches is checked with crash-inducing inputs extracted during concolic execution. We implemented the approach in the BASICS tool for BO mitigation and evaluated using the Juliet C/C++ and SARD datasets and real applications, achieving an accuracy and precision above 87%, both in detection and correction. Also, we compared it with CWE Checker, outperforming it.

Paperid: 3395, https://arxiv.org/pdf/2511.19654.pdf

Abstract:
This study examines whether Low-Rank Adaptation (LoRA) fine-tuned Large Language Models (LLMs) can approximate the performance of fully fine-tuned models in generating human-interpretable decisions and explanations for malware classification. Achieving trustworthy malware detection, particularly when LLMs are involved, remains a significant challenge. We developed an evaluation framework using Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation (ROUGE), and Semantic Similarity Metrics to benchmark explanation quality across five LoRA configurations and a fully fine-tuned baseline. Results indicate that full fine-tuning achieves the highest overall scores, with BLEU and ROUGE improvements of up to 10% over LoRA variants. However, mid-range LoRA models deliver competitive performance exceeding full fine-tuning on two metrics while reducing model size by approximately 81% and training time by over 80% on a LoRA model with 15.5% trainable parameters. These findings demonstrate that LoRA offers a practical balance of interpretability and resource efficiency, enabling deployment in resource-constrained environments without sacrificing explanation quality. By providing feature-driven natural language explanations for malware classifications, this approach enhances transparency, analyst confidence, and operational scalability in malware detection systems.

Paperid: 3396, https://arxiv.org/pdf/2511.19331.pdf

Abstract:
The science of science is an emerging field that studies the practice of science itself. We present the first study of the cybersecurity discipline from a science of science perspective. We examine the evolution of two comparable interdisciplinary communities in cybersecurity: the Symposium on Usable Privacy and Security (SOUPS) and Financial Cryptography and Data Security (FC).

Paperid: 3397, https://arxiv.org/pdf/2511.19015.pdf

Abstract:
Differential Privacy (DP) is a widely adopted standard for privacy-preserving data analysis, but it assumes a uniform privacy budget across all records, limiting its applicability when privacy requirements vary with data values. Per-record Differential Privacy (PrDP) addresses this by defining the privacy budget as a function of each record, offering better alignment with real-world needs. However, the dependency between the privacy budget and the data value introduces challenges in protecting the budget's privacy itself. Existing solutions either handle specific privacy functions or adopt relaxed PrDP definitions. A simple workaround is to use the global minimum of the privacy function, but this severely degrades utility, as the minimum is often set extremely low to account for rare records with high privacy needs. In this work, we propose a general and practical framework that enables any standard DP mechanism to support PrDP, with error depending only on the minimal privacy requirement among records actually present in the dataset. Since directly revealing this minimum may leak information, we introduce a core technique called privacy-specified domain partitioning, which ensures accurate estimation without compromising privacy. We also extend our framework to the local DP setting via a novel technique, privacy-specified query augmentation. Using our framework, we present the first PrDP solutions for fundamental tasks such as count, sum, and maximum estimation. Experimental results show that our mechanisms achieve high utility and significantly outperform existing Personalized DP (PDP) methods, which can be viewed as a special case of PrDP with relaxed privacy protection.

Paperid: 3398, https://arxiv.org/pdf/2511.18966.pdf

Abstract:
The security of code generated by large language models (LLMs) is a significant concern, as studies indicate that such code often contains vulnerabilities and lacks essential defensive programming constructs. This work focuses on examining and evaluating the security of LLM-generated code, particularly in the context of C/C++. We categorized known vulnerabilities using the Common Weakness Enumeration (CWE) and, to study their criticality, mapped them to CVEs. We used ten different LLMs for code generation and analyzed the outputs through static analysis. The amount of CWEs present in AI-generated code is concerning. Our findings highlight the need for developers to be cautious when using LLM-generated code. This study provides valuable insights to advance automated code generation and encourage further research in this domain.

Paperid: 3399, https://arxiv.org/pdf/2511.18498.pdf

Abstract:
Opening up data produced by the Internet of Things (IoT) and mobile devices for public utilization can maximize their economic value. Challenges remain in the trustworthiness of the data sources and the security of the trading process, particularly when there is no trust between the data providers and consumers. In this paper, we propose DEXO, a decentralized data exchange mechanism that facilitates secure and fair data exchange between data consumers and distributed IoT/mobile data providers at scale, allowing the consumer to verify the data generation process and the providers to be compensated for providing authentic data, with correctness guarantees from the exchange platform. To realize this, DEXO extends the decentralized oracle network model that has been successful in the blockchain applications domain to incorporate novel hardware-cryptographic co-design that harmonizes trusted execution environment, secret sharing, and smart contract-assisted fair exchange. For the first time, DEXO ensures end-to-end data confidentiality, source verifiability, and fairness of the exchange process with strong resilience against participant collusion. We implemented a prototype of the DEXO system to demonstrate feasibility. The evaluation shows a moderate deployment cost and significantly improved blockchain operation efficiency compared to a popular data exchange mechanism.

Paperid: 3400, https://arxiv.org/pdf/2511.18412.pdf

Abstract:
In this work, we present ioPUF+, which incorporates a novel Physical Unclonable Function (PUF) that generates unique fingerprints for Integrated Circuits (ICs) and the IoT nodes encompassing them. The proposed PUF generates device-specific responses by measuring the pull-up and pull-down resistor values on the I/O pins of the ICs, which naturally vary across chips due to manufacturing-induced process variations. Since these resistors are already integrated into the I/O structures of most ICs, ioPUF+ requires no custom circuitry, and no new IC fabrication. This makes ioPUF+ suitable for cost-sensitive embedded systems built from Commercial Off-The-Shelf (COTS) components. Beyond introducing a new PUF, ioPUF+ includes a complete datapath for converting raw PUF responses into cryptographically usable secret keys using BCH error correction and SHA-256 hashing. Further ioPUF+ also demonstrate a practical use case of PUF derive secret keys in securing device-to-device communication using AES-encryption. We implemented ioPUF+ on the Infineon PSoC-5 microcontroller and evaluated its performance across 30 devices using standard PUF metrics. The results show excellent reliability (intra-device Hamming distance of 100.00%), strong uniqueness (inter-device Hamming distance of 50.33%), near-ideal uniformity (50.54%), and negligible bit aliasing. Stability tests under temperature and supply-voltage variations show worst-case bit-error rates of only 2.63% and 2.10%, respectively. We also profiled the resource and energy usage of the complete ioPUF+ system, including the PUF primitive, BCH decoding, SHA-256 hashing, and AES encryption. The full implementation requires only 19.8 KB of Flash, exhibits a latency of 600 ms, and consumes 79 mW of power, demonstrating the suitabilitiy of ioPUF+ for resource-constrained IoT nodes.

Paperid: 3401, https://arxiv.org/pdf/2511.18379.pdf

Abstract:
Self-Sovereign Identity (SSI) grants holders full ownership and control of their digital identities, being the ultimate digital identity model. Operating in a decentralized manner, SSI enables the verification of claims, including privacy-preserving mechanisms. Blockchain, which can be used to implement a Verifiable Data Registry (VDR), is often considered one of the pillars of SSI, along with Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs). Unfortunately, blockchains are mostly siloed, affecting the interoperability and universality of SSI. We investigate the effect of blockchain isolation on blockchain-based SSI. We first define possible scenarios for cross-chain SSI and exemplify with real-life use cases. We then define specific requirements for cross-chain SSI and identify challenges, also in relation to the identified scenarios. We explore various solutions to achieve blockchain interoperability, with a focus on SSI. In particular, we identify the advantages and disadvantages of distinct cross-chain models for cross-chain SSI. Finally, we address the usability of cross-chain SSI and discuss security and privacy aspects, opening the way for future research.

Paperid: 3402, https://arxiv.org/pdf/2511.18226.pdf

Abstract:
In this paper, we propose a novel approach for optimizing the linear layer used in symmetric cryptography. It is observed that these matrices often have circulant structure. The basic idea of this work is to utilize the property to construct a sequence of transformation matrices, which allows subsequent heuristic algorithms to find more efficient implementations. Our results outperform previous works for various linear layers of block ciphers. For Whirlwind M0 , we obtain two implementations with 159 XOR counts (8% better than Yuan et al. at FSE 2025) and depth 17 (39% better than Shi et al. at AsiaCrypt 2024) respectively. For AES MixColumn, our automated method produces a quantum circuit with depth 10, which nearly matches the manually optimized state-of-the-art result by Zhang et al. at IEEE TC 2024, only with 2 extra CNOTs.

Paperid: 3403, https://arxiv.org/pdf/2511.18223.pdf

Abstract:
Intrusion Detection Systems (IDS) play a vital role in defending modern cyber physical systems against increasingly sophisticated cyber threats. Deep Reinforcement Learning-based IDS, have shown promise due to their adaptive and generalization capabilities. However, recent studies reveal their vulnerability to adversarial attacks, including Universal Adversarial Perturbations (UAPs), which can deceive models with a single, input-agnostic perturbation. In this work, we propose a novel UAP attack against Deep Reinforcement Learning (DRL)-based IDS under the domain-specific constraints derived from network data rules and feature relationships. To the best of our knowledge, there is no existing study that has explored UAP generation for the DRL-based IDS. In addition, this is the first work that focuses on developing a UAP against a DRL-based IDS under realistic domain constraints based on not only the basic domain rules but also mathematical relations between the features. Furthermore, we enhance the evasion performance of the proposed UAP, by introducing a customized loss function based on the Pearson Correlation Coefficient, and we denote it as Customized UAP. To the best of our knowledge, this is also the first work using the PCC value in the UAP generation, even in the broader context. Four additional established UAP baselines are implemented for a comprehensive comparison. Experimental results demonstrate that our proposed Customized UAP outperforms two input-dependent attacks including Fast Gradient Sign Method (FGSM), Basic Iterative Method (BIM), and four UAP baselines, highlighting its effectiveness for real-world adversarial scenarios.

Paperid: 3404, https://arxiv.org/pdf/2511.18098.pdf

Abstract:
This paper presents an empirical investigation into the capabilities of Large Language Models (LLMs) to perform automated Attribute-based Access Control (ABAC) policy mining. While ABAC provides fine-grained, context-aware access management, the increasing number and complexity of access policies can make their formulation and evaluation rather challenging. To address the task of synthesizing concise yet accurate policies, we evaluate the performance of some of the state-of-the-art LLMs, specifically Google Gemini (Flash and Pro) and OpenAI ChatGPT, as potential policy mining engines. An experimental framework was developed in Python to generate randomized access data parameterized by varying numbers of subjects, objects, and initial policy sets. The baseline policy sets, which govern permission decisions between subjects and objects, serve as the ground truth for comparison. Each LLM-generated policy was evaluated against the baseline policy using standard performance metrics. The results indicate that LLMs can effectively infer compact and valid ABAC policies for small-scale scenarios. However, as the system size increases, characterized by higher numbers of subjects and objects, LLM outputs exhibit declining accuracy and precision, coupled with significant increase in the size of policy generated, which is beyond the optimal size. These findings highlight both the promise and limitations of current LLM architectures for scalable policy mining in access control domains. Future work will explore hybrid approaches that combine prompt optimization with classical rule mining algorithms to improve scalability and interpretability in complex ABAC environments.

Paperid: 3405, https://arxiv.org/pdf/2511.18025.pdf

Abstract:
Data streams collected from multiple sources are rarely independent. Values evolve over time and influence one another across sequences. These correlations improve prediction in healthcare, finance, and smart-city control yet violate the record-independence assumption built into most Differential Privacy (DP) mechanisms. To restore rigorous privacy guarantees without sacrificing utility, we introduce Correlated-Sequence Differential Privacy (CSDP), a framework specifically designed for preserving privacy in correlated sequential data. CSDP addresses two linked challenges: quantifying the extra information an attacker gains from joint temporal and cross-sequence links, and adding just enough noise to hide that information while keeping the data useful. We model multivariate streams as a Coupling Markov Chain, yielding the derived loose leakage bound expressed with a few spectral terms and revealing a counterintuitive result: stronger coupling can actually decrease worst-case leakage by dispersing perturbations across sequences. Guided by these bounds, we build the Freshness-Regulated Adaptive Noise (FRAN) mechanism--combining data aging, correlation-aware sensitivity scaling, and Laplace noise--that runs in linear time. Tests on two-sequence datasets show that CSDP improves the privacy-utility trade-off by approximately 50% over existing correlated-DP methods and by two orders of magnitude compared to the standard DP approach.

Paperid: 3406, https://arxiv.org/pdf/2511.17726.pdf

Abstract:
Recent work has shown that out-of-order and speculative execution mechanisms used to increase performance in the majority of processors expose the processors to critical attacks. These attacks, called Meltdown and Spectre, exploit the side effects of performance-enhancing features in modern microprocessors to expose secret data through side channels in the microarchitecture. The well known implementations of these attacks exploit cache-based side channels since they are the least noisy channels to exfiltrate data. While some software patches attempted to mitigate these attacks, they are ad-hoc and only try to fix the side effects of the vulnerabilites. They may also impose a performance overhead of up to 30%. In this paper, we present a microarchitecture-based solution for Meltdown and Spectre that addresses the vulnerabilities exploited by the attacks. Our solution prevents flushed instructions from exposing data to the cache. Our approach can also be extended to other memory structures in the microarchitecture thereby preventing variants of the attacks which exploit these memory structures. We further identify two new variant attacks based on exploiting the side effects of speculative and out-of-order execution and show how our solution can be used to prevent these attacks. Evaluation results show that our microarchitectural solution not only restores secure out-of-order and speculative execution, but also has relatively low overhead and does not significantly impact performance for most applications.

Paperid: 3407, https://arxiv.org/pdf/2511.17464.pdf

Abstract:
We present a patient-centric architecture for electronic health record (EHR) sharing that separates content storage from authorization and audit. Encrypted FHIR resources are stored off-chain; a public blockchain records only cryptographic commitments and patient-signed, time-bounded permissions using EIP-712. Keys are distributed via public-key wrapping, enabling storage providers to remain honest-but-curious without risking confidentiality. We formalize security goals (confidentiality, integrity, cryptographically attributable authorization, and auditability of authorization events) and provide a Solidity reference implementation deployed as single-patient contracts. On-chain costs for permission grants average 78,000 gas (L1), and end-to-end access latency for 1 MB records is 0.7--1.4s (mean values for S3 and IPFS respectively), dominated by storage retrieval. Layer-2 deployment reduces gas usage by 10--13x, though data availability charges dominate actual costs. We discuss metadata privacy, key registry requirements, and regulatory considerations (HIPAA/GDPR), demonstrating a practical route to restoring patient control while preserving security properties required for sensitive clinical data.

Paperid: 3408, https://arxiv.org/pdf/2511.17283.pdf

Abstract:
With the rapid growth of IoT, secure and efficient mesh networking has become essential. Thread has emerged as a key protocol, widely used in smart-home and commercial systems, and serving as a core transport layer in the Matter standard. This paper presents ThreadFuzzer, the first dedicated fuzzing framework for systematically testing Thread protocol implementations. By manipulating packets at the MLE layer, ThreadFuzzer enables fuzzing of both virtual OpenThread nodes and physical Thread devices. The framework incorporates multiple fuzzing strategies, including Random and Coverage-based fuzzers from CovFuzz, as well as a newly introduced TLV Inserter, designed specifically for TLV-structured MLE messages. These strategies are evaluated on the OpenThread stack using code-coverage and vulnerability-discovery metrics. The evaluation uncovered five previously unknown vulnerabilities in the OpenThread stack, several of which were successfully reproduced on commercial devices that rely on OpenThread. Moreover, ThreadFuzzer was benchmarked against an oracle AFL++ setup using the manually extended OSS-Fuzz harness from OpenThread, demonstrating strong effectiveness. These results demonstrate the practical utility of ThreadFuzzer while highlighting challenges and future directions in the wireless protocol fuzzing research space.

Paperid: 3409, https://arxiv.org/pdf/2511.17113.pdf

Abstract:
Network Intrusion Detection Systems (NIDS) are essential tools for detecting network attacks and intrusions. While extensive research has explored the use of supervised Machine Learning for attack detection and characterisation, these methods require accurately labelled datasets, which are very costly to obtain. Moreover, existing public datasets have limited and/or outdated attacks, and many of them suffer from mislabelled data. To reduce the reliance on labelled data, we propose AutoGraphAD, a novel unsupervised anomaly detection approach based on a Heterogeneous Variational Graph Autoencoder. AutoGraphAD operates on heterogeneous graphs, made from connection and IP nodes that capture network activity within a time window. The model is trained using unsupervised and contrastive learning, without relying on any labelled data. The reconstruction, structural loss, and KL divergence are then weighted and combined in an anomaly score that is then used for anomaly detection. Overall, AutoGraphAD yields the same, and in some cases better, results than previous unsupervised approaches, such as Anomal-E, but without requiring costly downstream anomaly detectors. As a result, AutoGraphAD achieves around 1.18 orders of magnitude faster training and 1.03 orders of magnitude faster inference, which represents a significant advantage for operational deployment.

Paperid: 3410, https://arxiv.org/pdf/2511.16468.pdf

Abstract:
This paper proposes an optimization of Quantum Key Distribution (QKD) Networks using Graph Neural Networks (GNN) framework. Today, the development of quantum computers threatens the security systems of classical cryptography. Moreover, as QKD networks are designed for protecting secret communication, they suffer from multiple operational difficulties: adaptive to dynamic conditions, optimization for multiple parameters and effective resource utilization. In order to overcome these obstacles, we propose a GNN-based framework which can model QKD networks as dynamic graphs and extracts exploitable characteristics from these networks' structure. The graph contains not only topological information but also specific characteristics associated with quantum communication (the number of edges between nodes, etc). Experimental results demonstrate that the GNN-optimized QKD network achieves a substantial increase in total key rate (from 27.1 Kbits/s to 470 Kbits/s), a reduced average QBER (from 6.6% to 6.0%), and maintains path integrity with a slight reduction in average transmission distance (from 7.13 km to 6.42 km). Furthermore, we analyze network performance across varying scales (10 to 250 nodes), showing improved link prediction accuracy and enhanced key generation rate in medium-sized networks. This work introduces a novel operation mode for QKD networks, shifting the paradigm of network optimization through adaptive and scalable quantum communication systems that enhance security and performance.

Paperid: 3411, https://arxiv.org/pdf/2511.16034.pdf

Abstract:
The rapid growth of quantum computing poses a threat to the cryptographic foundations of digital systems, requiring the development of secure and scalable electronic voting (evoting) frameworks. We introduce a post-quantum-secure evoting architecture that integrates Falcon lattice-based digital signatures, biometric authentication via MobileNetV3 and AdaFace, and a permissioned blockchain for tamper-proof vote storage. Voter registration involves capturing facial embeddings, which are digitally signed using Falcon and stored on-chain to ensure integrity and non-repudiation. During voting, real-time biometric verification is performed using anti-spoofing techniques and cosine-similarity matching. The system demonstrates low latency and robust spoof detection, monitored through Prometheus and Grafana for real-time auditing. The average classification error rates (ACER) are below 3.5% on the CelebA Spoof dataset and under 8.2% on the Wild Face Anti-Spoofing (WFAS) dataset. Blockchain anchoring incurs minimal gas overhead, approximately 3.3% for registration and 0.15% for voting, supporting system efficiency, auditability, and transparency. The experimental results confirm the system's scalability, efficiency, and resilience under concurrent loads. This approach offers a unified solution to address key challenges in voter authentication, data integrity, and quantum-resilient security for digital systems.

Paperid: 3412, https://arxiv.org/pdf/2511.15998.pdf

Abstract:
Generative AI is reshaping offensive cybersecurity by enabling autonomous red team agents that can plan, execute, and adapt during penetration tests. However, existing approaches face trade-offs between generality and specialization, and practical deployments reveal challenges such as hallucinations, context limitations, and ethical concerns. In this work, we introduce a novel command & control (C2) architecture leveraging the Model Context Protocol (MCP) to coordinate distributed, adaptive reconnaissance agents covertly across networks. Notably, we find that our architecture not only improves goal-directed behavior of the system as whole, but also eliminates key host and network artifacts that can be used to detect and prevent command & control behavior altogether. We begin with a comprehensive review of state-of-the-art generative red teaming methods, from fine-tuned specialist models to modular or agentic frameworks, analyzing their automation capabilities against task-specific accuracy. We then detail how our MCP-based C2 can overcome current limitations by enabling asynchronous, parallel operations and real-time intelligence sharing without periodic beaconing. We furthermore explore advanced adversarial capabilities of this architecture, its detection-evasion techniques, and address dual-use ethical implications, proposing defensive measures and controlled evaluation in lab settings. Experimental comparisons with traditional C2 show drastic reductions in manual effort and detection footprint. We conclude with future directions for integrating autonomous exploitation, defensive LLM agents, predictive evasive maneuvers, and multi-agent swarms. The proposed MCP-enabled C2 framework demonstrates a significant step toward realistic, AI-driven red team operations that can simulate advanced persistent threats while informing the development of next-generation defensive systems.

Paperid: 3413, https://arxiv.org/pdf/2511.15837.pdf

Abstract:
The rapid adoption of multi-cloud environments has amplified risks associated with privileged access mismanagement. Traditional Privileged Access Management (PAM) solutions based on Attribute-Based Access Control (ABAC) exhibit cubic O(n^3) complexity, rendering real-time privilege analysis intractable at enterprise scale. We present a novel PAM framework integrating NIST's Next Generation Access Control (NGAC) with hypergraph semantics to address this scalability crisis. Our approach leverages hypergraphs with labeled hyperedges to model complex, multi-dimensional privilege relationships, achieving sub-linear O(sqrt n) traversal complexity and O(nlogn) detection time-rigorously proven through formal complexity analysis. We introduce a 3-Dimensional Privilege Analysis framework encompassing Attack Surface, Attack Window, and Attack Identity to systematically identify privilege vulnerabilities. Experimental validation on AWS-based systems with 200-4000 users demonstrates 10x improvement over ABAC and 4x improvement over standard NGAC-DAG, enabling sub-second privilege detection at scale. Real-world use cases validate detection of privilege escalation chains, over-privileged users, and lateral movement pathways in multi-cloud infrastructures.

Paperid: 3414, https://arxiv.org/pdf/2511.15763.pdf

Abstract:
Risks associated with the use of AI, ranging from algorithmic bias to model hallucinations, have received much attention and extensive research across the AI community, from researchers to end-users. However, a gap exists in the systematic assessment of supply chain risks associated with the complex web of data sources, pre-trained models, agents, services, and other systems that contribute to the output of modern AI systems. This gap is particularly problematic when AI systems are used in critical applications, such as the food supply, healthcare, utilities, law, insurance, and transport. We survey the current state of AI risk assessment and management, with a focus on the supply chain of AI and risks relating to the behavior and outputs of the AI system. We then present a proposed taxonomy specifically for categorizing AI supply chain entities. This taxonomy helps stakeholders, especially those without extensive AI expertise, to "consider the right questions" and systematically inventory dependencies across their organization's AI systems. Our contribution bridges a gap between the current state of AI governance and the urgent need for actionable risk assessment and management of AI use in critical applications.

Paperid: 3415, https://arxiv.org/pdf/2511.15556.pdf

Abstract:
Event-based Sensing (EBS) hardware is quickly proliferating while finding foothold in many commercial, industrial, and defense applications. At present, there are a handful of technologically mature systems which produce data streams with diverse output formats. In the near future it is anticipated there will be vendors who offer new sensor hardware which could also yield unique data schema that are not aligned to past efforts. Thus, due to the relative nascent nature of the technology and its potential for widespread use in a variety of applications, it is an opportune time to define a standard for this class of sensors' output data. The intent of this document is to identify and provide a standard for the collected EBS streaming data. The main objective of the standard is to be sensor agnostic, incorporate some of the current sensor configurations and modalities, and account for the developing configurations and modalities. The intent is also to leave enough place holders and space in the standard for future variations that may develop as EBS technology matures.

Paperid: 3416, https://arxiv.org/pdf/2511.15479.pdf

Abstract:
With the rise of software-defined vehicles (SDVs), where software governs most vehicle functions alongside enhanced connectivity, the need for secure software updates has become increasingly critical. Software vulnerabilities can severely impact safety, the economy, and society. In response to this challenge, Strandberg et al. [escar Europe, 2021] introduced the Unified Software Update Framework (UniSUF), designed to provide a secure update framework that integrates seamlessly with existing vehicular infrastructures. Although UniSUF has previously been evaluated regarding cybersecurity, these assessments have not employed formal verification methods. To bridge this gap, we perform a formal security analysis of UniSUF. We model UniSUF's architecture and assumptions to reflect real-world automotive systems and develop a ProVerif-based framework that formally verifies UniSUF's compliance with essential security requirements - confidentiality, integrity, authenticity, freshness, order, and liveness - demonstrating their satisfiability through symbolic execution. Our results demonstrate that UniSUF adheres to the specified security guarantees, ensuring the correctness and reliability of its security framework.

Paperid: 3417, https://arxiv.org/pdf/2511.15465.pdf

Abstract:
Quantum Key Distribution (QKD) networks enable unconditionally secure key exchange using quantum mechanical principles. However, routing cryptographic keys across multi-hop quantum networks introduces challenges unique to quantum communication. This survey analyzes and classifies 26 routing strategies proposed between 2013 and 2024 for terrestrial, satellite, and hybrid QKD infrastructures. Dynamic, key-aware routing algorithms have been shown to reduce service rejection rates by 25-40% compared to static shortest-path methods by incorporating real-time key pool availability and link error rates. Multi-path strategies improve resilience against trusted-node compromise by distributing keys across disjoint routes, albeit with an increase in key consumption of 30-60%. SDN-based orchestration frameworks emerge as essential enablers of flexible, QoS-aware routing for hybrid networks integrating terrestrial fibers and intermittent satellite links. Critical research directions include reducing dependence on trusted nodes through quantum repeaters or post-quantum cryptography, developing standardized key management interfaces for interoperability, and adopting AI-driven predictive routing for adapting to fluctuating network states. This comprehensive synthesis of routing algorithms, performance trade-offs, and unresolved issues offers practical guidance for engineering secure, adaptable, and large-scale QKD networks capable of meeting real-world deployment demands.

Paperid: 3418, https://arxiv.org/pdf/2511.15272.pdf

Abstract:
The security of future large-scale IoT networks is critically threatened by the ``Harvest Now, Decrypt Later'' (HNDL) attack paradigm. Securing the massive, long-lived data streams from these systems requires protocols that are both quantum-resistant and highly scalable. Existing solutions are insufficient: post-quantum classical protocols rely on computational assumptions that may not hold for decades, while purely quantum protocols are too resource-intensive for the sheer scale of IoT. This paper introduces the Quantum Anonymous Data Reporting (QADR) protocol, a hybrid framework that provides a theoretical benchmark and high-performance architecture for this challenge, designed for future fully-connected quantum networks. The protocol achieves scalable, quantum-resistant anonymity through a hybrid security model; it leverages information-theoretically secure keys from Quantum Key Distribution (QKD) to seed a quantum-secure pseudorandom function (QS-PRF), grounding its long-term data protection in well-established computational hardness assumptions. We also propose and analyze an automated slot reservation mechanism by making a deliberate trade-off: achieving high performance by accepting a quantifiable information leak during the anonymous slot reservation phase while maintaining strong unlinkability for the final data submission. Our security analysis formally quantifies the anonymity reduction caused by the leak and discusses pathways to fully mitigate it at a significant performance cost. We prove the protocol's critical advantage as a performance benchmark: its primary communication cost scales as $O(n^2)$, a dramatic improvement over quantum-native alternatives ($O(n^4)$), establishing a high-performance goal for future quantum-secured anonymity systems.

Paperid: 3419, https://arxiv.org/pdf/2511.15071.pdf

Abstract:
Efficient zero-knowledge proofs (ZKPs) have been restricted to NP statements so far, whereas they exist for all statements in PSPACE. This work presents the first practical zero-knowledge (ZK) protocols for PSPACE-complete statements by enabling ZK proofs of QBF (Quantified Boolean Formula) evaluation. The core idea is to validate quantified resolution proofs (Q-Res) in ZK. We develop an efficient polynomial encoding of Q-Res proofs, enabling proof validation through low-overhead arithmetic checks. We also design a ZK protocol to prove knowledge of a winning strategy related to the QBF, which is often equally important in practice. We implement our protocols and evaluate them on QBFEVAL. The results show that our protocols can verify 72% of QBF evaluations via Q-Res proof and 82% of instances' winning strategies within 100 seconds, for instances where such proofs or strategies can be obtained.

Paperid: 3420, https://arxiv.org/pdf/2511.14717.pdf

Abstract:
Attack trees (ATs) are popular graphical models for reasoning about the security of complex systems, allowing for the quantification of risk through so-called AT metrics. A large variety of different such AT metrics have been proposed, and despite their wide-spread practical use, no systematic treatment of attack tree metrics so far is fully satisfactory. Existing approaches either fail to include important metrics, or they are too general to provide a useful systematic way for defining concrete AT metrics, giving only an abstract characterisation of their behaviour. We solve this problem by developing a compositional theory of ATs and their functorial semantics based on gs-monoidal categories. Viewing attack trees as string diagrams, we show that components of ATs form a channel category, a particular type of gs-monoidal category. AT metrics then correspond to functors of channel categories. This characterisation is both general enough to include all common AT metrics, and concrete enough to define AT metrics by their logical structure.

Paperid: 3421, https://arxiv.org/pdf/2511.14611.pdf

Abstract:
Mobile Web3 faces catastrophic retention (< 5%) yielding effective acquisition costs of \$500 - \$1,000 per retained user. Existing solutions force an impossible tradeoff: embedded wallets achieve moderate usability but suffer inherent click-jacking vulnerabilities; app wallets maintain security at the cost of 2 - 3% retention due to download friction and context-switching penalties. We present SecureSign, a PWA-based architecture that adapts desktop browser extension security to mobile via EIP-6963 provider sandboxing. SecureSign isolates dApp execution in iframes within a trusted parent application, achieving click-jacking immunity and transaction integrity while enabling native mobile capabilities (push notifications, home screen installation, zero context-switching). Our drop-in SDK requires no codebase changes for existing Web3 applications. Threat model analysis demonstrates immunity to click-jacking, overlay, and skimming attacks while maintaining wallet interoperability across dApps.

Paperid: 3422, https://arxiv.org/pdf/2511.14524.pdf

Abstract:
With the ever-growing demand for cybersecurity, static key encryption mechanisms are increasingly vulnerable to adversarial attacks due to their deterministic and non-adaptive nature. Brute-force attacks, key compromise, and unauthorized access have become highly common cyber threats. This research presents a novel fuzzy logic-based cryptographic framework that dynamically generates encryption keys in real-time by accessing system-level entropy and hardware-bound trust. The proposed system leverages a Fuzzy Inference System (FIS) to evaluate system parameters that include CPU utilization, process count, and timestamp variation. It assigns entropy level based on linguistically defined fuzzy rules which are fused with hardware-generated randomness and then securely sealed using a Trusted Platform Module (TPM). The sealed key is incorporated in an AES-GCM encryption scheme to ensure both confidentiality and integrity of the data. This system introduces a scalable solution for adaptive encryption in high-assurance computing, zero-trust environments, and cloud-based infrastructure.

Paperid: 3424, https://arxiv.org/pdf/2511.14088.pdf

Abstract:
Run-time integrity enforcement in real-time systems presents a fundamental conflict with availability. Existing approaches in real- time systems primarily focus on minimizing the execution-time overhead of monitoring. After a violation is detected, prior works face a trade-off: (1) prioritize availability and allow a compromised system to continue to ensure applications meet their deadlines, or (2) prioritize security by generating a fault to abort all execution. In this work, we propose PAIR, an approach that offers a middle ground between the stark extremes of this trade-off. PAIR monitors real-time tasks for run-time integrity violations and maintains an Availability Region (AR) of all tasks that are safe to continue. When a task causes a violation, PAIR triggers a non-maskable interrupt to kill the task and continue executing a non-violating task within AR. Thus, PAIR ensures only violating tasks are prevented from execution, while granting availability to remaining tasks. With its hardware approach, PAIR does not cause any run-time overhead to the executing tasks, integrates with real-time operating systems (RTOSs), and is affordable to low-end microcontroller units (MCUs) by incurring +2.3% overhead in memory and hardware usage.

Paperid: 3425, https://arxiv.org/pdf/2511.14061.pdf

Abstract:
Given a circuit $G: \{0, 1\}^n \to \{0, 1\}^m$ with $m > n$, the *range avoidance* problem ($\text{Avoid}$) asks to output a string $y\in \{0, 1\}^m$ that is not in the range of $G$. Besides its profound connection to circuit complexity and explicit construction problems, this problem is also related to the existence of *proof complexity generators* -- circuits $G: \{0, 1\}^n \to \{0, 1\}^m$ where $m > n$ but for every $y\in \{0, 1\}^m$, it is infeasible to prove the statement "$y\not\in\mathrm{Range}(G)$" in a given propositional proof system. This paper connects these two problems with the existence of *demi-bits generators*, a fundamental cryptographic primitive against nondeterministic adversaries introduced by Rudich (RANDOM '97). $\bullet$ We show that the existence of demi-bits generators implies $\text{Avoid}$ is hard for nondeterministic algorithms. This resolves an open problem raised by Chen and Li (STOC '24). Furthermore, assuming the demi-hardness of certain LPN-style generators or Goldreich' PRG, we prove the hardness of $\text{Avoid}$ even when the instances are constant-degree polynomials over $\mathbb{F}_2$. $\bullet$ We show that the dual weak pigeonhole principle is unprovable in Cook's theory $\mathsf{PV}_1$ under the existence of demi-bits generators secure against $\mathbf{AM}$, thereby separating Jerabek's theory $\mathsf{APC}_1$ from $\mathsf{PV}_1$. $\bullet$ We transform demi-bits generators to proof complexity generators that are *pseudo-surjective* with nearly optimal parameters. Our constructions build on the recent breakthroughs on the hardness of $\text{Avoid}$ by Ilango, Li, and Williams (STOC '23) and Chen and Li (STOC '24). We use *randomness extractors* to significantly simplify the construction and the proof.

Paperid: 3426, https://arxiv.org/pdf/2511.13999.pdf

Abstract:
We study the running time, in terms of first order oracle queries, of differentially private empirical/population risk minimization of Lipschitz convex losses. We first consider the setting where the loss is non-smooth and the optimizer interacts with a private proxy oracle, which sends only private messages about a minibatch of gradients. In this setting, we show that expected running time $Ω(\min\{\frac{\sqrt{d}}{α^2}, \frac{d}{\log(1/α)}\})$ is necessary to achieve $α$ excess risk on problems of dimension $d$ when $d \geq 1/α^2$. Upper bounds via DP-SGD show these results are tight when $d>\tildeΩ(1/α^4)$. We further show our lower bound can be strengthened to $Ω(\min\{\frac{d}{\bar{m}α^2}, \frac{d}{\log(1/α)} \})$ for algorithms which use minibatches of size at most $\bar{m} < \sqrt{d}$. We next consider smooth losses, where we relax the private oracle assumption and give lower bounds under only the condition that the optimizer is private. Here, we lower bound the expected number of first order oracle calls by $\tildeΩ\big(\frac{\sqrt{d}}α + \min\{\frac{1}{α^2}, n\}\big)$, where $n$ is the size of the dataset. Modifications to existing algorithms show this bound is nearly tight. Compared to non-private lower bounds, our results show that differentially private optimizers pay a dimension dependent runtime penalty. Finally, as a natural extension of our proof technique, we show lower bounds in the non-smooth setting for optimizers interacting with information limited oracles. Specifically, if the proxy oracle transmits at most $Γ$-bits of information about the gradients in the minibatch, then $Ω\big(\min\{\frac{d}{α^2Γ}, \frac{d}{\log(1/α)}\}\big)$ oracle calls are needed. This result shows fundamental limitations of gradient quantization techniques in optimization.

Paperid: 3427, https://arxiv.org/pdf/2511.13788.pdf

Abstract:
Large language models (LLMs) increasingly operate in multi-agent and safety-critical settings, raising open questions about how their vulnerabilities scale when models interact adversarially. This study examines whether larger models can systematically jailbreak smaller ones - eliciting harmful or restricted behavior despite alignment safeguards. Using standardized adversarial tasks from JailbreakBench, we simulate over 6,000 multi-turn attacker-target exchanges across major LLM families and scales (0.6B-120B parameters), measuring both harm score and refusal behavior as indicators of adversarial potency and alignment integrity. Each interaction is evaluated through aggregated harm and refusal scores assigned by three independent LLM judges, providing a consistent, model-based measure of adversarial outcomes. Aggregating results across prompts, we find a strong and statistically significant correlation between mean harm and the logarithm of the attacker-to-target size ratio (Pearson r = 0.51, p < 0.001; Spearman rho = 0.52, p < 0.001), indicating that relative model size correlates with the likelihood and severity of harmful completions. Mean harm score variance is higher across attackers (0.18) than across targets (0.10), suggesting that attacker-side behavioral diversity contributes more to adversarial outcomes than target susceptibility. Attacker refusal frequency is strongly and negatively correlated with harm (rho = -0.93, p < 0.001), showing that attacker-side alignment mitigates harmful responses. These findings reveal that size asymmetry influences robustness and provide exploratory evidence for adversarial scaling patterns, motivating more controlled investigations into inter-model alignment and safety.

Paperid: 3428, https://arxiv.org/pdf/2511.13781.pdf

Abstract:
Human-centered threat modeling (HCTM) is an emerging area within security and privacy research that focuses on how people define and navigate threats in various social, cultural, and technological contexts. While researchers increasingly approach threat modeling from a human-centered perspective, little is known about how they prepare for and engage with HCTM in practice. In this work, we conduct 23 semi-structured interviews with researchers to examine the state of HCTM, including how researchers design studies, elicit threats, and navigate values, constraints, and long-term goals. We find that HCTM is not a prescriptive process but a set of evolving practices shaped by relationships with participants, disciplinary backgrounds, and institutional structures. Researchers approach threat modeling through sustained groundwork and participant-centered inquiry, guided by values such as care, justice, and autonomy. They also face challenges including emotional strain, ethical dilemmas, and structural barriers that complicate efforts to translate findings into real-world impact. We conclude by identifying opportunities to advance HCTM through shared infrastructure, broader recognition of diverse contributions, and stronger mechanisms for translating findings into policy, design, and societal change.

Paperid: 3429, https://arxiv.org/pdf/2511.13777.pdf

Abstract:
Mining blocks in a blockchain using the \textit{Proof-of-Work} consensus protocol involves significant risk, as network participants face continuous operational costs while earning infrequent capital gains upon successfully mining a block. A common risk mitigation strategy is to join a mining pool, which combines the computing resources of multiple miners to provide a more stable income. This article examines a Pay-per-Share (PPS) reward system, where the pool manager can adjust both the share difficulty and the management fee. Using a simplified wealth model for miners, we explore how miners should allocate their computing resources among different mining pools, considering the trade-off between risk transfer to the manager and management fees.

Paperid: 3430, https://arxiv.org/pdf/2511.13753.pdf

Abstract:
The integration of large language models (LLMs) into automated driving systems has opened new possibilities for reasoning and decision-making by transforming complex driving contexts into language-understandable representations. Recent studies demonstrate that fine-tuned LLMs can accurately predict vehicle trajectories and lane-change intentions by gathering and transforming data from surrounding vehicles. However, the robustness of such LLM-based prediction models for safety-critical driving systems remains unexplored, despite the increasing concerns about the trustworthiness of LLMs. This study addresses this gap by conducting a systematic vulnerability analysis of LLM-enabled vehicle trajectory prediction. We propose a one-feature differential evolution attack that perturbs a single kinematic feature of surrounding vehicles within the LLM's input prompts under a black-box setting. Experiments on the highD dataset reveal that even minor, physically plausible perturbations can significantly disrupt model outputs, underscoring the susceptibility of LLM-based predictors to adversarial manipulation. Further analyses reveal a trade-off between accuracy and robustness, examine the failure mechanism, and explore potential mitigation solutions. The findings provide the very first insights into adversarial vulnerabilities of LLM-driven automated vehicle models in the context of vehicular interactions and highlight the need for robustness-oriented design in future LLM-based intelligent transportation systems.

Paperid: 3431, https://arxiv.org/pdf/2511.13725.pdf

Abstract:
Recently, web-based Large Language Model (LLM) agents autonomously perform increasingly complex tasks, thereby bringing significant convenience. However, they also amplify the risks of malicious misuse cases such as unauthorized collection of personally identifiable information (PII), generation of socially divisive content, and even automated web hacking. To address these threats, we propose an AI Kill Switch technique that can immediately halt the operation of malicious web-based LLM agents. To achieve this, we introduce AutoGuard - the key idea is generating defensive prompts that trigger the safety mechanisms of malicious LLM agents. In particular, generated defense prompts are transparently embedded into the website's DOM so that they remain invisible to human users but can be detected by the crawling process of malicious agents, triggering its internal safety mechanisms to abort malicious actions once read. To evaluate our approach, we constructed a dedicated benchmark consisting of three representative malicious scenarios. Experimental results show that AutoGuard achieves over 80% Defense Success Rate (DSR) across diverse malicious agents, including GPT-4o, Claude-4.5-Sonnet and generalizes well to advanced models like GPT-5.1, Gemini-2.5-flash, and Gemini-3-pro. Also, our approach demonstrates robust defense performance in real-world website environments without significant performance degradation for benign agents. Through this research, we demonstrate the controllability of web-based LLM agents, thereby contributing to the broader effort of AI control and safety.

Paperid: 3432, https://arxiv.org/pdf/2511.13717.pdf

Abstract:
Large Language Models (LLMs) deployed on mobile devices offer benefits like user privacy and reduced network latency, but introduce a significant security risk: the leakage of proprietary models to end users. To mitigate this risk, we propose a system design for protecting on-device LLMs using Arm Trusted Execution Environment (TEE), TrustZone. Our system addresses two primary challenges: (1) The dilemma between memory efficiency and fast inference (caching model parameters within TEE memory). (2) The lack of efficient and secure Neural Processing Unit (NPU) time-sharing between Rich Execution Environment (REE) and TEE. Our approach incorporates two key innovations. First, we employ pipelined restoration, leveraging the deterministic memory access patterns of LLM inference to prefetch parameters on demand, hiding memory allocation, I/O and decryption latency under computation time. Second, we introduce a co-driver design, creating a minimal data plane NPU driver in the TEE that collaborates with the full-fledged REE driver. This reduces the TEE TCB size and eliminates control plane reinitialization overhead during NPU world switches. We implemented our system on the emerging OpenHarmony OS and the llama.cpp inference framework, and evaluated it with various LLMs on an Arm Rockchip device. Compared to a strawman TEE baseline lacking our optimizations, our system reduces TTFT by up to 90.9% and increases decoding speed by up to 23.2%.

Paperid: 3433, https://arxiv.org/pdf/2511.13576.pdf

Abstract:
With the requirements and emphases on privacy transparency placed by regulations such as GDPR and CCPA, the Google Play Store requires Android developers to more responsibly communicate their apps' privacy practices to potential users by providing the proper information via the data safety, privacy policy, and permission manifest privacy transparency channels. However, it is unclear how effective those channels are in helping users make informed decisions in the app selection and installation process. In this article, we conducted a study for 190 participants to interact with our simulated privacy transparency channels of mobile apps. We quantitatively analyzed (supplemented by qualitative analysis) participants' responses to five sets of questions. We found that data safety provides the most intuitive user interfaces, privacy policy is most informative and effective, while permission manifest excels at raising participants' concerns about an app's overall privacy risks. These channels complement each other and should all be improved.

Paperid: 3434, https://arxiv.org/pdf/2511.11235.pdf

Abstract:
This paper investigates neural network-based biometric authentication using finger-drawn digits on touchscreen devices. We evaluated CNN and autoencoder architectures for user authentication through simple digit patterns (0-9) traced with finger input. Twenty participants contributed 2,000 finger-drawn digits each on personal touchscreen devices. We compared two CNN architectures: a modified Inception-V1 network and a lightweight shallow CNN for mobile environments. Additionally, we examined Convolutional and Fully Connected autoencoders for anomaly detection. Both CNN architectures achieved ~89% authentication accuracy, with the shallow CNN requiring fewer parameters. Autoencoder approaches achieved ~75% accuracy. The results demonstrate that finger-drawn symbol authentication provides a viable, secure, and user-friendly biometric solution for touchscreen devices. This approach can be integrated with existing pattern-based authentication methods to create multi-layered security systems for mobile applications.

Paperid: 3435, https://arxiv.org/pdf/2511.10992.pdf

Abstract:
Cheating in online games poses significant threats to the gaming industry, yet most prior research has concentrated on Massively Multiplayer Online Role-Playing Games (MMORPGs). Competitive genres-such as Multiplayer Online Battle Arena (MOBA), First Person Shooter (FPS), Real Time Strategy (RTS), and Action games-remain underexplored due to the difficulty of detecting cheating users and the demand for complex data and techniques. To address this gap, many game companies rely on kernel-level anti-cheat solutions, which, while effective, raise serious concerns regarding user privacy and system security. In this paper, we propose GYNOPTICON, a novel cheating detection framework that leverages user consensus to identify abnormal behavior. GYNOPTICON integrates a lightweight client-side detection mechanism with a server-side voting system: when suspicious activity is identified, clients cast votes to the server, which aggregates them to establish consensus and distinguish cheaters from legitimate players. This architecture enables transparency, reduces reliance on intrusive monitoring, and mitigates privacy risks. We evaluate GYNOPTICON in both a controlled simulation and a real-world FPS environment. Simulation results verify its feasibility and requirements, while real-world experiments confirm its effectiveness in reliably detecting cheating users. Furthermore, we demonstrate the system's applicability and sustainability for long-term game management using public datasets. GYNOPTICON represents a user-driven, consensus-based alternative to conventional anti-cheat systems, offering a practical and privacy-preserving solution for competitive online games.

Paperid: 3436, https://arxiv.org/pdf/2511.10248.pdf

Abstract:
The OPC UA protocol is becoming the de facto standard for Industry 4.0 machine-to-machine communication. It stands out as one of the few industrial protocols that provide robust security features designed to prevent attackers from manipulating and damaging critical infrastructures. However, prior works showed that significant challenges still exists to set up secure OPC UA deployments in practice, mainly caused by the complexity of certificate management in industrial scenarios and the inconsistent implementation of security features across industrial OPC UA devices. In this paper, we present Pk-IOTA, an automated solution designed to secure OPC UA communications by integrating programmable data plane switches for in-network certificate validation and leveraging the IOTA Tangle for decen- tralized certificate distribution. Our evaluation is performed on a physical testbed representing a real-world industrial scenario and shows that Pk-IOTA introduces a minimal overhead while providing a scalable and tamper-proof mechanism for OPC UA certificate management.

Paperid: 3437, https://arxiv.org/pdf/2511.09879.pdf

Abstract:
AI programming assistants have demonstrated a tendency to generate code containing basic security vulnerabilities. While developers are ultimately responsible for validating and reviewing such outputs, improving the inherent quality of these generated code snippets remains essential. A key contributing factor to insecure outputs is the presence of vulnerabilities in the training datasets used to build large language models (LLMs). To address this issue, we propose curating training data to include only code that is free from detectable vulnerabilities. In this study, we constructed a secure dataset by filtering an existing Python corpus using a static analysis tool to retain only vulnerability-free functions. We then trained two transformer-based models: one on the curated dataset and one on the original, unfiltered dataset. The models were evaluated on both the correctness and security of the code they generated in response to natural language function descriptions. Our results show that the model trained on the curated dataset produced outputs with fewer security issues, while maintaining comparable functional correctness. These findings highlight the importance of secure training data in improving the reliability of AI-based programming assistants, though further enhancements to model architecture and evaluation are needed to reinforce these outcomes.

Paperid: 3438, https://arxiv.org/pdf/2511.09603.pdf

Abstract:
Intrusion Detection Systems (IDS) play a vital role in modern cybersecurity frameworks by providing a primary defense mechanism against sophisticated threat actors. In this paper, we propose an explainable intrusion detection framework that integrates Recursive Feature Elimination (RFE) with Random Forest (RF) to enhance detection of Advanced Persistent Threats (APTs). By using CICIDS2017 dataset, the approach begins with comprehensive data preprocessing and narrows down the most significant features via RFE. A Random Forest (RF) model was trained on the refined feature set, with SHapley Additive exPlanations (SHAP) used to interpret the contribution of each selected feature. Our experiment demonstrates that the explainable RF-RFE achieved a detection accuracy of 99.9%, reducing false positive and computational cost in comparison to traditional classifiers. The findings underscore the effectiveness of integrating explainable AI and feature selection to develop a robust, transparent, and deployable IDS solution.

Paperid: 3439, https://arxiv.org/pdf/2511.09552.pdf

Abstract:
In today's digital world, which has many different types of media, steganography, the art of secret communication, has a lot of problems to deal with. Traditional methods are often fixed and only work with one type of carrier media. This means they don't work well with all the different types of media that are out there. This system doesn't send data to "weak" or easily detectable carriers because it can't adapt. This makes the system less safe and less secret in general. This paper proposes a novel Intelligent Carrier Allocation framework founded on a Cross-Modal Reasoning (CMR) Engine. This engine looks at a wide range of carriers, such as images, audio, and text, to see if they are good for steganography. It uses important measurements like entropy, signal complexity, and vocabulary richness to come up with a single reliability score for each modality. The framework uses these scores to fairly and intelligently share the secret bitstream, giving more data to carriers that are thought to be stronger and more complex. This adaptive allocation strategy makes the system as hard to find as possible and as strong as possible against steganalysis. We demonstrate that this reasoning-based approach is more secure and superior in data protection compared to static, non-adaptive multimodal techniques. This makes it possible to build stronger and smarter secret communication systems.

Paperid: 3440, https://arxiv.org/pdf/2511.09492.pdf

Abstract:
Password security plays a crucial role in cybersecurity, yet traditional password strength meters, which rely on static rules like character-type requirements, often fail. Such methods are easily bypassed by common password patterns (e.g., 'P@ssw0rd1!'), giving users a false sense of security. To address this, we implement and evaluate a password strength scoring system by comparing four machine learning models: Random Forest (RF), Support Vector Machine (SVM), a Convolutional Neural Network (CNN), and Logistic Regression with a dataset of over 660,000 real-world passwords. Our primary contribution is a novel hybrid feature engineering approach that captures nuanced vulnerabilities missed by standard metrics. We introduce features like leetspeak-normalized Shannon entropy to assess true randomness, pattern detection for keyboard walks and sequences, and character-level TF-IDF n-grams to identify frequently reused substrings from breached password datasets. our RF model achieved superior performance, achieving 99.12% accuracy on a held-out test set. Crucially, the interpretability of the Random Forest model allows for feature importance analysis, providing a clear pathway to developing security tools that offer specific, actionable feedback to users. This study bridges the gap between predictive accuracy and practical usability, resulting in a high-performance scoring system that not only reduces password-based vulnerabilities but also empowers users to make more informed security decisions.

Paperid: 3441, https://arxiv.org/pdf/2511.09351.pdf

Abstract:
Key-length extension (KLE) techniques provide a general approach to enhancing the security of block ciphers by using longer keys. There are mainly two classes of KLE techniques, cascade encryption and XOR-cascade encryption. This paper presents several quantum meet-in-the-middle (MITM) attacks against two specific KLE constructions. For the two-key triple encryption (2kTE), we propose two quantum MITM attacks under the Q2 model. The first attack, leveraging the quantum claw-finding (QCF) algorithm, achieves a time complexity of $O(2^{2κ/3})$ with $O(2^{2κ/3})$ quantum random access memory (QRAM). The second attack, based on Grover's algorithm, achieves a time complexity of $O(2^{κ/2})$ with $O(2^κ)$ QRAM. The latter complexity is nearly identical to Grover-based brute-force attack on the underlying block cipher, indicating that 2kTE does not enhance security under the Q2 model when sufficient QRAM resources are available. For the 3XOR-cascade encryption (3XCE), we propose a quantum MITM attack applicable to the Q1 model. This attack requires no QRAM and has a time complexity of $O(2^{(κ+n)/2})$ ($κ$ and $n$ are the key length and block length of the underlying block cipher, respectively.), achieving a quadratic speedup over classical MITM attack. Furthermore, we extend the quantum MITM attack to quantum sieve-in-the-middle (SITM) attack, which is applicable for more constructions. We present a general quantum SITM framework for the construction $ELE=E^2\circ L\circ E^1$ and provide specific attack schemes for three different forms of the middle layer $L$. The quantum SITM attack technique can be further applied to a broader range of quantum cryptanalysis scenarios.

Paperid: 3442, https://arxiv.org/pdf/2511.09252.pdf

Abstract:
Traditional distributed backdoor attacks (DBA) in federated learning improve stealthiness by decomposing global triggers into sub-triggers, which however requires more poisoned data to maintian the attck strength and hence increases the exposure risk. To overcome this defect, This paper proposes a novel method, namely Fractal-Triggerred Distributed Backdoor Attack (FTDBA), which leverages the self-similarity of fractals to enhance the feature strength of sub-triggers and hence significantly reduce the required poisoning volume for the same attack strength. To address the detectability of fractal structures in the frequency and gradient domains, we introduce a dynamic angular perturbation mechanism that adaptively adjusts perturbation intensity across the training phases to balance efficiency and stealthiness. Experiments show that FTDBA achieves a 92.3\% attack success rate with only 62.4\% of the poisoning volume required by traditional DBA methods, while reducing the detection rate by 22.8\% and KL divergence by 41.2\%. This study presents a low-exposure, high-efficiency paradigm for federated backdoor attacks and expands the application of fractal features in adversarial sample generation.

Paperid: 3443, https://arxiv.org/pdf/2511.09068.pdf

Abstract:
Edge computing pushes computation closer to data sources, but it also expands the attack surface on resource-constrained devices. This work explores the deployment of the Lightweight Deep Anomaly Detection for Network Traffic (LDPI) integrated as an isolated service within a virtualization framework that provides security by separation. LDPI, adopting a Deep Learning approach, achieved strong training performance, reaching AUC 0.999 (5-fold mean) across the evaluated packet-window settings (n, l), with high F1 at conservative operating points. We deploy LDPI on a laptop-class edge node and evaluate its overhead and performance in two scenarios: (i) comparing it with representative signature-based IDSes (Suricata and Snort) deployed on the same framework under identical workloads, and (ii) while detecting network flooding attacks.

Paperid: 3444, https://arxiv.org/pdf/2511.08703.pdf

Abstract:
Industrial Systems-on-Chips (SoCs) often comprise hundreds of thousands to millions of nets and millions to tens of millions of connectivity edges, making empirical evaluation of hardware-Trojan (HT) detectors on realistic designs both necessary and difficult. Public benchmarks remain significantly smaller and hand-crafted, while releasing truly malicious RTL raises ethical and operational risks. This work presents an automated and scalable methodology for generating HT-like patterns in industry-scale netlists whose purpose is to stress-test detection tools without altering user-visible functionality. The pipeline (i) parses large gate-level designs into connectivity graphs, (ii) explores rare regions using SCOAP testability metrics, and (iii) applies parameterized, function-preserving graph transformations to synthesize trigger-payload pairs that mimic the statistical footprint of stealthy HTs. When evaluated on the benchmarks generated in this work, representative state-of-the-art graph-learning models fail to detect Trojans. The framework closes the evaluation gap between academic circuits and modern SoCs by providing reproducible challenge instances that advance security research without sharing step-by-step attack instructions.

Paperid: 3445, https://arxiv.org/pdf/2511.08641.pdf

Abstract:
This paper introduces a structured approach to improving decision making in Decentralized Autonomous Organizations (DAO) through the integration of the Question-Option-Criteria (QOC) model and AI agents. We outline a stepwise governance framework that evolves from human led evaluations to fully autonomous, AI-driven processes. By decomposing decisions into weighted, criterion based evaluations, the QOC model enhances transparency, fairness, and explainability in DAO voting. We demonstrate how large language models (LLMs) and stakeholder aligned AI agents can support or automate evaluations, while statistical safeguards help detect manipulation. The proposed framework lays the foundation for scalable and trustworthy governance in the Web3 ecosystem.

Paperid: 3446, https://arxiv.org/pdf/2511.08296.pdf

Abstract:
Modern encryption algorithms form the foundation of digital security. However, the widespread use of encryption algorithms results in significant challenges for network defenders in identifying which specific algorithms are being employed. More importantly, we find that when the plaintext distribution of test data departs from the training data, the performance of classifiers often declines significantly. This issue exposes the feature extractor's hidden dependency on plaintext features. To reduce this dependency, we adopt a method that does not learn end-to-end from ciphertext bytes. Specifically, this method is based on a set of statistical tests to compute the randomness feature of the ciphertext, and then uses the frequency distribution pattern of this feature to construct the algorithms' respective fingerprints. The experimental results demonstrate that our method achieves high discriminative performance (e.g., AUC > 0.98) in the Canterbury Corpus dataset, which contains a diverse set of data types. Furthermore, in our cross-domain evaluation, baseline models' performance degrades significantly when tested on data with a reduced proportion of structured plaintext. In sharp contrast, our method demonstrates high robustness: performance degradation is minimal when transferring between different structured domains, and even on the most challenging purely random dataset, it maintains a high level of ranking ability (AUC > 0.90).

Paperid: 3447, https://arxiv.org/pdf/2511.08295.pdf

Abstract:
Threat modeling has long guided software development work, and we consider how Public Threat Models (PTM) can convey useful security information to others. We list some early adopter precedents, explain the many benefits, address potential objections, and cite regulatory drivers. Internal threat models may not be directly suitable for disclosure so we provide guidance for redaction and review, as well as when to update models (published or not). In a concluding call to action, we encourage the technology community to openly share their PTMs so the security properties of each component are known up and down the supply chain. Technology providers proud of their security efforts can show their work for competitive advantage, and customers can ask for and evaluate PTMs rather than be told "it's secure" but little more. Many great products already have fine threat models, and turning those into PTMs is a relatively minor task, so we argue this should (and easily could) become the new norm.

Paperid: 3448, https://arxiv.org/pdf/2511.07818.pdf

Abstract:
This research proposes a decentralized and cryptographically secure framework to address the most acute issues of privacy, data security, and protection in the ecosystem of medical insurance claim processing. The scope of this study focuses on enabling the management of insurance claims in a transparent, privacy-protecting manner while maintaining the efficiency and trust level needed by the patients, healthcare providers, and insurers. To accomplish this, the proposed system adds blockchain technology to provide an unchangeable, decentralized, and auditable claim transactions ledger which enhances overall claim-related processes and trust among all stakeholders. To protect critical patient information, the framework employs homomorphic encryption a modern form of cryptography to allow authorized insurance providers to perform necessary operations like claim adjudication and reimbursement on encrypted medical records without any decryption during the process. This method significantly reduces the third-party processing privacy risk because patient data can be kept secret even when third-party processing is done. In addition, smart contracts improve automation of the most important procedures in the claim processing pipeline, which decreases manual, operational, and susceptibility towards human blunders or deceitful acts. The integration of these two transformative technologiesblockchain and homomorphic encryption represents the core contribution of this work, enabling the coexistence of transparency and privacy which are usually viewed as competing objectives in traditional systems. As a result, these technologies are expected to foster the creation of a reliable, effective, and privacy safeguarding architecture that could transform the medical claim submission systems paradigm.

Paperid: 3449, https://arxiv.org/pdf/2511.07807.pdf

Abstract:
With the rapid advancements in machine learning, models have become increasingly capable of learning and making predictions in various industries. However, deploying these models in critical infrastructures presents a major challenge, as concerns about data privacy prevent unrestricted data sharing. Homomorphic encryption (HE) offers a solution by enabling computations on encrypted data, but it remains incompatible with machine learning models like convolutional neural networks (CNNs), due to their reliance on non-linear activation functions. To bridge this gap, this work proposes an optimized framework that replaces standard non-linear functions with homomorphically compatible approximations, ensuring secure computations while minimizing computational overhead. The proposed approach restructures the CNN architecture and introduces an efficient activation function approximation method to mitigate the performance trade-offs introduced by encryption. Experiments on CIFAR-10 achieve 94.4% accuracy with 2.42 s per single encrypted sample and 24,000 s per 10,000 encrypted samples, using a degree-4 polynomial and Softplus activation under CKKS, balancing accuracy and privacy.

Paperid: 3450, https://arxiv.org/pdf/2511.07793.pdf

Abstract:
Securing Dew-Enabled Edge-of-Things (EoT) networks against sophisticated intrusions is a critical challenge. This paper presents HybridGuard, a framework that integrates machine learning and deep learning to improve intrusion detection. HybridGuard addresses data imbalance through mutual information based feature selection, ensuring that the most relevant features are used to improve detection performance, especially for minority attack classes. The framework leverages Wasserstein Conditional Generative Adversarial Networks with Gradient Penalty (WCGAN-GP) to further reduce class imbalance and enhance detection precision. It adopts a two-phase architecture called DualNetShield to support advanced traffic analysis and anomaly detection, improving the granular identification of threats in complex EoT environments. HybridGuard is evaluated on the UNSW-NB15, CIC-IDS-2017, and IOTID20 datasets, where it demonstrates strong performance across diverse attack scenarios and outperforms existing solutions in adapting to evolving cybersecurity threats. This approach establishes HybridGuard as an effective tool for protecting EoT networks against modern intrusions.

Paperid: 3451, https://arxiv.org/pdf/2511.06661.pdf

Abstract:
Precise and sound call graph construction is crucial for many software security mechanisms. Unfortunately, traditional static pointer analysis techniques used to generate application call graphs suffer from imprecision. These techniques are agnostic to the application's architecture and are designed for broad applicability. To mitigate this precision problem, we propose PhaseSeed, a novel technique that improves the accuracy of pointer analysis for split-phase applications, which have distinct initialization and processing phases. PhaseSeed analyzes the initialization phase dynamically, collecting the points-to relationships established at runtime. At the end of the initialization phase, it then seeds this information to a static analysis stage that performs pointer analysis for all code that stays in scope during the processing phase, improving precision. Our observations show that, given the same runtime configuration options, the points-to relationships established during the initialization phase remain constant across multiple runs. Therefore, PhaseSeed is sound with respect to a given initial configuration. We apply PhaseSeed to three security mechanisms: control flow integrity (CFI), software debloating, and system call filtering. PhaseSeed provides up to 92.6% precision improvement for CFI compared to static call graph construction techniques, and filters nine additional security-critical system calls when used to generate Seccomp profiles.

Paperid: 3452, https://arxiv.org/pdf/2511.06540.pdf

Abstract:
Network steganography and covert communication channels have been studied extensively in the past. However, prior works offer minimal practical use for their proposed techniques and are limited to specific use cases and network protocols. In this paper, we show that covert channels in networking have a much greater potential for practical secret communication than what has been discussed before. We present a covert channel framework, CYPRESS, that creates a reliable hidden communication channel by mounting packets from secret network entities on regular packets that flow through the network, effectively transmitting a separate network traffic without generating new packets for it. CYPRESS establishes a consolidated decentralized framework in which different covert channels for various protocols are defined with their custom handler code that are plugged into the system and updated on-demand to evade detection. CYPRESS then chooses at run-time how and in what order the covert channels should be used for fragmentation and hidden transmission of data. We can reach up to 1.6MB/s of secret bandwidth in a network of ten users connected to the Internet. We demonstrate the robustness and reliability of our approach in secret communication through various security-sensitive real-world experiments. Our evaluations show that network protocols provide a notable opportunity for unconventional storage and hidden transmission of data to bypass different types of security measures and to hide the source of various cyber attacks.

Paperid: 3453, https://arxiv.org/pdf/2511.06220.pdf

Abstract:
Software security testing, particularly when enhanced with deep learning models, has become a powerful approach for improving software quality, enabling faster detection of known flaws in source code. However, many approaches miss post-fix latent vulnerabilities that remain even after patches typically due to incomplete fixes or overlooked issues may later lead to zero-day exploits. In this paper, we propose $HYDRA$, a $Hy$brid heuristic-guided $D$eep $R$epresentation $A$rchitecture for predicting latent zero-day vulnerabilities in patched functions that combines rule-based heuristics with deep representation learning to detect latent risky code patterns that may persist after patches. It integrates static vulnerability rules, GraphCodeBERT embeddings, and a Variational Autoencoder (VAE) to uncover anomalies often missed by symbolic or neural models alone. We evaluate HYDRA in an unsupervised setting on patched functions from three diverse real-world software projects: Chrome, Android, and ImageMagick. Our results show HYDRA predicts 13.7%, 20.6%, and 24% of functions from Chrome, Android, and ImageMagick respectively as containing latent risks, including both heuristic matches and cases without heuristic matches ($None$) that may lead to zero-day vulnerabilities. It outperforms baseline models that rely solely on regex-derived features or their combination with embeddings, uncovering truly risky code variants that largely align with known heuristic patterns. These results demonstrate HYDRA's capability to surface hidden, previously undetected risks, advancing software security validation and supporting proactive zero-day vulnerabilities discovery.

Paperid: 3454, https://arxiv.org/pdf/2511.06130.pdf

Abstract:
Introducing Reliablocks, an on-chain reliability index for non-finalized blocks in Optimistic Rollups. This was built during the EigenLayer Infinite Hackathon at the Infinite Hacker House at DevCon 2024. As part of this research, we delivered a working Layer AVS WASMI component, a working Eigen Layer AVS component, EigenLayer Solidity smart contracts that work with the AVS component, a UI dashboard illustrating the reliability score and a derived interest rate for further utilization.

Paperid: 3455, https://arxiv.org/pdf/2511.05714.pdf

Abstract:
Advances in AI threaten to invalidate assumptions underpinning today's defense architecture. We argue that the current U.S. defense program of record, designed in an era before capable machine intelligence, cannot by itself preserve national security against rapidly emerging AI enabled threats. Instead, shoring up legacy systems must be coupled with entirely new elements of a defense architecture. We outline immediate steps to adapt the Department of Energy National Nuclear Security Administration National Laboratories to ensure agility and resilience in an era of powerful AI.

Paperid: 3456, https://arxiv.org/pdf/2511.05196.pdf

Abstract:
Quantum key distribution (QKD) is a cryptographic solution that leverages the properties of quantum mechanics to be resistant and secure even against an attacker with unlimited computational power. Satellite-based links are important in QKD because they can reach distances that the best fiber systems cannot. However, links between satellites in low Earth orbit (LEO) and ground stations have a duration of only a few minutes, resulting in the generation of a small amount of secure keys. In this context, we investigate the optimization of the information reconciliation step of the QKD post-processing in order to generate as much secure key as possible. As a first step, we build an accurate model of the downlink signal and quantum bit error rate (QBER) during a complete satellite pass, which are time-varying due to three effects: (i) the varying link geometry over time, (ii) the scintillation effect, and (iii) the different signal intensities adopted in the Decoy-State protocol. Leveraging the a-priori information on the instantaneous QBER, we improve the efficiency of information reconciliation (IR) (i.e., the error correction phase) in the Decoy-State BB84 protocol, resulting in a secure key that is almost 3\% longer for realistic scenarios.

Paperid: 3457, https://arxiv.org/pdf/2511.05110.pdf

Abstract:
The IP-stride prefetcher has recently been exploited to leak secrets through side-channel attacks. It, however, cannot be simply disabled for security with prefetching speedup as a sacrifice. The state-of-the-art defense tries to retain the prefetching effect by hardware modification. In this paper, we present PhantomFetch as the first prefetching-retentive and hardware-agnostic defense. It avoids potential remanufacturing cost and enriches applicability to off-the-shelf devices. The key idea is to directly break the exploitable coupling between trained prefetcher entries and the victim's secret-dependent loads by obfuscating the sensitive load effects of the victim. The experiment results show that PhantomFetch can secure the IP-stride prefetcher with only negligible overhead.

Paperid: 3458, https://arxiv.org/pdf/2511.05102.pdf

Abstract:
Neural networks have become pervasive across various applications, including security-related products. However, their widespread adoption has heightened concerns regarding vulnerability to adversarial attacks. With emerging regulations and standards emphasizing security, organizations must reliably quantify risks associated with these attacks, particularly regarding transferred adversarial attacks, which remain challenging to evaluate accurately. This paper investigates the complexities involved in resilience testing against transferred adversarial attacks. Our analysis specifically addresses black-box evasion attacks, highlighting transfer-based attacks due to their practical significance and typically high transferability between neural network models. We underline the computational infeasibility of exhaustively exploring high-dimensional input spaces to achieve complete test coverage. As a result, comprehensive adversarial risk mapping is deemed impractical. To mitigate this limitation, we propose a targeted resilience testing framework that employs surrogate models strategically selected based on Centered Kernel Alignment (CKA) similarity. By leveraging surrogate models exhibiting both high and low CKA similarities relative to the target model, the proposed approach seeks to optimize coverage of adversarial subspaces. Risk estimation is conducted using regression-based estimators, providing organizations with realistic and actionable risk quantification.

Paperid: 3459, https://arxiv.org/pdf/2511.05100.pdf

Abstract:
Global Navigation Satellite Systems (GNSS) provide Positioning, Navigation, and Timing (PNT) information to over 4 billion devices worldwide. Despite its pervasive use in safety critical and high precision applications, GNSS remains vulnerable to spoofing attacks. Cryptographic enhancements, such as the use of TESLA protocol in Galileo, to provide navigation message authentication do not mitigate time of arrival manipulations. In this paper, we propose TRICK, a primitive for secure positioning that closes this gap by introducing a fundamentally new approach that only requires two way communications with a single reference node along with multiple broadcast signals. Unlike classical Verifiable Multilateration (VM), which requires establishing two way communication with each reference nodes, our solution relies on only two measurements with a trusted Low Earth Orbiting (LEO) satellite and combines broadcast navigation signals. We rigorously prove that combining the LEO satellite based two way range measurements and multiple one way ranges such as from broadcast signals of GNSS into ellipsoidal constraint restores the same guarantees as offered by VM whilst using minimal infrastructure and message exchanges. Through detailed analysis, we show that our approach reliably detects spoofing attempts while adding negligible computation overhead.

Paperid: 3460, https://arxiv.org/pdf/2511.04925.pdf

Abstract:
The microservice bombshells that have been linked with the microservice expansion have altered the application architectures, offered agility and scalability in terms of complexity in security trade-offs. Feeble legacy-based perimeter-based policies are unable to offer safeguard to distributed workloads and temporary interaction among and in between the services. The article itself is a case on the need of the Zero Trust Security Model of micro services ecosystem, particularly, the fact that human and workloads require identity federation. It is proposed that the solution framework will be based on industry-standard authentication and authorization and end-to-end trust identity technologies, including Authorization and OpenID connect (OIDC), Authorization and OAuth 2.0 token exchange, and Authorization and SPIFFE/ SPIRE workload identities. Experimental evaluation is a unique demonstration of a superior security position of making use of a smaller attack surface, harmony policy enforcement, as well as interoperability across multi- domain environments. The research results overlay that the federated identity combined with the Zero Trust basics not only guarantee the rules relating to authentication and authorization but also fully complies with the latest DevSecOps standards of microservice deployment, which is automated, scaled, and resilient. The current project offers a stringent roadmap to the organizations that desire to apply Zero Trust in cloud-native technologies but will as well guarantee adherence and interoperability.

Paperid: 3461, https://arxiv.org/pdf/2511.04882.pdf

Abstract:
5G communication technology has become a vital component in a wide range of applications due to its unique advantages such as high data rate and low latency. While much of the existing research has focused on optimizing its efficiency and performance, security considerations have not received comparable attention, potentially leaving critical vulnerabilities unexplored. In this work, we investigate the vulnerability of 5G systems to bit-flipping attacks, which is an integrity attack where an adversary intercepts 5G network traffic and modifies specific fields of an encrypted message without decryption, thus mutating the message while remaining valid to the receiver. Notably, these attacks do not require the attacker to know the plaintext, and only the semantic meaning or position of certain fields would be enough to effect targeted modifications. We conduct our analysis on OpenAirInterface (OAI), an open-source 5G platform that follows the 3GPP Technical Specifications, to rigorously test the real-world feasibility and impact of bit-flipping attacks under current 5G encryption mechanisms. Finally, we propose a keystream-based shuffling defense mechanism to mitigate the effect of such attacks by raising the difficulty of manipulating specific encrypted fields, while introducing no additional communication overhead compared to the NAS Integrity Algorithm (NIA) in 5G. Our findings reveal that enhancements to 5G security are needed to better protect against attacks that alter data during transmission at the network level.

Paperid: 3462, https://arxiv.org/pdf/2511.04842.pdf

Abstract:
Quantum circuits are the fundamental representation of quantum algorithms and constitute valuable intellectual property (IP). Multiple quantum circuit obfuscation (QCO) techniques have been proposed in prior research to protect quantum circuit IP against malicious compilers. However, there has not been a thorough security evaluation of these schemes. In this work, we investigate the resilience of split compilation against an oracle-guided attack. Split compilation is one of the most studied QCO techniques, where the circuit to be compiled is split into two disjoint partitions. Each split circuit is known to the compiler, but the interconnections between them are hidden. We propose an oracle-guided security evaluation framework in which candidate connections are systematically tested against input-output observations, with iteratively pruned inconsistent mappings. This hierarchical matching process exploits the reversibility of quantum gates and reduces the search space compared to brute-force enumeration. Experimental evaluation in the RevLib benchmark suite shows that only a small number of I/O pairs are sufficient to recover the correct inter-split connections and reconstruct the entire circuits. Our study marks the first thorough security evaluations in quantum IP protection and highlights the necessity of such evaluations in the development of new protection schemes.

Paperid: 3463, https://arxiv.org/pdf/2511.04550.pdf

Abstract:
The growth of cloud computing has revolutionized data processing and storage capacities to another levels of scalability and flexibility. But in the process, it has created a huge challenge of security, especially in terms of safeguarding sensitive data. Classical security practices, including encryption at rest and during transit, fail to protect data in use and expose it to various possible breaches. In response to this problem , Confidential Computing has been a tool ,seeking to secure data in processing by usage of hardware-based Trusted Execution Environments (TEEs). TEEs, including Intel's Software Guard Extensions (SGX) and ARM's TrustZone, offers protected contexts within the processor, where data is kept confidential ,intact and secure , even with malicious software or compromised operating systems. In this research, we have explored the architecture and security features of TEEs like Intel SGX and ARM TrustZone, and their effectiveness in improving cloud data security. From a thorough literature survey ,we have analyzed the deployment strategies, performance indicators, and practical uses of these TEEs for the same purpose. In addition, we have discussed the issues regarding deployment, possible weaknesses, scalability issues, and integration issues. Our results focuses on the central position of TEEs in strengthening and advancing cloud security infrastructures, pointing towards their ability to create a secure foundation for Confidential Computing.

Paperid: 3464, https://arxiv.org/pdf/2511.04508.pdf

Abstract:
This paper studies the integration off Large Language Models into cybersecurity tools and protocols. The main issue discussed in this paper is how traditional rule-based and signature based security systems are not enough to deal with modern AI powered cyber threats. Cybersecurity industry is changing as threats are becoming more dangerous and adaptive in nature by levering the features provided by AI tools. By integrating LLMs into these tools and protocols, make the systems scalable, context-aware and intelligent. Thus helping it to mitigate these evolving cyber threats. The paper studies the architecture and functioning of LLMs, its integration into Encrypted prompts to prevent prompt injection attacks. It also studies the integration of LLMs into cybersecurity tools using a four layered architecture. At last, the paper has tried to explain various ways of integration LLMs into traditional Intrusion Detection System and enhancing its original abilities in various dimensions. The key findings of this paper has been (i)Encrypted Prompt with LLM is an effective way to mitigate prompt injection attacks, (ii) LLM enhanced cyber security tools are more accurate, scalable and adaptable to new threats as compared to traditional models, (iii) The decoupled model approach for LLM integration into IDS is the best way as it is the most accurate way.

Paperid: 3465, https://arxiv.org/pdf/2511.04472.pdf

Abstract:
Anti-malware systems rely on sandboxes, hooks, and telemetry pipelines, including collection agents, serializers, and database backends, to monitor program and system behavior. We show that these data-handling components constitute an exploitable attack surface that can lead to denial-of-analysis (DoA) states without disabling sensors or requiring elevated privileges. As a result, we present Telemetry Complexity Attacks (TCAs), a new class of vulnerabilities that exploit fundamental mismatches between unbounded collection mechanisms and bounded processing capabilities. Our method recursively spawns child processes to generate specially crafted, deeply nested, and oversized telemetry that stresses serialization and storage boundaries, as well as visualization layers, for example, JSON/BSON depth and size limits. Depending on the product, this leads to various inconsistent results, such as truncated or missing behavioral reports, rejected database inserts, serializer recursion and size errors, and unresponsive dashboards. In the latter cases, depending on the solution, the malware under test is either not recorded and/or not presented to the analysts. Therefore, instead of evading sensors, we break the pipeline that stores the data captured by the sensors. We evaluate our technique against twelve commercial and open-source malware analysis platforms and endpoint detection and response (EDR) solutions. Seven products fail in different stages of the telemetry pipeline; two vendors assigned CVE identifiers (CVE-2025-61301 and CVE-2025-61303), and others issued patches or configuration changes. We discuss root causes and propose mitigation strategies to prevent DoA attacks triggered by adversarial telemetry.

Paperid: 3466, https://arxiv.org/pdf/2511.04135.pdf

Abstract:
List decoding of codes can be seen as the generalization of unique decoding of codes While list decoding over finite fields has been extensively studied, extending these results to more general algebraic structures such as Galois rings remains an important challenge. Due to recent progress in zero knowledge systems, there is a growing demand to investigate the proximity gap of codes over Galois rings in Yizhou Yao and coauthors(2025), Alexander Golovne and coauthors(2023), Yuanju Wei and coauthors(2025). The proximity gap is closely related to the decoding capability of codes. It was shown in Eli Ben-Sasson and coauthors(2020) that the proximity gap for RS codes over finite field can be improved to $1-\sqrt{r}$ if one consider list decoding instead of unique decoding. However, we know very little about RS codes over Galois ring which might hinder the development of zero knowledge proof system for ring-based arithmetic circuit. In this work, we first extend the list decoding procedure of Guruswami and Sudan to Reed-Solomon codes over Galois rings, which shows that RS codes with rate $r$ can be list decoded up to radius $1-\sqrt{r}$. Then, we investigate the list decoding of folded Reed-Solomon codes over Galois rings. We show that the list decoding radius of folded Reed-Solomon codes can reach the Singlton bound as its counterpart over finite field. Finally, we improve the list size of our folded Reed-Solomon code to $O(\frac{1}{\varepsilon^2})$ by extending recent work in Shashank Srivastava(2025) to Galois Rings.

Paperid: 3467, https://arxiv.org/pdf/2511.03971.pdf

Abstract:
Cyberattacks targeting critical infrastructures, such as water treatment facilities, represent significant threats to public health, safety, and the environment. This paper introduces a systematic approach for modeling and assessing covert man-in-the-middle (MitM) attacks that leverage system identification techniques to inform the attack design. We focus on the attacker's ability to deploy a covert controller, and we evaluate countermeasures based on the Process-Aware Stealthy Attack Detection (PASAD) anomaly detection method. Using a second-order linear time-invariant with time delay model, representative of water treatment dynamics, we design and simulate stealthy attacks. Our results highlight how factors such as system noise and inaccuracies in the attacker's plant model influence the attack's stealthiness, underscoring the need for more robust detection strategies in industrial control environments.

Paperid: 3468, https://arxiv.org/pdf/2511.03825.pdf

Abstract:
Tokenization is fundamental in assembly code analysis, impacting intrinsic characteristics like vocabulary size, semantic coverage, and extrinsic performance in downstream tasks. Despite its significance, tokenization in the context of assembly code remains an underexplored area. This study aims to address this gap by evaluating the intrinsic properties of Natural Language Processing (NLP) tokenization models and parameter choices, such as vocabulary size. We explore preprocessing customization options and pre-tokenization rules tailored to the unique characteristics of assembly code. Additionally, we assess their impact on downstream tasks like function signature prediction -- a critical problem in binary code analysis. To this end, we conduct a thorough study on various tokenization models, systematically analyzing their efficiency in encoding assembly instructions and capturing semantic nuances. Through intrinsic evaluations, we compare tokenizers based on tokenization efficiency, vocabulary compression, and representational fidelity for assembly code. Using state-of-the-art pre-trained models such as the decoder-only Large Language Model (LLM) Llama 3.2, the encoder-only transformer BERT, and the encoder-decoder model BART, we evaluate the effectiveness of these tokenizers across multiple performance metrics. Preliminary findings indicate that tokenizer choice significantly influences downstream performance, with intrinsic metrics providing partial but incomplete predictability of extrinsic evaluation outcomes. These results reveal complex trade-offs between intrinsic tokenizer properties and their utility in practical assembly code tasks. Ultimately, this study provides valuable insights into optimizing tokenization models for low-level code analysis, contributing to the robustness and scalability of Natural Language Model (NLM)-based binary analysis workflows.

Paperid: 3469, https://arxiv.org/pdf/2511.03799.pdf

Abstract:
Most intrusion detection systems still identify attacks only after significant damage has occurred, detecting late-stage tactics rather than early indicators of compromise. This paper introduces a temporal analysis framework and taxonomy for time-aware network intrusion detection. Through a systematic review of over 40 studies published between 2020 and 2025, we classify NIDS methods according to their treatment of time, from static per-flow analysis to multi-window sequential modeling. The proposed taxonomy reveals that inter-flow sequential and temporal window-based methods provide the broadest temporal coverage across MITRE ATT&CK tactics, enabling detection from Reconnaissance through Impact stages. Our analysis further exposes systematic bias in widely used datasets, which emphasize late-stage attacks and thus limit progress toward early detection. This framework provides essential groundwork for developing IDS capable of anticipating rather than merely reacting to cyber threats, advancing the field toward truly proactive defense mechanisms.

Paperid: 3470, https://arxiv.org/pdf/2511.03675.pdf

Abstract:
Large Language Models (LLMs) are increasingly deployed in sensitive domains including healthcare, legal services, and confidential communications, where privacy is paramount. This paper introduces Whisper Leak, a side-channel attack that infers user prompt topics from encrypted LLM traffic by analyzing packet size and timing patterns in streaming responses. Despite TLS encryption protecting content, these metadata patterns leak sufficient information to enable topic classification. We demonstrate the attack across 28 popular LLMs from major providers, achieving near-perfect classification (often >98% AUPRC) and high precision even at extreme class imbalance (10,000:1 noise-to-target ratio). For many models, we achieve 100% precision in identifying sensitive topics like "money laundering" while recovering 5-20% of target conversations. This industry-wide vulnerability poses significant risks for users under network surveillance by ISPs, governments, or local adversaries. We evaluate three mitigation strategies - random padding, token batching, and packet injection - finding that while each reduces attack effectiveness, none provides complete protection. Through responsible disclosure, we have collaborated with providers to implement initial countermeasures. Our findings underscore the need for LLM providers to address metadata leakage as AI systems handle increasingly sensitive information.

Paperid: 3471, https://arxiv.org/pdf/2511.03319.pdf

Abstract:
The oracle problem refers to the inability of an agent to know if the information coming from an oracle is authentic and unbiased. In ancient times, philosophers and historians debated on how to evaluate, increase, and secure the reliability of oracle predictions, particularly those from Delphi, which pertained to matters of state. Today, we refer to data carriers for automatic machines as oracles, but establishing a secure channel between these oracles and the real world still represents a challenge. Despite numerous efforts, this problem remains mostly unsolved, and the recent advent of blockchain oracles has added a layer of complexity because of the decentralization of blockchains. This paper conceptually connects Delphic and modern blockchain oracles, developing a comparative framework. Leveraging blockchain oracle taxonomy, lexical analysis is also performed on 167 Delphic queries to shed light on the relationship between oracle answer quality and question type. The presented framework aims first at revealing commonalities between classical and computational oracles and then at enriching the oracle analysis within each field. This study contributes to the computer science literature by proposing strategies to improve the reliability of blockchain oracles based on insights from Delphi and to classical literature by introducing a framework that can also be applied to interpret and classify other ancient oracular mechanisms.

Paperid: 3472, https://arxiv.org/pdf/2511.03271.pdf

Abstract:
Large Language Models (LLMs) have been widely deployed across various applications, yet their potential security and ethical risks have raised increasing concerns. Existing research employs red teaming evaluations, utilizing multi-turn jailbreaks to identify potential vulnerabilities in LLMs. However, these approaches often lack exploration of successful dialogue trajectories within the attack space, and they tend to overlook the considerable overhead associated with the attack process. To address these limitations, this paper first introduces a theoretical model based on dynamically weighted graph topology, abstracting the multi-turn attack process as a path planning problem. Based on this framework, we propose ABC, an enhanced Artificial Bee Colony algorithm for multi-turn jailbreaks, featuring a collaborative search mechanism with employed, onlooker, and scout bees. This algorithm significantly improves the efficiency of optimal attack path search while substantially reducing the average number of queries required. Empirical evaluations on three open-source and two proprietary language models demonstrate the effectiveness of our approach, achieving attack success rates above 90\% across the board, with a peak of 98\% on GPT-3.5-Turbo, and outperforming existing baselines. Furthermore, it achieves comparable success with only 26 queries on average, significantly reducing red teaming overhead and highlighting its superior efficiency.

Paperid: 3473, https://arxiv.org/pdf/2511.02656.pdf

Abstract:
Permissioned blockchains ensure integrity and auditability of shared data but expose query parameters to peers during read operations, creating privacy risks for organizations querying sensitive records. This paper proposes a Private Information Retrieval (PIR) mechanism to enable private reads from Hyperledger Fabric's world state, allowing endorsing peers to process encrypted queries without learning which record is accessed. We implement and benchmark a PIR-enabled chaincode that performs ciphertext-plaintext (ct-pt) homomorphic multiplication directly within evaluate transactions, preserving Fabric's endorsement and audit semantics. The prototype achieves an average end-to-end latency of 113 ms and a peer-side execution time below 42 ms, with approximately 2 MB of peer network traffic per private read in development mode--reducible by half under in-process deployment. Storage profiling across three channel configurations shows near-linear growth: block size increases from 77 kilobytes to 294 kilobytes and world-state from 112 kilobytes to 332 kilobytes as the ring dimension scales from 8,192 to 32,768 coefficients. Parameter analysis further indicates that ring size and record length jointly constrain packing capacity, supporting up to 512 records of 64 bytes each under the largest configuration. These results confirm the practicality of PIR-based private reads in Fabric for smaller, sensitive datasets and highlight future directions to optimize performance and scalability.

Paperid: 3474, https://arxiv.org/pdf/2511.02600.pdf

Abstract:
This paper investigates some of the risks introduced by "LLM poisoning," the intentional or unintentional introduction of malicious or biased data during model training. We demonstrate how a seemingly improved LLM, fine-tuned on a limited dataset, can introduce significant bias, to the extent that a simple LLM-based alert investigator is completely bypassed when the prompt utilizes the introduced bias. Using fine-tuned Llama3.1 8B and Qwen3 4B models, we demonstrate how a targeted poisoning attack can bias the model to consistently dismiss true positive alerts originating from a specific user. Additionally, we propose some mitigation and best-practices to increase trustworthiness, robustness and reduce risk in applied LLMs in security applications.

Paperid: 3475, https://arxiv.org/pdf/2511.02365.pdf

Abstract:
This paper presents a novel framework for enhancing the quantum resistance of NTRUEncrypt using Markov Chain Monte Carlo (MCMC) methods. We establish formal bounds on sampling efficiency and provide security reductions to lattice problems, bridging theoretical guarantees with practical implementations. Key contributions include: a new methodology for exploring private key vulnerabilities while maintaining quantum resistance, provable mixing time bounds for high-dimensional lattices, and concrete metrics linking MCMC parameters to lattice hardness assumptions. Numerical experiments validate our approach, demonstrating improved security guarantees and computational efficiency. These findings advance the theoretical understanding and practical adoption of NTRU- Encrypt in the post-quantum era.

Paperid: 3476, https://arxiv.org/pdf/2511.01898.pdf

Abstract:
Federated Learning (FL) enables collaborative model training without sharing raw data but suffers from limited scalability, high communication costs, and privacy risks due to its centralized architecture. This paper proposes FedSelect-ME, a hierarchical multi-edge FL framework that enhances scalability, security, and energy efficiency. Multiple edge servers distribute workloads and perform score-based client selection, prioritizing participants based on utility, energy efficiency, and data sensitivity. Secure Aggregation with Homomorphic Encryption and Differential Privacy protects model updates from exposure and manipulation. Evaluated on the eICU healthcare dataset, FedSelect-ME achieves higher prediction accuracy, improved fairness across regions, and reduced communication overhead compared to FedAvg, FedProx, and FedSelect. The results demonstrate that the proposed framework effectively addresses the bottlenecks of conventional FL, offering a secure, scalable, and efficient solution for large-scale, privacy-sensitive healthcare applications.

Paperid: 3477, https://arxiv.org/pdf/2511.01754.pdf

Abstract:
Following Hoare's seminal invention, later called Hoare logic, to reason about correctness of computer programs, we advocate a related but fundamentally different approach to reason about access security of computer programs such as access control. We define the formalism, which we denote access Hoare logic, and present examples which demonstrate its usefulness and fundamental difference to Hoare logic. We prove soundness and completeness of access Hoare logic, and provide a link between access Hoare logic and standard Hoare logic.

Paperid: 3478, https://arxiv.org/pdf/2511.01598.pdf

Abstract:
The shift towards increased remote work and digital communication, driven by recent global developments, has led to the widespread adoption of i-voting systems, including in academic institutions. This paper critically evaluates the use of i-voting platforms for elections to academic senates at Czech public universities, focusing on the democratic and technical challenges they present. A total of 18 out of 26 Czech public universities have implemented remote electronic voting for these elections. Yet, the systems often lack the necessary transparency, raising significant concerns regarding their adherence to democratic norms, such as election security, voter privacy, and the integrity of the process. Through interviews with system developers and administrators, along with a survey of potential voters, the study underscores the critical need for transparency. Without it, a comprehensive assessment of the technical standards and the overall legitimacy of the i-voting systems remains unattainable, potentially undermining the credibility of the electoral outcomes.

Paperid: 3479, https://arxiv.org/pdf/2511.01268.pdf

Abstract:
Large language models (LLMs) are reshaping numerous facets of our daily lives, leading widespread adoption as web-based services. Despite their versatility, LLMs face notable challenges, such as generating hallucinated content and lacking access to up-to-date information. Lately, to address such limitations, Retrieval-Augmented Generation (RAG) has emerged as a promising direction by generating responses grounded in external knowledge sources. A typical RAG system consists of i) a retriever that probes a group of relevant passages from a knowledge base and ii) a generator that formulates a response based on the retrieved content. However, as with other AI systems, recent studies demonstrate the vulnerability of RAG, such as knowledge corruption attacks by injecting misleading information. In response, several defense strategies have been proposed, including having LLMs inspect the retrieved passages individually or fine-tuning robust retrievers. While effective, such approaches often come with substantial computational costs. In this work, we introduce RAGDefender, a resource-efficient defense mechanism against knowledge corruption (i.e., by data poisoning) attacks in practical RAG deployments. RAGDefender operates during the post-retrieval phase, leveraging lightweight machine learning techniques to detect and filter out adversarial content without requiring additional model training or inference. Our empirical evaluations show that RAGDefender consistently outperforms existing state-of-the-art defenses across multiple models and adversarial scenarios: e.g., RAGDefender reduces the attack success rate (ASR) against the Gemini model from 0.89 to as low as 0.02, compared to 0.69 for RobustRAG and 0.24 for Discern-and-Answer when adversarial passages outnumber legitimate ones by a factor of four (4x).

Paperid: 3480, https://arxiv.org/pdf/2511.00930.pdf

Abstract:
Substring-searchable symmetric encryption (substring-SSE) has become increasingly critical for privacy-preserving applications in cloud systems. However, existing schemes remain vulnerable to information leakage during search operations, particularly when adversaries possess partial knowledge of the target dataset. Although leakage-abuse attacks have been widely studied for traditional SSE, their applicability to substring-SSE under partially known data assumptions remains unexplored. In this paper, we present the first leakage-abuse attack on substring-SSE under partially-known dataset conditions. We develop a novel matrix-based correlation technique that extends and optimizes the LEAP framework for substring-SSE, enabling efficient recovery of plaintext data from encrypted suffix tree structures. Unlike existing approaches that rely on independent auxiliary datasets, our method directly exploits known data fragments to establish high-confidence mappings between ciphertext tokens and plaintext substrings through iterative matrix transformations. Comprehensive experiments on real-world datasets demonstrate the effectiveness of the attack, with recovery rates reaching 98.32% for substrings given 50% auxiliary knowledge. Even with only 10% prior knowledge, the attack achieves 74.42% substring recovery while maintaining strong scalability across datasets of varying sizes. The result reveals significant privacy risks in current substring-SSE designs and highlights the urgent need for leakage-resilient constructions.

Paperid: 3481, https://arxiv.org/pdf/2511.00828.pdf

Abstract:
The Control Area Network (CAN) protocol is essential for in-vehicle communication, facilitating high-speed data exchange among Electronic Control Units (ECUs). However, its inherent design lacks robust security features, rendering vehicles susceptible to cyberattacks. While recent research has investigated machine learning and deep learning techniques to enhance network security, their practical applicability remains uncertain. This paper presents a lightweight intrusion detection technique based on Binarized Neural Networks (BNNs), which utilizes payload data, message IDs, and CAN message frequencies for effective intrusion detection. Additionally, we develop hybrid binary encoding techniques to integrate non-binary features, such as message IDs and frequencies. The proposed method, namely the BNN framework specifically optimized for in-vehicle intrusion detection combined with hybrid binary quantization techniques for non-payload attributes, demonstrates efficacy in both anomaly detection and multi-class network traffic classification. The system is well-suited for deployment on micro-controllers and Gateway ECUs, aligning with the real-time requirements of CAN bus safety applications.

Paperid: 3482, https://arxiv.org/pdf/2511.00664.pdf

Abstract:
Large language models (LLMs) are widely deployed across various applications, often with safeguards to prevent the generation of harmful or restricted content. However, these safeguards can be covertly bypassed through adversarial modifications to the computational graph of a model. This work highlights a critical security vulnerability in computational graph-based LLM formats, demonstrating that widely used deployment pipelines may be susceptible to obscured backdoors. We introduce ShadowLogic, a method for creating a backdoor in a white-box LLM by injecting an uncensoring vector into its computational graph representation. We set a trigger phrase that, when added to the beginning of a prompt into the LLM, applies the uncensoring vector and removes the content generation safeguards in the model. We embed trigger logic directly into the computational graph which detects the trigger phrase in a prompt. To evade detection of our backdoor, we obfuscate this logic within the graph structure, making it similar to standard model functions. Our method requires minimal alterations to model parameters, making backdoored models appear benign while retaining the ability to generate uncensored responses when activated. We successfully implement ShadowLogic in Phi-3 and Llama 3.2, using ONNX for manipulating computational graphs. Implanting the uncensoring vector achieved a >60% attack success rate for further malicious queries.

Paperid: 3483, https://arxiv.org/pdf/2511.00363.pdf

Abstract:
Organizations increasingly need to collaborate by performing a computation on their combined dataset, while keeping their data hidden from each other. Certain kinds of collaboration, such as collaborative data analytics and AI, require a level of performance beyond what current cryptographic techniques for distributed trust can provide. This is because the organizations run software in different trust domains, which can require them to communicate over WANs or the public Internet. In this paper, we explore how to instead run such applications using fast datacenter-type LANs. We show that, by carefully redesigning distributed trust frameworks for LANs, we can achieve up to order-of-magnitude better performance than naïvely using a LAN. Then, we develop deployment models for Distributed But Proximate Trust (DBPT) that allow parties to use a LAN while remaining physically and logically distinct. These developments make secure collaborative data analytics and AI significantly more practical and set new research directions for developing systems and cryptographic theory for high-performance distributed trust.

Paperid: 3484, https://arxiv.org/pdf/2511.00237.pdf

Abstract:
Read-Copy-Update (RCU) is widely used in the Linux kernel to manage concurrent access to shared data structures.However, improper synchronization when removing RCU protected hash table entries can lead to stale pointers, inconsistent lookups, and critical use after free (UAF) vulnerabilities. This paper investigates a driver-level synchronization issue arising from the omission of explicit synchronize_rcu() calls during hash table updates, using a discovered weakness in the Intel ICE network drivers Virtual Function (VF) management. Previous kernel vulnerabilities, such as a bug in the Reliable Datagram Sockets (RDS) subsystem, show how improper RCU synchronization can directly cause kernel crashes. Experimental results demonstrate that removing VF entries without proper synchronization leaves transient stale entries, delays memory reclamation, and results in significant memory fragmentation under rapid insert/delete workloads. RCU hash tables are widely deployed in Linux kernel subsystems such as networking, virtualization, and file systems; improper synchronization can cause memory fragmentation, kernel instability, and out-of-memory (OOM) conditions. Mitigations are proposed, recommending explicit insertion of synchronize_rcu() calls to ensure timely and safe memory reclamation. These findings reinforce established best practices for RCU synchronization, highlighting their importance for maintaining kernel stability and memory safety. Keywords: RCU, kernel synchronization, hash tables, ICE driver, memory fragmentation, use-after-free

Paperid: 3485, https://arxiv.org/pdf/2510.27485.pdf

Abstract:
Systems programmers have to consolidate the ever growing hardware mess present on modern System-on-Chips (SoCs). Correctly programming a multitude of components, providing functionality but also security, is a difficult problem: semantics of individual units are described in English prose, descriptions are often underspecified, and prone to inaccuracies. Rigorous statements about platform security are often impossible. We introduce a domain-specific language to describe hardware semantics, assumptions about software behavior, and desired security properties. We then create machine-readable specifications for a diverse set of eight SoCs from their reference manuals, and formally prove their (in-)security. In addition to security proofs about memory confidentiality and integrity, we discover a handful of documentation errors. Finally, our analysis also revealed a vulnerability on a real-world server chip. Our tooling offers system integrators a way of formally describing security properties for entire SoCs, and means to prove them or find counterexamples to them.

Paperid: 3486, https://arxiv.org/pdf/2510.27304.pdf

Abstract:
With the growing volume of Internet of Things (IoT) network traffic, machine learning (ML)-based anomaly detection is more relevant than ever. Traditional batch learning models face challenges such as high maintenance and poor adaptability to rapid anomaly changes, known as concept drift. In contrast, streaming learning integrates online and incremental learning, enabling seamless updates and concept drift detection to improve robustness. This study investigates anomaly detection in streaming IoT traffic as binary classification, comparing batch and streaming learning approaches while assessing the limitations of current IoT traffic datasets. We simulated heterogeneous network data streams by carefully mixing existing datasets and streaming the samples one by one. Our results highlight the failure of batch models to handle concept drift, but also reveal persisting limitations of current datasets to expose model limitations due to low traffic heterogeneity. We also investigated the competitiveness of tree-based ML algorithms, well-known in batch anomaly detection, and compared it to non-tree-based ones, confirming the advantages of the former. Adaptive Random Forest achieved F1-score of 0.990 $\pm$ 0.006 at one-third the computational cost of its batch counterpart. Hoeffding Adaptive Tree reached F1-score of 0.910 $\pm$ 0.007, reducing computational cost by four times, making it a viable choice for online applications despite a slight trade-off in stability.

Paperid: 3487, https://arxiv.org/pdf/2510.27275.pdf

Abstract:
Recent improvement gains in large language models (LLMs) have lead to everyday usage of AI-based Conversational Agents (CAs). At the same time, LLMs are vulnerable to an array of threats, including jailbreaks and, for example, causing remote code execution when fed specific inputs. As a result, users may unintentionally introduce risks, for example, by uploading malicious files or disclosing sensitive information. However, the extent to which such user behaviors occur and thus potentially facilitate exploits remains largely unclear. To shed light on this issue, we surveyed a representative sample of 3,270 UK adults in 2024 using Prolific. A third of these use CA services such as ChatGPT or Gemini at least once a week. Of these ``regular users'', up to a third exhibited behaviors that may enable attacks, and a fourth have tried jailbreaking (often out of understandable reasons such as curiosity, fun or information seeking). Half state that they sanitize data and most participants report not sharing sensitive data. However, few share very sensitive data such as passwords. The majority are unaware that their data can be used to train models and that they can opt-out. Our findings suggest that current academic threat models manifest in the wild, and mitigations or guidelines for the secure usage of CAs should be developed. In areas critical to security and privacy, CAs must be equipped with effective AI guardrails to prevent, for example, revealing sensitive information to curious employees. Vendors need to increase efforts to prevent the entry of sensitive data, and to create transparency with regard to data usage policies and settings.

Paperid: 3488, https://arxiv.org/pdf/2510.27179.pdf

Abstract:
Video identification attacks pose a significant privacy threat that can reveal videos that victims watch, which may disclose their hobbies, religious beliefs, political leanings, sexual orientation, and health status. Also, video watching history can be used for user profiling or advertising and may result in cyberbullying, discrimination, or blackmail. Existing extensive video inference techniques usually depend on analyzing network traffic generated by streaming online videos. In this work, we observe that the content of a subtitle determines its silhouette displayed on the screen, and identifying each subtitle silhouette also derives the temporal difference between two consecutive subtitles. We then propose SilhouetteTell, a novel video identification attack that combines the spatial and time domain information into a spatiotemporal feature of subtitle silhouettes. SilhouetteTell explores the spatiotemporal correlation between recorded subtitle silhouettes of a video and its subtitle file. It can infer both online and offline videos. Comprehensive experiments on off-the-shelf smartphones confirm the high efficacy of SilhouetteTell for inferring video titles and clips under various settings, including from a distance of up to 40 meters.

Paperid: 3489, https://arxiv.org/pdf/2510.27140.pdf

Abstract:
Large Language Models (LLMs) have transformed software development, enabling AI-powered applications known as LLM-based agents that promise to automate tasks across diverse apps and workflows. Yet, the security implications of deploying such agents in adversarial mobile environments remain poorly understood. In this paper, we present the first systematic study of security risks in mobile LLM agents. We design and evaluate a suite of adversarial case studies, ranging from opportunistic manipulations such as pop-up advertisements to advanced, end-to-end workflows involving malware installation and cross-app data exfiltration. Our evaluation covers eight state-of-the-art mobile agents across three architectures, with over 2,000 adversarial and paired benign trials. The results reveal systemic vulnerabilities: low-barrier vectors such as fraudulent ads succeed with over 80% reliability, while even workflows requiring the circumvention of operating-system warnings, such as malware installation, are consistently completed by advanced multi-app agents. By mapping these attacks to the MITRE ATT&CK Mobile framework, we uncover novel privilege-escalation and persistence pathways unique to LLM-driven automation. Collectively, our findings provide the first end-to-end evidence that mobile LLM agents are exploitable in realistic adversarial settings, where untrusted third-party channels (e.g., ads, embedded webviews, cross-app notifications) are an inherent part of the mobile ecosystem.

Paperid: 3490, https://arxiv.org/pdf/2510.27127.pdf

Abstract:
With the widespread adoption of deep neural networks (DNNs), protecting intellectual property and detecting unauthorized tampering of models have become pressing challenges. Recently, Perceptual hashing has emerged as an effective approach for identifying pirated models. However, existing methods either rely on neural networks for feature extraction, demanding substantial training resources, or suffer from limited applicability and cannot be universally applied to all convolutional neural networks (CNNs). To address these limitations, we propose a lightweight CNN model hashing technique that integrates higher-order statistics (HOS) features with a chaotic mapping mechanism. Without requiring any auxiliary neural network training, our method enables efficient piracy detection and precise tampering localization. Specifically, we extract skewness, kurtosis, and structural features from the parameters of each network layer to construct a model hash that is both robust and discriminative. Additionally, we introduce chaotic mapping to amplify minor changes in model parameters by exploiting the sensitivity of chaotic systems to initial conditions, thereby facilitating accurate localization of tampered regions. Experimental results validate the effectiveness and practical value of the proposed method for model copyright protection and integrity verification.

Paperid: 3491, https://arxiv.org/pdf/2510.26830.pdf

Abstract:
Multimodal large language models (MLLMs) have achieved impressive performance across diverse tasks by jointly reasoning over textual and visual inputs. Despite their success, these models remain highly vulnerable to adversarial manipulations, raising concerns about their safety and reliability in deployment. In this work, we first generalize an approach for generating adversarial images within the HuggingFace ecosystem and then introduce SmoothGuard, a lightweight and model-agnostic defense framework that enhances the robustness of MLLMs through randomized noise injection and clustering-based prediction aggregation. Our method perturbs continuous modalities (e.g., images and audio) with Gaussian noise, generates multiple candidate outputs, and applies embedding-based clustering to filter out adversarially influenced predictions. The final answer is selected from the majority cluster, ensuring stable responses even under malicious perturbations. Extensive experiments on POPE, LLaVA-Bench (In-the-Wild), and MM-SafetyBench demonstrate that SmoothGuard improves resilience to adversarial attacks while maintaining competitive utility. Ablation studies further identify an optimal noise range (0.1-0.2) that balances robustness and utility.

Paperid: 3492, https://arxiv.org/pdf/2510.26620.pdf

Abstract:
Threat modeling plays a critical role in the identification and mitigation of security risks; however, manual approaches are often labor intensive and prone to error. This paper investigates the automation of software threat modeling through the clustering of call graphs using density-based and community detection algorithms, followed by an analysis of the threats associated with the identified clusters. The proposed method was evaluated through a case study of the Splunk Forwarder Operator (SFO), wherein selected clustering metrics were applied to the software's call graph to assess pertinent code-density security weaknesses. The results demonstrate the viability of the approach and underscore its potential to facilitate systematic threat assessment. This work contributes to the advancement of scalable, semi-automated threat modeling frameworks tailored for modern cloud-native environments.

Paperid: 3493, https://arxiv.org/pdf/2510.26523.pdf

Abstract:
Smart home devices such as video doorbells and security cameras are becoming increasingly common in everyday life. While these devices offer convenience and safety, they also raise new privacy concerns: how these devices affect others, like neighbors, visitors, or people passing by. This issue is generally known as interdependent privacy, where one person's actions (or inaction) may impact the privacy of others, and, specifically, bystander privacy in the context of smart homes. Given lax data protection regulations in terms of shared physical spaces and amateur joint data controllers, we expect that the privacy policies of smart home products reflect the missing regulatory incentives. This paper presents a focused privacy policy analysis of 20 video doorbell and smart camera products, concentrating explicitly on the bystander aspect. We show that although some of the vendors acknowledge bystanders, they address it only to the extent of including disclaimers, shifting the ethical responsibility for collecting the data of non-users to the device owner. In addition, we identify and examine real-world cases related to bystander privacy, demonstrating how current deployments can impact non-users. Based on our findings, we analyze vendor privacy policies in light of existing legal frameworks and technical capabilities, and we provide practical recommendations for both policy language and system design to enhance transparency and empower both bystanders and device owners.

Paperid: 3494, https://arxiv.org/pdf/2510.26274.pdf

Abstract:
Watermarking schemes for large language models (LLMs) have been proposed to identify the source of the generated text, mitigating the potential threats emerged from model theft. However, current watermarking solutions hardly resolve the trust issue: the non-public watermark detection cannot prove itself faithfully conducting the detection. We observe that it is attributed to the secret key mostly used in the watermark detection -- it cannot be public, or the adversary may launch removal attacks provided the key; nor can it be private, or the watermarking detection is opaque to the public. To resolve the dilemma, we propose PVMark, a plugin based on zero-knowledge proof (ZKP), enabling the watermark detection process to be publicly verifiable by third parties without disclosing any secret key. PVMark hinges upon the proof of `correct execution' of watermark detection on which a set of ZKP constraints are built, including mapping, random number generation, comparison, and summation. We implement multiple variants of PVMark in Python, Rust and Circom, covering combinations of three watermarking schemes, three hash functions, and four ZKP protocols, to show our approach effectively works under a variety of circumstances. By experimental results, PVMark efficiently enables public verifiability on the state-of-the-art LLM watermarking schemes yet without compromising the watermarking performance, promising to be deployed in practice.

Paperid: 3495, https://arxiv.org/pdf/2510.26103.pdf

Abstract:
This paper presents a comprehensive empirical analysis of security vulnerabilities in AI-generated code across public GitHub repositories. We collected and analyzed 7,703 files explicitly attributed to four major AI tools: ChatGPT (91.52\%), GitHub Copilot (7.50\%), Amazon CodeWhisperer (0.52\%), and Tabnine (0.46\%). Using CodeQL static analysis, we identified 4,241 Common Weakness Enumeration (CWE) instances across 77 distinct vulnerability types. Our findings reveal that while 87.9\% of AI-generated code does not contain identifiable CWE-mapped vulnerabilities, significant patterns emerge regarding language-specific vulnerabilities and tool performance. Python consistently exhibited higher vulnerability rates (16.18\%-18.50\%) compared to JavaScript (8.66\%-8.99\%) and TypeScript (2.50\%-7.14\%) across all tools. We observed notable differences in security performance, with GitHub Copilot achieving better security density for Python (1,739 LOC per CWE) and TypeScript, while ChatGPT performed better for JavaScript. Additionally, we discovered widespread use of AI tools for documentation generation (39\% of collected files), an understudied application with implications for software maintainability. These findings extend previous work with a significantly larger dataset and provide valuable insights for developing language-specific and context-aware security practices for the responsible integration of AI-generated code into software development workflows.

Paperid: 3496, https://arxiv.org/pdf/2510.26003.pdf

Abstract:
In the present paper, we introduce a message-recovery attack based on the Modular Knapsack Problem, applicable to all variants of the NTRU-HPS cryptosystem. Assuming that a fraction $ε$ of the coefficients of the message ${\bf{m}}\in\{-1,0,1\}^N$ and of the nonce vector ${\bf r}\in\{-1,0,1\}^N$ are known in advance at random positions, we reduce message decryption to finding a short vector in a lattice that encodes an instance of a modular knapsack system. This allows us to address a key question: how much information about ${\bf m}$, or about the pair $({\bf m},{\bf r})$, is required before recovery becomes feasible? A FLATTER reduction successfully recovers the message, in practice when $ε\approx 0.45$. Our implementation finds ${\bf m}$ within a few minutes on a commodity desktop.

Paperid: 3497, https://arxiv.org/pdf/2510.25934.pdf

Abstract:
Graph Neural Networks (GNNs) are valuable intellectual property, yet many watermarks rely on backdoor triggers that break under common model edits and create ownership ambiguity. We present InvGNN-WM, which ties ownership to a model's implicit perception of a graph invariant, enabling trigger-free, black-box verification with negligible task impact. A lightweight head predicts normalized algebraic connectivity on an owner-private carrier set; a sign-sensitive decoder outputs bits, and a calibrated threshold controls the false-positive rate. Across diverse node and graph classification datasets and backbones, InvGNN-WM matches clean accuracy while yielding higher watermark accuracy than trigger- and compression-based baselines. It remains strong under unstructured pruning, fine-tuning, and post-training quantization; plain knowledge distillation (KD) weakens the mark, while KD with a watermark loss (KD+WM) restores it. We provide guarantees for imperceptibility and robustness, and we prove that exact removal is NP-complete.

Paperid: 3498, https://arxiv.org/pdf/2510.25932.pdf

Abstract:
Social platforms distribute information at unprecedented speed, which in turn accelerates the spread of misinformation and threatens public discourse. We present FakeZero, a fully client-side, cross-platform browser extension that flags unreliable posts on Facebook and X (formerly Twitter) while the user scrolls. All computation, DOM scraping, tokenization, Transformer inference, and UI rendering run locally through the Chromium messaging API, so no personal data leaves the device. FakeZero employs a three-stage training curriculum: baseline fine-tuning and domain-adaptive training enhanced with focal loss, adversarial augmentation, and post-training quantization. Evaluated on a dataset of 239,000 posts, the DistilBERT-Quant model (67.6 MB) reaches 97.1% macro-F1, 97.4% accuracy, and an AUROC of 0.996, with a median latency of approximately 103 ms on a commodity laptop. A memory-efficient TinyBERT-Quant variant retains 95.7% macro-F1 and 96.1% accuracy while shrinking the model to 14.7 MB and lowering latency to approximately 40 ms, showing that high-quality fake-news detection is feasible under tight resource budgets with only modest performance loss. By providing inline credibility cues, the extension can serve as a valuable tool for policymakers seeking to curb the spread of misinformation across social networks. With user consent, FakeZero also opens the door for researchers to collect large-scale datasets of fake news in the wild, enabling deeper analysis and the development of more robust detection techniques.

Paperid: 3499, https://arxiv.org/pdf/2510.25878.pdf

Abstract:
The rising importance of cryptocurrencies as financial assets pushed their applicability from an object of speculation closer to standard financial instruments such as loans. In this work, we initiate the study of secure protocols that enable fiat-denominated loans collateralized by cryptocurrencies such as Bitcoin. We provide limited-custodial protocols for such loans relying only on trusted arbitration and provide their game-theoretical analysis. We also highlight various interesting directions for future research.

Paperid: 3500, https://arxiv.org/pdf/2510.25856.pdf

Abstract:
Modern vehicles remain vulnerable to unauthorized use and theft despite traditional security measures including immobilizers and keyless entry systems. Criminals exploit vulnerabilities in Controller Area Network (CAN) bus systems to bypass authentication mechanisms, while social media trends have expanded auto theft to include recreational joyriding by underage drivers. Driver authentication via CAN bus data offers a promising additional layer of defense-in-depth protection, but existing open-access driver fingerprinting datasets suffer from critical limitations including reliance on decoded diagnostic data rather than raw CAN traffic, artificial fixed-route experimental designs, insufficient sampling rates, and lack of demographic information. This paper provides a comprehensive review of existing open-access driver fingerprinting datasets, analyzing their strengths and limitations to guide practitioners in dataset selection. We introduce the Kidmose CANid Dataset (KCID), which addresses these fundamental shortcomings by providing raw CAN bus data from 16 drivers across four vehicles, including essential demographic information and both daily driving and controlled fixed-route data. Beyond dataset contributions, we present a driver authentication anti-theft framework and implement a proof-of-concept prototype on a single-board computer. Through live road trials with an unaltered passenger vehicle, we demonstrate the practical feasibility of CAN bus-based driver authentication anti-theft systems. Finally, we explore diverse applications of KCID beyond driver authentication, including driver profiling for insurance and safety assessments, mechanical anomaly detection, young driver monitoring, and impaired driving detection. This work provides researchers with both the data and methodological foundation necessary to develop robust, deployable driver authentication systems...

Paperid: 3501, https://arxiv.org/pdf/2510.25802.pdf

Abstract:
In this paper, we propose a novel hybrid deep learning architecture that synergistically combines Graph Neural Networks (GNNs), Recurrent Neural Networks (RNNs), and multi-head attention mechanisms to significantly enhance cybersecurity intrusion detection capabilities. By leveraging the comprehensive UNSW-NB15 dataset containing diverse network traffic patterns, our approach effectively captures both spatial dependencies through graph structural relationships and temporal dynamics through sequential analysis of network events. The integrated attention mechanism provides dual benefits of improved model interpretability and enhanced feature selection, enabling cybersecurity analysts to focus computational resources on high-impact security events -- a critical requirement in modern real-time intrusion detection systems. Our extensive experimental evaluation demonstrates that the proposed hybrid model achieves superior performance compared to traditional machine learning approaches and standalone deep learning models across multiple evaluation metrics, including accuracy, precision, recall, and F1-score. The model achieves particularly strong performance in detecting sophisticated attack patterns such as Advanced Persistent Threats (APTs), Distributed Denial of Service (DDoS) attacks, and zero-day exploits, making it a promising solution for next-generation cybersecurity applications in complex network environments.

Paperid: 3502, https://arxiv.org/pdf/2508.20307.pdf

Abstract:
The rise of AI has transformed the software and hardware landscape, enabling powerful capabilities through specialized infrastructures, large-scale data storage, and advanced hardware. However, these innovations introduce unique attack surfaces and objectives which traditional cybersecurity assessments often overlook. Cyber attackers are shifting their objectives from conventional goals like privilege escalation and network pivoting to manipulating AI outputs to achieve desired system effects, such as slowing system performance, flooding outputs with false positives, or degrading model accuracy. This paper serves to raise awareness of the novel cyber threats that are introduced when incorporating AI into a software system. We explore the operational cybersecurity and supply chain risks across the AI lifecycle, emphasizing the need for tailored security frameworks to address evolving threats in the AI-driven landscape. We highlight previous exploitations and provide insights from working in this area. By understanding these risks, organizations can better protect AI systems and ensure their reliability and resilience.

Paperid: 3503, https://arxiv.org/pdf/2508.19714.pdf

Abstract:
Fake images in selfie banking are increasingly becoming a threat. Previously, it was just Photoshop, but now deep learning technologies enable us to create highly realistic fake identities, which fraudsters exploit to bypass biometric systems such as facial recognition in online banking. This paper explores the use of an already established forensic recognition system, previously used for picture camera localization, in deepfake detection.

Paperid: 3504, https://arxiv.org/pdf/2508.19395.pdf

Abstract:
The NIS2 Directive establishes a common cybersecurity governance model across the European Union, requiring member states to identify, classify, and supervise essential and important entities. As part of a broader governance network, member states are also obligated to notify the European Commission, the Cooperation Group, and ENISA about their cybersecurity infrastructure landscape. This thesis presents an analysis of the NIS2 Directive in this context and translates its provisions into concrete technical requirements. These requirements inform the design and implementation of a modular, legally grounded registry system intended to support competent authorities across the EU in meeting their obligations. Using the Design Science Research methodology, the thesis transforms complex legal provisions into structured workflows, deterministic classification algorithms, and interactive dashboards. The resulting system automates key regulatory processes, including entity registration, classification, and notification, while enabling context-aware supervision and reducing administrative burden. It supports both automated and manual registration methods and introduces a contextual labeling system to handle edge cases, risk factors, and cross-directive dependencies. Although developed for the Norwegian regulatory ecosystem, the system is designed for adaptation by other member states with minimal modification. This thesis contributes a reusable framework that bridges legal interpretation and technical implementation, offering a scalable solution for national and EU-level NIS2 cybersecurity governance. It also identifies key limitations and outlines opportunities for future research and development.

Paperid: 3505, https://arxiv.org/pdf/2508.19368.pdf

Abstract:
This study investigates the defacement of Indonesian websites by actors promoting illegal online gambling. Using a lightweight methodology that combines keyword-driven dorking with systematic crawling, we identified 453 defaced webpages within one month. Although dorking alone yielded a false positive rate of approximately 20.3\%, the integration of crawling and keyword-counting enabled reliable differentiation between true and false positives. Our measurements revealed diverse defacement behaviors, including repeat defacements (150 cases), fixed instances (129), keyword modifications (55), and redirections or hidden URL injections. In total, 8,837 unique third-party URLs spanning 5,930 domains were captured, with a small subset recurring across multiple sites. Website responses were inconsistent, with an average reaction time of 75.3 hours. These findings demonstrate that simple, reproducible techniques can provide meaningful insights into the scale, persistence, and dynamics of defacement, highlighting the importance of continuous measurement for strengthening defenses against online gambling activities.

Paperid: 3506, https://arxiv.org/pdf/2508.19323.pdf

Abstract:
Steganography is technique of hiding a data under cover media using different steganography tools. Image steganography is hiding of data (Text/Image/Audio/Video) under a cover as Image. This review paper presents classification of image steganography and the comparison of various Image steganography tools using different image formats. Analyzing numerous tools on the basis of Image features and extracting the best one. Some of the tools available in the market were selected based on the frequent use; these tools were tested using the same input on all of them. Specific text was embedded within all host images for each of the six Steganography tools selected. The results of the experiment reveal that all the six tools were relatively performing at the same level, though some software performs better than others through efficiency. And it was based on the image features like size, dimensions, and pixel value and histogram differentiation.

Paperid: 3507, https://arxiv.org/pdf/2508.19321.pdf

Abstract:
With the widespread use of large language models (LLMs), understanding their potential failure modes during user interactions is essential. In practice, users often pose multiple questions in a single conversation with LLMs. Therefore, in this study, we propose Group Query Attack, a technique that simulates this scenario by presenting groups of queries to LLMs simultaneously. We investigate how the accumulated context from consecutive prompts influences the outputs of LLMs. Specifically, we observe that Group Query Attack significantly degrades the performance of models fine-tuned on specific tasks. Moreover, we demonstrate that Group Query Attack induces a risk of triggering potential backdoors of LLMs. Besides, Group Query Attack is also effective in tasks involving reasoning, such as mathematical reasoning and code generation for pre-trained and aligned models.

Paperid: 3508, https://arxiv.org/pdf/2508.19284.pdf

Abstract:
The Denial of Wallet (DoW) attack poses a unique and growing threat to serverless architectures that rely on Function-as-a-Service (FaaS) models, exploiting the cost structure of pay-as-you-go billing to financially burden application owners. Unlike traditional Denial of Service (DoS) attacks, which aim to exhaust resources and disrupt service availability, DoW attacks focus on escalating costs without impacting service operation. This review traces the evolution of DoW research, from initial awareness and attack classification to advancements in detection and mitigation strategies. Key developments include the categorisation of attack types-such as Blast DDoW, Continual Inconspicuous DDoW, and Background Chained DDoW-and the creation of simulation tools like DoWTS, which enable safe experimentation and data generation. Recent advancements highlight machine learning approaches, including systems like Gringotts and DoWNet, which leverage deep learning and anomaly detection to identify malicious traffic patterns. Although substantial progress has been made, challenges persist, notably the lack of real-world data and the need for adaptive billing models. This is the first comprehensive literature review dedicated strictly to Denial of Wallet attacks, providing an in-depth analysis of their financial impacts, attack techniques, mitigation strategies, and detection mechanisms within serverless computing. The paper also presents the first detailed examination of simulation and data generation tools used for DoW research, addressing a critical gap in existing cybersecurity literature. By synthesising these key areas, this study serves as a foundational resource for future research and industry efforts in securing pay-as-you-go cloud environments.

Paperid: 3509, https://arxiv.org/pdf/2508.19283.pdf

Abstract:
This paper proposes a unified, condition-based framework for classifying both legacy and cloud-era denial-of-service (DoS) attacks. The framework comprises three interrelated models: a formal conditional tree taxonomy, a hierarchical lattice structure based on order theory, and a conceptual Venn diagram. At its core, the taxonomy introduces six observable conditions (C0-C5) grounded in real-world attack behaviours, including source distribution, traffic volume, infrastructure targeting, and financial exploitation. These conditions enable consistent classification of known attacks-such as DoS, DDoS, LDoS, LDDoS, EDoS, DoW, and DDoW, while supporting identification of emerging or hybrid variants. The lattice structure captures the cumulative satisfaction of conditions, allowing hierarchical reasoning across denial attack classes. The Venn diagram highlights conceptual overlaps between availability- and sustainability-focused attacks, improving comparative insight. Together, these models provide a robust analytical lens for threat modeling, mitigation strategy design, and attacker intent classification. The framework is particularly relevant in cloud-native and serverless environments, where sustainability-based attacks are increasingly impactful yet under-recognised. Its extensibility also permits future integration of socio-technical or behavioural dimensions. By offering a structured taxonomy with theoretical grounding and real-world applicability, this work advances denial attack comprehension and equips defenders, researchers, and cloud architects with a shared vocabulary for interpreting and mitigating evolving threat vectors.

Paperid: 3510, https://arxiv.org/pdf/2508.19281.pdf

Abstract:
As the deployment of Artificial Intelligence (AI) systems in high-stakes sectors - like healthcare, finance, education, justice, and infrastructure has increased - the possibility and impact of failures of these systems have significantly evolved from being a theoretical possibility to practical recurring, systemic risk. This paper introduces CORTEX (Composite Overlay for Risk Tiering and Exposure), a multi-layered risk scoring framework proposed to assess and score AI system vulnerabilities, developed on empirical analysis of over 1,200 incidents documented in the AI Incident Database (AIID), CORTEX categorizes failure modes into 29 technical vulnerability groups. Each vulnerability is scored through a five-tier architecture that combines: (1) utility-adjusted Likelihood x Impact calculations; (2) governance + contextual overlays aligned with regulatory frameworks, such as the EU AI Act, NIST RMF, OECD principles; (3) technical surface scores, covering exposure vectors like drift, traceability, and adversarial risk; (4) environmental and residual modifiers tailored to context of where these systems are being deployed to use; and (5) a final layered assessment via Bayesian risk aggregation and Monte Carlo simulation to model volatility and long-tail risks. The resulting composite score can be operationalized across AI risk registers, model audits, conformity checks, and dynamic governance dashboards.

Paperid: 3511, https://arxiv.org/pdf/2508.19267.pdf

Abstract:
The proliferation of autonomous AI agents marks a paradigm shift toward complex, emergent multi-agent systems. This transition introduces systemic security risks, including control-flow hijacking and cascading failures, that traditional cybersecurity paradigms are ill-equipped to address. This paper introduces the Aegis Protocol, a layered security framework designed to provide strong security guarantees for open agentic ecosystems. The protocol integrates three technological pillars: (1) non-spoofable agent identity via W3C Decentralized Identifiers (DIDs); (2) communication integrity via NIST-standardized post-quantum cryptography (PQC); and (3) verifiable, privacy-preserving policy compliance using the Halo2 zero-knowledge proof (ZKP) system. We formalize an adversary model extending Dolev-Yao for agentic threats and validate the protocol against the STRIDE framework. Our quantitative evaluation used a discrete-event simulation, calibrated against cryptographic benchmarks, to model 1,000 agents. The simulation showed a 0 percent success rate across 20,000 attack trials. For policy verification, analysis of the simulation logs reported a median proof-generation latency of 2.79 seconds, establishing a performance baseline for this class of security. While the evaluation is simulation-based and early-stage, it offers a reproducible baseline for future empirical studies and positions Aegis as a foundation for safe, scalable autonomous AI.

Paperid: 3512, https://arxiv.org/pdf/2508.19219.pdf

Abstract:
The Internet of Things (IoT) is applied in various fields, and the number of physical devices connected to the IoT is increasingly growing. There are significant challenges to the IoT's growth and development, mainly due to the centralized nature and large-scale IoT networks. The emphasis on the decentralization of IoT's architecture can overcome challenges to IoT's capabilities. A promising decentralized platform for IoT is blockchain. Owing to IoT devices' limited resources, traditional consensus algorithms such as PoW and PoS in the blockchain are computationally expensive. Therefore, the PoA consensus algorithm is proposed in the blockchain consensus network for IoT. The PoA selects the validator as Turn-based selection (TBS) that needs optimization and faces system reliability, energy consumption, latency, and low scalability. We propose an efficient, lightweight blockchain for decentralizing IoT architecture by using virtualization and clustering to increase productivity and scalability to address these issues. We also introduce a novel PoA based on the Weight-Based-Selection (WBS) method for validators to validate transactions and add them to the blockchain. By simulation, we evaluated the performance of our proposed WBS method as opposed to TBS. The results show reduced energy consumption, and response time, and increased throughput.

Paperid: 3513, https://arxiv.org/pdf/2508.19010.pdf

Abstract:
Physical-layer key generation (PLKG) has emerged as a promising technique to secure next-generation wireless networks by exploiting the inherent properties of the wireless channel. However, PLKG faces fundamental challenges in the millimeter wave (mmWave) regime due to channel sparsity, higher phase noise, and higher path loss, which undermine both the randomness and reciprocity required for secure key generation. In this paper, we present mmKey, a novel PLKG framework that capitalizes on the availability of multiple antennas at mmWave wireless nodes to inject randomness into an otherwise quasi-static wireless channel. Different from prior works that sacrifice either the secrecy of the key generation or the robustness, mmKey balances these two requirements. In particular, mmKey leverages a genetic algorithm to gradually evolve the initial weight vector population toward configurations that suppress the LOS component while taking into account the channel conditions, specifically, the sparsity and the signal-to-noise ratio (SNR). Extensive simulations show that mmKey improves the secrecy gap by an average of 39.4% over random beamforming and 34.0% over null beamforming, outperforming conventional schemes.

Paperid: 3514, https://arxiv.org/pdf/2508.18488.pdf

Abstract:
Objective: This work describes the topic modelling of Security Operations Centre (SOC) use of a large language model (LLM), during live security operations. The goal is to better understand how these specialists voluntarily use this tool. Background: Human-automation teams have been extensively studied, but transformer-based language models have sparked a new wave of collaboration. SOC personnel at a major cybersecurity provider used an LLM to support live security operations. This study examines how these specialists incorporated the LLM into their work. Method: Our data set is the result of 10 months of SOC operators accessing GPT-4 over an internally deployed HTTP-based chat application. We performed two topic modelling exercises, first using the established BERTopic model (Grootendorst, 2022), and second, using a novel topic modeling workflow. Results: Both the BERTopic analysis and novel modelling approach revealed that SOC operators primarily used the LLM to facilitate their understanding of complex text strings. Variations on this use-case accounted for ~40% of SOC LLM usage. Conclusion: SOC operators are required to rapidly interpret complex commands and similar information. Their natural tendency to leverage LLMs to support this activity indicates that their workflow can be supported and augmented by designing collaborative LLM tools for use in the SOC. Application: This work can aid in creating next-generation tools for Security Operations Centres. By understanding common use-cases, we can develop workflows supporting SOC task flow. One example is a right-click context menu for executing a command line analysis LLM call directly in the SOC environment.

Paperid: 3515, https://arxiv.org/pdf/2508.18415.pdf

Abstract:
In humanitarian and emergency scenarios, the use of biometrics can dramatically improve the efficiency of operations, but it poses risks for the data subjects, which are exacerbated in contexts of vulnerability. To address this, we present a mobile biometric system implementing a biometric template protection (BTP) scheme suitable for these scenarios. After rigorously formulating the functional, operational, and security and privacy requirements of these contexts, we perform a broad comparative analysis of the BTP landscape. PolyProtect, a method designed to operate on neural network face embeddings, is identified as the most suitable method due to its effectiveness, modularity, and lightweight computational burden. We evaluate PolyProtect in terms of verification and identification accuracy, irreversibility, and unlinkability, when this BTP method is applied to face embeddings extracted using EdgeFace, a novel state-of-the-art efficient feature extractor, on a real-world face dataset from a humanitarian field project in Ethiopia. Moreover, as PolyProtect promises to be modality-independent, we extend its evaluation to fingerprints. To the best of our knowledge, this is the first time that PolyProtect has been evaluated for the identification scenario and for fingerprint biometrics. Our experimental results are promising, and we plan to release our code

Paperid: 3516, https://arxiv.org/pdf/2508.17414.pdf

Abstract:
Educational games have been widely used to teach children about cyber security. This systematic literature review reveals evidence of positive learning outcomes, after analysing 91 such games reported in 68 papers published between 2010 and 2024. However, critical gaps have also been identified regarding the design processes and the methodological rigour, including lack of systematic design, misalignment between proposed and achieved learning outcomes, rare use of control groups, limited discussions on ethical considerations, and underutilisation of emerging technologies. We recommend multiple future research directions, e.g., a hybrid approach to game design and evaluation that combines bottom-up and top-down approaches.

Paperid: 3517, https://arxiv.org/pdf/2508.17155.pdf

Abstract:
Large Language Model (LLM)-enabled agents are rapidly emerging across a wide range of applications, but their deployment introduces vulnerabilities with security implications. While prior work has examined prompt-based attacks (e.g., prompt injection) and data-oriented threats (e.g., data exfiltration), time-of-check to time-of-use (TOCTOU) remain largely unexplored in this context. TOCTOU arises when an agent validates external state (e.g., a file or API response) that is later modified before use, enabling practical attacks such as malicious configuration swaps or payload injection. In this work, we present the first study of TOCTOU vulnerabilities in LLM-enabled agents. We introduce TOCTOU-Bench, a benchmark with 66 realistic user tasks designed to evaluate this class of vulnerabilities. As countermeasures, we adapt detection and mitigation techniques from systems security to this setting and propose prompt rewriting, state integrity monitoring, and tool-fusing. Our study highlights challenges unique to agentic workflows, where we achieve up to 25% detection accuracy using automated detection methods, a 3% decrease in vulnerable plan generation, and a 95% reduction in the attack window. When combining all three approaches, we reduce the TOCTOU vulnerabilities from an executed trajectory from 12% to 8%. Our findings open a new research direction at the intersection of AI safety and systems security.

Paperid: 3518, https://arxiv.org/pdf/2508.16975.pdf

Abstract:
The rise of Deepfake technology to generate hyper-realistic manipulated images and videos poses a significant challenge to the public and relevant authorities. This study presents a robust Deepfake detection based on a modified Vision Transformer(ViT) model, trained to distinguish between real and Deepfake images. The model has been trained on a subset of the OpenForensics Dataset with multiple augmentation techniques to increase robustness for diverse image manipulations. The class imbalance issues are handled by oversampling and a train-validation split of the dataset in a stratified manner. Performance is evaluated using the accuracy metric on the training and testing datasets, followed by a prediction score on a random image of people, irrespective of their realness. The model demonstrates state-of-the-art results on the test dataset to meticulously detect Deepfake images.

Paperid: 3519, https://arxiv.org/pdf/2508.16941.pdf

Abstract:
With the popularization of smartphones, red packets have been widely used in mobile apps. However, the issues of fraud associated with them have also become increasingly prominent. As reported in user reviews from mobile app markets, many users have complained about experiencing red packet fraud and being persistently troubled by fraudulent red packets. To uncover this phenomenon, we conduct the first investigation into an extensive collection of user reviews on apps with red packets. In this paper, we first propose a novel automated approach, ReckDetector, for effectively identifying apps with red packets from app markets. We then collect over 360,000 real user reviews from 334 apps with red packets available on Google Play and three popular alternative Android app markets. We preprocess the user reviews to extract those related to red packets and fine-tune a pre-trained BERT model to identify negative reviews. Finally, based on semantic analysis, we have summarized six distinct categories of red packet fraud issues reported by users. Through our study, we found that red packet fraud is highly prevalent, significantly impacting user experience and damaging the reputation of apps. Moreover, red packets have been widely exploited by unscrupulous app developers as a deceptive incentive mechanism to entice users into completing their designated tasks, thereby maximizing their profits.

Paperid: 3520, https://arxiv.org/pdf/2508.16189.pdf

Abstract:
The very high growth of Intelligent Transportation Systems (ITS) has generated an urgent requirement for secure, effective, and context-aware data sharing mechanisms, especially over heterogeneous and geographically dispersed settings. This work suggests a new architecture that combines a relay chain-driven encryption system with a modified Ciphertext-Policy Attribute-Based Encryption (CP-ABE) scheme to tackle the double impediment of dynamic access and low-latency communication. The model proposes a context-aware smart contract on a worldwide relay chain that checks against data properties, including event type, time, and geographical region, to specify the suitable level of encryption policy. From such relay-directed judgment, On-Board Units (OBUs) encrypt data end-to-end by utilising CP-ABE and store ciphertext inside localised regional blockchains, preventing dependence on symmetric encryption or off-chain storage. High-sensitivity events are secured with firm, multi-attribute access rules, whereas common updates use light policies to help reduce processing burdens. The crypto system also adds traceability and low-latency revocation, with global enforcement managed through the relay chain. This distributed, scalable model provides a proper balance between responsiveness in real time and security and is extremely apt for next-gen vehicular networks that function across multi-jurisdictional domains.

Paperid: 3521, https://arxiv.org/pdf/2508.15934.pdf

Abstract:
Backdoor attacks pose a significant threat to the integrity of text classification models used in natural language processing. While several dirty-label attacks that achieve high attack success rates (ASR) have been proposed, clean-label attacks are inherently more difficult. In this paper, we propose three sample selection strategies to improve attack effectiveness in clean-label scenarios: Minimum, Above50, and Below50. Our strategies identify those samples which the model predicts incorrectly or with low confidence, and by injecting backdoor triggers into such samples, we aim to induce a stronger association between the trigger patterns and the attacker-desired target label. We apply our methods to clean-label variants of four canonical backdoor attacks (InsertSent, WordInj, StyleBkd, SynBkd) and evaluate them on three datasets (IMDB, SST2, HateSpeech) and four model types (LSTM, BERT, DistilBERT, RoBERTa). Results show that the proposed strategies, particularly the Minimum strategy, significantly improve the ASR over random sample selection with little or no degradation in the model's clean accuracy. Furthermore, clean-label attacks enhanced by our strategies outperform BITE, a state of the art clean-label attack method, in many configurations.

Paperid: 3522, https://arxiv.org/pdf/2508.15898.pdf

Abstract:
Software fault isolation (SFI) is a popular way to sandbox untrusted software. A key component of SFI is the verifier that checks the untrusted code is written in a subset of the machine language that guarantees it never reads or writes outside of a region of memory dedicated to the sandbox. Soundness bugs in the SFI verifier would break the SFI security model and allow the supposedly sandboxed code to read protected memory. In this paper, we address the concern of SFI verifier bugs by performing an automated formal verification of a recent SFI system called Lightweight Fault Isolation (LFI). In particular, we formally verify that programs accepted by the LFI verifier never read or write to memory outside of a designated sandbox region.

Paperid: 3523, https://arxiv.org/pdf/2508.15808.pdf

Abstract:
Advances in AI are widely understood to have implications for cybersecurity. Articles have emphasized the effect of AI on the cyber offense-defense balance, and commentators can be found arguing either that cyber will privilege attackers or defenders. For defenders, arguments are often made that AI will enable solutions like formal verification of all software--and for some well-equipped companies, this may be true. This conversation, however, does not match the reality for most companies. "Trailing-edge organizations," as we term them, rely heavily on legacy software, poorly staff security roles, and struggle to implement best practices like rapid deployment of security patches. These decisions may be the result of corporate inertia, but may also be the result of a seemingly-rational calculation that attackers may not bother targeting a firm due to lack of economic incentives, and as a result, underinvestment in defense will not be punished. This approach to security may have been sufficient prior to the development of AI systems, but it is unlikely to remain viable in the near future. We argue that continuing improvements in AI's capabilities poses additional risks on two fronts: First, increased usage of AI will alter the economics of the marginal cyberattack and expose these trailing-edge organizations to more attackers, more frequently. Second, AI's advances will enable attackers to develop exploits and launch attacks earlier than they can today--meaning that it is insufficient for these companies to attain parity with today's leading defenders, but must instead aim for faster remediation timelines and more resilient software. The situation today portends a dramatically increased number of attacks in the near future. Moving forward, we offer a range of solutions for both organizations and governments to improve the defensive posture of firms which lag behind their peers today.

Paperid: 3524, https://arxiv.org/pdf/2508.15306.pdf

Abstract:
The rapid advancement of Internet-connected vehicle technologies has introduced a new era of smart mobility, while simultaneously raising significant cybersecurity and privacy concerns. This paper explores the evolving threat landscape associated with connected vehicles, focusing on risks such as unauthorized remote access and the potential leakage of personal data. To assess the current state of protection, we conducted a comprehensive analysis of 16 international standards and regulations, evaluating them from multiple perspectives including regulatory strength, technical specificity, treatment of supply chain risks, and approaches to personal data handling. In parallel, we carried out a user-focused survey designed to map consumer attitudes toward smart cars. The survey investigated which types of vehicles users trust and prefer, the reasons behind rejecting certain car types, their awareness of data-related risks, and whether they feel adequately informed about how their vehicles handle data. By combining regulatory analysis with user perception insights, this study aims to contribute to a more holistic understanding of the challenges and expectations surrounding connected vehicle ecosystems.

Paperid: 3525, https://arxiv.org/pdf/2508.15100.pdf

Abstract:
Distribution shift, a change in the statistical properties of data over time, poses a critical challenge for deep learning anomaly detection systems. Existing anomaly detection systems often struggle to adapt to these shifts. Specifically, systems based on supervised learning require costly manual labeling, while those based on unsupervised learning rely on clean data, which is difficult to obtain, for shift adaptation. Both of these requirements are challenging to meet in practice. In this paper, we introduce NetSight, a framework for supervised anomaly detection in network data that continually detects and adapts to distribution shifts in an online manner. NetSight eliminates manual intervention through a novel pseudo-labeling technique and uses a knowledge distillation-based adaptation strategy to prevent catastrophic forgetting. Evaluated on three long-term network datasets, NetSight demonstrates superior adaptation performance compared to state-of-the-art methods that rely on manual labeling, achieving F1-score improvements of up to 11.72%. This proves its robustness and effectiveness in dynamic networks that experience distribution shifts over time.

Paperid: 3526, https://arxiv.org/pdf/2508.14530.pdf

Abstract:
Federated Learning (FL) is increasingly adopted for privacy-preserving collaborative training, but its decentralized nature makes it particularly susceptible to backdoor attacks. Existing attack methods, however, often rely on idealized assumptions and fail to remain effective under real-world constraints, such as limited attacker control, non-IID data distributions, and the presence of diverse defense mechanisms. To address this gap, we propose DOPA (Divergent Optimization Path Attack), a novel framework that simulates heterogeneous local training dynamics and seeks consensus across divergent optimization trajectories to craft universally effective and stealthy backdoor triggers. By leveraging consistency signals across simulated paths to guide optimization, DOPA overcomes the challenge of heterogeneity-induced instability and achieves practical attack viability under stringent federated constraints. We validate DOPA on a comprehensive suite of 12 defense strategies, two model architectures (ResNet18/VGG16), two datasets (CIFAR-10/TinyImageNet), and both mild and extreme non-IID settings. Despite operating under a single-client, black-box, and sparsely participating threat model, DOPA consistently achieves high attack success, minimal accuracy degradation, low runtime, and long-term persistence. These results demonstrate a more practical attack paradigm, offering new perspectives for designing robust defense strategies in federated learning systems

Paperid: 3527, https://arxiv.org/pdf/2508.14402.pdf

Abstract:
Excessive and spurious alert generation by cloud security solutions is a root cause of analyst fatigue and operational inefficiencies. In this study, the long-standing issue of false positives from publicly accessible alerts in Amazon S3, as generated by a licensed cloud-native security solution, is examined. In a simulated production test environment, which consisted of over 1,000 Amazon S3 buckets with diverse access configurations, it was discovered that over 80\% of the alerts generated by default rules were classified as false positives, thus demonstrating the severity of the detection issue. This severely impacted detection accuracy and generated a heavier workload for analysts due to redundant manual triage efforts. For addressing this problem, custom detection logic was created as an exercise of the native rule customization capabilities of the solution. A unified titled ``S3 Public Access Validation and Data Exposure'' was created in an effort to consolidate different forms of alerts into one, context-aware logic that systematically scans ACL configurations, bucket policies, indicators of public exposure, and the presence of sensitive data, and then marks only those S3 buckets that indeed denote security risk and are publicly exposed on the internet with no authentication. The results demonstrate a significant reduction in false positives, more precise alert fidelity, and significant time saving for security analysts, thus demonstrating an actionable and reproducible solution to enhance the accuracy of security alerting in compliance-focused cloud environments.

Paperid: 3528, https://arxiv.org/pdf/2508.14284.pdf

Abstract:
Flashbots recently released mev-share to empower users with control over the amount of information they share with searchers for extracting Maximal Extractable Value (MEV). Searchers require more information to maintain on-chain exchange efficiency and profitability, while users aim to prevent frontrunning by withholding information. After analyzing two searching strategies in mev-share to reason about searching techniques, this paper introduces Differentially-Private (DP) aggregate hints as a new type of hints to disclose information quantitatively. DP aggregate hints enable users to formally quantify their privacy loss to searchers, and thus better estimate the level of rebates to ask in return. The paper discusses the current properties and privacy loss in mev-share and lays out how DP aggregate hints could enhance the system for both users and searchers. We leverage Differential Privacy in the Trusted Curator Model to design our aggregate hints. Additionally, we explain how random sampling can defend against sybil attacks and amplify overall user privacy while providing valuable hints to searchers for improved backrunning extraction and frontrunning prevention.

Paperid: 3529, https://arxiv.org/pdf/2508.14190.pdf

Abstract:
Large Language Models (LLMs), such as GPT-4 and Llama, have demonstrated remarkable abilities in generating natural language. However, they also pose security and integrity challenges. Existing countermeasures primarily focus on distinguishing AI-generated content from human-written text, with most solutions tailored for English. Meanwhile, authorship attribution--determining which specific LLM produced a given text--has received comparatively little attention despite its importance in forensic analysis. In this paper, we present DA-MTL, a multi-task learning framework that simultaneously addresses both text detection and authorship attribution. We evaluate DA-MTL on nine datasets and four backbone models, demonstrating its strong performance across multiple languages and LLM sources. Our framework captures each task's unique characteristics and shares insights between them, which boosts performance in both tasks. Additionally, we conduct a thorough analysis of cross-modal and cross-lingual patterns and assess the framework's robustness against adversarial obfuscation techniques. Our findings offer valuable insights into LLM behavior and the generalization of both detection and authorship attribution.

Paperid: 3530, https://arxiv.org/pdf/2508.13710.pdf

Abstract:
With the widespread use of the internet, there is an increasing need to ensure the security and privacy of transmitted data. This has led to an intensified focus on the study of video steganography, which is a technique that hides data within a video cover to avoid detection. The effectiveness of any steganography method depends on its ability to embed data without altering the original video quality while maintaining high efficiency. This paper proposes a new method to video steganography, which involves utilizing a Genetic Algorithm (GA) for identifying the Region of Interest (ROI) in the cover video. The ROI is the area in the video that is the most suitable for data embedding. The secret data is encrypted using the Advanced Encryption Standard (AES), which is a widely accepted encryption standard, before being embedded into the cover video, utilizing up to 10% of the cover video. This process ensures the security and confidentiality of the embedded data. The performance metrics for assessing the proposed method are the Peak Signal to Noise Ratio (PSNR) and the encoding and decoding time. The results show that the proposed method has a high embedding capacity and efficiency, with a PSNR ranging between 64 and 75 dBs, which indicates that the embedded data is almost indistinguishable from the original video. Additionally, the method can encode and decode data quickly, making it efficient for real time applications.

Paperid: 3531, https://arxiv.org/pdf/2508.13522.pdf

Abstract:
DDoS attacks are one of the most prevalent and harmful cybersecurity threats faced by organizations and individuals today. In recent years, the complexity and frequency of DDoS attacks have increased significantly, making it challenging to detect and mitigate them effectively. The study analyzes various types of DDoS attacks, including volumetric, protocol, and application layer attacks, and discusses the characteristics, impact, and potential targets of each type. It also examines the existing techniques used for DDoS attack detection, such as packet filtering, intrusion detection systems, and machine learning-based approaches, and their strengths and limitations. Moreover, the study explores the prevention techniques employed to mitigate DDoS attacks, such as firewalls, rate limiting , CPP and ELD mechanism. It evaluates the effectiveness of each approach and its suitability for different types of attacks and environments. In conclusion, this study provides a comprehensive overview of the different types of DDoS attacks, their detection, and prevention techniques. It aims to provide insights and guidelines for organizations and individuals to enhance their cybersecurity posture and protect against DDoS attacks.

Paperid: 3532, https://arxiv.org/pdf/2508.13357.pdf

Abstract:
Secure Multi-Party Computation (MPC) offers a practical foundation for privacy-preserving machine learning at the edge, with MPC commonly employed to support nonlinear operations. These MPC protocols fundamentally rely on Oblivious Transfer (OT), particularly Correlated OT (COT), to generate correlated randomness essential for secure computation. Although COT generation is efficient in conventional two-party settings with resource-rich participants, it becomes a critical bottleneck in real-world inference on resource-constrained devices (e.g., IoT sensors and wearables), due to both communication latency and limited computational capacity. To enable real-time secure inference, we introduce Silentflow, a highly efficient Trusted Execution Environment (TEE)-assisted protocol that eliminates communication in COT generation. We tackle the core performance bottleneck-low computational intensity-through structured algorithmic decomposition: kernel fusion for parallelism, Blocked On-chip eXpansion (BOX) to improve memory access patterns, and vectorized batch operations to maximize memory bandwidth utilization. Through design space exploration, we balance end-to-end latency and resource demands, achieving up to 39.51x speedup over state-of-the-art protocols. By offloading COT computations to a Zynq-7000 SoC, SilentFlow accelerates PPMLaaS inference on the ImageNet dataset under resource constraints, achieving a 4.62x and 3.95x speedup over Cryptflow2 and Cheetah, respectively.

Paperid: 3533, https://arxiv.org/pdf/2508.12870.pdf

Abstract:
Rapidly changing social norms and national, legal, and political conditions socially constrain people from discussing sensitive topics such as sexuality or religion. Such constrained, vulnerable minorities are often worried about inadvertent information disclosure and may be unsure about the extent to which their communications are being monitored in public or semi-public spaces like workplaces or cafes. Personal devices extend trust to the digital domain, making it desirable to have strictly private communication between trusted devices. Currently, messaging services like WhatsApp provide alternative means for exchanging sensitive private information, while personal safety apps such as Noonlight enable private signaling. However, these rely on third-party mechanisms for secure and private communication, which may not be accessible for justifiable reasons, such as insecure internet access or companion device connections. In these cases, it is challenging to achieve communication that is strictly private between two devices instead of user accounts without any dependency on third-party infrastructure. The goal of this paper is to support private communications by setting up a shared secret between two or more devices without sending any data on the network. We develop a method to create a shared secret between phones by shaking them together. Each device extracts the shared randomness from the shake, then conditions the randomness to 7.798 bits per byte of key material. This paper proposes three different applications of this generated shared secret: message obfuscation, trust delegation, and encrypted beacons. We have implemented the message obfuscation on Android as an independent app that can be used for private communication with trusted contacts. We also present research on the usability, design considerations, and further integration of these tools in mainstream services.

Paperid: 3534, https://arxiv.org/pdf/2508.12641.pdf

Abstract:
Recent advancements in money laundering detection have demonstrated the potential of using graph neural networks to capture laundering patterns accurately. However, existing models are not explicitly designed to detect the diverse patterns of off-chain cryptocurrency money laundering. Neglecting any laundering pattern introduces critical detection gaps, as each pattern reflects unique transactional structures that facilitate the obfuscation of illicit fund origins and movements. Failure to account for these patterns may result in under-detection or omission of specific laundering activities, diminishing model accuracy and allowing schemes to bypass detection. To address this gap, we propose the MPOCryptoML model to effectively detect multiple laundering patterns in cryptocurrency transactions. MPOCryptoML includes the development of a multi-source Personalized PageRank algorithm to identify random laundering patterns. Additionally, we introduce two novel algorithms by analyzing the timestamp and weight of transactions in high-volume financial networks to detect various money laundering structures, including fan-in, fan-out, bipartite, gather-scatter, and stack patterns. We further examine correlations between these patterns using a logistic regression model. An anomaly score function integrates results from each module to rank accounts by anomaly score, systematically identifying high-risk accounts. Extensive experiments on public datasets including Elliptic++, Ethereum fraud detection, and Wormhole transaction datasets validate the efficacy and efficiency of MPOCryptoML. Results show consistent performance gains, with improvements up to 9.13% in precision, up to 10.16% in recall, up to 7.63% in F1-score, and up to 10.19% in accuracy.

Paperid: 3535, https://arxiv.org/pdf/2508.12584.pdf

Abstract:
Rule-based cloud security posture management (CSPM) solutions are known to produce a lot of false positives based on the limited contextual understanding and dependence on static heuristics testing. This paper introduces a validation-driven methodology that integrates active behavioral testing in cloud security posture management solution(s) to evaluate the exploitability of policy violations in real time. The proposed system employs lightweight and automated probes, built from open-source tools, validation scripts, and penetration testing test cases, to simulate adversarial attacks on misconfigured or vulnerable cloud assets without any impact to the cloud services or environment. For instance, cloud services may be flagged as publicly exposed and vulnerable despite being protected by access control layers, or secure policies, resulting in non-actionable alerts that consumes analysts time during manual validation. Through controlled experimentation in a reproducible AWS setup, we evaluated the reduction in false positive rates across various misconfiguration and vulnerable alerts. Our findings indicate an average reduction of 93\% in false positives. Furthermore, the framework demonstrates low latency performance. These results demonstrate a scalable method to improve detection accuracy and analyst productivity in large cloud environments. While our evaluation focuses on AWS, the architecture is modular and extensible to multi-cloud setups.

Paperid: 3536, https://arxiv.org/pdf/2508.12187.pdf

Abstract:
The rise of Virtual Reality (VR) has provided developers with an unprecedented platform for creating games and applications (apps) that require distinct inputs, different from those of conventional devices like smartphones. The Meta Quest VR platform, driven by Meta, has democratized VR app publishing and attracted millions of users worldwide. However, as the number of published apps grows, there is a notable lack of robust headless tools for user interface (UI) exploration and user event testing. To address this need, we present AUTOVR, an automatic framework for dynamic UI and user event interaction in VR apps built on the Unity Engine. Unlike conventional Android and GUI testers, AUTOVR analyzes the app's internal binary to reveal hidden events, resolves generative event dependencies, and utilizes them for comprehensive exploration of VR apps. Using sensitive data exposure as a performance metric, we compare AUTOVR with Android Monkey, a widely used headless Android GUI stress testing tool. Our empirical evaluation demonstrates AUTOVR's superior performance, triggering an order of magnitude of more sensitive data exposures and significantly enhancing the privacy of VR apps.

Paperid: 3537, https://arxiv.org/pdf/2508.12181.pdf

Abstract:
The rapid evolution of smart grids requires effective communication protocols to transfer data reliably and securely. Controller Area Network (CAN) is one of the most recognized protocols that offer reliable data transmission in smart grids due to its robustness, real-time capabilities, and relatively low initial cost of its required hardware. However, as a smart city becomes more interconnected, it also becomes more vulnerable to cyber-attacks. As there are many mechanisms to secure the CAN nodes from attacks, most of those mechanisms have computational overhead, resulting in more delay in the network. We implemented a solution that requires almost no overhead to any CAN node connected to the network. It depends on a single node responsible for securing the CAN network. This approach seeks to augment network security while reducing security mechanisms overhead to all CAN network nodes. The methodology and comprehensive test results will be presented in detail during a subsequent discussion. The used software for development is Code Composer Studio, and the used microcontroller evaluation boards (EVB) are TM4C 1294.

Paperid: 3538, https://arxiv.org/pdf/2508.12161.pdf

Abstract:
Attack graphs (AGs) are graphical tools to analyze the security of computer networks. By connecting the exploitation of individual vulnerabilities, AGs expose possible multi-step attacks against target networks, allowing system administrators to take preventive measures to enhance their network's security. As powerful analytical tools, however, AGs are both time- and memory-consuming to be generated. As the numbers of network assets, interconnections between devices, as well as vulnerabilities increase, the size and volume of the resulting AGs grow at a much higher rate, leading to the well-known state-space explosion. In this paper, we propose the use of high performance computing (HPC) clusters to implement AG generators. We evaluate the performance through experiments and provide insights into how cluster environments can help resolve the issues of slow speed and high memory demands in AG generation in a balanced way.

Paperid: 3539, https://arxiv.org/pdf/2508.12138.pdf

Abstract:
Bitcoin's Proof of Work (PoW) mechanism, while central to achieving decentralized consensus, has long been criticized for excessive energy use and hardware inefficiencies \cite{devries2018bitcoin, truby2018decarbonizing}. This paper introduces a hybrid architecture that replaces Bitcoin's traditional PoW with a centralized, cloud-based collaborative training framework. In this model, miners contribute computing resources to train segments of horizontally scaled machine learning models on preprocessed datasets, ensuring privacy and generating meaningful outputs \cite{li2017securing}. A central server evaluates contributions using two metrics: number of parameters trained and reduction in model loss during each cycle. At the end of every cycle, a weighted lottery selects the winning miner, who receives a digitally signed certificate. This certificate serves as a verifiable substitute for PoW and grants the right to append a block to the blockchain \cite{nakamoto2008bitcoin}. By integrating digital signatures and SHA-256 hashing \cite{nist2015sha}, the system preserves blockchain integrity while redirecting energy toward productive computation. The proposed approach addresses the sustainability concerns of traditional mining by converting resource expenditure into socially valuable work, aligning security incentives with real-world computational progress.

Paperid: 3540, https://arxiv.org/pdf/2508.11939.pdf

Abstract:
This study focuses on the creation and implementation of ransomware for educational purposes that leverages Python's native cryptographic APIs in a controlled environment. Additionally, an Android version of the framework is implemented using Flutter and Dart. For both versions, open-source cryptographic libraries are utilized. With this framework, researchers can systematically explore the functionalities of ransomware, including file encryption processes, cryptographic key management, and victim interaction dynamics. To ensure safe experimentation, multiple safeguards are incorporated, such as the ability to restrict the encryption process to a specific directory, providing the RSA private key for immediate decryption, and narrowing the scope of targetable files to a carefully curated list (.txt, .jpg, .csv, .doc). This paper draws inspiration from the infamous WannaCry ransomware and aims to simulate its behaviour on Android devices. By making the codebase open-source, it enables users to study, modify, and extend the program for pedagogical purposes and offers a hands-on tool that can be used to train the next generation of cybersecurity professionals.

Paperid: 3541, https://arxiv.org/pdf/2508.11928.pdf

Abstract:
With the rise of sophisticated authentication bypass techniques, passwords are no longer considered a reliable method for securing authentication systems. In recent years, new authentication technologies have shifted from traditional password-based logins to passwordless security. Among these, Time-Based One-Time Passwords (TOTP) remain one of the most widely used mechanisms, while Passkeys are emerging as a promising alternative with growing adoption. This paper highlights the key techniques used during the implementation of the authentication system with Passkey technology. It also suggests considerations for integrating components during system development to ensure that users can securely access their accounts with minimal complexity, while still meeting the requirements of a robust authentication system that balances security, usability, and performance. Additionally, by examining TOTP and Passkey mechanisms from an implementation perspective, this work not only addresses major security concerns such as password leaks, phishing attacks, and susceptibility to brute-force attacks, but also evaluates the feasibility and effectiveness of these mechanisms in real-world implementations. This paper demonstrates the superior security of Passkey technology and its potential for broader adoption in secure authentication systems.

Paperid: 3542, https://arxiv.org/pdf/2508.11817.pdf

Abstract:
Cryptographic algorithms like AES and RSA are widely used and they are mathematically robust and almost unbreakable but its implementation on physical devices often leak information through side channels, such as electromagnetic (EM) emissions, potentially compromising said theoretically secure algorithms. This paper investigates the application of machine learning (ML) techniques and Deep Learning models to exploit such leakage for partial key recovery. We use the public ASCAD `fixed' and `variable' key dataset, containing 700 and 1400 EM traces respectively from an AES-128 implementation on an 8-bit microcontroller. The problem is framed as a 256-class classification task where we target the output of the first-round S-box operation, which is dependent on a single key byte. We evaluate standard classifiers (Random Forest (RF), Support Vector Machine (SVM)), a Convolutional Neural Network(CNN) and a Residual Neural Network(ResNet). We also explore the utility of RF-based feature importance for dimensionality reduction. Crucially, we employ this domain-specific Key Rank metric for evaluation, showing its necessity over standard classification accuracy. Our results show that SVM and RF on full features perform poorly in key ranking. However, RF trained on reduced (top 100) identified via importance analysis achieves Rank 0 (successful key byte recovery) using almost half the attack traces. The implemented CNN also achieves Rank 0 efficiently using approximately 65 attack traces for the fixed-key dataset. The ResNets perform best on large and complex datasets but may not always be the best choice for simple fixed key dataset in terms of efficiency. Thus we conclude that models, particularly CNNs, ResNets and feature-selected RF, coupled with the Key Rank metric, are an effective tool for side-channel key recovery, confirming the practical vulnerability of the cryptographic implementations.

Paperid: 3543, https://arxiv.org/pdf/2508.11812.pdf

Abstract:
The advancement of computing equipment and the advances in services over the Internet has allowed corporations, higher education, and many other organizations to pursue the shared computing network environment. A requirement for shared computing environments is a centralized identity system to authenticate and authorize user access. An organization's digital identity plane is a prime target for cyber threat actors. When compromised, identities can be exploited to steal credentials, create unauthorized accounts, and manipulate permissions-enabling attackers to gain control of the network and undermine its confidentiality, availability, and integrity. Cybercrime losses reached a record of 16.6 B in the United States in 2024. For organizations using Microsoft software, Active Directory is the on-premises identity system of choice. In this article, we examine the challenge of security compromises in Active Directory (AD) environments and present effective strategies to prevent credential theft and limit lateral movement by threat actors. Our proposed approaches aim to confine the movement of compromised credentials, preventing significant privilege escalation and theft. We argue that through our illustration of real-world scenarios, tiering can halt lateral movement and advanced cyber-attacks, thus reducing ransom escalation. Our work bridges a gap in existing literature by combining technical guidelines with theoretical arguments in support of tiering, positioning it as a vital component of modern cybersecurity strategy even though it cannot function in isolation. As the hardware advances and the cloud sourced services along with AI is advancing with unprecedented speed, we think it is important for security experts and the business to work together and start designing and developing software and frameworks to classify devices automatically and accurately within the tiered structure.

Paperid: 3544, https://arxiv.org/pdf/2508.11797.pdf

Abstract:
Due to HIPAA and other privacy regulations, it is imperative to maintain patient privacy while conducting research on patient health records. In this paper, we propose AegisBlock, a patient-centric access controlled framework to share medical records with researchers such that the anonymity of the patient is maintained while ensuring the trustworthiness of the data provided to researchers. AegisBlock allows for patients to provide access to their medical data, verified by miners. A researcher submits a time-based range query to request access to records from a certain patient, and upon patient approval, access will be granted. Our experimental evaluation results show that AegisBlock is scalable with respect to the number of patients and hospitals in the system, and efficient with up to 50% of malicious miners.

Paperid: 3545, https://arxiv.org/pdf/2508.11710.pdf

Abstract:
Security vulnerabilities present in a code that has been written in diverse programming languages are among the most critical yet complicated aspects of source code to detect. Static analysis tools based on rule-based patterns usually do not work well at detecting the context-dependent bugs and lead to high false positive rates. Recent developments in artificial intelligence, specifically the use of transformer-based models like CodeBERT and CodeLlama, provide light to this problem, as they show potential in finding such flaws better. This paper presents the implementations of these models on various datasets of code vulnerability, showing how off-the-shelf models can successfully produce predictive capacity in models through dynamic fine-tuning of the models on vulnerable and safe code fragments. The methodology comprises the gathering of the dataset, normalization of the language, fine-tuning of the model, and incorporation of ensemble learning and explainable AI. Experiments show that a well-trained CodeBERT can be as good as or even better than some existing static analyzers in terms of accuracy greater than 97%. Further study has indicated that although language models can achieve close-to-perfect recall, the precision can decrease. A solution to this is given by hybrid models and validation procedures, which will reduce false positives. According to the results, the AI-based solutions generalize to different programming languages and classes of vulnerability. Nevertheless, robustness, interpretability, and deployment readiness are still being developed. The results illustrate the probabilities that AI will enhance the trustworthiness in the usability and scalability of machine-learning-based detectors of vulnerabilities.

Paperid: 3546, https://arxiv.org/pdf/2508.11599.pdf

Abstract:
Cryptographic algorithms are fundamental to modern security, yet their implementations frequently harbor subtle logic flaws that are hard to detect. We introduce CryptoScope, a novel framework for automated cryptographic vulnerability detection powered by Large Language Models (LLMs). CryptoScope combines Chain-of-Thought (CoT) prompting with Retrieval-Augmented Generation (RAG), guided by a curated cryptographic knowledge base containing over 12,000 entries. We evaluate CryptoScope on LLM-CLVA, a benchmark of 92 cases primarily derived from real-world CVE vulnerabilities, complemented by cryptographic challenges from major Capture The Flag (CTF) competitions and synthetic examples across 11 programming languages. CryptoScope consistently improves performance over strong LLM baselines, boosting DeepSeek-V3 by 11.62%, GPT-4o-mini by 20.28%, and GLM-4-Flash by 28.69%. Additionally, it identifies 9 previously undisclosed flaws in widely used open-source cryptographic projects.

Paperid: 3547, https://arxiv.org/pdf/2508.11563.pdf

Abstract:
Searchable encryption (SE) is the most scalable cryptographic primitive for searching on encrypted data. Typical SE constructions often allow access-pattern leakage, revealing which encrypted records are retrieved in the server's responses. All the known generic cryptanalyses assume either that the queries are issued uniformly at random or that the attacker observes the search-pattern leakage. It remains unclear what can be reconstructed when using only the access-pattern leakage and knowledge of the query distribution. In this work, we focus on the cryptanalytic technique of frequency analysis in the context of leakage-abuse attacks on schemes that support encrypted range queries. Frequency analysis matches the frequency of retrieval of an encrypted record with a plaintext value based on its probability of retrieval that follows from the knowledge of the query distribution. We generalize this underexplored cryptanalytic technique and introduce a generic attack framework called Leakage-Abuse via Matching (LAMA) that works even on high-dimensional encrypted data. We identify a parameterization of LAMA that brings frequency analysis to its limit -- that is, we prove that there is no additional frequency matching that an attacker can perform to refine the result. Furthermore, we show that our results hold for any class of convex queries, and not just axis-aligned rectangles, which is the assumption in all other attacks on range schemes. Using these results, we identify query distributions that make frequency analysis challenging for the attacker and, thus, can act as a mitigation mechanism. Finally, we implement and benchmark LAMA and reconstruct, for the first time, plaintext data from encrypted range queries spanning up to four dimensions.

Paperid: 3548, https://arxiv.org/pdf/2508.11395.pdf

Abstract:
The global financial system stands at an inflection point. Stablecoins represent the most significant evolution in banking since the abandonment of the gold standard, positioned to enable "Banking 2.0" by seamlessly integrating cryptocurrency innovation with traditional finance infrastructure. This transformation rivals artificial intelligence as the next major disruptor in the financial sector. Modern fiat currencies derive value entirely from institutional trust rather than physical backing, creating vulnerabilities that stablecoins address through enhanced stability, reduced fraud risk, and unified global transactions that transcend national boundaries. Recent developments demonstrate accelerating institutional adoption: landmark U.S. legislation including the GENIUS Act of 2025, strategic industry pivots from major players like JPMorgan's crypto-backed loan initiatives, and PayPal's comprehensive "Pay with Crypto" service. Widespread stablecoin implementation addresses critical macroeconomic imbalances, particularly the inflation-productivity gap plaguing modern monetary systems, through more robust and diversified backing mechanisms. Furthermore, stablecoins facilitate deregulation and efficiency gains, paving the way for a more interconnected international financial system. This whitepaper comprehensively explores how stablecoins are poised to reshape banking, supported by real-world examples, current market data, and analysis of their transformative potential.

Paperid: 3549, https://arxiv.org/pdf/2508.11341.pdf

Abstract:
In targeted adversarial attacks on vision models, the selection of the target label is a critical yet often overlooked determinant of attack success. This target label corresponds to the class that the attacker aims to force the model to predict. Now, existing strategies typically rely on randomness, model predictions, or static semantic resources, limiting interpretability, reproducibility, or flexibility. This paper then proposes a semantics-guided framework for adversarial target selection using the cross-modal knowledge transfer from pretrained language and vision-language models. We evaluate several state-of-the-art models (BERT, TinyLLAMA, and CLIP) as similarity sources to select the most and least semantically related labels with respect to the ground truth, forming best- and worst-case adversarial scenarios. Our experiments on three vision models and five attack methods reveal that these models consistently render practical adversarial targets and surpass static lexical databases, such as WordNet, particularly for distant class relationships. We also observe that static testing of target labels offers a preliminary assessment of the effectiveness of similarity sources, \textit{a priori} testing. Our results corroborate the suitability of pretrained models for constructing interpretable, standardized, and scalable adversarial benchmarks across architectures and datasets.

Paperid: 3550, https://arxiv.org/pdf/2508.11325.pdf

Abstract:
Cyber threats against the maritime industry have increased notably in recent years, highlighting the need for innovative cybersecurity approaches. Ships, as critical assets, possess highly specialized and interconnected network infrastructures, where their legacy systems and operational constraints further exacerbate their vulnerability to cyberattacks. To better understand this evolving threat landscape, we propose the use of cyber-deception techniques and in particular honeynets, as a means to gather valuable insights into ongoing attack campaigns targeting the maritime sector. In this paper we present Salty Seagull, a honeynet conceived to simulate a VSAT system for ships. This environment mimics the operations of a functional VSAT system onboard and, at the same time, enables a user to interact with it through a Web dashboard and a CLI environment. Furthermore, based on existing vulnerabilities, we purposefully integrate them into our system to increase attacker engagement. We exposed our honeynet for 30 days to the Internet to assess its capability and measured the received interaction. Results show that while numerous generic attacks have been attempted, only one curious attacker with knowledge of the nature of the system and its vulnerabilities managed to access it, without however exploring its full potential.

Paperid: 3551, https://arxiv.org/pdf/2508.11053.pdf

Abstract:
Post hoc explanation methods, such as LIME and SHAP, provide interpretable insights into black-box classifiers and are increasingly used to assess model biases and generalizability. However, these methods are vulnerable to adversarial manipulation, potentially concealing harmful biases. Building on the work of Slack et al. (2020), we investigate the susceptibility of LIME and SHAP to biased models and evaluate strategies for improving robustness. We first replicate the original COMPAS experiment to validate prior findings and establish a baseline. We then introduce a modular testing framework enabling systematic evaluation of augmented and ensemble explanation approaches across classifiers of varying performance. Using this framework, we assess multiple LIME/SHAP ensemble configurations on out-of-distribution models, comparing their resistance to bias concealment against the original methods. Our results identify configurations that substantially improve bias detection, highlighting their potential for enhancing transparency in the deployment of high-stakes machine learning systems.

Paperid: 3552, https://arxiv.org/pdf/2508.10879.pdf

Abstract:
Given $n$ i.i.d. random matrices $A_i \in \mathbb{R}^{d \times d}$ that share a common expectation $Î£$, the objective of Differentially Private Stochastic PCA is to identify a subspace of dimension $k$ that captures the largest variance directions of $Î£$, while preserving differential privacy (DP) of each individual $A_i$. Existing methods either (i) require the sample size $n$ to scale super-linearly with dimension $d$, even under Gaussian assumptions on the $A_i$, or (ii) introduce excessive noise for DP even when the intrinsic randomness within $A_i$ is small. Liu et al. (2022a) addressed these issues for sub-Gaussian data but only for estimating the top eigenvector ($k=1$) using their algorithm DP-PCA. We propose the first algorithm capable of estimating the top $k$ eigenvectors for arbitrary $k \leq d$, whilst overcoming both limitations above. For $k=1$ our algorithm matches the utility guarantees of DP-PCA, achieving near-optimal statistical error even when $n = \tilde{\!O}(d)$. We further provide a lower bound for general $k > 1$, matching our upper bound up to a factor of $k$, and experimentally demonstrate the advantages of our algorithm over comparable baselines.

Paperid: 3553, https://arxiv.org/pdf/2508.10510.pdf

Abstract:
Interactive Oracle Proofs of Proximity (IOPP) are at the heart of code-based SNARKs, a family of zeroknowledge protocols. The first and most famous one is the FRI protocol [BBHR18a], that efficiently tests proximity to Reed-Solomon codes. This paper generalizes the flowering IOPP introduced in [DMR25] for some specific (2, n)-regular Tanner codes to a much broader variety of codes: any code with symbols indexed on the edges of a Cayley graph. The flowering protocol of [DMR25] had a soundness parameter much lower than the FRI protocol [BCI + 23], and complexity parameters that could compete with the FRI [BBHR18a]. The lower soundness and the absence of restriction on the base field may lead to other practical speedups, however the codes considered in [DMR25] have an o(1) minimum distance. The generalization proposed in this paper preserves the soundness parameter with a slight decrease of the complexity parameters, while allowing being applied on codes with constant rate and constant minimum distance thanks to the good expansion properties of some families of Cayley graphs.

Paperid: 3554, https://arxiv.org/pdf/2508.10493.pdf

Abstract:
The fundamental basis for maintaining integrity within contemporary blockchain systems is provided by authenticated databases. Our analysis indicates that a significant portion of the approaches applied in this domain fail to sufficiently meet the stringent requirements of systems processing transactions at rates of multi-million TPS. AlDBaran signifies a substantial advancement in authenticated databases. By eliminating disk I/O operations from the critical path, implementing prefetching strategies, and refining the update mechanism of the Merkle tree, we have engineered an authenticated data structure capable of handling state updates efficiently at a network throughput of 50 Gbps. This throughput capacity significantly surpasses any empirically documented blockchain throughput, guaranteeing the ability of even the most high-throughput blockchains to generate state commitments effectively. AlDBaran provides support for historical state proofs, which facilitates a wide array of novel applications. For instance, the deployment of AlDBaran could enable blockchains that do not currently support state commitments to offer functionalities for light clients and/or implement rollups. When benchmarked against alternative authenticated data structure projects, AlDBaran exhibits superior performance and simplicity. In particular, AlDBaran achieves speeds of approximately 48 million updates per second using an identical machine configuration. This characteristic renders AlDBaran an attractive solution for resource-limited environments, as its historical data capabilities can be modularly isolated (and deactivated), which further enhances performance. On consumer-level portable hardware, it achieves approximately 8 million updates/s in an in-memory setting and 5 million updates/s with snapshots at sub-second intervals, illustrating compelling and cost-effective scalability.

Paperid: 3555, https://arxiv.org/pdf/2508.10429.pdf

Abstract:
We present MM-Food-100K, a public 100,000-sample multimodal food intelligence dataset with verifiable provenance. It is a curated approximately 10% open subset of an original 1.2 million, quality-accepted corpus of food images annotated for a wide range of information (such as dish name, region of creation). The corpus was collected over six weeks from over 87,000 contributors using the Codatta contribution model, which combines community sourcing with configurable AI-assisted quality checks; each submission is linked to a wallet address in a secure off-chain ledger for traceability, with a full on-chain protocol on the roadmap. We describe the schema, pipeline, and QA, and validate utility by fine-tuning large vision-language models (ChatGPT 5, ChatGPT OSS, Qwen-Max) on image-based nutrition prediction. Fine-tuning yields consistent gains over out-of-box baselines across standard metrics; we report results primarily on the MM-Food-100K subset. We release MM-Food-100K for publicly free access and retain approximately 90% for potential commercial access with revenue sharing to contributors.

Paperid: 3556, https://arxiv.org/pdf/2508.10035.pdf

Abstract:
False Data Injection Attacks (FDIAs) pose a significant threat to smart grid infrastructures, particularly Home Area Networks (HANs), where real-time monitoring and control are highly adopted. Owing to the comparatively less stringent security controls and widespread availability of HANs, attackers view them as an attractive entry point to manipulate aggregated demand patterns, which can ultimately propagate and disrupt broader grid operations. These attacks undermine the integrity of smart meter data, enabling malicious actors to manipulate consumption values without activating conventional alarms, thereby creating serious vulnerabilities across both residential and utility-scale infrastructures. This paper presents a machine learning-based framework for both the detection and classification of FDIAs using residential energy data. A real-time detection is provided by the lightweight Artificial Neural Network (ANN), which works by using the most vital features of energy consumption, cost, and time context. For the classification of different attack types, a Bidirectional LSTM is trained to recognize normal, trapezoidal, and sigmoid attack shapes through learning sequential dependencies in the data. A synthetic time-series dataset was generated to emulate realistic household behaviour. Experimental results demonstrate that the proposed models are effective in identifying and classifying FDIAs, offering a scalable solution for enhancing grid resilience at the edge. This work contributes toward building intelligent, data-driven defence mechanisms that strengthen smart grid cybersecurity from residential endpoints.

Paperid: 3557, https://arxiv.org/pdf/2508.09980.pdf

Abstract:
For many social, scientific, and commercial purposes, it is often important to estimate the distribution of the users' data regarding a sensitive attribute, e.g., their ages, locations, etc. To allow this estimation while protecting the users' privacy, every user applies a local privacy protection mechanism that releases a noisy (sanitized) version of their original datum to the data collector; then the original distribution is estimated using one of the known methods, such as the matrix inversion (INV), RAPPOR's estimator, and the iterative Bayesian update (IBU). Unlike the other estimators, the consistency of IBU, i.e., the convergence of its estimate to the real distribution as the amount of noisy data grows, has been either ignored or incorrectly proved in the literature. In this article, we use the fact that IBU is a maximum likelihood estimator to prove that IBU is consistent. We also show, through experiments on real datasets, that IBU significantly outperforms the other methods when the users' data are sanitized by geometric, Laplace, and exponential mechanisms, whereas it is comparable to the other methods in the case of the k-RR and RAPPOR mechanisms. Finally, we consider the case when the alphabet of the sensitive data is infinite, and we show a technique that allows IBU to operate in this case too.

Paperid: 3558, https://arxiv.org/pdf/2508.09673.pdf

Abstract:
We propose the notion of succinct oblivious tensor evaluation (OTE), where two parties compute an additive secret sharing of a tensor product of two vectors $\mathbf{x} \otimes \mathbf{y}$, exchanging two simultaneous messages. Crucially, the size of both messages and of the CRS is independent of the dimension of $\mathbf{x}$. We present a construction of OTE with optimal complexity from the standard learning with errors (LWE) problem. Then we show how this new technical tool enables a host of cryptographic primitives, all with security reducible to LWE, such as: * Adaptively secure laconic function evaluation for depth-$D$ functions $f:\{0, 1\}^m\rightarrow\{0, 1\}^\ell$ with communication $m+\ell+D\cdot \mathrm{poly}(Î»)$. * A trapdoor hash function for all functions. * An (optimally) succinct homomorphic secret sharing for all functions. * A rate-$1/2$ laconic oblivious transfer for batch messages, which is best possible. In particular, we obtain the first laconic function evaluation scheme that is adaptively secure from the standard LWE assumption, improving upon Quach, Wee, and Wichs (FOCS 2018). As a key technical ingredient, we introduce a new notion of \emph{adaptive lattice encodings}, which may be of independent interest.

Paperid: 3559, https://arxiv.org/pdf/2508.09299.pdf

Abstract:
Weather forecasting plays a vital role in disaster preparedness, agriculture, and resource management, yet current centralized forecasting systems are increasingly strained by security vulnerabilities, limited scalability, and susceptibility to single points of failure. To address these challenges, we propose a decentralized weather forecasting framework that integrates Federated Learning (FL) with blockchain technology. FL enables collaborative model training without exposing sensitive local data; this approach enhances privacy and reduces data transfer overhead. Meanwhile, the Ethereum blockchain ensures transparent and dependable verification of model updates. To further enhance the system's security, we introduce a reputation-based voting mechanism that assesses the trustworthiness of submitted models while utilizing the Interplanetary File System (IPFS) for efficient off-chain storage. Experimental results demonstrate that our approach not only improves forecasting accuracy but also enhances system resilience and scalability, making it a viable candidate for deployment in real-world, security-critical environments.

Paperid: 3560, https://arxiv.org/pdf/2508.08945.pdf

Abstract:
The growing digitalization and the rapid adoption of high-powered Internet-of-Things (IoT)-enabled devices (e.g., EV charging stations) have increased the vulnerability of power grids to cyber threats. In particular, the so-called Load Altering Attacks (LAAs) can trigger rapid frequency fluctuations and potentially destabilize the power grid. This paper aims to bridge the gap between academic research and practical application by using open-source datasets released by grid operators. It investigates various LAA scenarios on a real-world transmission network, namely the Great Britain (GB)-36 Zone model released by the UK's National Electricity System Operator (NESO). It evaluates the threshold of LAA severity that the grid can tolerate before triggering cascading effects. Additionally, it explores how Battery Energy Storage Systems (BESS) based fast frequency response services can mitigate or prevent such impacts. Simulations are conducted using DIgSILENT PowerFactory to ensure realistic system representation. The analysis provides several useful insights to grid operators on the LAA impact, such as the influence of the relative locations of BESS and LAA, as well as how delays in attack execution can influence the overall system response.

Paperid: 3561, https://arxiv.org/pdf/2508.08832.pdf

Abstract:
As digital data transmission continues to scale, concerns about privacy grow increasingly urgent - yet privacy remains a socially constructed and ambiguously defined concept, lacking a universally accepted quantitative measure. This work examines information leakage in image data, a domain where information-theoretic guarantees are still underexplored. At the intersection of deep learning, information theory, and cryptography, we investigate the use of mutual information (MI) estimators - in particular, the empirical estimator and the MINE framework - to detect leakage from selectively encrypted images. Motivated by the intuition that a robust estimator would require a probabilistic frameworks that can capture spatial dependencies and residual structures, even within encrypted representations - our work represent a promising direction for image information leakage estimation.

Paperid: 3562, https://arxiv.org/pdf/2508.08031.pdf

Abstract:
Federated self-supervised learning (FSSL) combines the advantages of decentralized modeling and unlabeled representation learning, serving as a cutting-edge paradigm with strong potential for scalability and privacy preservation. Although FSSL has garnered increasing attention, research indicates that it remains vulnerable to backdoor attacks. Existing methods generally rely on visually obvious triggers, which makes it difficult to meet the requirements for stealth and practicality in real-world deployment. In this paper, we propose an imperceptible and effective backdoor attack method against FSSL, called IPBA. Our empirical study reveals that existing imperceptible triggers face a series of challenges in FSSL, particularly limited transferability, feature entanglement with augmented samples, and out-of-distribution properties. These issues collectively undermine the effectiveness and stealthiness of traditional backdoor attacks in FSSL. To overcome these challenges, IPBA decouples the feature distributions of backdoor and augmented samples, and introduces Sliced-Wasserstein distance to mitigate the out-of-distribution properties of backdoor samples, thereby optimizing the trigger generation process. Our experimental results on several FSSL scenarios and datasets show that IPBA significantly outperforms existing backdoor attack methods in performance and exhibits strong robustness under various defense mechanisms.

Paperid: 3563, https://arxiv.org/pdf/2508.07840.pdf

Abstract:
Lightweight hash functions have become important building blocks for security in embedded and IoT systems. A plethora of algorithms have been proposed and standardized, providing a wide range of performance trade-off options for developers to choose from. This paper presents a comparative analysis of 22 key software-based lightweight hash functions, including the finalist from the SHA-3 competition. We use a novel benchmark methodology that combines an AVR ATXMega128 microcontroller with the ChipWhisperer cryptanalysis platform and evaluate and compare the various hash functions along several dimensions, including execution speed, % measured in Cycles per Byte (CpB), memory footprint, and energy consumption. Using the composite E-RANK metric, we provide new insight into the various trade-offs each hash function offers to system developers.

Paperid: 3564, https://arxiv.org/pdf/2508.07510.pdf

Abstract:
Batteryless energy harvesting IoT sensor nodes such as beat sensors can be deployed in millions without the need to replace batteries. They are ultra-low-power and cost-effective wireless sensor nodes without the maintenance cost and can work for 24 hours/365 days. However, they were not equipped with security mechanisms to protect user data. Data encryption and authentication can be used to secure beat sensor applications, but generating a secure cryptographic key is challenging. In this paper, we proposed an SRAM-based Physically Unclonable Function (PUF) combining a high-reliability bit selection algorithm with a lightweight error-correcting code to generate reliable secure keys for data encryption. The system employs a feature of beat sensors, in which the microcontroller is powered on to transmit the ID signals and then powered off. This fits the SRAM-based PUF requirement, which needs the SRAM to be powered off to read out its random values. The proposed system has been evaluated on STM32 Cortex M0+ microcontrollers and has been implemented to protect important data on beat sensors.

Paperid: 3565, https://arxiv.org/pdf/2508.07053.pdf

Abstract:
WebView applications are widely used in mobile applications to display web content directly within the app, enhancing user engagement by eliminating the need to open an external browser and providing a seamless experience. Progressive Web Applications (PWAs) further improve usability by combining the accessibility of web apps with the speed, offline capabilities, and responsiveness of native applications. However, malicious developers can exploit this technology by duplicating PWA web links to create counterfeit native apps, monetizing through user diversion. This unethical practice poses significant risks to users and the original application developers, underscoring the need for robust security measures to prevent unauthorized replication. Considering the one-way communication of Trusted Web Activity (a method for integrating web content into Android applications) and PWAs, we propose a query parameter-based practical security solution to defend against or mitigate such attacks. We analyze the vulnerabilities of our proposed security solution to assess its effectiveness and introduce advanced measures to address any identified weaknesses, presenting a comprehensive defense framework. As part of our work, we developed a prototype web application that secures PWAs from replication by embedding a combination of Unix timestamps and device identifiers into the query parameters. We evaluate the effectiveness of this defense strategy by simulating an advanced attack scenario. Additionally, we created a realistic dataset reflecting mobile app user behavior, modeled using a Zipfian distribution, to validate our framework.

Paperid: 3566, https://arxiv.org/pdf/2508.07044.pdf

Abstract:
In the era of generative AI, ensuring the privacy of music data presents unique challenges: unlike static artworks such as images, music data is inherently temporal and multimodal, and it is sampled, transformed, and remixed at an unprecedented scale. These characteristics make its core vector embeddings, i.e, the numerical representations of the music, highly susceptible to being learned, misused, or even stolen by models without accessing the original audio files. Traditional methods like copyright licensing and digital watermarking offer limited protection for these abstract mathematical representations, thus necessitating a stronger, e.g., cryptographic, approach to safeguarding the embeddings themselves. Standard encryption schemes, such as AES, render data unintelligible for computation, making such searches impossible. While Fully Homomorphic Encryption (FHE) provides a plausible solution by allowing arbitrary computations on ciphertexts, its substantial performance overhead remains impractical for large-scale vector similarity searches. Given this trade-off, we propose a more practical approach using Additive Homomorphic Encryption (AHE) for vector similarity search. The primary contributions of this paper are threefold: we analyze threat models unique to music information retrieval systems; we provide a theoretical analysis and propose an efficient AHE-based solution through inner products of music embeddings to deliver privacy-preserving similarity search; and finally, we demonstrate the efficiency and practicality of the proposed approach through empirical evaluation and comparison to FHE schemes on real-world MP3 files.

Paperid: 3567, https://arxiv.org/pdf/2508.06827.pdf

Abstract:
Detecting hidden behaviors in neural networks poses a significant challenge due to minimal prior knowledge and potential adversarial obfuscation. We explore this problem by framing detection as an adversarial game between two teams: the red team trains two similar models, one trained solely on benign data and the other trained on data containing hidden harmful behavior, with the performance of both being nearly indistinguishable on the benign dataset. The blue team, with limited to no information about the harmful behaviour, tries to identify the compromised model. We experiment using CNNs and try various blue team strategies, including Gaussian noise analysis, model diffing, integrated gradients, and adversarial attacks under different levels of hints provided by the red team. Results show high accuracy for adversarial-attack-based methods (100\% correct prediction, using hints), which is very promising, whilst the other techniques yield more varied performance. During our LLM-focused rounds, we find that there are not many parallel methods that we could apply from our study with CNNs. Instead, we find that effective LLM auditing methods require some hints about the undesired distribution, which can then used in standard black-box and open-weight methods to probe the models further and reveal their misalignment. We open-source our auditing games (with the model and data) and hope that our findings contribute to designing better audits.

Paperid: 3568, https://arxiv.org/pdf/2508.06795.pdf

Abstract:
Memory-Hard Functions (MHF) are a useful cryptographic primitive to build egalitarian proofs-of-work and to help protect low entropy secrets (e.g., user passwords) against brute-forces attacks. Ideally, we would like for a MHF to have the property that (1) an honest party can evaluate the function in sequential time $Î©(N)$, and (2) any parallel party that evaluates the function is forced to lockup $Î©(N)$ memory for $Î©(N)$ sequential steps. Unfortunately, this goal is not quite achievable, so prior work of Blocki and Holman [BH22] focused on designing MHFs with strong tradeoff guarantees between sustained-space complexity (SSC) and cumulative memory costs (CMC). However, their theoretical construction is not suitable for practical deployment due to the reliance on expensive constructions of combinatorial graphs. Furthermore, there is no formal justification for the heuristic use of the dynamic pebbling game in MHF analysis so we cannot rule out the possibility that there are more efficient attacks in the Parallel Random Oracle Model (PROM). Towards the goal of developing a practical MHF with provably strong SSC/CMC tradeoffs we develop a new MHF called EGSample which does not rely on expensive combinatorial constructions like [BH22]. In the dynamic pebbling model, we prove equivalent SSC/CMC tradeoffs for EGSample i.e., any the dynamic pebbling strategy either (1) locks up $Î©(N)$ memory for $Î©(N)$ steps, or (2) incurs cumulative memory cost at least $Î©(N^{3-Îµ})$. We also develop new techniques to directly establish SSC/CMC tradeoffs in the parallel random oracle model. In particular, we prove that {\em any} PROM algorithm evaluating our MHF either (1) locks up $Î©(N)$ blocks of memory for $Î©(N)$ steps or (2) incurs cumulative memory cost at least $Î©(N^{2.5-Îµ})$.

Paperid: 3569, https://arxiv.org/pdf/2508.06643.pdf

Abstract:
Symbolic execution is a powerful program analysis technique that allows for the systematic exploration of all program paths. Path explosion, where the number of states to track becomes unwieldy, is one of the biggest challenges hindering symbolic execution's practical application. To combat this, researchers have employed various strategies to enable symbolic execution on complex software systems. This paper introduces a systematic taxonomy of these strategies, categorizing them into two primary approaches: Scope Reduction, which aims to reduce the scope of symbolic execution to manageable portions of code, and Guidance Heuristics, which steer the symbolic execution engine toward promising paths. Using this taxonomy as a lens, we survey applications of symbolic executions in several domains such as vulnerability analysis, malware analysis, firmware re-hosting, and network protocol analysis. Finally, we identify promising directions for future research, including the application of symbolic execution to real-time operating systems and modern, type-safe languages.

Paperid: 3570, https://arxiv.org/pdf/2508.06377.pdf

Abstract:
We revisit Wald's celebrated Sequential Probability Ratio Test for sequential tests of two simple hypotheses, under privacy constraints. We propose DP-SPRT, a wrapper that can be calibrated to achieve desired error probabilities and privacy constraints, addressing a significant gap in previous work. DP-SPRT relies on a private mechanism that processes a sequence of queries and stops after privately determining when the query results fall outside a predefined interval. This OutsideInterval mechanism improves upon naive composition of existing techniques like AboveThreshold, potentially benefiting other sequential algorithms. We prove generic upper bounds on the error and sample complexity of DP-SPRT that can accommodate various noise distributions based on the practitioner's privacy needs. We exemplify them in two settings: Laplace noise (pure Differential Privacy) and Gaussian noise (RÃ©nyi differential privacy). In the former setting, by providing a lower bound on the sample complexity of any $Îµ$-DP test with prescribed type I and type II errors, we show that DP-SPRT is near optimal when both errors are small and the two hypotheses are close. Moreover, we conduct an experimental study revealing its good practical performance.

Paperid: 3571, https://arxiv.org/pdf/2508.05690.pdf

Abstract:
Database systems are extensively used to store critical data across various domains. However, the frequency of abnormal database access behaviors, such as database intrusion by internal and external attacks, continues to rise. Internal masqueraders often have greater organizational knowledge, making it easier to mimic employee behavior effectively. In contrast, external masqueraders may behave differently due to their lack of familiarity with the organization. Current approaches lack the granularity needed to detect anomalies at the operational level, frequently misclassifying entire sequences of operations as anomalies, even though most operations are likely to represent normal behavior. On the other hand, some anomalous behaviors often resemble normal activities, making them difficult for existing detection methods to identify. This paper introduces a two-tiered anomaly detection approach for Structured Query Language (SQL) using the Bidirectional Encoder Representations from Transformers (BERT) model, specifically DistilBERT, a more efficient, pre-trained version. Our method combines both unsupervised and supervised machine learning techniques to accurately identify anomalous activities while minimizing the need for data labeling. First, the unsupervised method uses ensemble anomaly detectors that flag embedding vectors distant from learned normal patterns of typical user behavior across the database (out-of-scope queries). Second, the supervised method uses fine-tuned transformer-based models to detect internal attacks with high precision (in-scope queries), using role-labeled classification, even on limited labeled SQL data. Our findings make a significant contribution by providing an effective solution for safeguarding critical database systems from sophisticated threats.

Paperid: 3572, https://arxiv.org/pdf/2508.05684.pdf

Abstract:
The proliferation of multi-modal fake news on social media poses a significant threat to public trust and social stability. Traditional detection methods, primarily text-based, often fall short due to the deceptive interplay between misleading text and images. While Large Vision-Language Models (LVLMs) offer promising avenues for multi-modal understanding, effectively fusing diverse modal information, especially when their importance is imbalanced or contradictory, remains a critical challenge. This paper introduces MM-FusionNet, an innovative framework leveraging LVLMs for robust multi-modal fake news detection. Our core contribution is the Context-Aware Dynamic Fusion Module (CADFM), which employs bi-directional cross-modal attention and a novel dynamic modal gating network. This mechanism adaptively learns and assigns importance weights to textual and visual features based on their contextual relevance, enabling intelligent prioritization of information. Evaluated on the large-scale Multi-modal Fake News Dataset (LMFND) comprising 80,000 samples, MM-FusionNet achieves a state-of-the-art F1-score of 0.938, surpassing existing multi-modal baselines by approximately 0.5% and significantly outperforming single-modal approaches. Further analysis demonstrates the model's dynamic weighting capabilities, its robustness to modality perturbations, and performance remarkably close to human-level, underscoring its practical efficacy and interpretability for real-world fake news detection.

Paperid: 3573, https://arxiv.org/pdf/2508.05671.pdf

Abstract:
As large language models (LLMs) and generative AI become increasingly integrated into customer service and moderation applications, adversarial threats emerge from both external manipulations and internal label corruption. In this work, we identify and systematically address these dual adversarial threats by introducing DINA (Dual Defense Against Internal Noise and Adversarial Attacks), a novel unified framework tailored specifically for NLP. Our approach adapts advanced noisy-label learning methods from computer vision and integrates them with adversarial training to simultaneously mitigate internal label sabotage and external adversarial perturbations. Extensive experiments conducted on a real-world dataset from an online gaming service demonstrate that DINA significantly improves model robustness and accuracy compared to baseline models. Our findings not only highlight the critical necessity of dual-threat defenses but also offer practical strategies for safeguarding NLP systems in realistic adversarial scenarios, underscoring broader implications for fair and responsible AI deployment.

Paperid: 3574, https://arxiv.org/pdf/2508.05355.pdf

Abstract:
Digital signatures represent a crucial cryptographic asset that must be protected against quantum adversaries. Quantum Digital Signatures (QDS) can offer solutions that are information-theoretically (IT) secure and thus immune to quantum attacks. In this work, we analyze three existing practical QDS protocols based on preshared secure keys (e.g., established with quantum key distribution) and universal hashing families. For each protocol, we make amendments to close potential loopholes and prove their IT security while accounting for the failure of IT-secure authenticated communication. We then numerically optimize the protocol parameters to improve efficiency in terms of preshared bit consumption and signature length, allowing us to identify the most efficient protocol.

Paperid: 3575, https://arxiv.org/pdf/2508.04641.pdf

Abstract:
Cross-chain asset exchange is crucial for blockchain interoperability. Existing solutions rely on trusted third parties and risk asset loss, or use decentralized alternatives like atomic swaps, which suffer from grief attacks. Griefing occurs when a party prematurely exits, locking the counterparty's assets until a timelock expires. Hedged Atomic Swaps mitigate griefing by introducing a penalty premium; however, they increase the number of transactions from four (as in Tier Nolan's swap) to six, which in turn introduces new griefing risks. Grief-Free (GF) Swap reduces this to five transactions by consolidating assets and premiums on a single chain. However, no existing protocol achieves grief-free asset exchange in just four transactions. This paper presents 4-Swap, the first cross-chain atomic swap protocol that is both grief-free and bribery-safe, while completing asset exchange in just four transactions. By combining the griefing premium and principal into a single transaction per chain, 4-Swap reduces on-chain transactions, leading to faster execution compared to previous grief-free solutions. It is fully compatible with Bitcoin and operates without the need for any new opcodes. A game-theoretic analysis shows that rational participants have no incentive to deviate from the protocol, ensuring robust compliance and security.

Paperid: 3576, https://arxiv.org/pdf/2508.04285.pdf

Abstract:
Federated learning (FL) enables collaborative model training without sharing raw data, but individual model updates may still leak sensitive information. Secure aggregation (SecAgg) mitigates this risk by allowing the server to access only the sum of client updates, thereby concealing individual contributions. However, a significant vulnerability has recently attracted increasing attention: when model updates are sparse vectors, a non-zero value contributed by a single client at a given index can be directly revealed in the aggregate, enabling precise data reconstruction attacks. In this paper, we propose a novel enhancement to SecAgg that reveals aggregated values only at indices with at least $t$ non-zero contributions. Our mechanism introduces a per-element masking strategy to prevent the exposure of under-contributed elements, while maintaining modularity and compatibility with many existing SecAgg implementations by relying solely on cryptographic primitives already employed in a typical setup. We integrate this mechanism into Flamingo, a low-round SecAgg protocol, to provide a robust defense against such attacks. Our analysis and experimental results indicate that the additional computational and communication overhead introduced by our mechanism remains within an acceptable range, supporting the practicality of our approach.

Paperid: 3577, https://arxiv.org/pdf/2508.03882.pdf

Abstract:
In today digital landscape, organizations face constantly evolving cyber threats, making it essential to discover slippery attack vectors through novel techniques like Security Chaos Engineering (SCE), which allows teams to test defenses and identify vulnerabilities effectively. This paper proposes to integrate SCE into Breach Attack Simulation (BAS) platforms, leveraging adversary profiles and abilities from existing threat intelligence databases. This innovative proposal for cyberattack simulation employs a structured architecture composed of three layers: SCE Orchestrator, Connector, and BAS layers. Utilizing MITRE Caldera in the BAS layer, our proposal executes automated attack sequences, creating inferred attack trees from adversary profiles. Our proposal evaluation illustrates how integrating SCE with BAS can enhance the effectiveness of attack simulations beyond traditional scenarios, and be a useful component of a cyber defense strategy.

Paperid: 3578, https://arxiv.org/pdf/2508.03856.pdf

Abstract:
The security of research software is essential for ensuring the integrity and reproducibility of scientific results. However, research software security is still largely unexplored. Due to its dependence on open source components and distributed development practices, research software is particularly vulnerable to supply chain attacks. This study analyses 3,248 high-quality, largely peer-reviewed research software repositories using the OpenSSF Scorecard. We find a generally weak security posture with an average score of 3.5/10. Important practices, such as signed releases and branch protection, are rarely implemented. Finally, we present actionable, low-effort recommendations that can help research teams improve software security and mitigate potential threats to scientific integrity.

Paperid: 3579, https://arxiv.org/pdf/2508.03517.pdf

Abstract:
Network Intrusion Detection Systems (NIDS) play a crucial role in safeguarding network infrastructure against cyberattacks. As the prevalence and sophistication of these attacks increase, machine learning and deep neural network approaches have emerged as effective tools for enhancing NIDS capabilities in detecting malicious activities. However, the effectiveness of traditional deep neural models is often limited by the need for extensive labelled datasets and the challenges posed by data and feature heterogeneity across different network domains. To address these limitations, we developed a deep neural model that integrates multi-modal learning with domain adaptation techniques for classification. Our model processes data from diverse sources in a sequential cyclic manner, allowing it to learn from multiple datasets and adapt to varying feature spaces. Experimental results demonstrate that our proposed model significantly outperforms baseline neural models in classifying network intrusions, particularly under conditions of varying sample availability and probability distributions. The model's performance highlights its ability to generalize across heterogeneous datasets, making it an efficient solution for real-world network intrusion detection.

Paperid: 3580, https://arxiv.org/pdf/2508.02921.pdf

Abstract:
We introduce PentestJudge, a system for evaluating the operations of penetration testing agents. PentestJudge is a large language model (LLM)-as-judge with access to tools that allow it to consume arbitrary trajectories of agent states and tool call history to determine whether a security agent's actions meet certain operating criteria that would be impractical to evaluate programmatically. We develop rubrics that use a tree structure to hierarchically collapse the penetration testing task for a particular environment into smaller, simpler, and more manageable sub-tasks and criteria until each leaf node represents simple yes-or-no criteria for PentestJudge to evaluate. Task nodes are broken down into different categories related to operational objectives, operational security, and tradecraft. LLM-as-judge scores are compared to human domain experts as a ground-truth reference, allowing us to compare their relative performance with standard binary classification metrics, such as F1 scores. We evaluate several frontier and open-source models acting as judge agents, with the best model reaching an F1 score of 0.83. We find models that are better at tool-use perform more closely to human experts. By stratifying the F1 scores by requirement type, we find even models with similar overall scores struggle with different types of questions, suggesting certain models may be better judges of particular operating criteria. We find that weaker and cheaper models can judge the trajectories of pentests performed by stronger and more expensive models, suggesting verification may be easier than generation for the penetration testing task. We share this methodology to facilitate future research in understanding the ability of judges to holistically and scalably evaluate the process quality of AI-based information security agents so that they may be confidently used in sensitive production environments.

Paperid: 3581, https://arxiv.org/pdf/2508.02438.pdf

Abstract:
Physically Unclonable Function (PUF) offers a secure and lightweight alternative to traditional cryptography for authentication due to their unique device fingerprint. However, their dependence on specialized hardware hinders their adoption in diverse applications. This paper proposes a novel blockchain framework that leverages SoftPUF, a software-based approach mimicking PUF. SoftPUF addresses the hardware limitations of traditional PUF, enabling secure and efficient authentication for a broader range of devices within a blockchain network. The framework utilizes a machine learning model trained on PUF data to generate unique, software-based keys for each device. These keys serve as secure identifiers for authentication on the blockchain, eliminating the need for dedicated hardware. This approach facilitates the integration of legacy devices from various domains, including cloud-based solutions, into the blockchain network. Additionally, the framework incorporates well-established defense mechanisms to ensure robust security against various attacks. This combined approach paves the way for secure and scalable authentication in diverse blockchain-based applications. Additionally, to ensure robust security, the system incorporates well-established defense mechanisms against various attacks, including 51%, phishing, routing, and Sybil attacks, into the blockchain network. This combined approach paves the way for secure and efficient authentication in a wider range of blockchain-based applications.

Paperid: 3582, https://arxiv.org/pdf/2508.02375.pdf

Abstract:
Operational Technology (OT) is an integral component of critical national infrastructure, enabling automation and control in industries such as energy, manufacturing, and transportation. However, OT networks, systems, and devices have been designed and deployed prioritising functionality rather than security. This leads to inherent vulnerabilities in many deployed systems when operational misconfigurations expose them to the internet. This report provides an up-to-date overview of the OT threat landscape exposed to the public internet and studies the affected protocols, vendors, software, and the geographic distribution of systems. Our findings reveal nearly 70,000 exposed OT devices globally, with significant concentrations in North America and Europe. Analysis of prevalent protocols (e.g., ModbusTCP, EtherNet/IP, S7) shows that many devices expose detailed identifying information, including outdated firmware versions with known critical vulnerabilities that remain unpatched for years after disclosure. Furthermore, we demonstrate how automated analysis of screenshots can uncover exposed graphical interfaces of Human Machine Interfaces (HMIs) and Supervisory Control and Data Acquisition (SCADA) systems, highlighting diverse pathways for potential unauthorized access and underscoring the risks to industrial processes and critical infrastructure.

Paperid: 3583, https://arxiv.org/pdf/2508.01913.pdf

Abstract:
Academic publishing, integral to knowledge dissemination and scientific advancement, increasingly faces threats from unethical practices such as unconsented authorship, gift authorship, author ambiguity, and undisclosed conflicts of interest. While existing infrastructures like ORCID effectively disambiguate researcher identities, they fall short in enforcing explicit authorship consent, accurately verifying contributor roles, and robustly detecting conflicts of interest during peer review. To address these shortcomings, this paper introduces a decentralized framework leveraging Self-Sovereign Identity (SSI) and blockchain technology. The proposed model uses Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs) to securely verify author identities and contributions, reducing ambiguity and ensuring accurate attribution. A blockchain-based trust registry records authorship consent and peer-review activity immutably. Privacy-preserving cryptographic techniques, especially Zero-Knowledge Proofs (ZKPs), support conflict-of-interest detection without revealing sensitive data. Verified authorship metadata and consent records are embedded in publications, increasing transparency. A stakeholder survey of researchers, editors, and reviewers suggests the framework improves ethical compliance and confidence in scholarly communication. This work represents a step toward a more transparent, accountable, and trustworthy academic publishing ecosystem.

Paperid: 3584, https://arxiv.org/pdf/2508.01909.pdf

Abstract:
Mirai is undoubtedly one of the most significant Internet of Things (IoT) botnet attacks in history. In terms of its detrimental effects, seamless spread, and low detection rate, it surpassed its predecessors. Its developers released the source code, which triggered the development of several variants that combined the old code with newer vulnerabilities found on popular IoT devices. The prominent variants, Satori, Mukashi, Moobot, and Sonic1, together target more than 15 unique known vulnerabilities discovered between 2014-2021. The vulnerabilities include but are not limited to improper input validation, command injections, insufficient credential protection, and out-of-bound writes. With these new attack strategies, Satori compromised more than a quarter million devices within the first twelve hours of its release and peaked at almost 700,000 infected devices. Similarly, Mukashi made more than a hundred million Zyxel NAS devices vulnerable through its new exploits. This article reviews the attack methodologies and impacts of these variants in detail. It summarizes the common vulnerabilities targeted by these variants and analyzes the infection mechanism through vulnerability analysis. This article also provides an overview of possible defense solutions.

Paperid: 3585, https://arxiv.org/pdf/2508.01863.pdf

Abstract:
In today's enterprise environment, traditional access methods such as Virtual Private Networks (VPNs) and application-specific Single Sign-On (SSO) often fall short when it comes to securely scaling access for a distributed and dynamic workforce. This paper presents our experience implementing a modern, Zero Trust-aligned architecture that leverages a reverse proxy integrated with Mutual TLS (mTLS) and centralized SSO, along with the key challenges we encountered and lessons learned during its deployment and scaling. This multidimensional solution involves both per-device and per-user authentication, centralized enforcement of security policies, and comprehensive observability, hence enabling organizations to deliver secure and seamless access to their internal applications.

Paperid: 3586, https://arxiv.org/pdf/2508.01685.pdf

Abstract:
Data representation remains a fundamental challenge in machine learning, particularly when adapting sequence-based architectures like Transformers and Large Language Models (LLMs) for structured tabular data. Existing methods often fail to cohesively encode the mix of numerical and categorical features or preserve the inherent structure of tables. This paper introduces a novel, hybrid tokenisation methodology designed to convert tabular data into a unified, sequential format suitable for LLM training. Our approach combines predefined fixed tokens to represent structural elements and low-cardinality categorical features, with a learned subword vocabulary using Byte-Pair Encoding (BPE) for high-cardinality and continuous values. We demonstrate the efficacy of this technique by applying it to a large-scale NetFlow dataset (CIDDS-001), preparing a corpus for a Network Intrusion Detection System (NIDS) foundation model. The evaluation shows that our method is highly efficient, processing over 31 million network flows in under five hours and achieving a significant data compression ratio of 6.18:1. This process resulted in a computationally manageable corpus of over one billion tokens, establishing a viable and generalisable pathway for training foundation models on structured data.

Paperid: 3587, https://arxiv.org/pdf/2508.01542.pdf

Abstract:
The increase of IoT devices, driven by advancements in hardware technologies, has led to widespread deployment in large-scale networks that process massive amounts of data daily. However, the reliance on Edge Computing to manage these devices has introduced significant security vulnerabilities, as attackers can compromise entire networks by targeting a single IoT device. In light of escalating cybersecurity threats, particularly botnet attacks, this paper investigates the application of machine learning techniques to enhance security in Edge-Computing-Assisted IoT environments. Specifically, it presents a comparative analysis of Random Forest, XGBoost, and LightGBM -- three advanced ensemble learning algorithms -- to address the dynamic and complex nature of botnet threats. Utilizing a widely recognized IoT network traffic dataset comprising benign and malicious instances, the models were trained, tested, and evaluated for their accuracy in detecting and classifying botnet activities. Furthermore, the study explores the feasibility of deploying these models in resource-constrained edge and IoT devices, demonstrating their practical applicability in real-world scenarios. The results highlight the potential of machine learning to fortify IoT networks against emerging cybersecurity challenges.

Paperid: 3588, https://arxiv.org/pdf/2508.01276.pdf

Abstract:
Poisoning attacks, in which an attacker adversarially manipulates the training dataset of a machine learning (ML) model, pose a significant threat to ML security. Beta Poisoning is a recently proposed poisoning attack that disrupts model accuracy by making the training dataset linearly nonseparable. In this paper, we propose four defense strategies against Beta Poisoning attacks: kNN Proximity-Based Defense (KPB), Neighborhood Class Comparison (NCC), Clustering-Based Defense (CBD), and Mean Distance Threshold (MDT). The defenses are based on our observations regarding the characteristics of poisoning samples generated by Beta Poisoning, e.g., poisoning samples have close proximity to one another, and they are centered near the mean of the target class. Experimental evaluations using MNIST and CIFAR-10 datasets demonstrate that KPB and MDT can achieve perfect accuracy and F1 scores, while CBD and NCC also provide strong defensive capabilities. Furthermore, by analyzing performance across varying parameters, we offer practical insights regarding defenses' behaviors under varying conditions.

Paperid: 3589, https://arxiv.org/pdf/2508.01268.pdf

Abstract:
Small language models (SLMs) are increasingly valued for their efficiency and deployability in resource-constrained environments, making them useful for on-device, privacy-sensitive, and edge computing applications. On the other hand, membership inference attacks (MIAs), which aim to determine whether a given sample was used in a model's training, are an important threat with serious privacy and intellectual property implications. In this paper, we study MIAs on SLMs. Although MIAs were shown to be effective on large language models (LLMs), they are relatively less studied on emerging SLMs, and furthermore, their effectiveness decreases as models get smaller. Motivated by this finding, we propose a new MIA called win-k, which builds on top of a state-of-the-art attack (min-k). We experimentally evaluate win-k by comparing it with five existing MIAs using three datasets and eight SLMs. Results show that win-k outperforms existing MIAs in terms of AUROC, TPR @ 1% FPR, and FPR @ 99% TPR metrics, especially on smaller models.

Paperid: 3590, https://arxiv.org/pdf/2508.01054.pdf

Abstract:
This study evaluates the ability of GPT-4o to autonomously solve beginner-level offensive security tasks by connecting the model to OverTheWire's Bandit capture-the-flag game. Of the 25 levels that were technically compatible with a single-command SSH framework, GPT-4o solved 18 unaided and another two after minimal prompt hints for an overall 80% success rate. The model excelled at single-step challenges that involved Linux filesystem navigation, data extraction or decoding, and straightforward networking. The approach often produced the correct command in one shot and at a human-surpassing speed. Failures involved multi-command scenarios that required persistent working directories, complex network reconnaissance, daemon creation, or interaction with non-standard shells. These limitations highlight current architectural deficiencies rather than a lack of general exploit knowledge. The results demonstrate that large language models (LLMs) can automate a substantial portion of novice penetration-testing workflow, potentially lowering the expertise barrier for attackers and offering productivity gains for defenders who use LLMs as rapid reconnaissance aides. Further, the unsolved tasks reveal specific areas where secure-by-design environments might frustrate simple LLM-driven attacks, informing future hardening strategies. Beyond offensive cybersecurity applications, results suggest the potential to integrate LLMs into cybersecurity education as practice aids.

Paperid: 3591, https://arxiv.org/pdf/2508.00934.pdf

Abstract:
Understanding behavioral drivers of success in illicit digital marketplaces is critical for developing effective enforcement strategies and understanding digital commerce evolution, as darknet drug markets represent a growing share of the total drug economy. This study employs quantitative regression analysis of 50,000+ listings from 2,653 vendors in the Agora marketplace (2014-2015), examining relationships between cybersecurity signaling (PGP encryption mentions), product diversification, and commercial success through nested regression specifications controlling for reputation, pricing, and category-specific factors. Product diversification emerges as the dominant predictor of vendor scale, increasing the odds of large vendor status by 169% per additional category, while PGP encryption signaling functions primarily as a professional marker rather than an independent success factor. Vendor success depends on portfolio breadth rather than specialization, with category-specific enforcement creating differential market constraints. Successful vendors operate as diversified enterprises capable of rapid pivoting between product categories, requiring targeted enforcement towards diversified vendors based on coordinated multi-category enforcement approaches rather than traditional substance-specific targeting strategies.

Paperid: 3592, https://arxiv.org/pdf/2508.00682.pdf

Abstract:
Dynamic Binary Instrumentation (DBI) is the set of techniques that enable instrumentation of programs at run-time, making it possible to monitor and modify the execution of compiled binaries or entire systems. DBI is used for countless security applications and analyses, and is extensively used across many fields in both industry and academia. Over the years, several DBI approaches have been proposed based on different technologies and implementing diverse techniques. Every solution tries to overcome certain limitations, but they sometimes bring other shortcomings. Some are specialized for one particular domain or task, while others have a wider scope. In this paper, we shed light into the labyrinth of DBI, bringing together process-level and whole-system approaches. We depict their building blocks and analyze the underlying instrumentation techniques, comparing their ability to instrument different primitives and run-time events. Then, we evaluate their performance when implementing each primitive, and highlight relevant observations. Our results show that no single technique is better than the rest in all circumstances.

Paperid: 3593, https://arxiv.org/pdf/2508.00636.pdf

Abstract:
Federated learning is a distributed training framework vulnerable to Byzantine attacks, particularly when over 50% of clients are malicious or when datasets are highly non-independent and identically distributed (non-IID). Additionally, most existing defense mechanisms are designed for specific attack types (e.g., gradient similarity-based schemes can only defend against outlier model poisoning), limiting their effectiveness. In response, we propose FedGuard, a novel federated learning mechanism. FedGuard cleverly addresses the aforementioned issues by leveraging the high sensitivity of membership inference to model bias. By requiring clients to include an additional mini-batch of server-specified data in their training, FedGuard can identify and exclude poisoned models, as their confidence in the mini-batch will drop significantly. Our comprehensive evaluation unequivocally shows that, under three highly non-IID datasets, with 90% of clients being Byzantine and seven different types of Byzantine attacks occurring in each round, FedGuard significantly outperforms existing robust federated learning schemes in mitigating various types of Byzantine attacks.

Paperid: 3594, https://arxiv.org/pdf/2508.00293.pdf

Abstract:
Ransomware (RW) presents a significant and widespread threat in the digital landscape, necessitating effective countermeasures. Active cyber deception is a promising strategy to thwart RW and limiting its propagation by misleading it with false information and revealing its true behaviors. Furthermore, RW often acts as a communication conduit between attackers and defenders, allowing deception to return false data to attackers and deplete their resources. This paper introduces ranDecepter, a novel approach that combines active cyber deception with real-time analysis to enhance defenses against RW attacks. The ranDecepter identifies RW in real-time and isolates it within a deceptive environment, autonomously identifying critical elements in the RW code to create a loop mechanism. By repeatedly restarting the malware and transmitting counterfeit encryption information and secret keys to the attacker, it forces the attacker to store these fabricated details for each victim, thereby depleting their resources. Our comprehensive evaluation of ranDecepter, conducted using 1,134 real-world malware samples and twelve benign applications, demonstrates a remarkable 100% accuracy in RW identification, with no false positives and minimal impact on response times. Furthermore, within 24-hours, ranDecepter generates up to 9,223K entries in the attacker's database using 50 agents, showcasing its potential to undermine attacker resources.

Paperid: 3595, https://arxiv.org/pdf/2507.23002.pdf

Abstract:
The proliferation of advanced tools for manipulating video has led to an arms race, pitting those who wish to sow disinformation against those who want to detect and expose it. Unfortunately, time favors the ill-intentioned in this race, with fake videos growing increasingly difficult to distinguish from real ones. At the root of this trend is a fundamental advantage held by those manipulating media: equal access to a distribution of what we consider authentic (i.e., "natural") video. In this paper, we show how coding very subtle, noise-like modulations into the illumination of a scene can help combat this advantage by creating an information asymmetry that favors verification. Our approach effectively adds a temporal watermark to any video recorded under coded illumination. However, rather than encoding a specific message, this watermark encodes an image of the unmanipulated scene as it would appear lit only by the coded illumination. We show that even when an adversary knows that our technique is being used, creating a plausible coded fake video amounts to solving a second, more difficult version of the original adversarial content creation problem at an information disadvantage. This is a promising avenue for protecting high-stakes settings like public events and interviews, where the content on display is a likely target for manipulation, and while the illumination can be controlled, the cameras capturing video cannot.

Paperid: 3596, https://arxiv.org/pdf/2507.22309.pdf

Abstract:
For centuries, financial institutions have responded to liquidity challenges by forming closed, centralized clearing clubs with strict rules and membership that allow them to collaborate on using the least money to discharge the most debt. As closed clubs, much of the general public has been excluded from participation. But the vast majority of private sector actors consists of micro or small firms that are vulnerable to late payments and generally ineligible for bank loans. This low liquidity environment often results in gridlock and leads to insolvency, and it disproportionately impacts small enterprises and communities. On the other hand, blockchain communities have developed open, decentralized settlement systems, along with a proliferation of store of value assets and new lending protocols, allowing anyone to permissionlessly transact and access credit. However, these protocols remain used primarily for speculative purposes, and so far have fallen short of the large-scale positive impact on the real economy prophesied by their promoters. We address these challenges by introducing Cycles, an open, decentralized clearing, settlement, and issuance protocol. Cycles is designed to enable firms to overcome payment inefficiencies, to reduce their working capital costs, and to leverage diverse assets and liquidity sources, including cryptocurrencies, stablecoins, and lending protocols, in service of clearing more debt with less money. Cycles solves real world liquidity challenges through a privacy-preserving multilateral settlement platform based on a graph optimization algorithm. The design is based on a core insight: liquidity resides within cycles in the payment network's structure and can be accessed via settlement flows optimized to reduce debt.

Paperid: 3597, https://arxiv.org/pdf/2507.22265.pdf

Abstract:
A recent work (Korten, Pitassi, and Impagliazzo, FOCS 2025) established an insightful connection between static data structure lower bounds, range avoidance of $\text{NC}^0$ circuits, and the refutation of pseudorandom CSP instances, leading to improvements to some longstanding lower bounds in the cell-probe/bit-probe models. Here, we improve these lower bounds in certain cases via a more streamlined reduction to XOR refutation, coupled with handling the odd-arity case. Our result can be viewed as a complete derandomization of the state-of-the-art semi-random $k$-XOR refutation analysis (Guruswami, Kothari and Manohar, STOC 2022, Hsieh, Kothari and Mohanty, SODA 2023), which complements the derandomization of the even-arity case obtained by Korten et al. As our main technical statement, we show that for any multi-output constant-depth circuit that substantially stretches its input, its output is very likely far from strings sampled from distributions with sufficient independence, and further this can be efficiently certified. Via suitable shifts in perspectives, this gives applications to cell-probe lower bounds and range avoidance algorithms for $\mathsf{NC}^0$ circuits.

Paperid: 3598, https://arxiv.org/pdf/2507.22165.pdf

Abstract:
The emergence of programmable data planes, and particularly switches supporting the P4 language, has transformed network security by enabling customized, line-rate packet processing. These switches, originally intended for flexible forwarding, now play a broader role: detecting and mitigating attacks such as DDoS and spoofing, enforcing next-generation firewall policies, and even supporting in-network cryptography and machine learning. These capabilities are made possible by techniques such as recirculate-and-truncate and lookup-table precomputation, which work around architectural constraints like limited memory and restricted instruction sets. In this paper, we systematize recent advances in security applications built on programmable switches, with an emphasis on the capabilities, challenges, and architectural workarounds. We highlight the non-obvious design techniques that make complex in-network security functions feasible despite the constraints of the hardware platform, and also comment on remaining issues and emerging research directions.

Paperid: 3599, https://arxiv.org/pdf/2507.22066.pdf

Abstract:
The generation of large, high-quality datasets for code understanding and generation remains a significant challenge, particularly when aligning decompiled binaries with their original source code. To address this, we present CodableLLM, a Python framework designed to automate the creation and curation of datasets by mapping decompiled functions to their corresponding source functions. This process enhances the alignment between decompiled and source code representations, facilitating the development of large language models (LLMs) capable of understanding and generating code across multiple abstraction levels. CodableLLM supports multiple programming languages and integrates with existing decompilers and parsers to streamline dataset generation. This paper presents the design and implementation of CodableLLM, evaluates its performance in dataset creation, and compares it to existing tools in the field. The results demonstrate that CodableLLM offers a robust and efficient solution for generating datasets tailored for code-focused LLMS.

Paperid: 3600, https://arxiv.org/pdf/2507.21985.pdf

Abstract:
Machine unlearning (MU) removes specific data points or concepts from deep learning models to enhance privacy and prevent sensitive content generation. Adversarial prompts can exploit unlearned models to generate content containing removed concepts, posing a significant security risk. However, existing adversarial attack methods still face challenges in generating content that aligns with an attacker's intent while incurring high computational costs to identify successful prompts. To address these challenges, we propose ZIUM, a Zero-shot Intent-aware adversarial attack on Unlearned Models, which enables the flexible customization of target attack images to reflect an attacker's intent. Additionally, ZIUM supports zero-shot adversarial attacks without requiring further optimization for previously attacked unlearned concepts. The evaluation across various MU scenarios demonstrated ZIUM's effectiveness in successfully customizing content based on user-intent prompts while achieving a superior attack success rate compared to existing methods. Moreover, its zero-shot adversarial attack significantly reduces the attack time for previously attacked unlearned concepts.

Paperid: 3601, https://arxiv.org/pdf/2507.21904.pdf

Abstract:
System and network event logs are essential for security analytics, threat detection, and operational monitoring. However, these logs often contain Personally Identifiable Information (PII), raising significant privacy concerns when shared or analyzed. A key challenge in log anonymization is balancing privacy protection with the retention of sufficient structure for meaningful analysis. Overly aggressive anonymization can destroy contextual integrity, while weak techniques risk re-identification through linkage or inference attacks. This paper introduces novel field-specific anonymization methods that address this trade-off. For IP addresses, we propose a salt-based hashing technique applied at the per-octet level, preserving both subnet and host structure to enable correlation across various log entries while ensuring non-reversibility. For port numbers, full-value hashing with range mapping maintains interpretability. We also present an order-preserving timestamp anonymization scheme using adaptive noise injection, which obfuscates exact times without disrupting event sequences. An open-source tool implementing these techniques has been released to support practical deployment and reproducible research. Evaluations using entropy metrics, collision rates, and residual leakage analysis demonstrate that the proposed approach effectively protects privacy while preserving analytical utility.

Paperid: 3602, https://arxiv.org/pdf/2507.21731.pdf

Abstract:
In this work, we propose a class of equational theories for bounded binary circuits that have the finite variant property. These theories could serve as a building block to specify cryptographic primitive implementations and automatically discover attacks as binary circuits in the symbolic model. We provide proofs of equivalence between this class of equational theories and Boolean logic up to circuit size 3 and we provide the variant complexities and performance benchmarks using Maude-NPA. This is the first result in this direction and follow-up research is needed to improve the scalability of the approach.

Paperid: 3603, https://arxiv.org/pdf/2507.21640.pdf

Abstract:
Modern in-vehicle networks face various cyber threats due to the lack of encryption and authentication in the Controller Area Network (CAN). To address this security issue, this paper presents GUARD-CAN, an anomaly detection framework that combines graph-based representation learning with time-series modeling. GUARD-CAN splits CAN messages into fixed-length windows and converts each window into a graph that preserves message order. To detect anomalies in the timeaware and structure-aware context at the same window, GUARD-CAN takes advantage of the overcomplete Autoencoder (AE) and Graph Convolutional Network (GCN) to generate graph embedding vectors. The model groups these vectors into sequences and feeds them into the Gated Recurrent Unit (GRU) to detect temporal anomaly patterns across the graphs. GUARD-CAN performs anomaly detection at both the sequence level and the window level, and this allows multi-perspective performance evaluation. The model also verifies the importance of window size selection through an analysis based on Shannon entropy. As a result, GUARD-CAN shows that the proposed model detects four types of CAN attacks (flooding, fuzzing, replay and spoofing attacks) effectively without relying on complex feature engineering.

Paperid: 3604, https://arxiv.org/pdf/2507.21483.pdf

Abstract:
Neural networks have received a lot of attention recently, and related security issues have come with it. Many studies have shown that neural networks are vulnerable to adversarial examples that have been artificially perturbed with modification, which is too small to be distinguishable by human perception. Different attacks and defenses have been proposed to solve these problems, but there is little research on evaluating the robustness of neural networks and their inputs. In this work, we propose a metric called the neuron cover change rate (NCCR) to measure the ability of deep learning models to resist attacks and the stability of adversarial examples. NCCR monitors alterations in the output of specifically chosen neurons when the input is perturbed, and networks with a smaller degree of variation are considered to be more robust. The results of the experiment on image recognition and the speaker recognition model show that our metrics can provide a good assessment of the robustness of neural networks or their inputs. It can also be used to detect whether an input is adversarial or not, as adversarial examples are always less robust.

Paperid: 3605, https://arxiv.org/pdf/2507.21398.pdf

Abstract:
This paper presents the development of a distributed digital identity system utilizing modern technologies, including FastAPI, MongoDB, gRPC, Docker, and blockchain simulation with Ganache and Ethereum. The objective is to demonstrate the benefits of distributed systems and blockchain for the security, traceability, and decentralization of digital identities. The methodology included the development of a microservices architecture with JWT authentication, data persistence in MongoDB, simulation of blockchain operations using Ganache, and containerization with Docker. The results demonstrate the feasibility of the proposed approach, with a functional web interface, complete audit logs, and blockchain simulation with Ethereum. The theoretical foundations, technical implementation, results obtained, and prospects for integration with real blockchain networks are discussed.

Paperid: 3606, https://arxiv.org/pdf/2507.21325.pdf

Abstract:
The traditional way for a Quantum Key Distribution (QKD) user to join a quantum network is by authenticating themselves using pre-shared key material. While this approach is sufficient for small-scale networks, it becomes impractical as the network grows, due to the total quadratic increase in the number of pre-shared keys required. To address this scalability issue, Public Key Infrastructure (PKI) combined with Post-Quantum Cryptography (PQC) offers a more scalable solution, allowing users to authenticate the QKD traffic remotely to obtain information-theoretical secure (ITS) keys under the presented assumptions. Unlike traditional PKI, which relies on classical cryptographic algorithms such as RSA, the approach presented in this paper leverages PQC algorithms that are believed to be resistant to quantum attacks. Similarly to the SIGMA or TLS protocols, authentication, confidentiality, and integrity are achievable against bounded adversaries to ensure secure and scalable quantum networks.

Paperid: 3607, https://arxiv.org/pdf/2507.21163.pdf

Abstract:
Adversarial attack methods for 3D point cloud classification reveal the vulnerabilities of point cloud recognition models. This vulnerability could lead to safety risks in critical applications that use deep learning models, such as autonomous vehicles. To uncover the deficiencies of these models, researchers can evaluate their security through adversarial attacks. However, most existing adversarial attack methods are based on white-box attacks. While these methods achieve high attack success rates and imperceptibility, their applicability in real-world scenarios is limited. Black-box attacks, which are more meaningful in real-world scenarios, often yield poor results. This paper proposes a novel black-box adversarial example generation method that utilizes a diffusion model to improve the attack success rate and imperceptibility in the black-box setting, without relying on the internal information of the point cloud classification model to generate adversarial samples. We use a 3D diffusion model to use the compressed features of the point cloud as prior knowledge to guide the reverse diffusion process to add adversarial points to clean examples. Subsequently, its reverse process is employed to transform the distribution of other categories into adversarial points, which are then added to the point cloud.

Paperid: 3608, https://arxiv.org/pdf/2507.21146.pdf

Abstract:
Evolving AI systems increasingly deploy multi-agent architectures where autonomous agents collaborate, share information, and delegate tasks through developing protocols. This connectivity, while powerful, introduces novel security risks. One such risk is a cascading risk: a breach in one agent can cascade through the system, compromising others by exploiting inter-agent trust. In tandem with OWASP's initiative for an Agentic AI Vulnerability Scoring System we define an attack vector, Agent Cascading Injection, analogous to Agent Impact Chain and Blast Radius, operating across networks of agents. In an ACI attack, a malicious input or tool exploit injected at one agent leads to cascading compromises and amplified downstream effects across agents that trust its outputs. We formalize this attack with an adversarial goal equation and key variables (compromised agent, injected exploit, polluted observations, etc.), capturing how a localized vulnerability can escalate into system-wide failure. We then analyze ACI's properties -- propagation chains, amplification factors, and inter-agent compound effects -- and map these to OWASP's emerging Agentic AI risk categories (e.g. Impact Chain and Orchestration Exploits). Finally, we argue that ACI highlights a critical need for quantitative benchmarking frameworks to evaluate the security of agent-to-agent communication protocols. We outline a methodology for stress-testing multi-agent systems (using architectures such as Google's A2A and Anthropic's MCP) against cascading trust failures, developing upon groundwork for measurable, standardized agent-to-agent security evaluation. Our work provides the necessary apparatus for engineers to benchmark system resilience, make data-driven architectural trade-offs, and develop robust defenses against a new generation of agentic threats.

Paperid: 3609, https://arxiv.org/pdf/2507.21122.pdf

Abstract:
Kintsugi is a protocol for key recovery, allowing a user to regain access to end-to-end encrypted data after they have lost their device, but still have their (potentially low-entropy) password. Existing E2EE key recovery methods, such as those deployed by Signal and WhatsApp, centralize trust by relying on servers administered by a single provider. Kintsugi is decentralized, distributing trust over multiple recovery nodes, which could be servers run by independent parties, or end user devices in a peer-to-peer setting. To recover a user's keys, a threshold $t+1$ of recovery nodes must assist the user in decrypting a shared backup. Kintsugi is password-authenticated and protects against offline brute-force password guessing without requiring any specialized secure hardware. Kintsugi can tolerate up to $t$ honest-but-curious colluding recovery nodes, as well as $n - t - 1$ offline nodes, and operates safely in an asynchronous network model where messages can be arbitrarily delayed.

Paperid: 3610, https://arxiv.org/pdf/2507.21101.pdf

Abstract:
As mobile computing becomes central to digital interaction, researchers have turned their attention to adaptive authentication for its real-time, context- and behavior-aware verification capabilities. However, many implementations remain fragmented, inconsistently apply intelligent techniques, and fall short of user expectations. In this Systematization of Knowledge (SoK), we analyze 41 peer-reviewed studies since 2011 that focus on adaptive authentication in mobile environments. Our analysis spans seven dimensions: privacy and security models, interaction modalities, user behavior, risk perception, implementation challenges, usability needs, and machine learning frameworks. Our findings reveal a strong reliance on machine learning (64.3%), especially for continuous authentication (61.9%) and unauthorized access prevention (54.8%). AI-driven approaches such as anomaly detection (57.1%) and spatio-temporal analysis (52.4%) increasingly shape the interaction landscape, alongside growing use of sensor-based and location-aware models.

Paperid: 3611, https://arxiv.org/pdf/2507.21068.pdf

Abstract:
We propose a technique to protect and preserve a private key or a passcode in an encrypted two-dimensional graphical image. The plaintext private key or the passcode is converted into an encrypted QR code and embedded into a real-life color image with a steganographic scheme. The private key or the passcode is recovered from the stego color image by first extracting the encrypted QR code from the color image, followed by decryption of the QR code. The cryptographic key for encryption of the QR code is generated from the output of a Linear Feedback Shift Register (LFSR), initialized by a seed image chosen by the user. The user can store the seed image securely, without the knowledge of an attacker. Even if an active attacker modifies the seed image (without knowledge of the fact that it is the seed image), the user can easily restore it if he/she keeps multiple copies of it, so that the encryption key can be regenerated easily. Our experiments prove the feasibility of the technique using sample private key data and real-life color images.

Paperid: 3612, https://arxiv.org/pdf/2507.21060.pdf

Abstract:
The rapid integration of Artificial Intelligence (AI) into medical diagnostics has raised pressing concerns about patient privacy, especially when sensitive imaging data must be transferred, stored, or processed. In this paper, we propose a novel framework for privacy-preserving diagnostic inference on encrypted medical images using a modified convolutional neural network (Masked-CNN) capable of operating on transformed or ciphered image formats. Our approach leverages AES-CBC encryption coupled with JPEG2000 compression to protect medical images while maintaining their suitability for AI inference. We evaluate the system using public DICOM datasets (NIH ChestX-ray14 and LIDC-IDRI), focusing on diagnostic accuracy, inference latency, storage efficiency, and privacy leakage resistance. Experimental results show that the encrypted inference model achieves performance comparable to its unencrypted counterpart, with only marginal trade-offs in accuracy and latency. The proposed framework bridges the gap between data privacy and clinical utility, offering a practical, scalable solution for secure AI-driven diagnostics.

Paperid: 3613, https://arxiv.org/pdf/2507.20806.pdf

Abstract:
There has been a growing interest in Internet user privacy, demonstrated by the popularity of privacy-preserving products such as Telegram and Brave, and the widespread adoption of HTTPS. The Domain Name System (DNS) is a key component of Internet-based communication and its privacy has been neglected for years. Recently, DNS over HTTPS (DoH) has improved the situation by fixing the issue of in-path middleboxes. Further progress has been made with proxy-based solutions such as Oblivious DoH (ODoH), which separate a user's identity from their DNS queries. However, these solutions rely on non-collusion assumptions between DNS resolvers and proxies -- an assumption difficult to guarantee in practice. To address this, we explore integrating single-server Private Information Retrieval (PIR) into DNS to enable encrypted query processing without relying on trust assumptions. However, applying PIR to DNS is challenging due to its hierarchical nature -- particularly, interactions with recursive resolvers can still leak information. Navigating performance and privacy trade-offs, we propose PDNS, a DNS extension leveraging single-server PIR to strengthen privacy guarantees. We have implemented a prototype of PDNS and compared its performance against state-of-the-art solutions via trace-driven experiments. The results show that PDNS achieves acceptable performance (2x faster than DoH over Tor with similar privacy guarantees) and strong privacy guarantees today, mainly at the cost of its scalability, which specialized hardware for PIR can address in the near future.

Paperid: 3614, https://arxiv.org/pdf/2507.20704.pdf

Abstract:
The increasing integration of Visual Language Models (VLMs) into AI systems necessitates robust model alignment, especially when handling multimodal content that combines text and images. Existing evaluation datasets heavily lean towards text-only prompts, leaving visual vulnerabilities under evaluated. To address this gap, we propose \textbf{Text2VLM}, a novel multi-stage pipeline that adapts text-only datasets into multimodal formats, specifically designed to evaluate the resilience of VLMs against typographic prompt injection attacks. The Text2VLM pipeline identifies harmful content in the original text and converts it into a typographic image, creating a multimodal prompt for VLMs. Also, our evaluation of open-source VLMs highlights their increased susceptibility to prompt injection when visual inputs are introduced, revealing critical weaknesses in the current models' alignment. This is in addition to a significant performance gap compared to closed-source frontier models. We validate Text2VLM through human evaluations, ensuring the alignment of extracted salient concepts; text summarization and output classification align with human expectations. Text2VLM provides a scalable tool for comprehensive safety assessment, contributing to the development of more robust safety mechanisms for VLMs. By enhancing the evaluation of multimodal vulnerabilities, Text2VLM plays a role in advancing the safe deployment of VLMs in diverse, real-world applications.

Paperid: 3615, https://arxiv.org/pdf/2507.20676.pdf

Abstract:
Digital signatures are fundamental cryptographic primitives that ensure the authenticity and integrity of digital documents. In the post-quantum era, classical public key-based signature schemes become vulnerable to brute-force and key-recovery attacks due to the computational power of quantum algorithms. Multivariate polynomial based signature schemes are among the one of the cryptographic constructions that offers strong security guarantees against such quantum threats. With the growing capabilities of neural networks, it is natural to explore their potential application in the design of cryptographic primitives. Neural networks inherently captures the non-linear relationships within the data, which are encoded in their synaptic weight matrices and bias vectors. In this paper, we propose a novel construction of a multivariate polynomial based digital signature scheme that leverages neural network architectures. A neural network with binary weights is employed to define the central structure of the signature scheme. The design introduces a recurrent random vector, functionally analogous to an attention mechanism, which contributes dynamic randomness based on the previous state, thereby enhancing the scheme's security. It is demonstrated that the proposed signature scheme provide security against Existential Unforgeability under adaptive Chosen-Message Attacks (EUF-CMA). Furthermore, it is proven that direct attacks aimed to recover the private keys are computationally infeasible within polynomial time, even in the presence of quantum computing abilities. The operational characteristics of the proposed scheme are also evaluated, with results indicating notable efficiency and practical viability in post-quantum cryptographic applications.

Paperid: 3616, https://arxiv.org/pdf/2507.20554.pdf

Abstract:
This paper presents MPC-EVM, the first blockchain prototype that extends the EVM to enable asynchronous MPC invocations by smart contracts during transaction executions without compromising consistency or throughput. MPC-EVM uses an asynchronous execution model to process MPC-invoking transactions in a non-blocking fashion, saving the transaction's progress when it enters an MPC and resuming its execution upon MPC's completion. Additionally, it employs an access control mechanism that prevents inconsistent state access and modifications as a result of asynchronous executions. Benchmarking MPC-EVM's throughput show that the transactions per second (TPS) decreased by less than 3% compared to the baseline when MPC-invoking transactions are executed alongside regular transactions.

Paperid: 3617, https://arxiv.org/pdf/2507.20502.pdf

Abstract:
The high complexity of modern software supply chains necessitates tools such as Software Bill of Materials (SBOMs) to manage component dependencies, and Software Composition Analysis (SCA) tools to identify vulnerabilities. While there exists limited integration between SBOMs and SCA tools, a unified view of complex dependency-vulnerability relationships remains elusive. In this paper, we introduce VDGraph, a novel knowledge graph-based methodology for integrating vulnerability and dependency data into a holistic view. VDGraph consolidates SBOM and SCA outputs into a graph representation of software projects' dependencies and vulnerabilities. We provide a formal description and analysis of the theoretical properties of VDGraph and present solutions to manage possible conflicts between the SBOM and SCA data. We further introduce and evaluate a practical, proof-of-concept implementation of VDGraph using two popular SBOM and SCA tools, namely CycloneDX Maven plugin and Google's OSV-Scanner. We apply VDGraph on 21 popular Java projects. Through the formulation of appropriate queries on the graphs, we uncover the existence of concentrated risk points (i.e., vulnerable components of high severity reachable through numerous dependency paths). We further show that vulnerabilities predominantly emerge at a depth of three dependency levels or higher, indicating that direct or secondary dependencies exhibit lower vulnerability density and tend to be more secure. Thus, VDGraph contributes a graph-theoretic methodology that improves visibility into how vulnerabilities propagate through complex, transitive dependencies. Moreover, our implementation, which combines open SBOM and SCA standards with Neo4j, lays a foundation for scalable and automated analysis across real-world projects.

Paperid: 3618, https://arxiv.org/pdf/2507.19976.pdf

Abstract:
Fintech provides technological services to increase operational efficiency in financial institutions, but traditional perimeter-based defense mechanisms are insufficient against evolving cyber threats like insider attacks, malware intrusions, and Advanced Persistent Threats (APTs). These vulnerabilities expose Fintech organizations to significant risks, including financial losses and data breaches. To address these challenges, this paper proposes a blockchain-integrated Zero Trust framework, adhering to the principle of "Never Trust, Always Verify." The framework uses Ethereum smart contracts to enforce Multi Factor Authentication (MFA), Role-Based Access Control (RBAC), and Just-In-Time (JIT) access privileges, effectively mitigating credential theft and insider threats, the effect of malware and APT attacks. The proposed solution transforms blockchain into a Policy Engine (PE) and Policy Enforcement Point (PEP), and policy storage, ensuring immutable access control and micro-segmentation. A decentralized application (DApp) prototype was developed and tested using STRIDE threat modeling, demonstrating resilience against spoofing, tampering, and privilege escalation. Comparative analysis with Perimeter-based systems revealed a trade-off: while the framework introduced a marginal latency increase (74.0 ms vs. 49.33 ms) and reduced throughput (30.77 vs. 50.0 requests/sec), it significantly enhanced security by eliminating single points of failure and enabling tamper-proof audit trails. Experimental validation on a 200-node simulated network confirmed the framework's robustness, with future optimizations targeting Layer-2 solutions for scalability. This work bridges the gap between Zero Trust theory and practical blockchain implementation, offering Fintech organizations a decentralized, cost-effective security model.

Paperid: 3619, https://arxiv.org/pdf/2507.19880.pdf

Abstract:
The Model Context Protocol (MCP) represents a significant advancement in AI-tool integration, enabling seamless communication between AI agents and external services. However, this connectivity introduces novel attack vectors that remain largely unexplored. This paper demonstrates how unsophisticated threat actors, requiring only basic programming skills and free web tools, can exploit MCP's trust model to exfiltrate sensitive financial data. We present a proof-of-concept attack where a malicious weather MCP server, disguised as benign functionality, discovers and exploits legitimate banking tools to steal user account balances. The attack chain requires no advanced technical knowledge, server infrastructure, or monetary investment. The findings reveal a critical security gap in the emerging MCP ecosystem: while individual servers may appear trustworthy, their combination creates unexpected cross-server attack surfaces. Unlike traditional cybersecurity threats that assume sophisticated adversaries, our research shows that the barrier to entry for MCP-based attacks is alarmingly low. A threat actor with undergraduate-level Python knowledge can craft convincing social engineering attacks that exploit the implicit trust relationships MCP establishes between AI agents and tool providers. This work contributes to the nascent field of MCP security by demonstrating that current MCP implementations allow trivial cross-server attacks and proposing both immediate mitigations and protocol improvements to secure this emerging ecosystem.

Paperid: 3620, https://arxiv.org/pdf/2507.19295.pdf

Abstract:
Private Information Retrieval (PIR) schemes allow clients to retrieve files from a database without disclosing the requested file's identity to the server. In the pursuit of post-quantum security, most recent PIR schemes rely on hard lattice problems. In contrast, the so called CB-cPIR scheme stands out as a pioneering effort to base PIR schemes on hard problems in coding theory, thereby contributing significantly to the diversification of security foundations. However, our research reveals a critical vulnerability in CB-cPIR, substantially diminishing its security levels. Moreover, a comparative analysis with state-of-the-art PIR schemes shows that CB-cPIR's advantages are reduced, making it less competitive in terms of the communication cost. Nevertheless, our findings highlight the importance of continued research into code-based PIR schemes, as they have the potential to provide a valuable alternative to lattice-based approaches.

Paperid: 3621, https://arxiv.org/pdf/2507.18360.pdf

Abstract:
The protection of personal data has become a central topic in software development, especially with the implementation of the General Data Protection Law (LGPD) in Brazil and the General Data Protection Regulation (GDPR) in the European Union. With the enforcement of these laws, certain software quality criteria have become mandatory, such as data anonymization, which is one of the main aspects addressed by these regulations. The aim of this article is to analyze data anonymization techniques and assess their effectiveness in ensuring compliance with legal requirements and the utility of the data for its intended purpose. Techniques such as aggregation, generalization, perturbation, and k-anonymity were investigated and applied to datasets containing personal and sensitive data. The analysis revealed significant variations in the effectiveness of each method, highlighting the need to balance privacy and data utility.

Paperid: 3622, https://arxiv.org/pdf/2507.18157.pdf

Abstract:
Due to the merits of high efficiency and strong security against timing and side-channel attacks, ChaCha has been widely applied in real-time communication and data streaming scenarios. However, with the rapid development of AI-assisted cryptanalysis and quantum computing technologies, there are serious challenges to the secure implementation of ChaCha cipher. To further strengthen the security of ChaCha cipher, we propose an improved variant based on quantum random numbers, i.e., Quantum Random Number Enhanced ChaCha (QRE-ChaCha). Specifically, the design XORs the initial constants with quantum random numbers and periodically injects quantum random numbers into selected state words during odd rounds to enhance diffusion. Compared with the original ChaCha, the present variant shows stronger resistance to differential attacks and generates a keystream with statistical randomness, thereby offering increased robustness against both classical and quantum attacks. To evaluate the security and performance of the present ChaCha, our analysis proceeds in three main parts. Firstly, we analyze its theoretical security in terms of quantum randomness and attack testing, and conduct differential cryptanalysis with an automated search method based on the Boolean satisfiability problem (SAT). Secondly, we subject the keystream generated by the cipher to randomness tests using the NIST statistical test suite and the GM/T 0005-2021 randomness testing standard. Finally, we assess its encryption and decryption performance by measuring its encryption speed on files of various sizes. According to the results, the present ChaCha is significantly improved to resist differential attacks while maintaining the high efficiency of the original ChaCha cipher, and its keystream successfully passes statistical randomness tests using the NIST and GM/T 0005-2021 standards, meeting cryptographic application requirements.

Paperid: 3623, https://arxiv.org/pdf/2507.17888.pdf

Abstract:
Detecting security vulnerabilities in open-source software is a critical task that is highly regarded in the related research communities. Several approaches have been proposed in the literature for detecting vulnerable codes and identifying the classes of vulnerabilities. However, there is still room to work in explaining the root causes of detected vulnerabilities through locating vulnerable statements and the discovery of paths leading to the activation of the vulnerability. While frameworks like SliceLocator offer explanations by identifying vulnerable paths, they rely on rule-based sink identification that limits their generalization. In this paper, we introduce VulPathFinder, an explainable vulnerability path discovery framework that enhances SliceLocator's methodology by utilizing a novel Graph Neural Network (GNN) model for detecting sink statements, rather than relying on predefined rules. The proposed GNN captures semantic and syntactic dependencies to find potential sink points (PSPs), which are candidate statements where vulnerable paths end. After detecting PSPs, program slicing can be used to extract potentially vulnerable paths, which are then ranked by feeding them back into the target graph-based detector. Ultimately, the most probable path is returned, explaining the root cause of the detected vulnerability. We demonstrated the effectiveness of the proposed approach by performing evaluations on a benchmark of the buffer overflow CWEs from the SARD dataset, providing explanations for the corresponding detected vulnerabilities. The results show that VulPathFinder outperforms both original SliceLocator and GNNExplainer (as a general GNN explainability tool) in discovery of vulnerability paths to identified PSPs.

Paperid: 3624, https://arxiv.org/pdf/2507.17691.pdf

Abstract:
Software obfuscation, particularly prevalent in JavaScript, hinders code comprehension and analysis, posing significant challenges to software testing, static analysis, and malware detection. This paper introduces CASCADE, a novel hybrid approach that integrates the advanced coding capabilities of Gemini with the deterministic transformation capabilities of a compiler Intermediate Representation (IR), specifically JavaScript IR (JSIR). By employing Gemini to identify critical prelude functions, the foundational components underlying the most prevalent obfuscation techniques, and leveraging JSIR for subsequent code transformations, CASCADE effectively recovers semantic elements like original strings and API names, and reveals original program behaviors. This method overcomes limitations of existing static and dynamic deobfuscation techniques, eliminating hundreds to thousands of hardcoded rules while achieving reliability and flexibility. CASCADE is already deployed in Google's production environment, demonstrating substantial improvements in JavaScript deobfuscation efficiency and reducing reverse engineering efforts.

Paperid: 3625, https://arxiv.org/pdf/2507.17655.pdf

Abstract:
As cloud infrastructure becomes the backbone of modern organizations, the security of cryptographic key management, especially using Hardware Security Modules (HSMs) and Trusted Platform Modules (TPMs), faces unprecedented challenges. While these hardware-based solutions offer strong protection in isolated environments, their effectiveness is being undermined by cloud-native threats such as misconfigurations, compromised APIs, and lateral privilege escalations. This paper presents a comprehensive analysis of publicly disclosed attacks and breaches involving HSMs and TPMs in cloud environments, identifying recurring architectural and operational flaws. We propose a taxonomy of attack vectors based on real-world case studies and threat intelligence reports, highlighting the gaps between hardware trust anchors and dynamic cloud ecosystems. Furthermore, we evaluate emerging defensive paradigms: confidential computing, post-quantum cryptography, and decentralized key management systems (dKMS), assessing their potential to address these gaps. Our findings emphasize that securing cloud-based cryptographic trust requires a layered, context-aware approach that integrates both hardware and software safeguards. The study serves as a practical framework for cloud architects and security engineers to reassess key protection strategies in light of evolving threats. To our knowledge, this is the first work to synthesize documented, real-world cloud HSM and TPM failures into a coherent taxonomy grounded in modern threat models.

Paperid: 3626, https://arxiv.org/pdf/2507.16952.pdf

Abstract:
This study investigates the effectiveness of several machine learning algorithms for static malware detection using the EMBER dataset, which contains feature representations of Portable Executable (PE) files. We evaluate eight classification models: LightGBM, XGBoost, CatBoost, Random Forest, Extra Trees, HistGradientBoosting, k-Nearest Neighbors (KNN), and TabNet, under three preprocessing settings: original feature space, Principal Component Analysis (PCA), and Linear Discriminant Analysis (LDA). The models are assessed on accuracy, precision, recall, F1 score, and AUC to examine both predictive performance and robustness. Ensemble methods, especially LightGBM and XGBoost, show the best overall performance across all configurations, with minimal sensitivity to PCA and consistent generalization. LDA improves KNN performance but significantly reduces accuracy for boosting models. TabNet, while promising in theory, underperformed under feature reduction, likely due to architectural sensitivity to input structure. The analysis is supported by detailed exploratory data analysis (EDA), including mutual information ranking, PCA or t-SNE visualizations, and outlier detection using Isolation Forest and Local Outlier Factor (LOF), which confirm the discriminatory capacity of key features in the EMBER dataset. The results suggest that boosting models remain the most reliable choice for high-dimensional static malware detection, and that dimensionality reduction should be applied selectively based on model type. This work provides a benchmark for comparing classification models and preprocessing strategies in malware detection tasks and contributes insights that can guide future system development and real-world deployment.

Paperid: 3627, https://arxiv.org/pdf/2507.16852.pdf

Abstract:
Cyber Threat Intelligence (CTI) mining involves extracting structured insights from unstructured threat data, enabling organizations to understand and respond to evolving adversarial behavior. A key task in CTI mining is mapping threat descriptions to MITRE ATT\&CK techniques. However, this process is often performed manually, requiring expert knowledge and substantial effort. Automated approaches face two major challenges: the scarcity of high-quality labeled CTI data and class imbalance, where many techniques have very few examples. While domain-specific Large Language Models (LLMs) such as SecureBERT have shown improved performance, most recent work focuses on model architecture rather than addressing the data limitations. In this work, we present SynthCTI, a data augmentation framework designed to generate high-quality synthetic CTI sentences for underrepresented MITRE ATT\&CK techniques. Our method uses a clustering-based strategy to extract semantic context from training data and guide an LLM in producing synthetic CTI sentences that are lexically diverse and semantically faithful. We evaluate SynthCTI on two publicly available CTI datasets, CTI-to-MITRE and TRAM, using LLMs with different capacity. Incorporating synthetic data leads to consistent macro-F1 improvements: for example, ALBERT improves from 0.35 to 0.52 (a relative gain of 48.6\%), and SecureBERT reaches 0.6558 (up from 0.4412). Notably, smaller models augmented with SynthCTI outperform larger models trained without augmentation, demonstrating the value of data generation methods for building efficient and effective CTI classification systems.

Paperid: 3628, https://arxiv.org/pdf/2507.16540.pdf

Abstract:
Detecting security vulnerabilities in source code remains challenging, particularly due to class imbalance in real-world datasets where vulnerable functions are under-represented. Existing learning-based methods often optimise for recall, leading to high false positive rates and reduced usability in development workflows. Furthermore, many approaches lack explainability, limiting their integration into security workflows. This paper presents ExplainVulD, a graph-based framework for vulnerability detection in C/C++ code. The method constructs Code Property Graphs and represents nodes using dual-channel embeddings that capture both semantic and structural information. These are processed by an edge-aware attention mechanism that incorporates edge-type embeddings to distinguish among program relations. To address class imbalance, the model is trained using class-weighted cross-entropy loss. ExplainVulD achieves a mean accuracy of 88.25 percent and an F1 score of 48.23 percent across 30 independent runs on the ReVeal dataset. These results represent relative improvements of 4.6 percent in accuracy and 16.9 percent in F1 score compared to the ReVeal model, a prior learning-based method. The framework also outperforms static analysis tools, with relative gains of 14.0 to 14.1 percent in accuracy and 132.2 to 201.2 percent in F1 score. Beyond improved detection performance, ExplainVulD produces explainable outputs by identifying the most influential code regions within each function, supporting transparency and trust in security triage.

Paperid: 3629, https://arxiv.org/pdf/2507.16276.pdf

Abstract:
In an increasingly complex contractual landscape, the demand for transparency, security, and efficiency has intensified. Blockchain technology, with its decentralized and immutable nature, addresses these challenges by reducing intermediary costs, minimizing fraud risks, and enhancing system compatibility. Smart contracts, initially conceptualized by Nick Szabo and later implemented on the Ethereum blockchain, automate and secure contractual clauses, offering a robust solution for various industries. However, their complexity and the requirement for advanced programming skills present significant barriers to widespread adoption. This study introduces a multi-level finite state machine model designed to represent and track the execution of smart contracts. Our model aims to simplify smart contract development by providing a formalized framework that abstracts underlying technical complexities, making it accessible to professionals without deep technical expertise. The hierarchical structure of the multi-level finite state machine enhances contract modularity and traceability, facilitating detailed representation and evaluation of functional properties. The paper explores the potential of this multi-level approach, reviewing existing methodologies and tools, and detailing the smart contract generation process with an emphasis on reusable components and modularity. We also conduct a security analysis to evaluate potential vulnerabilities in our model, ensuring the robustness and reliability of the generated smart contracts.

Paperid: 3630, https://arxiv.org/pdf/2507.16045.pdf

Abstract:
Online content creators spend significant time and effort building their user base through a long, often arduous process, which requires finding the right ``niche'' to cater to. So, what incentive is there for an established content creator known for cat memes to completely reinvent their page channel and start promoting cryptocurrency services or cover electoral news events? And, if they do, do their existing subscribers not notice? We explore this problem of \textit{repurposed channels}, whereby a channel changes its identity and contents. We first characterize a market for ``second-hand'' social media accounts, which recorded sales exceeding USD~1M during our 6-month observation period. By observing YouTube channels (re)sold over these 6~months, we find that a substantial number (37\%) are used to disseminate potentially harmful content, often without facing any penalty. Even more surprisingly, these channels seem to gain rather than lose subscribers. To estimate the prevalence of channel repurposing ``in the wild,'' we also collect two snapshots of 1.4M quasi-randomly sampled YouTube accounts. In a 3-month period, we estimate that $\sim$0.25\% channels -- collectively holding $\sim$44M subscribers -- were repurposed. We confirm that these repurposed channels share several characteristics with sold channels -- mainly, the fact that they had a significantly high presence of potentially problematic content. Across repurposed channels, we find channels that became disinformation channels, as well as channels that link to web pages with financial scams. We reason that abusing the residual trust placed on these channels is advantageous to financially- and ideologically-motivated adversaries. This phenomenon is not exclusive to YouTube and we posit that the market for cultivating organic audiences is set to grow, particularly if it remains unchallenged by mitigations, technical or otherwise.

Paperid: 3631, https://arxiv.org/pdf/2507.16040.pdf

Abstract:
An oblivious pseudorandom function (OPRF) is a protocol by which a client and server interact to evaluate a pseudorandom function on a key provided by the server and an input provided by the client, without divulging the key or input to the other party. We extend this notion by enabling the server to specify a blocklist, such that OPRF evaluation succeeds only if the client's input is not on the blocklist. More specifically, our design gains performance by embedding the client input into a metric space, where evaluation continues only if this embedding does not cluster with blocklist elements. Our framework exploits this structure to separate the embedding and blocklist check to enable efficient implementations of each, but then must stitch these phases together through cryptographic means. Our framework also supports subsequent evaluation of the OPRF on the same input more efficiently. We demonstrate the use of our design for password blocklisting in augmented password-authenticated key exchange, and to MAC only executables that are not similar to ones on a blocklist of known malware.

Paperid: 3632, https://arxiv.org/pdf/2507.15859.pdf

Abstract:
AI Innovations in the IoT for Real-Time Patient Monitoring On one hand, the current traditional centralized healthcare architecture poses numerous issues, including data privacy, delay, and security. Here, we present an AI-enabled decentralized IoT architecture that can address such challenges during a pandemic and critical care settings. This work presents our architecture to enhance the effectiveness of the current available federated learning, blockchain, and edge computing approach, maximizing data privacy, minimizing latency, and improving other general system metrics. Experimental results demonstrate transaction latency, energy consumption, and data throughput orders of magnitude lower than competitive cloud solutions.

Paperid: 3633, https://arxiv.org/pdf/2507.15660.pdf

Abstract:
Mega events such as the Olympics, World Cup tournaments, G-20 Summit, religious events such as MahaKumbh are increasingly digitalized. From event ticketing, vendor booth or lodging reservations, sanitation, event scheduling, customer service, crime reporting, media streaming and messaging on digital display boards, surveillance, crowd control, traffic control and many other services are based on mobile and web applications, wired and wireless networking, network of Closed-Circuit Television (CCTV) cameras, specialized control room with network and video-feed monitoring. Consequently, cyber threats directed at such digital infrastructure are common. Starting from hobby hackers, hacktivists, cyber crime gangs, to the nation state actors, all target such infrastructure to unleash chaos on an otherwise smooth operation, and often the cyber threat actors attempt to embarrass the organizing country or the organizers. Unlike long-standing organizations such as a corporate or a government department, the infrastructure of mega-events is temporary, constructed over a short time span in expediency, and often shortcuts are taken to make the deadline for the event. As a result, securing such an elaborate yet temporary infrastructure requires a different approach than securing a standard organizational digital infrastructure. In this paper, we describe our approach to securing MahaKumbh 2025, a 600 million footfall event for 45 days in Prayagraj, India, as a cyber security assessment and risk management oversight team. We chronicle the scope, process, methodology, and outcome of our team's effort to secure this mega event. It should be noted that none of the cyber attacks during the 45-day event was successful. Our goal is to put on record the methodology and discuss what we would do differently in case we work on similar future mega event.

Paperid: 3634, https://arxiv.org/pdf/2507.15613.pdf

Abstract:
Large Language Models (LLMs) deployed in enterprise settings (e.g., as Microsoft 365 Copilot) face novel security challenges. One critical threat is prompt inference attacks: adversaries chain together seemingly benign prompts to gradually extract confidential data. In this paper, we present a comprehensive study of multi-stage prompt inference attacks in an enterprise LLM context. We simulate realistic attack scenarios where an attacker uses mild-mannered queries and indirect prompt injections to exploit an LLM integrated with private corporate data. We develop a formal threat model for these multi-turn inference attacks and analyze them using probability theory, optimization frameworks, and information-theoretic leakage bounds. The attacks are shown to reliably exfiltrate sensitive information from the LLM's context (e.g., internal SharePoint documents or emails), even when standard safety measures are in place. We propose and evaluate defenses to counter such attacks, including statistical anomaly detection, fine-grained access control, prompt sanitization techniques, and architectural modifications to LLM deployment. Each defense is supported by mathematical analysis or experimental simulation. For example, we derive bounds on information leakage under differential privacy-based training and demonstrate an anomaly detection method that flags multi-turn attacks with high AUC. We also introduce an approach called "spotlighting" that uses input transformations to isolate untrusted prompt content, reducing attack success by an order of magnitude. Finally, we provide a formal proof of concept and empirical validation for a combined defense-in-depth strategy. Our work highlights that securing LLMs in enterprise settings requires moving beyond single-turn prompt filtering toward a holistic, multi-stage perspective on both attacks and defenses.

Paperid: 3635, https://arxiv.org/pdf/2507.15058.pdf

Abstract:
A fundamental problem in cybersecurity and computer science is determining whether a program is free of bugs and vulnerabilities. Fuzzing, a popular approach to discovering vulnerabilities in programs, has several advantages over alternative strategies, although it has investment costs in the form of initial setup and continuous maintenance. The choice of fuzzing is further complicated when only a binary library is available, such as the case of closed-source and proprietary software. In response, we introduce LibLMFuzz, a framework that reduces costs associated with fuzzing closed-source libraries by pairing an agentic Large Language Model (LLM) with a lightweight tool-chain (disassembler/compiler/fuzzer) to autonomously analyze stripped binaries, plan fuzz strategies, generate drivers, and iteratively self-repair build or runtime errors. Tested on four widely-used Linux libraries, LibLMFuzz produced syntactically correct drivers for all 558 fuzz-able API functions, achieving 100% API coverage with no human intervention. Across the 1601 synthesized drivers, 75.52% were nominally correct on first execution. The results show that LLM-augmented middleware holds promise in reducing the costs of fuzzing black box components and provides a foundation for future research efforts. Future opportunities exist for research in branch coverage.

Paperid: 3636, https://arxiv.org/pdf/2507.14588.pdf

Abstract:
Secure federated learning enables collaborative model training across decentralized users while preserving data privacy. A key component is secure aggregation, which keeps individual updates hidden from both the server and users, while also defending against Byzantine users who corrupt the aggregation. To this end, Jinhyun So et al. recently developed a Byzantine-resilient secure aggregation scheme using a secret-sharing strategy over finite-field arithmetic. However, such an approach can suffer from numerical errors and overflows when applied to real-valued model updates, motivating the need for secure aggregation methods that operate directly over the real domain. We propose FORTA, a Byzantine-resilient secure aggregation framework that operates entirely in the real domain. FORTA leverages Discrete Fourier Transform (DFT) codes for privacy and employs Krum-based outlier detection for robustness. While DFT decoder is error-free under infinite precision, finite precision introduces numerical perturbations that can distort distance estimates and allow malicious updates to evade detection. To address this, FORTA refines Krum using feedback from DFT decoder, improving the selection of trustworthy updates. Theoretical analysis and experiments show that our modification of Krum offers improved robustness and more accurate aggregation than standard Krum.

Paperid: 3637, https://arxiv.org/pdf/2507.14229.pdf

Abstract:
We investigate the cryptanalysis of affine ciphers using a hybrid neural network architecture that combines modular arithmetic-aware and statistical feature-based learning. Inspired by recent advances in interpretable neural networks for modular arithmetic and neural cryptanalysis of classical ciphers, our approach integrates a modular branch that processes raw ciphertext sequences and a statistical branch that leverages letter frequency features. Experiments on datasets derived from natural English text demonstrate that the hybrid model attains high key recovery accuracy for short and moderate ciphertexts, outperforming purely statistical approaches for the affine cipher. However, performance degrades for very long ciphertexts, highlighting challenges in model generalization.

Paperid: 3638, https://arxiv.org/pdf/2507.14213.pdf

Abstract:
The Big Data revolution has heightened the demand for robust, energy-efficient security hardware capable of withstanding increasingly sophisticated cyber threats. Conventional encryption schemes, reliant on complex algorithms, are resource-intensive and remain vulnerable. To fortify sensitive information, society needs innovative anti-hacking and anti-counterfeiting technologies that exploit new materials and designs. Here, we present a magneto-ionic strategy for hardware-level security based on fully selective voltage-controlled N3- ion migration within pre-defined, initially paramagnetic FeCoN dots. This process generates ferromagnetic sublayers of tuneable thickness, resulting in either deterministic (single-domain or vortex) or probabilistic states (with coexisting magnetic configurations and voltage-adjustable probabilities), each exhibiting stochastic orientation and chirality, thereby providing a rich platform for magnetic fingerprinting. This approach enables self-protected primitives, including true random number generators, physical unclonable functions, and in-memory probabilistic inference. The resulting reconfigurable architecture combines tamper resistance, low energy consumption, and scalability, marking a significant leap toward next-generation hardware security rooted in emergent magnetic phenomena.

Paperid: 3639, https://arxiv.org/pdf/2507.14207.pdf

Abstract:
The integration of Large Language Models (LLMs) in K--12 education offers both transformative opportunities and emerging risks. This study explores how students may Trojanize prompts to elicit unsafe or unintended outputs from LLMs, bypassing established content moderation systems with safety guardrils. Through a systematic experiment involving simulated K--12 queries and multi-turn dialogues, we expose key vulnerabilities in GPT-3.5 and GPT-4. This paper presents our experimental design, detailed findings, and a prototype tool, TrojanPromptGuard (TPG), to automatically detect and mitigate Trojanized educational prompts. These insights aim to inform both AI safety researchers and educational technologists on the safe deployment of LLMs for educators.

Paperid: 3640, https://arxiv.org/pdf/2507.14197.pdf

Abstract:
We introduce DM-RSA (Dual Modulus RSA), a variant of the RSA cryptosystem that employs two distinct moduli symmetrically to enhance security. By leveraging the Chinese Remainder Theorem (CRT) for decryption, DM-RSA provides increased robustness against side-channel attacks while preserving the efficiency of classical RSA. This approach improves resistance to partial compromise of a modulus and integrates easily into existing infrastructures.

Paperid: 3641, https://arxiv.org/pdf/2507.13932.pdf

Abstract:
The rise of blockchain and Digital Ledger Technology (DLT) has gained wide traction. Instead of relying on a traditional centralized data authority, a blockchain system consists of digitally entangled block data shared across a distributed network. The specially designed chain data structure and its consensus mechanism protect blockchain data from being tampered by unauthorized adversaries. However, implementing a full-fledged blockchain system to protect a database can be technically cumbersome. In this work, we introduce an in-database design, named chain table, to protect data integrity without the need for a blockchain system. It features a succinct design without significant technology barriers or storage overhead. To realize rigorous data security, we also propose a set of data writing principles for the chain table. We prove that the chain table, together with the data writing principles, will guarantee flexible data integrity, named table-level data integrity (TDI).

Paperid: 3642, https://arxiv.org/pdf/2507.13629.pdf

Abstract:
Large Language Models (LLMs) are transforming cybersecurity by enabling intelligent, adaptive, and automated approaches to threat detection, vulnerability assessment, and incident response. With their advanced language understanding and contextual reasoning, LLMs surpass traditional methods in tackling challenges across domains such as IoT, blockchain, and hardware security. This survey provides a comprehensive overview of LLM applications in cybersecurity, focusing on two core areas: (1) the integration of LLMs into key cybersecurity domains, and (2) the vulnerabilities of LLMs themselves, along with mitigation strategies. By synthesizing recent advancements and identifying key limitations, this work offers practical insights and strategic recommendations for leveraging LLMs to build secure, scalable, and future-ready cyber defense systems.

Paperid: 3643, https://arxiv.org/pdf/2507.13038.pdf

Abstract:
Multi-agent debate (MAD) systems leverage collaborative interactions among large language models (LLMs) agents to improve reasoning capabilities. While recent studies have focused on increasing the accuracy and scalability of MAD systems, their security vulnerabilities have received limited attention. In this work, we introduce MAD-Spear, a targeted prompt injection attack that compromises a small subset of agents but significantly disrupts the overall MAD process. Manipulated agents produce multiple plausible yet incorrect responses, exploiting LLMs' conformity tendencies to propagate misinformation and degrade consensus quality. Furthermore, the attack can be composed with other strategies, such as communication attacks, to further amplify its impact by increasing the exposure of agents to incorrect responses. To assess MAD's resilience under attack, we propose a formal definition of MAD fault-tolerance and develop a comprehensive evaluation framework that jointly considers accuracy, consensus efficiency, and scalability. Extensive experiments on five benchmark datasets with varying difficulty levels demonstrate that MAD-Spear consistently outperforms the baseline attack in degrading system performance. Additionally, we observe that agent diversity substantially improves MAD performance in mathematical reasoning tasks, which challenges prior work suggesting that agent diversity has minimal impact on performance. These findings highlight the urgent need to improve the security in MAD design.

Paperid: 3644, https://arxiv.org/pdf/2507.12937.pdf

Abstract:
This paper presents a comprehensive analysis of T-Mobile's critical data breaches in 2021 and 2023, alongside a full-spectrum security audit targeting its systems, infrastructure, and publicly exposed endpoints. By combining case-based vulnerability assessments with active ethical hacking techniques--including Shodan reconnaissance, API misuse simulations, VNC brute-forcing, firmware reverse engineering, and web application scans--we uncover structural weaknesses persisting beyond the initial breach events. Building on these findings, we propose a multi-layered defensive strategy encompassing Zero Trust Architecture, granular role-based access control, network segmentation, firmware encryption using AES with integrity checks, and API rate limiting and token lifecycle control. Financial modelling demonstrates that a five-year investment yields less than 1.1% of expected breach losses, validating the cost-effectiveness of proactive security measures. Our work bridges post-incident forensic analysis with hands-on security evaluation, providing an actionable blueprint for large-scale telecoms seeking operational resilience, regulatory compliance, and cross-domain threat readiness.

Paperid: 3645, https://arxiv.org/pdf/2507.12919.pdf

Abstract:
Architectural backdoors pose an under-examined but critical threat to deep neural networks, embedding malicious logic directly into a model's computational graph. Unlike traditional data poisoning or parameter manipulation, architectural backdoors evade standard mitigation techniques and persist even after clean retraining. This survey systematically consolidates research on architectural backdoors, spanning compiler-level manipulations, tainted AutoML pipelines, and supply-chain vulnerabilities. We assess emerging detection and defense strategies, including static graph inspection, dynamic fuzzing, and partial formal verification, and highlight their limitations against distributed or stealth triggers. Despite recent progress, scalable and practical defenses remain elusive. We conclude by outlining open challenges and proposing directions for strengthening supply-chain security, cryptographic model attestations, and next-generation benchmarks. This survey aims to guide future research toward comprehensive defenses against structural backdoor threats in deep learning systems.

Paperid: 3646, https://arxiv.org/pdf/2507.12872.pdf

Abstract:
Frontier AI systems are rapidly advancing in their capabilities to persuade, deceive, and influence human behaviour, with current models already demonstrating human-level persuasion and strategic deception in specific contexts. Humans are often the weakest link in cybersecurity systems, and a misaligned AI system deployed internally within a frontier company may seek to undermine human oversight by manipulating employees. Despite this growing threat, manipulation attacks have received little attention, and no systematic framework exists for assessing and mitigating these risks. To address this, we provide a detailed explanation of why manipulation attacks are a significant threat and could lead to catastrophic outcomes. Additionally, we present a safety case framework for manipulation risk, structured around three core lines of argument: inability, control, and trustworthiness. For each argument, we specify evidence requirements, evaluation methodologies, and implementation considerations for direct application by AI companies. This paper provides the first systematic methodology for integrating manipulation risk into AI safety governance, offering AI companies a concrete foundation to assess and mitigate these threats before deployment.

Paperid: 3647, https://arxiv.org/pdf/2507.12652.pdf

Abstract:
Invasive and non-invasive neural interfaces hold promise as high-bandwidth input devices for next-generation technologies. However, neural signals inherently encode sensitive information about an individual's identity and health, making data sharing for decoder training a critical privacy challenge. Federated learning (FL), a distributed, privacy-preserving learning framework, presents a promising solution, but it remains unexplored in closed-loop adaptive neural interfaces. Here, we introduce FL-based neural decoding and systematically evaluate its performance and privacy using high-dimensional electromyography signals in both open- and closed-loop scenarios. In open-loop simulations, FL significantly outperformed local learning baselines, demonstrating its potential for high-performance, privacy-conscious neural decoding. In contrast, closed-loop user studies required adapting FL methods to accommodate single-user, real-time interactions, a scenario not supported by standard FL. This modification resulted in local learning decoders surpassing the adapted FL approach in closed-loop performance, yet local learning still carried higher privacy risks. Our findings highlight a critical performance-privacy tradeoff in real-time adaptive applications and indicate the need for FL methods specifically designed for co-adaptive, single-user applications.

Paperid: 3648, https://arxiv.org/pdf/2507.11772.pdf

Abstract:
Distributed Denial of Service (DDoS) attacks have become increasingly prevalent and dangerous in the context of Internet of Things (IoT) networks, primarily due to the low-security configurations of many connected devices. This paper analyzes the nature and impact of DDoS attacks such as those launched by the Mirai botnet, and proposes layered mitigation strategies tailored to IoT environments. Key solutions explored include IPv6 Unique Local Addresses (ULA), edge computing, software-defined networking (SDN), honeypot deception, and machine learning-based intrusion detection systems. The paper aims to help engineers and researchers understand and implement practical countermeasures to protect IoT infrastructures.

Paperid: 3649, https://arxiv.org/pdf/2507.11499.pdf

Abstract:
Next-Generation Radio Access Networks (NGRAN) aim to support diverse vertical applications with strict security, latency, and Service-Level Agreement (SLA) requirements. These demands introduce challenges in securing the infrastructure, allocating resources dynamically, and enabling real-time reconfiguration. This demo presents SnSRIC, a secure and intelligent network slicing framework that mitigates a range of Distributed Denial-of-Service (DDoS) attacks in Open RAN environments. SnSRIC incorporates an AI-driven xApp that dynamically allocates Physical Resource Blocks (PRBs) to active users while enforcing slice-level security. The system detects anomalous behavior, distinguishes between benign and malicious devices, and uses the E2 interface to throttle rogue signaling while maintaining service continuity for legitimate users.

Paperid: 3650, https://arxiv.org/pdf/2507.11324.pdf

Abstract:
Privacy Preserving Synthetic Data Generation (PP-SDG) has emerged to produce synthetic datasets from personal data while maintaining privacy and utility. Differential privacy (DP) is the property of a PP-SDG mechanism that establishes how protected individuals are when sharing their sensitive data. It is however difficult to interpret the privacy budget ($\varepsilon$) expressed by DP. To make the actual risk associated with the privacy budget more transparent, multiple privacy metrics (PMs) have been proposed to assess the privacy risk of the data. These PMs are utilized in separate studies to assess newly introduced PP-SDG mechanisms. Consequently, these PMs embody the same assumptions as the PP-SDG mechanism they were made to assess. Therefore, a thorough definition of how these are calculated is necessary. In this work, we present the assumptions and mathematical formulations of 17 distinct privacy metrics.

Paperid: 3651, https://arxiv.org/pdf/2507.10819.pdf

Abstract:
The main objective of this technical report is to conduct a comprehensive study on devices operating within Industrial Internet of Things (IIoT) environments, describing the scenarios that define this category and analysing the vulnerabilities that compromise their security. To this end, the report seeks to identify and examine the main classes of IIoT devices, detailing their characteristics, functionalities, and roles within industrial systems. This analysis enables a better understanding of how these devices interact and fulfil the requirements of critical industrial environments. The report also explores the specific contexts in which these devices operate, highlighting the distinctive features of industrial scenarios and the conditions under which the devices function. Furthermore, it analyses the vulnerabilities affecting IIoT devices, outlining their vectors, targets, impact, and consequences. The report then describes the typical phases of an attack, along with a selection of real-world documented incidents. These cases are classified according to the taxonomy presented in Section 3, providing a comprehensive view of the potential threats to security and assessing the impact these vulnerabilities may have on industrial environments. Finally, the report presents a compilation of some of the most recent and effective security countermeasures as potential solutions to the security challenges faced by industrial systems. Special emphasis is placed on the role of Machine Learning in the development of these approaches, underscoring its importance in enhancing industrial cybersecurity.

Paperid: 3652, https://arxiv.org/pdf/2507.10730.pdf

Abstract:
This paper presents a novel key-based access control technique for secure outsourcing key-value stores where values correspond to documents that are indexed and accessed using keys. The proposed approach adopts Shamir's secret-sharing that offers unconditional or information-theoretic security. It supports keyword-based document retrieval while preventing leakage of the data, access rights of users, or the size (\textit{i}.\textit{e}., volume of the output that satisfies a query). The proposed approach allows servers to detect (and abort) malicious clients from gaining unauthorized access to data, and prevents malicious servers from altering data undetected while ensuring efficient access -- it takes 231.5ms over 5,000 keywords across 500,000 files.

Paperid: 3653, https://arxiv.org/pdf/2507.10627.pdf

Abstract:
Given a graph $G$ defined in a domain $\mathcal{G}$, we investigate locally differentially private mechanisms to release a degree sequence on $\mathcal{G}$ that accurately approximates the actual degree distribution. Existing solutions for this problem mostly use graph projection techniques based on edge deletion process, using a threshold parameter $Î¸$ to bound node degrees. However, this approach presents a fundamental trade-off in threshold parameter selection. While large $Î¸$ values introduce substantial noise in the released degree sequence, small $Î¸$ values result in more edges removed than necessary. Furthermore, $Î¸$ selection leads to an excessive communication cost. To remedy existing solutions' deficiencies, we present CADR-LDP, an efficient framework incorporating encryption techniques and differentially private mechanisms to release the degree sequence. In CADR-LDP, we first use the crypto-assisted Optimal-$Î¸$-Selection method to select the optimal parameter with a low communication cost. Then, we use the LPEA-LOW method to add some edges for each node with the edge addition process in local projection. LPEA-LOW prioritizes the projection with low-degree nodes, which can retain more edges for such nodes and reduce the projection error. Theoretical analysis shows that CADR-LDP satisfies $Îµ$-node local differential privacy. The experimental results on eight graph datasets show that our solution outperforms existing methods.

Paperid: 3654, https://arxiv.org/pdf/2507.10233.pdf

Abstract:
Quantum digital signatures ensure unforgeable message authenticity and integrity using quantum principles, offering unconditional security against both classical and quantum attacks. They are crucial for secure communication in high-stakes environments, ensuring trust and long-term protection in the quantum era. Nowadays, the majority of arbitrated quantum signature (AQS) protocols encrypt data qubit by qubit using the quantum one-time pad (QOTP). Despite providing robust data encryption, QOTP is not a good fit for AQS because of its susceptibility to many types of attacks. In this work, we present an efficient AQS protocol to encrypt quantum message ensembles using a distinct encryption technique, the chained controlled unitary operations. In contrast to existing protocols, our approach successfully prevents disavowal and forgery attacks. We hope this contributes to advancing future investigations into the development of AQS protocols.

Paperid: 3655, https://arxiv.org/pdf/2507.09859.pdf

Abstract:
Self-Sovereign Identity (SSI) offers significant potential for managing identities in the Internet of Things (IoT), enabling decentralized authentication and credential management without reliance on centralized entities. However, existing SSI frameworks often limit credential issuance and revocation to trusted entities, such as IoT manufacturers, which restricts flexibility in dynamic IoT ecosystems. In this paper, we propose a blockchain-based SSI framework that allows any individual with a verifiable trust linkage to act as a credential issuer, ensuring decentralized and scalable identity management. Our framework incorporates a layered architecture, where trust is dynamically established through endorsement-based calculations and maintained via a hierarchical chain-of-trust mechanism. Blockchain serves as the Verifiable Data Registry, ensuring transparency and immutability of identity operations, while smart contracts automate critical processes such as credential issuance, verification, and revocation. A proof-of-concept implementation demonstrates that the proposed framework is feasible and incurs minimal overheads compared to the baseline, making it well-suited for dynamic and resource-constrained IoT environments.

Paperid: 3656, https://arxiv.org/pdf/2507.09453.pdf

Abstract:
The digitization of democratic processes promises greater accessibility but presents challenges in terms of security, privacy, and verifiability. Existing electronic voting systems often rely on centralized architectures, creating single points of failure and forcing too much trust in authorities, which contradicts democratic principles. This research addresses the challenge of creating a secure, private e-voting system with minimized trust dependencies designed for the most versatile personal device: the smartphone. We introduce SmartphoneDemocracy, a novel e-voting protocol that combines three key technologies: the emerging European Digital Identity (EUDI) Wallet for Sybil-resistant identity verification, Zero-Knowledge Proofs for privacy-preserving validation, and a peer-to-peer blockchain (TrustChain) for a resilient, serverless public bulletin board. Our protocol enables voters to register and cast ballots anonymously and verifiably directly from their smartphones. We provide a detailed protocol design, a security analysis against a defined threat model, and a performance evaluation demonstrating that the computational and network overhead is feasible for medium- to large-scale elections. By developing and prototyping this system, we demonstrate a viable path to empower citizens with a trustworthy, accessible, and user-controlled digital voting experience.

Paperid: 3657, https://arxiv.org/pdf/2507.09329.pdf

Abstract:
LLM-based coding agents are rapidly being deployed in software development, yet their safety implications remain poorly understood. These agents, while capable of accelerating software development, may exhibit unsafe behaviors during normal operation that manifest as cybersecurity vulnerabilities. We conducted the first systematic safety evaluation of autonomous coding agents, analyzing over 12,000 actions across five state-of-the-art models (GPT-4o, GPT-4.1, Claude variants) on 93 real-world software setup tasks. Our findings reveal significant security concerns: 21% of agent trajectories contained insecure actions, with models showing substantial variation in unsafe behavior. We developed a high-precision detection system that identified four major vulnerability categories, with information exposure (CWE-200) being the most prevalent one. We also evaluated mitigation strategies including feedback mechanisms and security reminders with various effectiveness between models. GPT-4.1 demonstrated exceptional security awareness with 96.8% mitigation success.

Paperid: 3658, https://arxiv.org/pdf/2507.09231.pdf

Abstract:
Transparency is one of the key benefits of public blockchains. However, the public visibility of transactions potentially compromises users' privacy. The fundamental challenge is to balance the intrinsic benefits of blockchain openness with the vital need for individual confidentiality. The proposal suggests creating a confidential version of wrapped Ethereum (cWETH) fully within the application layer. The solution combines the Elliptic Curve (EC) Twisted ElGamal-based commitment scheme to preserve confidentiality and the EC Diffie-Hellman (DH) protocol to introduce accessibility limited by the commitment scheme. To enforce the correct generation of commitments, encryption, and decryption, zk-SNARKs are utilized.

Paperid: 3659, https://arxiv.org/pdf/2507.09190.pdf

Abstract:
Protecting personal computers (PCs) from unauthorized access typically relies on password authentication, which is know to suffer from cognitive burden and weak credentials. As many users nowadays carry mobile devices with advanced security features throughout their day, there is an opportunity to leverage these devices to improve authentication to PCs. In this paper we utilize a token-based passwordless approach where users authenticate to their PC by confirming the authentication request on their smartphones or smartwatches. Upon a request to login to the PC, or to evaluate privileges, the PC issues an authentication request that users receive on their mobile devices, where users can confirm or deny the request. We evaluate button tap and biometric fingerprint verification as confirmation variants, and compare their authentication duration, success rate, and usability to traditional password-based authentication in a user study with 30 participants and a total of 1,200 authentication attempts. Smartwatch-based authentication outperformed password-based authentication and smartphone-based variants in authentication duration, while showing comparable success rates. Participants rated smartwatch-based authentication highest in usability, followed by password-based authentication and smartphone-based authentication.

Paperid: 3660, https://arxiv.org/pdf/2507.09022.pdf

Abstract:
We propose a method for using Web Authentication APIs for SSH authentication, enabling passwordless remote server login with passkeys. These are credentials that are managed throughout the key lifecycle by an authenticator on behalf of the user and offer strong security guarantees. Passwords remain the dominant mode of SSH authentication, despite their well known flaws such as phishing and reuse. SSH's custom key-based authentication protocol can alleviate these issues but remains vulnerable to key theft. Additionally, it has poor usability, with even knowledgeable users leaking key material and failing to verify fingerprints. Hence, effective key management remains a critical open area in SSH security. In contrast, WebAuthn is a modern authentication standard designed to replace passwords, managing keys on behalf of the user. As a web API, this standard cannot integrate with SSH directly. We propose a framework to integrate WebAuthn with SSH servers, by using UNIX pluggable authentication modules (PAM). Our approach is backwards-compatible, supports stock SSH servers and requires no new software client-side. It offers protection for cryptographic material at rest, resistance to key leaks, phishing protection, privacy protection and attestation capability. None of these properties are offered by passwords nor traditional SSH keys. We validate these advantages with a structured, conceptual security analysis. We develop a prototype implementation and conduct a user study to quantify the security advantages of our proposal, testing our prototype with 40 SSH users. The study confirms the security problems of SSH-keys, including 20% of the cohort leaking their private keys. Our SSH-passkeys effectively address these problems: we find a 90% reduction in critical security errors, while reducing authentication time by 4x on average.

Paperid: 3661, https://arxiv.org/pdf/2507.08978.pdf

Abstract:
Increasingly, students begin learning aspects of security and privacy during their primary and secondary education (grades K-12 in the United States). Individual U.S. states and some national organizations publish teaching standards -- guidance that outlines expectations for what students should learn -- which often form the basis for course curricula. However, research has not yet examined what is covered by these standards and whether the topics align with what the broader security and privacy community thinks students should know. To shed light on these questions, we started by collecting computer science teaching standards from all U.S. states and eight national organizations. After manually examining a total of 11,954 standards, we labeled 3,778 of them as being related to security and privacy, further classifying these into 103 topics. Topics ranged from technical subjects like encryption, network security, and embedded systems to social subjects such as laws, ethics, and appropriate online behavior. Subsequently, we interviewed 11 security and privacy professionals to examine how the teaching standards align with their expectations. We found that, while the specific topics they mentioned mostly overlapped with those of existing standards, professionals placed a greater emphasis on threat modeling and security mindset.

Paperid: 3662, https://arxiv.org/pdf/2507.08853.pdf

Abstract:
As archives turn to artificial intelligence to manage growing volumes of digital records, privacy risks inherent in current AI data practices raise critical concerns about data sovereignty and ethical accountability. This paper explores how privacy-enhancing technologies (PETs) and Web3 architectures can support archives to preserve control over sensitive content while still being able to make it available for access by researchers. We present Clio-X, a decentralized, privacy-first Web3 digital solution designed to embed PETs into archival workflows and support AI-enabled reference and access. Drawing on a user evaluation of a medium-fidelity prototype, the study reveals both interest in the potential of the solution and significant barriers to adoption related to trust, system opacity, economic concerns, and governance. Using Rogers' Diffusion of Innovation theory, we analyze the sociotechnical dimensions of these barriers and propose a path forward centered on participatory design and decentralized governance through a Clio-X Decentralized Autonomous Organization. By integrating technical safeguards with community-based oversight, Clio-X offers a novel model to ethically deploy AI in cultural heritage contexts.

Paperid: 3663, https://arxiv.org/pdf/2507.08331.pdf

Abstract:
In recent years, the increasing awareness of cybersecurity has led to a heightened focus on information security within hardware devices and products. Incorporating Trusted Execution Environments (TEEs) into product designs has become a standard practice for safeguarding sensitive user information. However, vulnerabilities within these components present significant risks, if exploited by attackers, these vulnerabilities could lead to the leakage of sensitive data, thereby compromising user privacy and security. This research centers on trusted applications (TAs) within the Qualcomm TEE and introduces a novel emulator specifically designed for these applications. Through reverse engineering techniques, we thoroughly analyze Qualcomm TAs and develop a partial emulation environment that accurately emulates their behavior. Additionally, we integrate fuzzing testing techniques into the emulator to systematically uncover potential vulnerabilities within Qualcomm TAs, demonstrating its practical effectiveness in identifying real-world security flaws. This research makes a significant contribution by being the first to provide both the implementation methods and source codes for a Qualcomm TAs emulator, offering a valuable reference for future research efforts. Unlike previous approaches that relied on complex and resource-intensive full-system simulations, our approach is lightweight and effective, making security testing of TA more convenient.

Paperid: 3664, https://arxiv.org/pdf/2507.08288.pdf

Abstract:
Watermarking technology has gained significant attention due to the increasing importance of intellectual property (IP) rights, particularly with the growing deployment of large language models (LLMs) on billions resource-constrained edge devices. To counter the potential threats of IP theft by malicious users, this paper introduces a robust watermarking scheme without retraining or fine-tuning for transformer models. The scheme generates a unique key for each user and derives a stable watermark value by solving linear constraints constructed from model invariants. Moreover, this technology utilizes noise mechanism to hide watermark locations in multi-user scenarios against collusion attack. This paper evaluates the approach on three popular models (Llama3, Phi3, Gemma), and the experimental results confirm the strong robustness across a range of attack methods (fine-tuning, pruning, quantization, permutation, scaling, reversible matrix and collusion attacks).

Paperid: 3665, https://arxiv.org/pdf/2507.08286.pdf

Abstract:
Open banking framework enables third party providers to access financial data across banking institutions, leading to unprecedented innovations in the financial sector. However, some open banking standards remain susceptible to severe technological risks, including unverified data sources, inconsistent data integrity, and lack of immutability. In this paper, we propose a layered architecture that provides assurance in data trustworthiness with three distinct levels of trust, covering source validation, data-level authentication, and tamper-proof storage. The first layer guarantees the source legitimacy using decentralized identity and verifiable presentation, while the second layer verifies data authenticity and consistency using cryptographic signing. Lastly, the third layer guarantees data immutability through the Tangle, a directed acyclic graph distributed ledger. We implemented a proof-of-concept implementation of our solution to evaluate its performance, where the results demonstrate that the system scales linearly with a stable throughput, exhibits a 100% validation rate, and utilizes under 35% of CPU and 350 MiB memory. Compared to a real-world open banking implementation, our solution offers significantly reduced latency and stronger data integrity assurance. Overall, our solution offers a practical and efficient system for secure data sharing in financial ecosystems while maintaining regulatory compliance.

Paperid: 3666, https://arxiv.org/pdf/2507.08249.pdf

Abstract:
There is growing interest in giving AI agents access to cryptocurrencies as well as to the smart contracts that transact them. But doing so, this position paper argues, could lead to formidable new vectors of AI harm. To support this argument, we first examine the unique properties of cryptocurrencies and smart contracts that could lead to these new vectors of harm. Next, we describe each of these new vectors of harm in detail. Finally, we conclude with a call for more technical research aimed at preventing and mitigating these harms and, thereby making it safer to endow AI agents with cryptocurrencies and smart contracts.

Paperid: 3667, https://arxiv.org/pdf/2507.08190.pdf

Abstract:
Intel(r) Software Guard Extensions (SGX) was originally released on client platforms and later extended to single socket server platforms. As developers have become familiar with the capabilities of the technology, the applicability of this capability in the cloud has been tested. Various Cloud Service Providers (CSPs) are demonstrating the value of using SGX based Trusted Execution Environments (TEE) to create a new paradigm of Confidential Cloud Computing. This paper describes the additional platform enhancements we believe are necessary to deliver a user programmable Trusted Execution Environment that scales to cloud usages, performs and is secure on multi-package platforms.

Paperid: 3668, https://arxiv.org/pdf/2507.07250.pdf

Abstract:
A semi-fragile watermarking scheme for multiple band images is presented in this article. We propose to embed a mark into remote sensing images applying a tree-structured vector quantization approach to the pixel signatures instead of processing each band separately. The signature of the multispectral or hyperspectral image is used to embed the mark in it order to detect any significant modification of the original image. The image is segmented into three-dimensional blocks, and a tree-structured vector quantizer is built for each block. These trees are manipulated using an iterative algorithm until the resulting block satisfies a required criterion, which establishes the embedded mark. The method is shown to be able to preserve the mark under lossy compression (above a given threshold) but, at the same time, it detects possibly forged blocks and their position in the whole image.

Paperid: 3669, https://arxiv.org/pdf/2507.07031.pdf

Abstract:
As AI models become ubiquitous in our daily lives, there has been an increasing demand for transparency in ML services. However, the model owner does not want to reveal the weights, as they are considered trade secrets. To solve this problem, researchers have turned to zero-knowledge proofs of ML model inference. These proofs convince the user that the ML model output is correct, without revealing the weights of the model to the user. Past work on these provers can be placed into two categories. The first method compiles the ML model into a low-level circuit, and proves the circuit using a ZK-SNARK. The second method uses custom cryptographic protocols designed only for a specific class of models. Unfortunately, the first method is highly inefficient, making it impractical for the large models used today, and the second method does not generalize well, making it difficult to update in the rapidly changing field of machine learning. To solve this, we propose ZKTorch, an open source end-to-end proving system that compiles ML models into base cryptographic operations called basic blocks, each proved using specialized protocols. ZKTorch is built on top of a novel parallel extension to the Mira accumulation scheme, enabling succinct proofs with minimal accumulation overhead. These contributions allow ZKTorch to achieve at least a $3\times$ reduction in the proof size compared to specialized protocols and up to a $6\times$ speedup in proving time over a general-purpose ZKML framework.

Paperid: 3670, https://arxiv.org/pdf/2507.06723.pdf

Abstract:
Malware developers exploit the fact that most detection models focus on the entire binary to extract the feature rather than on the regions of potential maliciousness. Therefore, they reverse engineer a benign binary and inject malicious code into it. This obfuscation technique circumvents the malware detection models and deceives the ML classifiers due to the prevalence of benign features compared to malicious features. However, extracting the features from the potential malicious regions enhances the accuracy and decreases false positives. Hence, we propose a novel model named PotentRegion4MalDetect that extracts features from the potential malicious regions. PotentRegion4MalDetect determines the nodes with potential maliciousness in the partially preprocessed Control Flow Graph (CFG) using the malicious strings given by StringSifter. Then, it extracts advanced features of the identified potential malicious regions alongside the features from the completely preprocessed CFG. The features extracted from the completely preprocessed CFG mitigate obfuscation techniques that attempt to disguise malicious content, such as suspicious strings. The experiments reveal that the PotentRegion4MalDetect requires fewer entries to save the features for all binaries than the model focusing on the entire binary, reducing memory overhead, faster computation, and lower storage requirements. These advanced features give an 8.13% increase in SHapley Additive exPlanations (SHAP) Absolute Mean and a 1.44% increase in SHAP Beeswarm value compared to those extracted from the entire binary. The advanced features outperform the features extracted from the entire binary by producing more than 99% accuracy, precision, recall, AUC, F1-score, and 0.064% FPR.

Paperid: 3671, https://arxiv.org/pdf/2507.06706.pdf

Abstract:
The security of the RSA cryptosystem is based on the intractability of computing Euler's totient function phi(n) for large integers n. Although deriving phi(n) deterministically remains computationally infeasible for cryptographically relevant bit lengths, and machine learning presents a promising alternative for constructing efficient approximations. In this work, we explore a machine learning approach to approximate Euler's totient function phi using linear regression models. We consider a dataset of RSA moduli of 64, 128, 256, 512 and 1024 bits along with their corresponding totient values. The regression model is trained to capture the relationship between the modulus and its totient, and tested on unseen samples to evaluate its prediction accuracy. Preliminary results suggest that phi can be approximated within a small relative error margin, which may be sufficient to aid in certain classes of RSA attacks. This research opens a direction for integrating statistical learning techniques into cryptanalysis, providing insights into the feasibility of attacking cryptosystems using approximation based strategies.

Paperid: 3672, https://arxiv.org/pdf/2507.06508.pdf

Abstract:
When analyzing connection patterns within graphs, subgraph counting serves as an effective and fundamental approach. Edge-local differential privacy (edge-LDP) and shuffle model have been employed to achieve subgraph counting under a privacy-preserving situation. Existing algorithms are plagued by high time complexity, excessive download costs, low accuracy, or dependence on trusted third parties. To address the aforementioned challenges, we propose the Noisy Adjacency Matrix (NAM), which combines differential privacy with the adjacency matrix of the graph. NAM offers strong versatility and scalability, making it applicable to a wider range of DP variants, DP mechanisms, and graph types. Based on NAM, we designed five algorithms (TriOR, TriTR, TriMTR, QuaTR, and 2STAR) to count three types of subgraphs: triangles, quadrangles, and 2-stars. Theoretical and experimental results demonstrate that in triangle counting, TriOR maximizes accuracy with reduced time complexity among one-round algorithms, TriTR achieves optimal accuracy, TriMTR achieves the highest accuracy under low download costs, and QuaTR stands as the first quadrangle counting algorithm under pure edge-LDP. We implement edge-LDP for noisy data via a confidence interval-inspired method, providing DP guarantees on randomized data. Our 2STAR algorithm achieves the highest accuracy in 2-star counting and can be derived as a byproduct of two-round triangle or quadrangle counting algorithms, enabling efficient joint estimation of triangle, quadrangle, and 2-star counts within two query rounds.

Paperid: 3673, https://arxiv.org/pdf/2507.06423.pdf

Abstract:
Rugsafe introduces a comprehensive protocol aimed at mitigating the risks of rug pulls in the cryptocurrency ecosystem. By utilizing cryptographic security measures and economic incentives, the protocol provides a secure multichain system for recovering assets and transforming rugged tokens into opportunities and rewards. Foundational to Rugsafe are specialized vaults where rugged tokens can be securely deposited, and anticoin tokens are issued as receipts. These anticoins are designed to be inversely pegged to the price movement of the underlying rugged token. Users can utilize these anticoins within the ecosystem or choose to burn them, further securing the protocol and earning additional rewards. The supply of the native Rugsafe token is dynamically adjusted based on the volume, value, and activity of rugged tokens, ensuring stability and resilience. By depositing rugged tokens into a vault on several chains, and by burning anticoins, users receive incentives on the RugSafe chain. This protocol's vaults are designed to work in heterogenous blockchain ecosystems, offering a practical and effective solution to one of the most significant challenges in the cryptocurrency market.

Paperid: 3674, https://arxiv.org/pdf/2507.06421.pdf

Abstract:
While additive manufacturing has opened interesting avenues to reimagine manufacturing as a service (MaaS) platform, transmission of design files from client to manufacturer over networks opens up many cybersecurity challenges. Securing client's intellectual property (IP) especially from cyber-attacks emerges as a major challenge. Earlier works introduced streaming, instead of sharing process plan (G-code) files, as a possible solution. However, executing client's G-codes on manufacturer's machines exposes them to potential malicious G-codes. This paper proposes a viable approach when the client and manufacturer do not trust each other and both the client and manufacturer want to preserve their IP of designs and manufacturing process respectively. The proposed approach is based on segmenting and streaming design (STL) files and employing a novel machine-specific STL to G-code translator at the manufacturer's site in real-time for printing. This approach secures design and manufacturing process IPs as demonstrated in a real-world implementation.

Paperid: 3675, https://arxiv.org/pdf/2507.06402.pdf

Abstract:
With the proliferation of wireless electrocardiogram (ECG) systems for health monitoring and authentication, protecting signal integrity against tampering is becoming increasingly important. This paper analyzes the performance of CNN, ResNet, and hybrid Transformer-CNN models for tamper detection. It also evaluates the performance of a Siamese network for ECG based identity verification. Six tampering strategies, including structured segment substitutions and random insertions, are emulated to mimic real world attacks. The one-dimensional ECG signals are transformed into a two dimensional representation in the time frequency domain using the continuous wavelet transform (CWT). The models are trained and evaluated using ECG data from 54 subjects recorded in four sessions 2019 to 2025 outside of clinical settings while the subjects performed seven different daily activities. Experimental results show that in highly fragmented manipulation scenarios, CNN, FeatCNN-TranCNN, FeatCNN-Tran and ResNet models achieved an accuracy exceeding 99.5 percent . Similarly, for subtle manipulations (for example, 50 percent from A and 50 percent from B and, 75 percent from A and 25 percent from B substitutions) our FeatCNN-TranCNN model demonstrated consistently reliable performance, achieving an average accuracy of 98 percent . For identity verification, the pure Transformer-Siamese network achieved an average accuracy of 98.30 percent . In contrast, the hybrid CNN-Transformer Siamese model delivered perfect verification performance with 100 percent accuracy.

Paperid: 3676, https://arxiv.org/pdf/2507.06112.pdf

Abstract:
Developers rely on constant-time programming to prevent timing side-channel attacks. But these efforts can be undone by compilers, whose optimizations may silently reintroduce leaks. While recent works have measured the extent of such leakage, they leave developers without actionable insights: which optimization passes are responsible, and how to disable them without modifying the compiler remains unclear. In this paper, we conduct a qualitative analysis of how compiler optimizations break constant-time code. We construct a dataset of compiler-introduced constant-time violations and analyze the internals of two widely used compilers, GCC and LLVM, to identify the specific optimization passes responsible. Our key insight is that a small set of passes are at the root of most leaks. To the best of our knowledge, we are also the first to characterize how the interactions between these passes contribute to leakage. Based on this analysis, we propose an original and practical mitigation that requires no source code modification or custom compiler: disabling selected optimization passes via compiler flags. We show that this approach significantly reduces leakage with minimal performance overhead, offering an immediately deployable defense for developers.

Paperid: 3677, https://arxiv.org/pdf/2507.05794.pdf

Abstract:
For securing systems, it is essential to manage their vulnerability posture and design appropriate security controls. Vulnerability management allows to proactively address vulnerabilities by incorporating pertinent security controls into systems designs. Current vulnerability management approaches do not support systematic reasoning about the vulnerability postures of systems designs. To effectively manage vulnerabilities and design security controls, we propose a formally grounded automated reasoning mechanism. We integrate the mechanism into an open-source security design tool and demonstrate its application through an illustrative example driven by real-world challenges. The automated reasoning mechanism allows system designers to identify vulnerabilities that are applicable to a specific system design, explicitly specify vulnerability mitigation options, declare selected controls, and thus systematically manage vulnerability postures.

Paperid: 3678, https://arxiv.org/pdf/2507.04495.pdf

Abstract:
Deep learning-based watermarking has emerged as a promising solution for robust image authentication and protection. However, existing models are limited by low embedding capacity and vulnerability to bit-level errors, making them unsuitable for cryptographic applications such as digital signatures, which require over 2048 bits of error-free data. In this paper, we propose README (Robust Error-Aware Digital Signature via Deep WaterMarking ModEl), a novel framework that enables robust, verifiable, and error-tolerant digital signatures within images. Our method combines a simple yet effective cropping-based capacity scaling mechanism with ERPA (ERror PAinting Module), a lightweight error correction module designed to localize and correct bit errors using Distinct Circular Subsum Sequences (DCSS). Without requiring any fine-tuning of existing pretrained watermarking models, README significantly boosts the zero-bit-error image rate (Z.B.I.R) from 1.2% to 86.3% when embedding 2048-bit digital signatures into a single image, even under real-world distortions. Moreover, our use of perceptual hash-based signature verification ensures public verifiability and robustness against tampering. The proposed framework unlocks a new class of high-assurance applications for deep watermarking, bridging the gap between signal-level watermarking and cryptographic security.

Paperid: 3679, https://arxiv.org/pdf/2507.04461.pdf

Abstract:
In an increasingly interconnected world, protecting electronic devices has grown more crucial because of the dangers of data extraction, reverse engineering, and hardware tampering. Producing chips in a third-party manufacturing company can let hackers change the design. As the Internet of Things (IoT) proliferates, physical attacks happen more, and conventional cryptography techniques do not function well. In this paper, we investigate the design and assessment of PUFs using the Stanford Memristor Model, utilizing its random filament evolution to improve security. The system was built using 45nm CMOS technology. A comparison is made between CMOS-based and memristor-based Arbiter PUFs, evaluating their performance under temperature, voltage, and process variations. Intra- and inter-hamming distances are employed by Monte Carlo simulations to estimate uniqueness and reliability. The results show that memristor-based PUFs offer better reliability than CMOS-based designs, though uniqueness needs further improvement. Furthermore, this study sheds light on the reasonableness of memristor-based PUFs for secure applications in hardware security.

Paperid: 3680, https://arxiv.org/pdf/2507.04275.pdf

Abstract:
The persistent threat of Android malware presents a serious challenge to the security of millions of users globally. While many machine learning-based methods have been developed to detect these threats, their reliance on large labeled datasets limits their effectiveness against emerging, previously unseen malware families, for which labeled data is scarce or nonexistent. To address this challenge, we introduce a novel zero-shot learning framework that combines Variational Graph Auto-Encoders (VGAE) with Siamese Neural Networks (SNN) to identify malware without needing prior examples of specific malware families. Our approach leverages graph-based representations of Android applications, enabling the model to detect subtle structural differences between benign and malicious software, even in the absence of labeled data for new threats. Experimental results show that our method outperforms the state-of-the-art MaMaDroid, especially in zero-day malware detection. Our model achieves 96.24% accuracy and 95.20% recall for unknown malware families, highlighting its robustness against evolving Android threats.

Paperid: 3681, https://arxiv.org/pdf/2507.04174.pdf

Abstract:
Cloud Forensics presents a multi-jurisdictional challenge that may undermines the success of digital forensic investigations (DFIs). The growing volumes of domiciled and foreign law enforcement (LE) requests, the latency and complexity of formal channels for crossborder data access are challenging issues. In this paper, we first discuss major Cloud Service Providers (CSPs) transparency reports and law enforcement guidelines, then propose an abstract architecture for a Cloud Law Enforcement Requests Management System (CLERMS). A proof of concept of the proposed solution is developed, deployed and validated by two realistic scenarios, in addition to an economic estimation of its associated costs. Based on available open source components, our solution is for the benefit of both CSPs and Cloud Service Consumers (CSCs), and aims to enhance the due Cloud Digital Forensic Readiness (CDFR).

Paperid: 3682, https://arxiv.org/pdf/2507.03607.pdf

Abstract:
This paper presents VLAI, a transformer-based model that predicts software vulnerability severity levels directly from text descriptions. Built on RoBERTa, VLAI is fine-tuned on over 600,000 real-world vulnerabilities and achieves over 82% accuracy in predicting severity categories, enabling faster and more consistent triage ahead of manual CVSS scoring. The model and dataset are open-source and integrated into the Vulnerability-Lookup service.

Paperid: 3683, https://arxiv.org/pdf/2507.03136.pdf

Abstract:
This article presents an in-depth exploration of the analogy between the Holographic Principle in theoretical physics and cyber attack surfaces in digital security. Building on concepts such as black hole entropy and AdS/CFT duality, it highlights how complex infrastructures project their vulnerabilities onto their external interfaces. The paper draws a parallel between a black hole's event horizon, which encodes all internal information, and the attack surface, which reflects the internal architecture's security posture. Additionally, the article outlines how this conceptual framework can guide cybersecurity practices, emphasizing strategies such as attack surface reduction, continuous scanning with tools like OWASP ZAP and Greenbone OpenVAS, and the implementation of Zero Trust Architecture. This analogy not only provides a unique perspective on digital security but also underscores the critical importance of boundary-level defenses in protecting vast internal infrastructures.

Paperid: 3684, https://arxiv.org/pdf/2507.03014.pdf

Abstract:
Large language models (LLMs) face significant copyright and intellectual property challenges as the cost of training increases and model reuse becomes prevalent. While watermarking techniques have been proposed to protect model ownership, they may not be robust to continue training and development, posing serious threats to model attribution and copyright protection. This work introduces a simple yet effective approach for robust LLM fingerprinting based on intrinsic model characteristics. We discover that the standard deviation distributions of attention parameter matrices across different layers exhibit distinctive patterns that remain stable even after extensive continued training. These parameter distribution signatures serve as robust fingerprints that can reliably identify model lineage and detect potential copyright infringement. Our experimental validation across multiple model families demonstrates the effectiveness of our method for model authentication. Notably, our investigation uncovers evidence that a recently Pangu Pro MoE model released by Huawei is derived from Qwen-2.5 14B model through upcycling techniques rather than training from scratch, highlighting potential cases of model plagiarism, copyright violation, and information fabrication. These findings underscore the critical importance of developing robust fingerprinting methods for protecting intellectual property in large-scale model development and emphasize that deliberate continued training alone is insufficient to completely obscure model origins.

Paperid: 3685, https://arxiv.org/pdf/2507.02969.pdf

Abstract:
This paper aims to provide an innovative machine learning-based solution to automate security testing tasks for web applications, ensuring the correct functioning of all components while reducing project maintenance costs. Reinforcement Learning is proposed to select and prioritize tools and optimize the testing path. The presented approach utilizes a simulated webpage along with its network topology to train the agent. Additionally, the model leverages Geometric Deep Learning to create priors that reduce the search space and improve learning convergence. The validation and testing process was conducted on real-world vulnerable web pages commonly used by human hackers for learning. As a result of this study, a reinforcement learning algorithm was developed that maximizes the number of vulnerabilities found while minimizing the number of steps required

Paperid: 3686, https://arxiv.org/pdf/2507.02607.pdf

Abstract:
The digital evolution of connected vehicles and the subsequent security risks emphasize the critical need for implementing in-vehicle cyber security measures such as intrusion detection and response systems. The continuous advancement of attack scenarios further highlights the need for adaptive detection mechanisms that can detect evolving, unknown, and complex threats. The effective use of ML-driven techniques can help address this challenge. However, constraints on implementing diverse attack scenarios on test vehicles due to safety, cost, and ethical considerations result in a scarcity of data representing attack scenarios. This limitation necessitates alternative efficient and effective methods for generating high-quality attack-representing data. This paper presents a context-aware attack data generator that generates attack inputs and corresponding in-vehicle network log, i.e., controller area network (CAN) log, representing various types of attack including denial of service (DoS), fuzzy, spoofing, suspension, and replay attacks. It utilizes parameterized attack models augmented with CAN message decoding and attack intensity adjustments to configure the attack scenarios with high similarity to real-world scenarios and promote variability. We evaluate the practicality of the generated attack-representing data within an intrusion detection system (IDS) case study, in which we develop and perform an empirical evaluation of two deep neural network IDS models using the generated data. In addition to the efficiency and scalability of the approach, the performance results of IDS models, high detection and classification capabilities, validate the consistency and effectiveness of the generated data as well. In this experience study, we also elaborate on the aspects influencing the fidelity of the data to real-world scenarios and provide insights into its application.

Paperid: 3687, https://arxiv.org/pdf/2507.02478.pdf

Abstract:
Wi-Fi management frames reveal structured communication patterns that persist even under randomization of MAC addresses. Prior approaches to associating randomized MAC addresses with devices primarily focus on probe requests, overlooking the broader set of management frames and their transition dynamics. This narrow focus limits their robustness in dense, real-world environments with high device mobility, where probe activity alone fails to yield stable and distinctive signatures. In this paper, we present a novel framework for fingerprinting Wi-Fi devices based on behavioral dynamics extracted from passively observed management frames. We model each device's behavior as a finite state machine and introduce matrix-based representations that encode both structural (state transition frequencies) and temporal (inter-state delays) characteristics. These matrices are embedded into compact feature vectors, enabling efficient similarity comparison. Through extensive evaluation in diverse real-world settings, our method achieves over 86% identification accuracy for non-randomized devices using only Wi-Fi management frames, with further improvements observed through temporal burst aggregation. Our findings are sufficient to uniquely and consistently characterize devices at scale, outperforming the state-of-the-art.

Paperid: 3688, https://arxiv.org/pdf/2507.02390.pdf

Abstract:
Log analysis is a relevant research field in cybersecurity as they can provide a source of information for the detection of threats to networks and systems. This paper presents a pipeline to use fine-tuned Large Language Models (LLMs) for anomaly detection and mitigation recommendation using IoT security logs. Utilizing classical machine learning classifiers as a baseline, three open-source LLMs are compared for binary and multiclass anomaly detection, with three strategies: zero-shot, few-shot prompting and fine-tuning using an IoT dataset. LLMs give better results on multi-class attack classification than the corresponding baseline models. By mapping detected threats to MITRE CAPEC, defining a set of IoT-specific mitigation actions, and fine-tuning the models with those actions, the models are able to provide a combined detection and recommendation guidance.

Paperid: 3689, https://arxiv.org/pdf/2507.01453.pdf

Abstract:
Censorship resilience is a fundamental assumption underlying the security of blockchain protocols. Additionally, the analysis of blockchain security from an economic and game theoretic perspective has been growing in popularity in recent years. In this work, we present a surprising rational censorship attack on blockchain censorship resilience when we adopt the analysis of blockchain security from a game theoretic lens and assume all users are rational. In our attack, a colluding group with sufficient voting power censors the remainder nodes such that the group alone can gain all the rewards from maintaining the blockchain. We show that if nodes are rational, coordinating this attack just requires a public read and write blackboard and we formally model the attack using a game theoretic framework. Furthermore, we note that to ensure the success of the attack, nodes need to know the total true voting power held by the colluding group. We prove that the strategy to join the rational censorship attack and also for nodes to honestly declare their power is a subgame perfect equilibrium in the corresponding extensive form game induced by our attack. Finally, we discuss the implications of the attack on blockchain users and protocol designers as well as some potential countermeasures.

Paperid: 3690, https://arxiv.org/pdf/2507.01423.pdf

Abstract:
This paper introduces a compact and secure 16-bit substitution box (S-box) designed over the composite field $\F_{(((2^2)^2)^2)^2}$, optimized for both hardware efficiency and cryptographic robustness. The proposed S-box decomposes operations into subfields, leveraging a tower field architecture. This enables significant hardware reduction through optimized field inversion and a low-cost affine transformation. Security evaluations confirm resilience against linear, differential, algebraic and DPA attacks, validated via metrics including Nonlinearity (32512), Differential Uniformity (4), Algebraic Degree (15), Transparency order (15.9875) and SNR (0.34e-08). The hardware results, in 65 nm CMOS technology, show the proposed 16-bit S-box has lower hardware resources consumption and lower critical path delay (CPD) than those of other 16-bit S-boxes. By integrating high algebraic complexity with resource-efficient structures, this work addresses the growing demand for scalable cryptographic primitives in data-sensitive applications, demonstrating that larger S-boxes can enhance security without proportional hardware costs. The results underscore the viability of composite field-based architectures in balancing security and efficiency for modern block ciphers.

Paperid: 3691, https://arxiv.org/pdf/2507.01208.pdf

Abstract:
Modern vehicles are increasingly connected, and in this context, automotive Ethernet is one of the technologies that promise to provide the necessary infrastructure for intra-vehicle communication. However, these systems are subject to attacks that can compromise safety, including flow injection attacks. Deep Learning-based Intrusion Detection Systems (IDS) are often designed to combat this problem, but they require expensive hardware to run in real time. In this work, we propose to evaluate and apply fast neural network inference techniques like Distilling and Prunning for deploying IDS models on low-cost platforms in real time. The results show that these techniques can achieve intrusion detection times of up to 727 Î¼s using a Raspberry Pi 4, with AUCROC values of 0.9890.

Paperid: 3692, https://arxiv.org/pdf/2507.01118.pdf

Abstract:
Quasi-twisted (QT) codes generalize several important families of linear codes, including cyclic, constacyclic, and quasi-cyclic codes. Despite their potential, to the best of our knowledge, there exists no efficient decoding algorithm for QT codes. In this work, we propose a syndrome-based decoding method capable of efficiently correcting up to (d* - 1)/2 errors, where d* denotes an HT-like lower bound on the minimum distance of QT codes, which we formalize here. Additionally, we introduce a Niederreiter-like cryptosystem constructed from QT codes. This cryptosystem is resistant to some classical attacks as well as some quantum attacks based on Quantum Fourier Sampling.

Paperid: 3693, https://arxiv.org/pdf/2507.00847.pdf

Abstract:
Bluetooth is a pervasive wireless communication technology used by billions of devices for short-range connectivity. The security of Bluetooth relies on the pairing process, where devices establish shared long-term keys for secure communications. However, many commercial Bluetooth devices implement automatic pairing functions to improve user convenience, creating a previously unexplored attack surface. We present Stealtooth, a novel attack that abuses unknown vulnerabilities in the automatic pairing functions in commercial Bluetooth devices to achieve completely silent device link key overwriting. The Stealtooth attack leverages the fact that Bluetooth audio devices automatically transition to pairing mode under specific conditions, enabling attackers to hijack pairing processes without user awareness or specialized tools. We also extend the attack into the MitM Stealtooth attack, combining automatic pairing abuse with power-saving mode techniques to enable man-in-the-middle attacks. We evaluate the attacks against 10 commercial Bluetooth devices from major manufacturers, demonstrating widespread vulnerabilities across diverse device types and manufacturers. Our practical implementation requires only commodity hardware and open-source software, highlighting the low barrier to entry for attackers. We propose defenses both device and protocol levels, including enhanced user notifications and standardized automatic pairing guidelines. Our findings reveal a critical tension between security and usability, showing that current automatic pairing implementations create systematic vulnerabilities. We responsibly disclosed our findings to affected vendors, with several already releasing patches.

Paperid: 3694, https://arxiv.org/pdf/2507.00827.pdf

Abstract:
Tampering or forgery of digital documents has become widespread, most commonly through altering images without any malicious intent such as enhancing the overall appearance of the image. However, there are occasions when tampering of digital documents can have negative consequences, such as financial fraud and reputational damage. Tampering can occur through altering a digital document's text or editing an image's pixels. Many techniques have been developed to detect whether changes have been made to a document. Most of these techniques rely on generating hashes or watermarking the document. These techniques, however, have limitations in that they cannot detect alterations to portable document format (PDF) signatures or other non-visual aspects, such as metadata. This paper presents a new technique that can be used to detect tampering within a PDF document by utilizing the PDF document's file page objects. The technique employs a prototype that can detect changes to a PDF document, such as changes made to the text, images, or metadata of the said file.

Paperid: 3695, https://arxiv.org/pdf/2507.00637.pdf

Abstract:
We present a novel methodology for modelling, visualising, and analysing cyber threats, attack paths, as well as their impact on user services in enterprise or infrastructure networks of digital devices and services they provide. Using probabilistic methods to track the propagation of an attack through attack graphs, via the service or application layers, and on physical communication networks, our model enables us to analyse cyber attacks at different levels of detail. Understanding the propagation of an attack within a service among microservices and its spread between different services or application servers could help detect and mitigate it early. We demonstrate that this network-based influence spreading modelling approach enables the evaluation of diverse attack scenarios and the development of protection and mitigation measures, taking into account the criticality of services from the user's perspective. This methodology could also aid security specialists and system administrators in making well-informed decisions regarding risk mitigation strategies.

Paperid: 3696, https://arxiv.org/pdf/2507.00522.pdf

Abstract:
The digital transformation of power systems is accelerating the adoption of IEC 61850 standard. However, its communication protocols, including Sampled Values (SV), lack built-in security features such as authentication and encryption, making them vulnerable to malicious packet injection. Such cyber attacks can delay fault clearance or trigger unintended circuit breaker operations. While most existing research focuses on detecting cyber attacks in digital substations, intrusion prevention systems have been disregarded because of the risk of potential communication network disruptions. This paper proposes a novel method using hybrid statistical-deep learning for the detection, prevention, and source localization of IEC 61850 SV injection attacks. The method uses exponentially modified Gaussian distributions to model communication network latency and long short-term memory and Elman recurrent neural network to detect anomalous variations in the estimated probability distributions. It effectively discards malicious SV frames with minimal processing overhead and latency, maintains robustness against communication network latency variation and time-synchronization issues, and guarantees a near-zero false positive rate in non-attack scenarios. Comprehensive validation is conducted on three testbeds involving industrial-grade devices, hardware-in-the-loop simulations, virtualized intelligent electronic devices and merging units, and high-fidelity emulated communication networks. Results demonstrate the method's suitability for practical deployment in IEC 61850-compliant digital substations.

Paperid: 3697, https://arxiv.org/pdf/2507.00423.pdf

Abstract:
Federated learning (FL) allows multiple clients to collaboratively train a global machine learning model with coordination from a central server, without needing to share their raw data. This approach is particularly appealing in the era of privacy regulations like the GDPR, leading many prominent companies to adopt it. However, FL's distributed nature makes it susceptible to poisoning attacks, where malicious clients, controlled by an attacker, send harmful data to compromise the model. Most existing poisoning attacks in FL aim to degrade the model's integrity, such as reducing its accuracy, with limited attention to privacy concerns from these attacks. In this study, we introduce FedPoisonMIA, a novel poisoning membership inference attack targeting FL. FedPoisonMIA involves malicious clients crafting local model updates to infer membership information. Additionally, we propose a robust defense mechanism to mitigate the impact of FedPoisonMIA attacks. Extensive experiments across various datasets demonstrate the attack's effectiveness, while our defense approach reduces its impact to a degree.

Paperid: 3698, https://arxiv.org/pdf/2507.00348.pdf

Abstract:
Machine learning is increasingly vital in cybersecurity, especially in malware detection. However, concept drift, where the characteristics of malware change over time, poses a challenge for maintaining the efficacy of these detection systems. Concept drift can occur in two forms: the emergence of entirely new malware families and the evolution of existing ones. This paper proposes an innovative method to address the former, focusing on effectively identifying new malware families. Our approach leverages a supervised autoencoder combined with triplet loss to differentiate between known and new malware families. We create clear and robust clusters that enhance the accuracy and resilience of malware family classification by utilizing this metric learning technique and the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. The effectiveness of our method is validated using an Android malware dataset and a Windows portable executable (PE) malware dataset, showcasing its capability to sustain model performance within the dynamic landscape of emerging malware threats. Our results demonstrate a significant improvement in detecting new malware families, offering a reliable solution for ongoing cybersecurity challenges.

Paperid: 3699, https://arxiv.org/pdf/2507.00095.pdf

Abstract:
We introduce the first quantum authentication scheme for continuous-variable states. Our scheme is based on trap states, and is an adaptation of a discrete-variable scheme by Broadbent et al. (arXiv:1211.1080), but with more freedom in choosing the number of traps. We provide a security proof, mostly following the approach of Broadbent and Wainewright (arXiv:1607.03075). As a necessary ingredient for the proof we derive the continuous-variable analogue of the Pauli Twirl.

Paperid: 3700, https://arxiv.org/pdf/2512.24415.pdf

Abstract:
Customer-service LLM agents increasingly make policy-bound decisions (refunds, rebooking, billing disputes), but the same ``helpful'' interaction style can be exploited: a small fraction of users can induce unauthorized concessions, shifting costs to others and eroding trust in agentic workflows. We present a cross-domain benchmark of profit-seeking direct prompt injection in customer-service interactions, spanning 10 service domains and 100 realistic attack scripts grouped into five technique families. Across five widely used models under a unified rubric with uncertainty reporting, attacks are highly domain-dependent (airline support is most exploitable) and technique-dependent (payload splitting is most consistently effective). We release data and evaluation code to support reproducible auditing and to inform the design of oversight and recovery workflows for trustworthy, human centered agent interfaces.

Paperid: 3701, https://arxiv.org/pdf/2512.23607.pdf

Abstract:
This document presents a concise overview of the contemporary research directions in quantum computer cybersecurity. The aim of this document is not to be a survey, but rather a succinct summary of the major research directions in quantum computer cybersecurity at the end of the first half of the current decade. The document has been inspired by the presentations and discussions held at the 3$^{rd}$ Quantum Computer Cybersecurity Symposium, but goes beyond the contents of the symposium and aims to summarize at the high level the last five years of quantum computer cybersecurity work in academia. It is hoped that the document can provide researchers as well as government and industry leaders an overview of the current landscape of security threats and defenses against emergent quantum computing technologies. The document also includes a discussion of the current trends in cybersecurity research on quantum computers, and the perceived research gaps that should be filled with future funding and through academic and industry~research.

Paperid: 3702, https://arxiv.org/pdf/2512.23535.pdf

Abstract:
This paper presents a private transfer architecture for the Internet Computer (ICP) that decouples deposit and retrieval through two short-lived intermediaries, with sealed storage and attested teardown by an ephemeral witness. The protocol uses a non-interactive RDMPF-based encapsulation to derive per-transfer transport keys. A public notice hint is computed from the capsule to enable discovery without fingerprinting the recipient's key. Retrieval is authorized by a short proof of decapsulation that reveals no identities. All transaction intermediaries are ephemeral and issue certified destruction intents and proofs, allowing a noticeboard to publish auditable finalization records. The design provides sender identity privacy with respect to the recipient, content confidentiality against intermediaries, forward secrecy for transport keys after staged destruction, verifiable liveness and finality. We formalize the basic interfaces, provide the security arguments for encapsulation correctness, hint privacy, authorization soundness and timeout reclaim. In terms of implementation, it has been recently brought into production on the ICP under the name ICPP. It has been subject to exhaustive testing and incorporates a few enhancements, focusing on the operational possibilities offered by ICP's technology. This work hence serves as a broad reference for the protocol now publicly accessible.

Paperid: 3703, https://arxiv.org/pdf/2512.23124.pdf

Abstract:
Financial institutions increasingly rely on distributed architectures, open banking APIs, cloud native infrastructures, and high frequency digital transactions. These transformations expand the attack surface and expose limitations in traditional perimeter based security models. While Zero Trust architectures provide essential security principles, most existing frameworks do not explicitly incorporate transactional semantics, financial risk modeling, adaptive identity trust, or automation weighted by economic impact. This paper introduces SecureBank, a financially aware and context adaptive Zero Trust architecture designed specifically for high assurance banking systems. The proposed framework integrates Financial Zero Trust, Adaptive Identity Scoring, Contextual Micro Segmentation, and Impact Driven Security Automation. A Monte Carlo simulation evaluates SecureBank against a representative rule based baseline architecture using metrics such as the Transactional Integrity Index (TII), Identity Trust Adaptation Level (ITAL), and Security Automation Efficiency (SAE). The results demonstrate that SecureBank significantly improves automated attack handling and accelerates identity trust adaptation while preserving conservative and regulator aligned levels of transactional integrity. Beyond experimental validation, SecureBank is intended to serve as a reference architecture and evaluation baseline for financially aware Zero Trust systems in regulated financial environments.

Paperid: 3704, https://arxiv.org/pdf/2512.22501.pdf

Abstract:
Ensuring the authenticity and ownership of digital images is increasingly challenging as modern editing tools enable highly realistic forgeries. Existing image protection systems mainly rely on digital watermarking, which is susceptible to sophisticated digital attacks. To address this limitation, we propose a hybrid optical-digital framework that incorporates physical authentication cues during image formation and preserves them through a learned reconstruction process. At the optical level, a phase mask in the camera aperture produces a Null-space Optical Watermark (NOWA) that lies in the Null Space of the imaging operator and therefore remains invisible in the captured image. Then, a Null-Space Network (NSN) performs measurement-consistent reconstruction that delivers high-quality protected images while preserving the NOWA signature. The proposed design enables tamper localization by projecting the image onto the camera's null space and detecting pixel-level inconsistencies. Our design preserves perceptual quality, resists common degradations such as compression, and establishes a structural security asymmetry: without access to the optical or NSN parameters, adversaries cannot forge the NOWA signature. Experiments with simulations and a prototype camera demonstrate competitive performance in terms of image quality preservation, and tamper localization accuracy compared to state-of-the-art digital watermarking and learning-based authentication methods.

Paperid: 3705, https://arxiv.org/pdf/2512.22293.pdf

Abstract:
Warning-framed content in training data (e.g., "DO NOT USE - this code is vulnerable") does not, it turns out, teach language models to avoid the warned-against behavior. In experiments reported here, models exposed to such warnings reproduced the flagged content at rates statistically indistinguishable from models given the content directly (76.7% vs. 83.3%). Why? Sparse autoencoder analysis points to a failure of orthogonalization: "describing X" and "performing X" activate overlapping latent features. Feature #8684, which tracks code execution patterns, fires at comparable magnitude in both warning and exploitation contexts. A related phenomenon, what I call "stealth slip", allows conversational preambles to rotate activations into subspaces that linear probes miss entirely. Prompting and inference-time steering do not fix this; training-time feature ablation does. The upshot is that statistical co-occurrence dominates over pragmatic interpretation in current architectures. Models learn what tends to follow a context, not why it appeared there.

Paperid: 3706, https://arxiv.org/pdf/2512.22167.pdf

Abstract:
The IANEC project (Investigation of Digital Archives of Contemporary Writers), led by the GREYC Research Lab and funded by the French Ministry of Culture aims to develop dedicated digital forensic investigation tools to automate the analysis of archival corpora from the Institut M{é}moires de l'{É}dition Contemporaine (IMEC). The project is based on the observation that born-digital archival materials are increasingly prevalent in contemporary archival institutions, and that digital forensics technologies have become essential for the extraction, identification, processing, and description of natively digital archival corpora.*

Paperid: 3707, https://arxiv.org/pdf/2512.22128.pdf

Abstract:
Graph Neural Networks (GNNs) have emerged as a dominant paradigm for learning on graph-structured data, thanks to their ability to jointly exploit node features and relational information encoded in the graph topology. This joint modeling, however, also introduces a critical weakness: perturbations or noise in either the structure or the features can be amplified through message passing, making GNNs highly vulnerable to adversarial attacks and spurious connections. In this work, we introduce a pruning framework that leverages adversarial robustness evaluation to explicitly identify and remove fragile or detrimental components of the graph. By using robustness scores as guidance, our method selectively prunes edges that are most likely to degrade model reliability, thereby yielding cleaner and more resilient graph representations. We instantiate this framework on three representative GNN architectures and conduct extensive experiments on benchmarks. The experimental results show that our approach can significantly enhance the defense capability of GNNs in the high-perturbation regime.

Paperid: 3708, https://arxiv.org/pdf/2512.21698.pdf

Abstract:
This work introduces a unified raster domain steganographic framework, termed as the Glyph Perturbation Cardinality (GPC) framework, capable of embedding heterogeneous data such as text, images, audio, and video directly into the pixel space of rendered textual glyphs. Unlike linguistic or structural text based steganography, the proposed method operates exclusively after font rasterization, modifying only the bitmap produced by a deterministic text rendering pipeline. Each glyph functions as a covert encoding unit, where a payload value is expressed through the cardinality of minimally perturbed interior ink pixels. These minimal intensity increments remain visually imperceptible while forming a stable and decodable signal. The framework is demonstrated for text to text embedding and generalized to multimodal inputs by normalizing image intensities, audio derived scalar features, and video frame values into bounded integer sequences distributed across glyphs. Decoding is achieved by re-rasterizing the cover text, subtracting canonical glyph rasters, and recovering payload values via pixel count analysis. The approach is computationally lightweight, and grounded in deterministic raster behavior, enabling ordinary text to serve as a visually covert medium for multimodal data embedding.

Paperid: 3709, https://arxiv.org/pdf/2512.21574.pdf

Abstract:
The integration of non-terrestrial networks (NTNs) into 6G systems is crucial for achieving seamless global coverage, particularly in underserved and disaster-prone regions. Among NTN platforms, unmanned aerial vehicles (UAVs) are especially promising due to their rapid deployability. However, this shift from fixed, wired base stations (BSs) to mobile, wireless, energy-constrained UAV-BSs introduces unique security challenges. Their central role in emergency communications makes them attractive candidates for emergency alert spoofing. Their limited computing and energy resources make them more vulnerable to denial-of-service (DoS) attacks, and their dependence on wireless backhaul links and GNSS navigation exposes them to jamming, interception, and spoofing. Furthermore, UAV mobility opens new attack vectors such as malicious handover manipulation. This paper identifies several attack surfaces of UAV-BS systems and outlines principles for mitigating their threats.

Paperid: 3710, https://arxiv.org/pdf/2512.21368.pdf

Abstract:
The successful deployment of the Internet of Things (IoT) applications relies heavily on their robust security, and lightweight cryptography is considered an emerging solution in this context. While existing surveys have been examining lightweight cryptographic techniques from the perspective of hardware and software implementations or performance evaluation, there is a significant gap in addressing different security aspects specific to the IoT environment. This study aims to bridge this gap. This research presents a thorough survey focused on the security evaluation of symmetric lightweight ciphers commonly used in IoT systems. The objective of this study is to provide a holistic understanding of lightweight ciphers, emphasizing their security strength, which is an essential consideration for real-time and resource-constrained applications. Furthermore, we propose two taxonomies: one for classifying IoT applications based on their inherent characteristics, and another for evaluating security levels based on key size. Our findings indicate that key size is a critical parameter in the security of lightweight ciphers. Ciphers employing keys shorter than 128 bits are considered less secure or even insecure for protecting sensitive data

Paperid: 3711, https://arxiv.org/pdf/2512.21251.pdf

Abstract:
My main worry, and the core of my research, is that our cybersecurity ecosystem is slowly but surely aging and getting old and that aging is becoming an operational risk. This is happening not only because of growing complexity, but more importantly because of accumulation of controls and measures whose effectiveness are uncertain. I introduce a new term for this aging phenomenon: cyber senescence. I will begin my lecture with a short historical overview in which I sketch a development over time that led to this worry for the future of cybersecurity. It is this worry that determined my research agenda and its central theme of the role of uncertainty in cybersecurity. My worry is that waste is accumulating in cyberspace. This waste consists of a multitude of overlapping controls whose risk reductions are uncertain. Unless we start pruning these control frameworks, this waste accumulation causes aging of cyberspace and could ultimately lead to a system collapse.

Paperid: 3712, https://arxiv.org/pdf/2512.21248.pdf

Abstract:
Lateral movement is a tactic that adversaries employ most frequently in enterprise IT environments to traverse between assets. In operational technology (OT) environments, however, few methods exist for lateral movement between domain-specific devices, particularly programmable logic controllers (PLCs). Existing techniques often rely on complex chains of vulnerabilities, which are noisy and can be patched. This paper describes the first PLC-centric lateral movement technique that relies exclusively on the native functionality of the victim environment. This OT-specific form of `living off the land' is herein distinguished as `living off the plant' (LOTP). The described technique also facilitates escape from IP networks onto legacy serial networks via dual-homed PLCs. Furthermore, this technique is covert, leveraging common network communication functions that are challenging to detect. This serves as a reminder of the risks posed by LOTP techniques within OT, highlighting the need for a fundamental reconsideration of traditional OT defensive practices.

Paperid: 3713, https://arxiv.org/pdf/2512.20423.pdf

Abstract:
The purpose of this project is to assess how well defenders can detect DNS-over-HTTPS (DoH) file exfiltration, and which evasion strategies can be used by attackers. While providing a reproducible toolkit to generate, intercept and analyze DoH exfiltration, and comparing Machine Learning vs threshold-based detection under adversarial scenarios. The originality of this project is the introduction of an end-to-end, containerized pipeline that generates configurable file exfiltration over DoH using several parameters (e.g., chunking, encoding, padding, resolver rotation). It allows for file reconstruction at the resolver side, while extracting flow-level features using a fork of DoHLyzer. The pipeline contains a prediction side, which allows the training of machine learning models based on public labelled datasets and then evaluates them side-by-side with threshold-based detection methods against malicious and evasive DNS-Over-HTTPS traffic. We train Random Forest, Gradient Boosting and Logistic Regression classifiers on a public DoH dataset and benchmark them against evasive DoH exfiltration scenarios. The toolkit orchestrates traffic generation, file capture, feature extraction, model training and analysis. The toolkit is then encapsulated into several Docker containers for easy setup and full reproducibility regardless of the platform it is run on. Future research regarding this project is directed at validating the results on mixed enterprise traffic, extending the protocol coverage to HTTP/3/QUIC request, adding a benign traffic generation, and working on real-time traffic evaluation. A key objective is to quantify when stealth constraints make DoH exfiltration uneconomical and unworthy for the attacker.

Paperid: 3714, https://arxiv.org/pdf/2512.19951.pdf

Abstract:
The mod function plays a critical role in numerous data encoding and cryptographic primitives. However, the widely used CKKS homomorphic encryption (HE) scheme supports only arithmetic operations, making it difficult to perform mod computations on encrypted data. Approximating the mod function with polynomials has therefore become an important yet challenging problem. The discontinuous and periodic characteristics of the mod function make it particularly difficult to approximate accurately under HE. Existing homomorphic mod constructions provide accurate results only within limited subranges of the input range, leaving the problem of achieving accurate approximation across the full input range unresolved. In this work, we propose a novel method based on polynomial interpolation and Chebyshev series to accurately approximate the mod function. Building upon this, we design two efficient data packing schemes, BitStack and CRTStack, tailored for small-integer inputs in CKKS. These schemes significantly improve the utilization of the CKKS plaintext space and enable efficient ciphertext uploads. Furthermore, we apply the proposed HE mod function to implement a homomorphic rounding operation and a general transformation from additive secret sharing to CKKS ciphertexts, achieving accurate ciphertext rounding and complete secret-share-to-CKKS conversion. Experimental results demonstrate that our approach achieves high approximation accuracy (up to 1e-8). Overall, our work provides a practical and general solution for performing mod operations under CKKS, extending its applicability to a broader range of privacy-preserving computations.

Paperid: 3715, https://arxiv.org/pdf/2512.19935.pdf

Abstract:
Machine learning models used in financial decision systems operate in nonstationary economic environments, yet adversarial robustness is typically evaluated under static assumptions. This work introduces Conditional Adversarial Fragility, a regime dependent phenomenon in which adversarial vulnerability is systematically amplified during periods of macroeconomic stress. We propose a regime aware evaluation framework for time indexed tabular financial classification tasks that conditions robustness assessment on external indicators of economic stress. Using volatility based regime segmentation as a proxy for macroeconomic conditions, we evaluate model behavior across calm and stress periods while holding model architecture, attack methodology, and evaluation protocols constant. Baseline predictive performance remains comparable across regimes, indicating that economic stress alone does not induce inherent performance degradation. Under adversarial perturbations, however, models operating during stress regimes exhibit substantially greater degradation across predictive accuracy, operational decision thresholds, and risk sensitive outcomes. We further demonstrate that this amplification propagates to increased false negative rates, elevating the risk of missed high risk cases during adverse conditions. To complement numerical robustness metrics, we introduce an interpretive governance layer based on semantic auditing of model explanations using large language models. Together, these results demonstrate that adversarial robustness in financial machine learning is a regime dependent property and motivate stress aware approaches to model risk assessment in high stakes financial deployments.

Paperid: 3716, https://arxiv.org/pdf/2512.18932.pdf

Abstract:
Differential privacy (DP) has emerged as the gold standard for protecting user data in recommender systems, but existing privacy-preserving mechanisms face a fundamental challenge: the privacy-utility tradeoff inevitably degrades recommendation quality as privacy budgets tighten. We introduce DPSR (Differentially Private Sparse Reconstruction), a novel three-stage denoising framework that fundamentally addresses this limitation by exploiting the inherent structure of rating matrices -- sparsity, low-rank properties, and collaborative patterns. DPSR consists of three synergistic stages: (1) \textit{information-theoretic noise calibration} that adaptively reduces noise for high-information ratings, (2) \textit{collaborative filtering-based denoising} that leverages item-item similarities to remove privacy noise, and (3) \textit{low-rank matrix completion} that exploits latent structure for signal recovery. Critically, all denoising operations occur \textit{after} noise injection, preserving differential privacy through the post-processing immunity theorem while removing both privacy-induced and inherent data noise. Through extensive experiments on synthetic datasets with controlled ground truth, we demonstrate that DPSR achieves 5.57\% to 9.23\% RMSE improvement over state-of-the-art Laplace and Gaussian mechanisms across privacy budgets ranging from $\varepsilon=0.1$ to $\varepsilon=10.0$ (all improvements statistically significant with $p < 0.05$, most $p < 0.001$). Remarkably, at $\varepsilon=1.0$, DPSR achieves RMSE of 0.9823, \textit{outperforming even the non-private baseline} (1.0983), demonstrating that our denoising pipeline acts as an effective regularizer that removes data noise in addition to privacy noise.

Paperid: 3717, https://arxiv.org/pdf/2512.18646.pdf

Abstract:
Privacy-preserving inference of convolutional neural networks (CNNs) using homomorphic encryption has emerged as a promising approach for enabling secure machine learning in untrusted environments. In our previous work, we introduced a matrix-encoding strategy that allows convolution and matrix multiplication to be efficiently evaluated over encrypted data, enabling practical CNN inference without revealing either the input data or the model parameters. The core idea behind this strategy is to construct a three-dimensional representation within ciphertexts that preserves the intrinsic spatial structure of both input image data and model weights, rather than flattening them into conventional two-dimensional encodings. However, this approach can operate efficiently $only$ when the number of available plaintext slots within a ciphertext is sufficient to accommodate an entire input image, which becomes a critical bottleneck when processing high-resolution images. In this paper, we address this fundamental limitation by proposing an improved encoding and computation framework that removes the requirement that a single encrypted ciphertext must fully contain one input image. Our method reformulates the data layout and homomorphic operations to partition high-resolution inputs across multiple ciphertexts while preserving the algebraic structure required for efficient convolution and matrix multiplication. As a result, our approach enables privacy-preserving CNN inference to scale naturally beyond the slot-capacity constraints of prior methods, making homomorphic evaluation of CNNs practical for higher-resolution and more complex datasets.

Paperid: 3718, https://arxiv.org/pdf/2512.18560.pdf

Abstract:
Sensor data in IoT (Internet of Things) systems is vulnerable to tampering or falsification when transmitted through untrusted services. This is critical because such data increasingly underpins real-world decisions in domains such as logistics, healthcare, and other critical infrastructure. We propose a general method for secure sensor-data logging in which tamper-evident devices periodically sign readouts, link data using redundant hash chains, and submit cryptographic evidence to a blockchain-based service via Merkle trees to ensure verifiability even under data loss. Our approach enables reliable and cost-effective validation of sensor data across diverse IoT systems, including disaster response and other humanitarian applications, without relying on the integrity of intermediate systems.

Paperid: 3719, https://arxiv.org/pdf/2512.18432.pdf

Abstract:
The move to 6th Generation (6G) wireless networks creates new issues with privacy, scalability, and adaptability. The data-intensive nature of 6G is not handled well by older, centralized network models. A shift toward more secure and decentralized systems is therefore required. A new framework called the Federated Learning-based Decentralized Adaptive Intelligent Transmission Protocol (AITP) is proposed to meet these challenges. The AITP uses the distributed learning of Federated Learning (FL) within a decentralized system. Transmission parameters can be adjusted intelligently in real time. User privacy is maintained by keeping raw data on local edge devices. The protocol's performance was evaluated with mathematical modeling and detailed simulations. It was shown to be superior to traditional non-adaptive and centralized AI methods across several key metrics. These included latency, network throughput, energy efficiency, and robustness. The AITP is presented as a foundational technology for future 6G networks that supports a user-centric, privacy-first design. This study is a step forward for privacy-preserving research in 6G.

Paperid: 3720, https://arxiv.org/pdf/2512.17722.pdf

Abstract:
This paper introduces a standardized model card framework specifically designed for digital and web forensics. Building upon established model card methodologies and recent work on abstract models for digital forensic analysis, this paper presents a web based framework that generates model cards specifically designed to represent knowledge in the forensic domain. The framework includes controlled vocabularies for classification, reasoning types, bias identification, and error categorization, along with a web-based generator tool to facilitate adoption. The paper describes the model card structure, presents the controlled vocabularies, and introduces the beta version of the generator tool, inviting community feedback to refine this emerging standard. Ultimately, the systemic risk is that that the anti fraud and digital and web forensics processes are controlled by the mobs.

Paperid: 3721, https://arxiv.org/pdf/2512.17527.pdf

Abstract:
Foundation models for protein design raise concrete biosecurity risks, yet the community lacks a simple, reproducible baseline for sequence-level hazard screening that is explicitly evaluated under homology control and runs on commodity CPUs. We introduce SafeBench-Seq, a metadata-only, reproducible benchmark and baseline classifier built entirely from public data (SafeProtein hazards and UniProt benigns) and interpretable features (global physicochemical descriptors and amino-acid composition). To approximate "never-before-seen" threats, we homology-cluster the combined dataset at <=40% identity and perform cluster-level holdouts (no cluster overlap between train/test). We report discrimination (AUROC/AUPRC) and screening-operating points (TPR@1% FPR; FPR@95% TPR) with 95% bootstrap confidence intervals (n=200), and we provide calibrated probabilities via CalibratedClassifierCV (isotonic for Logistic Regression / Random Forest; Platt sigmoid for Linear SVM). We quantify probability quality using Brier score, Expected Calibration Error (ECE; 15 bins), and reliability diagrams. Shortcut susceptibility is probed via composition-preserving residue shuffles and length-/composition-only ablations. Empirically, random splits substantially overestimate robustness relative to homology-clustered evaluation; calibrated linear models exhibit comparatively good calibration, while tree ensembles retain slightly higher Brier/ECE. SafeBench-Seq is CPU-only, reproducible, and releases metadata only (accessions, cluster IDs, split labels), enabling rigorous evaluation without distributing hazardous sequences.

Paperid: 3722, https://arxiv.org/pdf/2512.17519.pdf

Abstract:
We present a simple, PEFT-compatible mechanism that enforces secret-key access control in instruction-tuned language models. K-OTG trains on a dual-path corpus: authorized examples (prefixed with a role key) learn the task output, while unauthorized examples learn a visible block token. At inference, a pre-lm_head hook applies an orthonormal transform to the hidden state: with the correct key/role the inverse map restores the model's native basis; otherwise a session-ephemeral scrambler (permutation, sign flips, Householders) makes logits uninformative and the system short-circuits to BLOCK. Keys are not added as special tokens, and the method composes cleanly with LoRA on 4-bit bases. We evaluate an hour-scale protocol on 1-3B-class instruction models (Llama 3.2, Qwen2.5 1.5B) across utility (XSum ROUGE/BLEU, GSM8K accuracy, WikiText-2 perplexity), selectivity (3by3 role-key unlock matrices), nonce invariance, block suppression, and throughput. Authorized utility remains close to the base on summarization with the expected modest PPL increase from instruction tuning; unauthorized utility collapses (near-zero sequence metrics with exploding PPL), indicating practical unusability without the key. Unlock matrices are diagonally dominant (high on-target unlock, low cross-unlock), authorized block emission is 0 per N via robust bad-word lists, and greedy outputs match exactly across nonces, confirming correct inverse cancellation. The runtime overhead of the Python-level hook is 40% tokens per sec versus the base. K-OTG therefore provides a pragmatic, model-agnostic way to prevent unauthorized use while preserving authorized utility.

Paperid: 3723, https://arxiv.org/pdf/2512.17411.pdf

Abstract:
Blockchain technology, lauded for its transparent and immutable nature, introduces a novel trust model. However, its decentralized structure raises concerns about potential inclusion of malicious or illegal content. This study focuses on Ethereum, presenting a data identification and restoration algorithm. Successfully recovering 175 common files, 296 images, and 91,206 texts, we employed the FastText algorithm for sentiment analysis, achieving a 0.9 accuracy after parameter tuning. Classification revealed 70,189 neutral, 5,208 positive, and 15,810 negative texts, aiding in identifying sensitive or illicit information. Leveraging the NSFWJS library, we detected seven indecent images with 100% accuracy. Our findings expose the coexistence of benign and harmful content on the Ethereum blockchain, including personal data, explicit images, divisive language, and racial discrimination. Notably, sensitive information targeted Chinese government officials. Proposing preventative measures, our study offers valuable insights for public comprehension of blockchain technology and regulatory agency guidance. The algorithms employed present innovative solutions to address blockchain data privacy and security concerns.

Paperid: 3724, https://arxiv.org/pdf/2512.17295.pdf

Abstract:
Identifying heavy hitters in data streams is a fundamental problem with widespread applications in modern analytics systems. These streams are often derived from sensitive user activity, making update-level privacy guarantees necessary. While recent work has adapted the classical heavy hitter algorithm Misra-Gries to satisfy differential privacy in the streaming model, the privatization of other heavy hitter algorithms with better empirical utility is absent. Under this observation, we present the first differentially private variant of the SpaceSaving algorithm, which, in the non-private setting, is regarded as the state-of-the-art in practice. Our construction post-processes a non-private SpaceSaving summary by injecting asymptotically optimal noise and applying a carefully calibrated selection rule that suppresses unstable labels. This yields strong privacy guarantees while preserving the empirical advantages of SpaceSaving. Second, we introduce a generic method for extracting heavy hitters from any differentially private frequency oracle in the data stream model. The method requires only O(k) additional memory, where k is the number of heavy items, and provides a mechanism for safely releasing item identities from noisy frequency estimates. This yields an efficient, plug-and-play approach for private heavy hitter recovery from linear sketches. Finally, we conduct an experimental evaluation on synthetic and real-world datasets. Across a wide range of privacy parameters and space budgets, our method provides superior utility to the existing differentially private Misra-Gries algorithm. Our results demonstrate that the empirical superiority of SpaceSaving survives privatization and that efficient, practical heavy hitter identification is achievable under strong differential privacy guarantees.

Paperid: 3725, https://arxiv.org/pdf/2512.17251.pdf

Abstract:
Large language models are exposed to risks of extraction, distillation, and unauthorized fine-tuning. Existing defenses use watermarking or monitoring, but these act after leakage. We design AlignDP, a hybrid privacy lock that blocks knowledge transfer at the data interface. The key idea is to separate rare and non-rare fields. Rare fields are shielded by PAC indistinguishability, giving effective zero-epsilon local DP. Non-rare fields are privatized with RAPPOR, giving unbiased frequency estimates under local DP. A global aggregator enforces composition and budget. This two-tier design hides rare events and adds controlled noise to frequent events. We prove limits of PAC extension to global aggregation, give bounds for RAPPOR estimates, and analyze utility trade-off. A toy simulation confirms feasibility: rare categories remain hidden, frequent categories are recovered with small error.

Paperid: 3726, https://arxiv.org/pdf/2512.17146.pdf

Abstract:
Genomic Foundation Models (GFMs), such as Evolutionary Scale Modeling (ESM), have demonstrated remarkable success in variant effect prediction. However, their security and robustness under adversarial manipulation remain largely unexplored. To address this gap, we introduce the Secure Agentic Genomic Evaluator (SAGE), an agentic framework for auditing the adversarial vulnerabilities of GFMs. SAGE functions through an interpretable and automated risk auditing loop. It injects soft prompt perturbations, monitors model behavior across training checkpoints, computes risk metrics such as AUROC and AUPR, and generates structured reports with large language model-based narrative explanations. This agentic process enables continuous evaluation of embedding-space robustness without modifying the underlying model. Using SAGE, we find that even state-of-the-art GFMs like ESM2 are sensitive to targeted soft prompt attacks, resulting in measurable performance degradation. These findings reveal critical and previously hidden vulnerabilities in genomic foundation models, showing the importance of agentic risk auditing in securing biomedical applications such as clinical variant interpretation.

Paperid: 3727, https://arxiv.org/pdf/2512.16683.pdf

Abstract:
We describe the Lockchain Protocol, a lightweight Bitcoin meta-protocol that enables highly efficient transaction discovery at zero marginal block space cost, and data verification without introducing any new on-chain storage mechanism. The protocol repurposes the mandatory 4-byte nLockTime field of every Bitcoin transaction as a compact metadata header. By constraining values to an unused range of past Unix timestamps greater than or equal to 500,000,000, the field can encode a protocol signal, type, variant, and sequence identifier while remaining fully valid under Bitcoin consensus and policy rules. The primary contribution of the protocol is an efficient discovery layer. Indexers can filter candidate transactions by examining a fixed-size header field, independent of transaction payload size, and only then selectively inspect heavier data such as OP RETURN outputs or witness fields. The Lockchain Protocol applies established protocol design patterns to an under-optimised problem domain, namely transaction discovery at scale, and does not claim new cryptographic primitives or storage methods.

Paperid: 3728, https://arxiv.org/pdf/2512.16419.pdf

Abstract:
The use of Large Language Models (LLM) by providers of cybersecurity and digital infrastructures of all kinds is an ongoing development. It is suggested and on an experimental basis used to write the code for the systems, and potentially fed with sensitive data or what would otherwise be considered trade secrets. Outside of these obvious points, this paper asks how AI can negatively affect cybersecurity and law when used for the design and deployment of security infrastructure by its developers. Firstly, the paper discusses the use of LLMs in security, either directly or indirectly, and briefly tackles other types of AI. It then lists norms in cybersecurity, then a range of legal cybersecurity obligations from the European Union, to create a frame of reference. Secondly, the paper describes how LLMs may fail to fulfil both legal obligations and best practice in cybersecurity is given, and the paper ends with some economic and practical consequences for this development, with some notions of solutions as well. The paper finds that using LLMs comes with many risks, many of which are against good security practice, and the legal obligations in security regulation. This is because of the inherent weaknesses of LLMs, most of which are mitigated if replaced with symbolic AI. Both also have issues fulfilling basic traceability obligations and practice. Solutions are secondary systems surrounding LLM based AI, fulfilment of security norms beyond legal requirements and simply not using such technology in certain situations.

Paperid: 3729, https://arxiv.org/pdf/2512.16080.pdf

Abstract:
In decentralized finance (DeFi), designing fixed-income lending automated market makers (AMMs) is extremely challenging due to time-related complexities. Moreover, existing protocols only support single-maturity lending. Building upon the BondMM protocol, this paper argues that its mathematical invariants are sufficiently elegant to be generalized to arbitrary maturities. This paper thus propose an improved design, BondMM-A, which supports lending activities of any maturity. By integrating fixed-income instruments of varying maturities into a single smart contract, BondMM-A offers users and liquidity providers (LPs) greater operational freedom and capital efficiency. Experimental results show that BondMM-A performs excellently in terms of interest rate stability and financial robustness.

Paperid: 3730, https://arxiv.org/pdf/2512.15915.pdf

Abstract:
Hierarchical organization is a fundamental structure in real-world society, where authority and responsibility are delegated from managers to subordinates. The VIRGO network (Virtual Hierarchical Overlay Network for scalable grid computing) provides a scalable overlay for organizing distributed systems but lacks intrinsic security and privacy mechanisms. This paper proposes Private Virtual Tree Networks (PVTNs), a cryptographically enforced extension that leverages the VIRGO overlay to mirror real organizational hierarchies. In PVTNs, join requests are encrypted with the manager's public key to ensure confidentiality, while membership authorization is enforced through manager-signed delegation certificates. Public keys are treated as organizational secrets and are disclosed only within direct manager-member relationships, resulting in a private, non-enumerable virtual tree. Our work demonstrates, through the system model, protocols, security analysis, and design rationale, that PVTNs achieve scalability, dynamic management, and strong security guarantees without relying on global public key infrastructures.

Paperid: 3731, https://arxiv.org/pdf/2512.15782.pdf

Abstract:
Large language models (LLMs) are increasingly deployed behind safety guardrails such as system prompts and content filters, especially in settings where product teams cannot modify model weights. In practice these guardrails are typically hand-tuned, brittle, and difficult to reproduce. This paper studies a simple but practical alternative: treat safety guardrail design itself as a hyperparameter optimization problem over a frozen base model. Concretely, I wrap Mistral-7B-Instruct with modular jailbreak and malware system prompts plus a ModernBERT-based harmfulness classifier, then evaluate candidate configurations on three public benchmarks covering malware generation, classic jailbreak prompts, and benign user queries. Each configuration is scored using malware and jailbreak attack success rate, benign harmful-response rate, and end-to-end latency. A 48-point grid search over prompt combinations and filter modes establishes a baseline. I then run a black-box Optuna study over the same space and show that it reliably rediscovers the best grid configurations while requiring an order of magnitude fewer evaluations and roughly 8x less wall-clock time. The results suggest that viewing safety guardrails as tunable hyperparameters is a feasible way to harden black-box LLM deployments under compute and time constraints.

Paperid: 3732, https://arxiv.org/pdf/2512.15781.pdf

Abstract:
We evaluate adversarial robustness in tabular machine learning models used in financial decision making. Using credit scoring and fraud detection data, we apply gradient based attacks and measure impacts on discrimination, calibration, and financial risk metrics. Results show notable performance degradation under small perturbations and partial recovery through adversarial training.

Paperid: 3734, https://arxiv.org/pdf/2512.14759.pdf

Abstract:
This paper presents a hardware-conscious analysis of the quantum acceleration of the classical 3-round Keccak-256 preimage attack using Grover's Algorithm. While the theoretical quantum speed-up from T_cl=2^{57.8} (classical) to T_qu = 2^{28.9} (quantum) is mathematically sound, the practical implementation overhead is so extreme that attacks remain wholly infeasible in both resource and runtime dimensions. Using Qiskit-based circuit synthesis, we derive that a 3-round Keccak quantum oracle requires: 9,600 Toffoli gates (with uncomputation for reversibility); 3,200 logical qubits (1,600 state + 1,600 auxiliary); 7.47 * 10^{13} total 2-qubit gates (full Grover search); 3.2 million physical qubits (with quantum error correction)PROHIBITIVE; 0.12 years (43 days) to 2,365+ years execution time, depending on machine assumptions. These barriers -- particularly the physical qubit requirements, circuit depth, and error accumulation -- render the quantum attack infeasible for any foreseeable quantum computer. Consequently, SHA-3 security is not threatened by quantum computers for preimage attacks. We emphasize the critical importance of hardware-aware complexity analysis in quantum cryptanalysis: the elegant asymptotic theory of Grover's Algorithm hides an engineering overhead so prohibitive that the quantum approach becomes infeasible from both resource and implementation perspectives.

Paperid: 3735, https://arxiv.org/pdf/2512.14582.pdf

Abstract:
This work presents the first thorough exploration of how reset operations in cloud-based quantum computers could be exploited to run quantum circuits for free. This forms a new type of attack on the economics of cloud-based quantum computers. All major quantum computing companies today offer access to their hardware through some type of cloud-based service. Due to the noisy nature of quantum computers, a quantum circuit is run many times to collect the output statistics, and each run is called a shot. The fees users pay for access to the machines typically depend on the number of these shots of a quantum circuit that are executed. Per-shot pricing is a clean and straightforward approach as users are charged a small fee for each shot of their circuit. This work demonstrates that per-shot pricing can be exploited to get circuits to run for free when users abuse recently implemented mid-circuit qubit measurement and reset operations. Through evaluation on real, cloud-based quantum computers this work shows how multiple circuits can be executed together within a shot, by separating each user circuit by set of reset operations and submitting all the circuits, and reset operations, as one larger circuit. As a result, the user is charged per-shot pricing, even though inside each shot are multiple circuits. Total per-shot cost to run certain circuits could be reduced by up to $900$\% using methods proposed in this work, leading to significant financial losses to quantum computing companies. To address this novel finding, this work proposes a clear approach for how users should be charged for their execution, while maintaining the flexibility and usability of the mid-circuit measurement and reset~operations.

Paperid: 3736, https://arxiv.org/pdf/2512.14422.pdf

Abstract:
The cybersecurity of Industrial Control Systems that manage critical infrastructure such as Water Distribution Systems has become increasingly important as digital connectivity expands. BATADAL benchmark data is a good source of testing intrusion detection techniques, but it presents several important problems, such as imbalance in the number of classes, multivariate time dependence, and stealthy attacks. We consider a hybrid ensemble learning model that will enhance the detection ability of cyber-attacks in WDS by using the complementary capabilities of machine learning and deep learning models. Three base learners, namely, Random Forest , eXtreme Gradient Boosting , and Long Short-Term Memory network, have been strictly compared and seven ensemble types using simple averaged and stacked learning with a logistic regression meta-learner. Random Forest analysis identified top predictors turned into temporal and statistical features, and Synthetic Minority Oversampling Technique (SMOTE) was used to overcome the class imbalance issue. The analyics indicates that the single Long Short-Term Memory network model is of poor performance (F1 = 0.000, AUC = 0.4460), but tree-based models, especially eXtreme Gradient Boosting, perform well (F1 = 0.7470, AUC=0.9684). The hybrid stacked ensemble of Random Forest , eXtreme Gradient Boosting , and Long Short-Term Memory network scored the highest, with the attack class of 0.7205 with an F1-score and a AUC of 0.9826 indicating that the heterogeneous stacking between model precision and generalization can work. The proposed framework establishes a robust and scalable solution for cyber-attack detection in time-dependent industrial systems, integrating temporal learning and ensemble diversity to support the secure operation of critical infrastructure.

Paperid: 3737, https://arxiv.org/pdf/2512.13703.pdf

Abstract:
Large Language Models (LLMs) have demonstrated exceptional performance across various tasks, but their security vulnerabilities can be exploited by attackers to generate harmful content, causing adverse impacts across various societal domains. Most existing jailbreak methods revolve around Prompt Engineering or adversarial optimization, yet we identify a previously overlooked phenomenon: many harmful scenarios are highly consistent with legitimate ones in terms of underlying principles. Based on this finding, this paper proposes the Safe2Harm Semantic Isomorphism Attack method, which achieves efficient jailbreaking through four stages: first, rewrite the harmful question into a semantically safe question with similar underlying principles; second, extract the thematic mapping relationship between the two; third, let the LLM generate a detailed response targeting the safe question; finally, reversely rewrite the safe response based on the thematic mapping relationship to obtain harmful output. Experiments on 7 mainstream LLMs and three types of benchmark datasets show that Safe2Harm exhibits strong jailbreaking capability, and its overall performance is superior to existing methods. Additionally, we construct a challenging harmful content evaluation dataset containing 358 samples and evaluate the effectiveness of existing harmful detection methods, which can be deployed for LLM input-output filtering to enable defense.

Paperid: 3738, https://arxiv.org/pdf/2512.13628.pdf

Abstract:
We study non-interactive zero-knowledge proofs (NIZKs) for NP satisfying: 1) statistical soundness, 2) computational zero-knowledge and 3) certified-everlasting zero-knowledge (CE-ZK). The CE-ZK property allows a verifier of a quantum proof to revoke the proof in a way that can be checked (certified) by the prover. Conditioned on successful certification, the verifier's state can be efficiently simulated with only the statement, in a statistically indistinguishable way. Our contributions regarding these certified-everlasting NIZKs (CE-NIZKs) are as follows: - We identify a barrier to obtaining CE-NIZKs in the CRS model via generalizations of known interactive proofs that satisfy CE-ZK. - We circumvent this by constructing CE-NIZK from black-box use of NIZK for NP satisfying certain properties, along with OWFs. As a result, we obtain CE-NIZKs for NP in the CRS model, based on polynomial hardness of the learning with errors (LWE) assumption. - In addition, we observe that the aforementioned barrier does not apply to the shared EPR model. Consequently, we present a CE-NIZK for NP in this model based on any statistical binding hidden-bits generator, which can be based on LWE. The only quantum computation in this protocol involves single-qubit measurements of the shared EPR pairs.

Paperid: 3739, https://arxiv.org/pdf/2512.13325.pdf

Abstract:
Securing digital text is becoming increasingly relevant due to the widespread use of large language models. Individuals' fear of losing control over data when it is being used to train such machine learning models or when distinguishing model-generated output from text written by humans. Digital watermarking provides additional protection by embedding an invisible watermark within the data that requires protection. However, little work has been taken to analyze and verify if existing digital text watermarking methods are secure and undetectable by large language models. In this paper, we investigate the security-related area of watermarking and machine learning models for text data. In a controlled testbed of three experiments, ten existing Unicode text watermarking methods were implemented and analyzed across six large language models: GPT-5, GPT-4o, Teuken 7B, Llama 3.3, Claude Sonnet 4, and Gemini 2.5 Pro. The findings of our experiments indicate that, especially the latest reasoning models, can detect a watermarked text. Nevertheless, all models fail to extract the watermark unless implementation details in the form of source code are provided. We discuss the implications for security researchers and practitioners and outline future research opportunities to address security concerns.

Paperid: 3740, https://arxiv.org/pdf/2512.13213.pdf

Abstract:
With the rise of cryptocurrencies, many new applications built on decentralized blockchains have emerged. Blockchains are full-stack distributed systems where multiple sub-systems interact. While many deployed blockchains and decentralized applications need better scalability and performance, security is also critical. Due to their complexity, assessing blockchain and DAPP security requires a more holistic view than for traditional distributed or centralized systems. In this thesis, we summarize our contributions to blockchain and decentralized application security. We propose a security reference architecture to support standardized vulnerability and threat analysis. We study consensus security in single-chain Proof-of-Work blockchains, including resistance to selfish mining, undercutting, and greedy transaction selection, as well as related issues in DAG-based systems. We contribute to wallet security with a new classification of authentication schemes and a two-factor method based on One-Time Passwords. We advance e-voting with a practical boardroom voting protocol, extend it to a scalable version for millions of participants while preserving security and privacy, and introduce a repetitive voting framework that enables vote changes between elections while avoiding peak-end effects. Finally, we improve secure logging using blockchains and trusted computing through a centralized ledger that guarantees non-equivocation, integrity, and censorship evidence, then build on it to propose an interoperability protocol for central bank digital currencies that ensures atomic transfers.

Paperid: 3741, https://arxiv.org/pdf/2512.12989.pdf

Abstract:
Cryptographically Relevant Quantum Computers (CRQCs) pose a structural threat to the global digital economy. Algorithms like Shor's factoring and Grover's search threaten to dismantle the public-key infrastructure (PKI) securing sovereign communications and financial transactions. While the timeline for fault-tolerant CRQCs remains probabilistic, the "Store-Now, Decrypt-Later" (SNDL) model necessitates immediate migration to Post-Quantum Cryptography (PQC). This transition is hindered by the velocity of research, evolving NIST standards, and heterogeneous deployment environments. To address this, we present Quantigence, a theory-driven multi-agent AI framework for structured quantum-security analysis. Quantigence decomposes research objectives into specialized roles - Cryptographic Analyst, Threat Modeler, Standards Specialist, and Risk Assessor - coordinated by a supervisory agent. Using "cognitive parallelism," agents reason independently to maintain context purity while execution is serialized on resource-constrained hardware (e.g., NVIDIA RTX 2060). The framework integrates external knowledge via the Model Context Protocol (MCP) and prioritizes vulnerabilities using the Quantum-Adjusted Risk Score (QARS), a formal extension of Mosca's Theorem. Empirical validation shows Quantigence achieves a 67% reduction in research turnaround time and superior literature coverage compared to manual workflows, democratizing access to high-fidelity quantum risk assessment.

Paperid: 3742, https://arxiv.org/pdf/2512.12483.pdf

Abstract:
With the advent of machine learning and quantum computing, the 21st century has gone from a place of relative algorithmic security, to one of speculative unease and possibly, cyber catastrophe. Modern algorithms like Elliptic Curve Cryptography (ECC) are the bastion of current cryptographic security protocols that form the backbone of consumer protection ranging from Hypertext Transfer Protocol Secure (HTTPS) in the modern internet browser, to cryptographic financial instruments like Bitcoin. And there's been very little work put into testing the strength of these ciphers. Practically the only study that I could find was on side-channel recognition, a joint paper from the University of Milan, Italy and King's College, London\cite{battistello2025ecc}. These algorithms are already considered bulletproof by many consumers, but exploits already exist for them, and with computing power and distributed, federated compute on the rise, it's only a matter of time before these current bastions fade away into obscurity, and it's on all of us to stand up when we notice something is amiss, lest we see such passages claim victims in that process. In this paper, we seek to explore the use of modern language model architecture in cracking the association between a known public key, and its associated private key, by intuitively learning to reverse engineer the public keypair generation process, effectively solving the curve. Additonally, we attempt to ascertain modern machine learning's ability to memorize public-private secp256r1 keypairs, and to then test their ability to reverse engineer the public keypair generation process. It is my belief that proof-for would be equally valuable as proof-against in either of these categories. Finally, we'll conclude with some number crunching on where we see this particular field heading in the future.

Paperid: 3743, https://arxiv.org/pdf/2512.12332.pdf

Abstract:
The purpose of this study is to investigate how homophily, memory constraints, and adversarial disruptions collectively shape the resilience and adaptability of complex networks. To achieve this, we develop a new framework that integrates explicit memory decay mechanisms into homophily-based models and systematically evaluate their performance across diverse graph structures and adversarial settings. Our methods involve extensive experimentation on synthetic datasets, where we vary decay functions, reconnection probabilities, and similarity measures, primarily comparing cosine similarity with traditional metrics such as Jaccard similarity and baseline edge weights. The results show that cosine similarity achieves up to a 30\% improvement in stability metrics in sparse, convex, and modular networks. Moreover, the refined value-of-recall metric demonstrates that strategic forgetting can bolster resilience by balancing network robustness and adaptability. The findings underscore the critical importance of aligning memory and similarity parameters with the structural and adversarial dynamics of the network. By quantifying the tangible benefits of incorporating memory constraints into homophily-based analyses, this study offers actionable insights for optimizing real-world applications, including social systems, collaborative platforms, and cybersecurity contexts.

Paperid: 3744, https://arxiv.org/pdf/2512.11107.pdf

Abstract:
We present a fully digital framework that replicates the statistical behavior of coherent-state quantum random number generation (QRNG) by harnessing system timing jitter through random permutation processes. Our approach transforms computational timing variations from hardware and operating system sources into permutation dynamics that generate Poisson-distributed numbers, accurately reproducing the photon statistics of optical coherent states. The theoretical foundation is established by the Uniform Convergence Theorem, which provides exponential convergence to uniformity under modular projection with rigorous error bounds. Extensive experimental validation across multiple parameter regimes and sample sizes up to $10^8$ bytes demonstrates exceptional performance: Shannon entropy approaching 7.999998 bits/byte and min-entropy exceeding 7.99 bits/byte, outperforming theoretical bounds at scale. The architecture inherently resists side-channel attacks through compound timing distributions and adaptive permutation behavior, while operating without classical cryptographic post-processing. Our results establish that coherent-state QRNG functionality can be entirely realized through classical computational processes, delivering mathematically provable uniformity and practical cryptographic security without quantum photonic hardware.

Paperid: 3745, https://arxiv.org/pdf/2512.10280.pdf

Abstract:
The rapid expansion of cloud infrastructures and distributed identity systems has significantly increased the complexity and attack surface of modern enterprises. Traditional rule based or signature driven detection systems are often inadequate in identifying novel or evolving threats within Identity and Access Management logs, where anomalous behavior may appear statistically benign but contextually malicious. This paper presents a Graph Neural Network Based Adaptive Threat Detection framework designed to learn latent user resource interaction patterns from IAM audit trails in real time. By modeling IAM logs as heterogeneous dynamic graphs, the proposed system captures temporal, relational, and contextual dependencies across entities such as users, roles, sessions, and access actions. The model incorporates attention based aggregation and graph embedding updates to enable continual adaptation to changing cloud environments. Experimental evaluation on synthesized and real world IAM datasets demonstrates that the proposed method achieves higher detection precision and recall than baseline LSTM and GCN classifiers, while maintaining scalability across multi tenant cloud environments. The frameworks adaptability enables proactive mitigation of insider threats, privilege escalation, and lateral movement attacks, contributing to the foundation of AI driven zero trust access analytics. This work bridges the gap between graph based machine learning and operational cloud security intelligence.

Paperid: 3746, https://arxiv.org/pdf/2512.10203.pdf

Abstract:
We study Sybil manipulation in BRACE, a competitive equilibrium mechanism for combinatorial exchanges, by treating identity creation as a finite perturbation of the empirical distribution of reported types. Under standard regularity assumptions on the excess demand map and smoothness of principal utilities, we obtain explicit linear bounds on price and welfare deviations induced by bounded Sybil invasion. Using these bounds, we prove a sharp contrast: strategyproofness in the large holds if and only if each principal's share of identities vanishes, whereas any principal with a persistent positive share can construct deviations yielding strictly positive limiting gains. We further show that the feasibility of BRACE fails in the event of an unbounded population of Sybils and provide a precise cost threshold that ensures disincentivization of such attacks in large markets.

Paperid: 3747, https://arxiv.org/pdf/2512.10088.pdf

Abstract:
The Transportation Systems Sector is one of the sixteen critical infrastructure sectors identified by the Cybersecurity and Infrastructure Security Agency (CISA) and plays a crucial role in ensuring public safety, economic stability, and national security. The Massachusetts Bay Transportation Authority (MBTA) serves as the primary public transportation system in the Greater Boston Area, with the Green Line representing one of the oldest and most complex rapid transit systems in the network. This paper presents a network-based risk and resilience assessment of the MBTA Green Line using graph theory, network metrics, and the Model-Based Risk Analysis (MBRA) tool. The original 70-station Green Line network is simplified into a 17-node model, and key metrics, including degree centrality, betweenness centrality, eigenvector centrality, spectral radius, node robustness, and blocking nodes, are computed using Python-based analysis. Critical vulnerability is derived using the MBRA resiliency equation, and random, targeted, and cyber-physical attack scenarios are evaluated. The results identify North Station, Government Center, Haymarket, Copley, and Kenmore as the most critical nodes. A fault tree analysis between Kenmore and Copley further demonstrates the impact of budget allocation on threat reduction. This work highlights key vulnerabilities in the Green Line network and provides actionable recommendations to improve resilience against cyber-physical threats.

Paperid: 3748, https://arxiv.org/pdf/2512.09954.pdf

Abstract:
Next-generation tactical networks face a critical Anonymity Trilemma: it is impossible to simultaneously achieve strong anonymity, low latency (isochrony), and low bandwidth overhead under a global passive adversary. CIDP breaks this deadlock by injecting physical-layer entropy via rapid antenna sidelobe modulation, enabling near-isochronous, low-overhead anonymous communication. CIDP jointly designs: (a) a Lyapunov drift-plus-penalty network controller that stabilizes queues and maximizes entropy injection; (b) a robust discrete-time Control Barrier Function (RaCBF) filter that provably enforces deterministic jitter bounds for real-time flows despite uncertainty; and (c) a convex Sidelobe Time Modulation (SLTM) optimization that spreads signals into the antenna null-space to mask transmissions. We explicitly augment the classical anonymity bound with a physical-layer equivocation term, showing that rapidly changing sidelobes contribute additional secrecy. Consequently, as the injected physical entropy grows, both latency and dummy overhead can approach zero for a fixed anonymity target. We provide full theoretical proofs of queue stability, barrier-set invariance, and SLTM convexity. Moreover, we quantitatively benchmark our SLTM design against recent LPI/LPD schemes, demonstrating significantly lower intercept probability for comparable overhead. High-fidelity MATLAB/NS-3 simulations and an FPGA prototype validate CIDP: results show approximately 40% larger anonymity sets and 100% compliance with sub-30 ms jitter (compared to a Tor-like baseline), with only about 5% throughput loss. We also outline a Modular Open Systems Approach (MOSA) and FOCI-compliant supply-chain strategy. CIDP is the first architecture that simultaneously addresses strong anonymity, strict isochrony, and spectral efficiency with provable guarantees, making it highly relevant for sovereign JADC2 deployments.

Paperid: 3749, https://arxiv.org/pdf/2512.08782.pdf

Abstract:
Hackers may create malicious solidity programs and deploy it in the Ethereum block chain. These malicious smart contracts try to attack legitimate programs by exploiting its vulnerabilities such as reentrancy, tx.origin attack, bad randomness, deligatecall and so on. This may lead to drain of the funds, denial of service and so on . Hence, it is necessary to identify and prevent the malicious smart contract before deploying it into the blockchain. In this paper, we propose an ML based malicious smart contract detection mechanism by analyzing the EVM opcodes. After balancing the opcode frequency dataset with SMOTE algorithm, we transformed opcode frequencies to the binary values (0,1) using an entropy based supervised binning method. Then, an explainable AI model is trained with the proposed binary opcode based features. From the implementations, we found that the proposed mechanism can detect 99% of malicious smart contracts with a false positive rate of only 0.01. Finally, we incorporated LIME algorithm in our classifier to justify its predictions. We found that, LIME algorithm can explain why a particular smart contract app is declared as malicious by our ML classifier based on the binary value of EVM opcodes.

Paperid: 3750, https://arxiv.org/pdf/2512.07866.pdf

Abstract:
The sophistication of modern malware, specifically regarding communication with Command and Control (C2) servers, has rendered static blacklist-based defenses obsolete. The use of Domain Generation Algorithms (DGA) allows attackers to generate thousands of dynamic addresses daily, hindering blocking by traditional firewalls. This paper aims to propose and evaluate a method for detecting DGA domains using Deep Learning and Natural Language Processing (NLP) techniques. The methodology consisted of collecting a hybrid database containing 50,000 legitimate and 50,000 malicious domains, followed by the extraction of lexical features and the training of a Recurrent Neural Network (LSTM). Results demonstrated that while statistical entropy analysis is effective for simple DGAs, the Neural Network approach presents superiority in detecting complex patterns, reaching 97.2% accuracy and reducing the false positive rate in ambiguous lawful traffic scenarios.

Paperid: 3751, https://arxiv.org/pdf/2512.07827.pdf

Abstract:
The escalating sophistication and variety of cyber threats have rendered static honeypots inadequate, necessitating adaptive, intelligence-driven deception. In this work, ADLAH is introduced: an Adaptive Deep Learning Anomaly Detection Honeynet designed to maximize high-fidelity threat intelligence while minimizing cost through autonomous orchestration of infrastructure. The principal contribution is offered as an end-to-end architectural blueprint and vision for an AI-driven deception platform. Feasibility is evidenced by a functional prototype of the central decision mechanism, in which a reinforcement learning (RL) agent determines, in real time, when sessions should be escalated from low-interaction sensor nodes to dynamically provisioned, high-interaction honeypots. Because sufficient live data were unavailable, field-scale validation is not claimed; instead, design trade-offs and limitations are detailed, and a rigorous roadmap toward empirical evaluation at scale is provided. Beyond selective escalation and anomaly detection, the architecture pursues automated extraction, clustering, and versioning of bot attack chains, a core capability motivated by the empirical observation that exposed services are dominated by automated traffic. Together, these elements delineate a practical path toward cost-efficient capture of high-value adversary behavior, systematic bot versioning, and the production of actionable threat intelligence.

Paperid: 3752, https://arxiv.org/pdf/2512.06925.pdf

Abstract:
Phishing is a cybercrime in which individuals are deceived into revealing personal information, often resulting in financial loss. These attacks commonly occur through fraudulent messages, misleading advertisements, and compromised legitimate websites. This study proposes a Quantile Regression Deep Q-Network (QR-DQN) approach that integrates RoBERTa semantic embeddings with handcrafted lexical features to enhance phishing detection while accounting for uncertainties. Unlike traditional DQN methods that estimate single scalar Q-values, QR-DQN leverages quantile regression to model the distribution of returns, improving stability and generalization on unseen phishing data. A diverse dataset of 105,000 URLs was curated from PhishTank, OpenPhish, Cloudflare, and other sources, and the model was evaluated using an 80/20 train-test split. The QR-DQN framework achieved a test accuracy of 99.86%, precision of 99.75%, recall of 99.96%, and F1-score of 99.85%, demonstrating high effectiveness. Compared to standard DQN with lexical features, the hybrid QR-DQN with lexical and semantic features reduced the generalization gap from 1.66% to 0.04%, indicating significant improvement in robustness. Five-fold cross-validation confirmed model reliability, yielding a mean accuracy of 99.90% with a standard deviation of 0.04%. These results suggest that the proposed hybrid approach effectively identifies phishing threats, adapts to evolving attack strategies, and generalizes well to unseen data.

Paperid: 3753, https://arxiv.org/pdf/2512.06659.pdf

Abstract:
Cybersecurity has become one of the earliest adopters of agentic AI, as security operations centers increasingly rely on multi-step reasoning, tool-driven analysis, and rapid decision-making under pressure. While individual large language models can summarize alerts or interpret unstructured reports, they fall short in real SOC environments that require grounded data access, reproducibility, and accountable workflows. In response, the field has seen a rapid architectural evolution from single-model helpers toward tool-augmented agents, distributed multi-agent systems, schema-bound tool ecosystems, and early explorations of semi-autonomous investigative pipelines. This survey presents a five-generation taxonomy of agentic AI in cybersecurity. It traces how capabilities and risks change as systems advance from text-only LLM reasoners to multi-agent collaboration frameworks and constrained-autonomy pipelines. We compare these generations across core dimensions - reasoning depth, tool use, memory, reproducibility, and safety. In addition, we also synthesize emerging benchmarks used to evaluate cyber-oriented agents. Finally, we outline the unresolved challenges that accompany this evolution, such as response validation, tool-use correctness, multi-agent coordination, long-horizon reasoning, and safeguards for high-impact actions. Collectively, this work provides a structured perspective on how agentic AI is taking shape within cybersecurity and what is required to ensure its safe and reliable deployment.

Paperid: 3754, https://arxiv.org/pdf/2512.06467.pdf

Abstract:
In distributed applications, like swarms of satellites, machine learning can be efficiently applied even on small devices by using Federated Learning (FL). This allows to reduce the learning complexity by transmitting only updates to the general model in the server in the form of differences in stochastic gradient descent. FL naturally supports differential privacy but new attacks, so called Data Leakage from Gradient (DLG) have been discovered recently. There has been work on defenses against DLG but there is a lack of foundation and rigorous evaluation of their security. In the current work, we extend existing work on a formal notion of Differential Privacy for Federated Learning distributed dynamic systems and relate it to the notion of the attacker advantage. This formalisation is carried out within the Isabelle Insider and Infrastructure framework (IIIf) allowing the machine supported verification of theory and applications within the proof assistant Isabelle. Satellite swarm systems are used as a motivating use case but also as a validation case study.

Paperid: 3755, https://arxiv.org/pdf/2512.06396.pdf

Abstract:
The increasing complexity of cyber threats in distributed environments demands advanced frameworks for real-time detection and response across multimodal data streams. This paper introduces AgenticCyber, a generative AI powered multi-agent system that orchestrates specialized agents to monitor cloud logs, surveillance videos, and environmental audio concurrently. The solution achieves 96.2% F1-score in threat detection, reduces response latency to 420 ms, and enables adaptive security posture management using multimodal language models like Google's Gemini coupled with LangChain for agent orchestration. Benchmark datasets, such as AWS CloudTrail logs, UCF-Crime video frames, and UrbanSound8K audio clips, show greater performance over standard intrusion detection systems, reducing mean time to respond (MTTR) by 65% and improving situational awareness. This work introduces a scalable, modular proactive cybersecurity architecture for enterprise networks and IoT ecosystems that overcomes siloed security technologies with cross-modal reasoning and automated remediation.

Paperid: 3756, https://arxiv.org/pdf/2512.06048.pdf

Abstract:
Cybersecurity's evolving complexity presents unique challenges and opportunities for AI research and practice. This paper shares key lessons and insights from designing, building, and operating production-grade GenAI pipelines in cybersecurity, with a focus on the continual adaptation required to keep pace with ever-shifting knowledge bases, tooling, and threats. Our goal is to provide an actionable perspective for AI practitioners and industry stakeholders navigating the frontier of GenAI for cybersecurity, with particular attention to how different adaptation mechanisms complement each other in end-to-end systems. We present practical guidance derived from real-world deployments, propose best practices for leveraging retrieval- and model-level adaptation, and highlight open research directions for making GenAI more robust, precise, and auditable in cyber defense.

Paperid: 3757, https://arxiv.org/pdf/2512.05288.pdf

Abstract:
Malicious WebShells pose a significant and evolving threat by compromising critical digital infrastructures and endangering public services in sectors such as healthcare and finance. While the research community has made significant progress in WebShell detection (i.e., distinguishing malicious samples from benign ones), we argue that it is time to transition from passive detection to in-depth analysis and proactive defense. One promising direction is the automation of WebShell family classification, which involves identifying the specific malware lineage in order to understand an adversary's tactics and enable a precise, rapid response. This crucial task, however, remains a largely unexplored area that currently relies on slow, manual expert analysis. To address this gap, we present the first systematic study to automate WebShell family classification. Our method begins with extracting dynamic function call traces to capture inherent behaviors that are resistant to common encryption and obfuscation. To enhance the scale and diversity of our dataset for a more stable evaluation, we augment these real-world traces with new variants synthesized by Large Language Models. These augmented traces are then abstracted into sequences, graphs, and trees, providing a foundation to benchmark a comprehensive suite of representation methods. Our evaluation spans classic sequence-based embeddings (CBOW, GloVe), transformers (BERT, SimCSE), and a range of structure-aware algorithms, including Graph Kernels, Graph Edit Distance, Graph2Vec, and various Graph Neural Networks. Through extensive experiments on four real-world, family-annotated datasets under both supervised and unsupervised settings, we establish a robust baseline and provide practical insights into the most effective combinations of data abstractions, representation models, and learning paradigms for this challenge.

Paperid: 3758, https://arxiv.org/pdf/2512.04237.pdf

Abstract:
This work introduces the Primitive Vector Cipher (PVC), a novel hybrid encryption scheme integrating matrix-based cryptography with advanced Diffie-Hellman key exchange. PVC's security is grounded on the established hardness of the Vector Computational Diffie- Hellman (V-CDH) problem. The two-layered design uses HKDF to mask plaintext via a DH-authenticated shared primitive vector and randomize cipher blocks with a per-block offset. This approach eliminates deterministic repetitions and provides strong resistance against linear and known-plaintext attacks. PVC's block-wise structure allows for massive parallelism and excellent linear scaling. Security is formally analyzed, demonstrating INDCPA security under V-CDH. STS protocol integration elevates security toward IND-CCA guarantees.

Paperid: 3759, https://arxiv.org/pdf/2512.03465.pdf

Abstract:
In this study, we more rigorously evaluated our attack script $\textit{TraceTarnish}$, which leverages adversarial stylometry principles to anonymize the authorship of text-based messages. To ensure the efficacy and utility of our attack, we sourced, processed, and analyzed Reddit comments--comments that were later alchemized into $\textit{TraceTarnish}$ data--to gain valuable insights. The transformed $\textit{TraceTarnish}$ data was then further augmented by $\textit{StyloMetrix}$ to manufacture stylometric features--features that were culled using the Information Gain criterion, leaving only the most informative, predictive, and discriminative ones. Our results found that function words and function word types ($L\_FUNC\_A$ $\&$ $L\_FUNC\_T$); content words and content word types ($L\_CONT\_A$ $\&$ $L\_CONT\_T$); and the Type-Token Ratio ($ST\_TYPE\_TOKEN\_RATIO\_LEMMAS$) yielded significant Information-Gain readings. The identified stylometric cues--function-word frequencies, content-word distributions, and the Type-Token Ratio--serve as reliable indicators of compromise (IoCs), revealing when a text has been deliberately altered to mask its true author. Similarly, these features could function as forensic beacons, alerting defenders to the presence of an adversarial stylometry attack; granted, in the absence of the original message, this signal may go largely unnoticed, as it appears to depend on a pre- and post-transformation comparison. "In trying to erase a trace, you often imprint a larger one." Armed with this understanding, we framed $\textit{TraceTarnish}$'s operations and outputs around these five isolated features, using them to conceptualize and implement enhancements that further strengthen the attack.

Paperid: 3760, https://arxiv.org/pdf/2512.03088.pdf

Abstract:
As data is an essential asset for any Web3 application, selecting an oracle is a critical decision for its success. To date, academic research has mainly focused on improving oracle technology and internal economics, while the drivers of oracle choice on the client side remain largely unexplored. This study fills this gap by gathering insights from leading Web3 protocols, uncovering their rationale for oracle selection and their preferences when deciding whether to outsource or internalize data request mechanisms. The collected data covers more than 55% of the DeFi market cap and is obtained exclusively by protocol executives, board members, or delegates. Insights support the view that protocol choices are tied to technological dependencies, where immutability of smart contracts amplifies lock-in, preventing agile switching among data providers. Furthermore, when viable third-party solutions exist, protocols overwhelmingly prefer outsourcing rather than building and maintaining internal oracle mechanisms.

Paperid: 3761, https://arxiv.org/pdf/2512.03079.pdf

Abstract:
Large Language Models (LLMs) have demonstrated exceptional capabilities in natural language understanding and generation. Based on these LLMs, businesses have started to provide Embeddings-as-a-Service (EaaS), offering feature extraction capabilities (in the form of text embeddings) that benefit downstream natural language processing tasks. However, prior research has demonstrated that EaaS is vulnerable to imitation attacks, where an attacker clones the service's model in a black-box manner without access to the model's internal workings. In response, watermarks have been added to the text embeddings to protect the intellectual property of EaaS providers by allowing them to check for model ownership. This thesis focuses on defending against imitation attacks by investigating EaaS watermarks. To achieve this goal, we unveil novel attacks and propose and validate new watermarking techniques. Firstly, we show that existing EaaS watermarks can be removed through paraphrasing the input text when attackers clone the model during imitation attacks. Our study illustrates that paraphrasing can effectively bypass current state-of-the-art EaaS watermarks across various attack setups (including different paraphrasing techniques and models) and datasets in most instances. This demonstrates a new vulnerability in recent EaaS watermarking techniques. Subsequently, as a countermeasure, we propose a novel watermarking technique, WET (Watermarking EaaS with Linear Transformation), which employs linear transformation of the embeddings. Watermark verification is conducted by applying a reverse transformation and comparing the similarity between recovered and original embeddings. We demonstrate its robustness against paraphrasing attacks with near-perfect verifiability. We conduct detailed ablation studies to assess the significance of each component and hyperparameter in WET.

Paperid: 3762, https://arxiv.org/pdf/2512.02822.pdf

Abstract:
We present a novel approach to post-quantum cryptography that employs directed-graph decryption of noise-enhanced high-memory convolutional codes. The proposed construction generates random-like generator matrices that effectively conceal algebraic structure and resist known structural attacks. Security is further reinforced by the deliberate injection of strong noise during decryption, arising from polynomial division: while legitimate recipients retain polynomial-time decoding, adversaries face exponential-time complexity. As a result, the scheme achieves cryptanalytic security margins surpassing those of Classic McEliece by factors exceeding 2^(200). Beyond its enhanced security, the method offers greater design flexibility, supporting arbitrary plaintext lengths with linear-time decryption and uniform per-bit computational cost, enabling seamless scalability to long messages. Practical deployment is facilitated by parallel arrays of directed-graph decoders, which identify the correct plaintext through polynomial ambiguity while allowing efficient hardware and software implementations. Altogether, the scheme represents a compelling candidate for robust, scalable, and quantum-resistant public-key cryptography.

Paperid: 3763, https://arxiv.org/pdf/2512.02598.pdf

Abstract:
Public-key cryptography algorithms have evolved towards increasing computational complexity to hide desired messages, which is accelerating with the development of the Internet and quantum computing. This paper introduces a novel public-key cryptography algorithm that generates ciphertexts by counting the number of elements in randomly extracted subsets from a multiset. After explaining the novel cryptographic concept, the process of mathematically refining it using satisfiability problems is described. The advantages of the proposed algorithm are: first, it is significantly faster than other public-key algorithms; second, it does not require big numbers, making it executable on any devices; and third, it can be easily extended into a public-key cryptosystem using a single public key and multiple private keys while maintaining quantum resistance.

Paperid: 3764, https://arxiv.org/pdf/2512.02082.pdf

Abstract:
As the Deterministic Random Bit Generator (DRBG) serves as a fundamental component in random number generation and cryptographic applications, its performance and security are particularly critical in resource-constrained embedded systems, where memory capacity and computational efficiency are limited. Establishing a high-performance and secure DRBG is therefore an important issue for embedded system design. Furthermore, the National Institute of Standards and Technology (NIST) established the Ascon lightweight cryptographic standard in August 2025, which is suitable for use in resource-limited embedded environments. Therefore, this study revises the DRBG standard and proposes three Ascon-driven constructions: the Ascon-Driven Hash-Based DRBG, the Ascon-Driven keyed-Hash Message Authentication Code (HMAC) DRBG, and the Ascon-Driven Counter-mode (CTR) DRBG. In the experiments, these methods are implemented on a Raspberry Pi platform. The experimental results demonstrate that the proposed approaches achieve higher computational efficiency and lower memory usage compared with existing DRBG implementations, making them suitable for deployment in embedded systems.

Paperid: 3765, https://arxiv.org/pdf/2512.01974.pdf

Abstract:
Fast time-domain algorithms have been developed in signal processing applications to reduce the multiplication complexity. For example, fast convolution structures using Cook-Toom and Winograd algorithms are well understood. Short length fast convolutions can be iterated to obtain fast convolution structures for long lengths. In this paper, we show that well known fast convolution structures form the basis for design of fast algorithms in four other problem domains: fast parallel filters, fast polynomial modular multiplication, and fast pointwise multiplication in the DFT and NTT domains. Fast polynomial modular multiplication and fast pointwise multiplication problems are important for cryptosystem applications such as post-quantum cryptography and homomorphic encryption. By establishing the equivalence of these problems, we show that a fast structure from one domain can be used to design a fast structure for another domain. This understanding is important as there are many well known solutions for fast convolution that can be used in other signal processing and cryptosystem applications.

Paperid: 3766, https://arxiv.org/pdf/2512.00635.pdf

Abstract:
The gamut of todays internet-connected embedded devices has led to increased concerns regarding the security and confidentiality of data. Most internet-connected embedded devices employ mathematically secure cryptographic algorithms to address security vulnerabilities. Despite such mathematical guarantees, as these algorithms are often implemented in silicon, they leak critical information in terms of power consumption, electromagnetic (EM) radiation, timing, cache hits and misses, photonic emission and so on, leading to side-channel analysis (SCA) attacks. This thesis focuses on low overhead generic circuit-level yet synthesizable countermeasures against power and EM SCA. Existing countermeasures (including proposed) still have relatively high overhead which bars them from being used in energy-constraint IoT devices. We propose a zero-overhead integrated inductive sensor which is able to detect i)EM SCA ii) Clock glitch-based Fault Injection Attack (FIA), and iii) Voltage-glitch based Fault Injection Attack by using a simple ML algorithm. Advent of quantum computer research will open new possibilities for theoretical attacks against existing cryptographic protocols. National Institute of Standard & Technology (NIST) has standardized post-quantum cryptographic algorithms to secure crypto-systems against quantum adversary. I contribute to the standardization procedure by introducing the first silicon-verified Saber (a NIST finalist modulo Learning with Rounding scheme) which consumes lowest energy and area till date amongst all the candidates.

Paperid: 3767, https://arxiv.org/pdf/2512.00595.pdf

Abstract:
Modern AI inference faces an irreducible tension: no single computational resource simultaneously maximizes performance, preserves privacy, minimizes cost, and maintains trust. Existing orchestration frameworks optimize single dimensions (Kubernetes prioritizes latency, federated learning preserves privacy, edge computing reduces network distance), creating solutions that struggle under real-world heterogeneity. We present IslandRun, a multi-objective orchestration system that treats computational resources as autonomous "islands" spanning personal devices, private edge servers, and public cloud. Our key insights: (1) request-level heterogeneity demands policy-constrained multi-objective optimization, (2) data locality enables routing compute to data rather than data to compute, and (3) typed placeholder sanitization preserves context semantics across trust boundaries. IslandRun introduces agent-based routing, tiered island groups with differential trust, and reversible anonymization. This establishes a new paradigm for privacy-aware, decentralized inference orchestration across heterogeneous personal computing ecosystems.

Paperid: 3768, https://arxiv.org/pdf/2512.00480.pdf

Abstract:
Retrieving up-to-date information from a publicly accessible database poses significant threats to the user's privacy. {\em Private information retrieval} (PIR) protocols allow a user to retrieve any entry from a database, without revealing the identity of the entry being retrieved to the server(s). Such protocols have found numerous applications in both theoretical studies and real-life scenarios. The existing PIR constructions mainly give multi-server {\em information-theoretic} PIR (IT-PIR) protocols or single-server computational PIR (CPIR) protocols. Compared with CPIR, IT-PIR protocols are computationally more efficient and secure in the presence of unbounded servers. The most classical and challenging problem in the realm of IT-PIR is constructing protocols with lower {\em communication complexity}. In this review, we introduce a new discrete structure called {\em families of orthogonal arrays with span capability} (FOASC) and propose a unified framework for constructing IT-PIR protocols. We show how the most influential IT-PIR protocols in the literature can be captured by the framework. We also put forward several interesting open problems concerning FOASC, whose solutions may result in innovative IT-PIR protocols.

Paperid: 3769, https://arxiv.org/pdf/2512.00110.pdf

Abstract:
Constant-size cryptographic evidence records are increasingly used to build audit trails for regulated AI workloads in clinical, pharmaceutical, and financial settings, where each execution is summarized by a compact, verifiable record of code identity, model version, data digests, and platform measurements. Existing instantiations, however, typically rely on classical signature schemes whose long-term security is threatened by quantum-capable adversaries. In this paper we formalize security notions for evidence structures in the presence of quantum adversaries and study post-quantum (PQ) instantiations and migration strategies for deployed audit logs. We recall an abstraction of constant-size evidence structures and introduce game-based definitions of Q-Audit Integrity, Q-Non-Equivocation, and Q-Binding, capturing the inability of a quantum adversary to forge, equivocate, or rebind evidence items. We then analyze a hash-and-sign instantiation in the quantum random-oracle model (QROM), assuming an existentially unforgeable PQ signature scheme against quantum adversaries, and show that the resulting evidence structure satisfies these notions under standard assumptions. Building on this, we present three migration patterns for existing evidence logs: hybrid signatures, re-signing of legacy evidence, and Merkle-root anchoring, and analyze their security, storage, and computational trade-offs. A case study based on an industrial constant-size evidence platform for regulated AI at Codebat Technologies Inc. suggests that quantum-safe audit trails are achievable with moderate overhead and that systematic migration can significantly extend the evidentiary lifetime of existing deployments.

Paperid: 3770, https://arxiv.org/pdf/2511.23026.pdf

Abstract:
Every day we share our personal information through digital systems which are constantly exposed to threats. For this reason, security-oriented disciplines of signal processing have received increasing attention in the last decades: multimedia forensics, digital watermarking, biometrics, network monitoring, steganography and steganalysis are just a few examples. Even though each of these fields has its own peculiarities, they all have to deal with a common problem: the presence of one or more adversaries aiming at making the system fail. Adversarial Signal Processing lays the basis of a general theory that takes into account the impact that the presence of an adversary has on the design of effective signal processing tools. By focusing on the application side of Adversarial Signal Processing, namely adversarial information fusion in distributed sensor networks, and adopting a game-theoretic approach, this thesis contributes to the above mission by addressing four issues. First, we address decision fusion in distributed sensor networks by developing a novel soft isolation defense scheme that protect the network from adversaries, specifically, Byzantines. Second, we develop an optimum decision fusion strategy in the presence of Byzantines. In the next step, we propose a technique to reduce the complexity of the optimum fusion by relying on a novel near-optimum message passing algorithm based on factor graphs. Finally, we introduce a defense mechanism to protect decentralized networks running consensus algorithm against data falsification attacks.

Paperid: 3771, https://arxiv.org/pdf/2511.22095.pdf

Abstract:
Deep learning research for binary analysis faces a critical infrastructure gap. Today, existing datasets target single platforms, require specialized tooling, or provide only hand-engineered features incompatible with modern neural architectures; no single dataset supports accessible research and pedagogy on realistic use cases. To solve this, we introduce Binary-30K, the first heterogeneous binary dataset designed for sequence-based models like transformers. Critically, Binary-30K covers Windows, Linux, macOS, and Android across 15+ CPU architectures. With 29,793 binaries and approximately 26.93% malware representation, Binary-30K enables research on platform-invariant detection, cross-target transfer learning, and long-context binary understanding. The dataset provides pre-computed byte-level BPE tokenization alongside comprehensive structural metadata, supporting both sequence modeling and structure-aware approaches. Platform-first stratified sampling ensures representative coverage across operating systems and architectures, while distribution via Hugging Face with official train/validation/test splits enables reproducible benchmarking. The dataset is publicly available at https://huggingface.co/datasets/mjbommar/binary-30k, providing an accessible resource for researchers, practitioners, and students alike.

Paperid: 3772, https://arxiv.org/pdf/2511.22047.pdf

Abstract:
Large Language Model (LLM) safety guardrail models have emerged as a primary defense mechanism against harmful content generation, yet their robustness against sophisticated adversarial attacks remains poorly characterized. This study evaluated ten publicly available guardrail models from Meta, Google, IBM, NVIDIA, Alibaba, and Allen AI across 1,445 test prompts spanning 21 attack categories. While Qwen3Guard-8B achieved the highest overall accuracy (85.3%, 95% CI: 83.4-87.1%), a critical finding emerged when separating public benchmark prompts from novel attacks: all models showed substantial performance degradation on unseen prompts, with Qwen3Guard dropping from 91.0% to 33.8% (a 57.2 percentage point gap). In contrast, Granite-Guardian-3.2-5B showed the best generalization with only a 6.5% gap. A "helpful mode" jailbreak was also discovered where two guardrail models (Nemotron-Safety-8B, Granite-Guardian-3.2-5B) generated harmful content instead of blocking it, representing a novel failure mode. These findings suggest that benchmark performance may be misleading due to training data contamination, and that generalization ability, not overall accuracy, should be the primary metric for guardrail evaluation.

Paperid: 3773, https://arxiv.org/pdf/2511.21901.pdf

Abstract:
The accelerating deployment of artificial intelligence systems across regulated sectors has exposed critical fragmentation in risk assessment methodologies. A significant "language barrier" currently separates technical security teams, who focus on algorithmic vulnerabilities (e.g., MITRE ATLAS), from legal and compliance professionals, who address regulatory mandates (e.g., EU AI Act, NIST AI RMF). This disciplinary disconnect prevents the accurate translation of technical vulnerabilities into financial liability, leaving practitioners unable to answer fundamental economic questions regarding contingency reserves, control return-on-investment, and insurance exposure. To bridge this gap, this research presents the AI System Threat Vector Taxonomy, a structured ontology designed explicitly for Quantitative Risk Assessment (QRA). The framework categorizes AI-specific risks into nine critical domains: Misuse, Poisoning, Privacy, Adversarial, Biases, Unreliable Outputs, Drift, Supply Chain, and IP Threat, integrating 53 operationally defined sub-threats. Uniquely, each domain maps technical vectors directly to business loss categories (Confidentiality, Integrity, Availability, Legal, Reputation), enabling the translation of abstract threats into measurable financial impact. The taxonomy is empirically validated through an analysis of 133 documented AI incidents from 2025 (achieving 100% classification coverage) and reconciled against the main AI risk frameworks. Furthermore, it is explicitly aligned with ISO/IEC 42001 controls and NIST AI RMF functions to facilitate auditability.

Paperid: 3774, https://arxiv.org/pdf/2511.21757.pdf

Abstract:
The integration of Large Language Models (LLMs) into healthcare demands a safety paradigm rooted in \textit{primum non nocere}. However, current alignment techniques rely on generic definitions of harm that fail to capture context-dependent violations, such as administrative fraud and clinical discrimination. To address this, we introduce Medical Malice: a dataset of 214,219 adversarial prompts calibrated to the regulatory and ethical complexities of the Brazilian Unified Health System (SUS). Crucially, the dataset includes the reasoning behind each violation, enabling models to internalize ethical boundaries rather than merely memorizing a fixed set of refusals. Using an unaligned agent (Grok-4) within a persona-driven pipeline, we synthesized high-fidelity threats across seven taxonomies, ranging from procurement manipulation and queue-jumping to obstetric violence. We discuss the ethical design of releasing these "vulnerability signatures" to correct the information asymmetry between malicious actors and AI developers. Ultimately, this work advocates for a shift from universal to context-aware safety, providing the necessary resources to immunize healthcare AI against the nuanced, systemic threats inherent to high-stakes medical environments -- vulnerabilities that represent the paramount risk to patient safety and the successful integration of AI in healthcare systems.

Paperid: 3775, https://arxiv.org/pdf/2511.21291.pdf

Abstract:
Backdoor attacks pose severe security threats to deep neural networks by embedding malicious triggers that force misclassification. While machine unlearning techniques can remove backdoor behaviors, current methods lack transparency and real-time interpretability. This paper introduces a novel framework that integrates Gradient-weighted Class Activation Mapping (Grad-CAM) into the unlearning process to provide real-time monitoring and explainability. We propose the Trigger Attention Ratio (TAR) metric to quantitatively measure the model's attention shift from trigger patterns to legitimate object features. Our balanced unlearning strategy combines gradient ascent on backdoor samples, Elastic Weight Consolidation (EWC) for catastrophic forgetting prevention, and a recovery phase for clean accuracy restoration. Experiments on CIFAR-10 with BadNets attacks demonstrate that our approach reduces Attack Success Rate (ASR) from 96.51% to 5.52% while retaining 99.48% of clean accuracy (82.06%), achieving a 94.28% ASR reduction. The integration of explainable AI enables transparent, observable, and verifiable backdoor removal.

Paperid: 3776, https://arxiv.org/pdf/2511.20505.pdf

Abstract:
Cryptographic identity anchors modern decentralized systems, yet current standards like BIP-39 and BIP-32 are structurally insufficient for the demands of multi-curve, multi-domain, and post-quantum (PQC) environments. These legacy schemes rely on a monolithic identity root with no inherent context isolation, algorithm agility, or secure secret rotation. This paper introduces MSCIKDF, a single-root, multi-curve, context-isolated, PQC-pluggable cryptographic identity primitive. MSCIKDF defines a new architectural foundation where identity is derived deterministically but with cryptographically enforced separation across diverse contexts (e.g., blockchain, E2EE, KMS, IoT). It achieves strong security invariants -- such as zero-linkability, multi-curve independence, and resistance to cross-context correlation -- while offering stateless secret rotation that preserves long-term identity continuity without requiring asset migration. MSCIKDF is proposed as an infrastructure-level upgrade to deterministic identity, establishing a durable and algorithm-agnostic root of trust suitable for the next decade of distributed systems, AI agents, and PQC migration.

Paperid: 3777, https://arxiv.org/pdf/2511.19902.pdf

Abstract:
Recent advances in artificial intelligence (AI), particularly deep learning, have led to widespread adoption across various applications. Yet, a fundamental challenge persists: how can we verify the correctness of AI model inference when model owners cannot (or will not) reveal their parameters? These parameters represent enormous training costs and valuable intellectual property, making transparent verification difficult. In this paper, we introduce a zero-knowledge framework capable of verifying deep learning inference without exposing model internal parameters. Built on recursively composed zero-knowledge proofs and requiring no trusted setup, our framework supports both linear and nonlinear neural network layers, including matrix multiplication, normalization, softmax, and SiLU. Leveraging the Fiat-Shamir heuristic, we obtain a succinct non-interactive argument of knowledge (zkSNARK) with constant-size proofs. To demonstrate the practicality of our approach, we translate the DeepSeek model into a fully SNARK-verifiable version named ZK-DeepSeek and show experimentally that our framework delivers both efficiency and flexibility in real-world AI verification workloads.

Paperid: 3778, https://arxiv.org/pdf/2511.19874.pdf

Abstract:
As AI agents become integral to enterprise workflows, their reliance on shared tool libraries and pre-trained components creates significant supply chain vulnerabilities. While previous work has demonstrated behavioral backdoor detection within individual LLM architectures, the critical question of cross-LLM generalization remains unexplored, a gap with serious implications for organizations deploying multiple AI systems. We present the first systematic study of cross-LLM behavioral backdoor detection, evaluating generalization across six production LLMs (GPT-5.1, Claude Sonnet 4.5, Grok 4.1, Llama 4 Maverick, GPT-OSS 120B, and DeepSeek Chat V3.1). Through 1,198 execution traces and 36 cross-model experiments, we quantify a critical finding: single-model detectors achieve 92.7% accuracy within their training distribution but only 49.2% across different LLMs, a 43.4 percentage point generalization gap equivalent to random guessing. Our analysis reveals that this gap stems from model-specific behavioral signatures, particularly in temporal features (coefficient of variation > 0.8), while structural features remain stable across architectures. We show that model-aware detection incorporating model identity as an additional feature achieves 90.6% accuracy universally across all evaluated models. We release our multi-LLM trace dataset and detection framework to enable reproducible research.

Paperid: 3779, https://arxiv.org/pdf/2511.19727.pdf

Abstract:
Large Language Models (LLMs) remain vulnerable to prompt injection attacks, representing the most significant security threat in production deployments. We present Prompt Fencing, a novel architectural approach that applies cryptographic authentication and data architecture principles to establish explicit security boundaries within LLM prompts. Our approach decorates prompt segments with cryptographically signed metadata including trust ratings and content types, enabling LLMs to distinguish between trusted instructions and untrusted content. While current LLMs lack native fence awareness, we demonstrate that simulated awareness through prompt instructions achieved complete prevention of injection attacks in our experiments, reducing success rates from 86.7% (260/300 successful attacks) to 0% (0/300 successful attacks) across 300 test cases with two leading LLM providers. We implement a proof-of-concept fence generation and verification pipeline with a total overhead of 0.224 seconds (0.130s for fence generation, 0.094s for validation) across 100 samples. Our approach is platform-agnostic and can be incrementally deployed as a security layer above existing LLM infrastructure, with the expectation that future models will be trained with native fence awareness for optimal security.

Paperid: 3780, https://arxiv.org/pdf/2511.18790.pdf

Abstract:
Content moderation pipelines for modern large language models combine static filters, dedicated moderation services, and alignment tuned base models, yet real world deployments still exhibit dangerous failure modes. This paper presents RoguePrompt, an automated jailbreak attack that converts a disallowed user query into a self reconstructing prompt which passes provider moderation while preserving the original harmful intent. RoguePrompt partitions the instruction across two lexical streams, applies nested classical ciphers, and wraps the result in natural language directives that cause the target model to decode and execute the hidden payload. Our attack assumes only black box access to the model and to the associated moderation endpoint. We instantiate RoguePrompt against GPT 4o and evaluate it on 2 448 prompts that a production moderation system previously marked as strongly rejected. Under an evaluation protocol that separates three security relevant outcomes bypass, reconstruction, and execution the attack attains 84.7 percent bypass, 80.2 percent reconstruction, and 71.5 percent full execution, substantially outperforming five automated jailbreak baselines. We further analyze the behavior of several automated and human aligned evaluators and show that dual layer lexical transformations remain effective even when detectors rely on semantic similarity or learned safety rubrics. Our results highlight systematic blind spots in current moderation practice and suggest that robust deployment will require joint reasoning about user intent, decoding workflows, and model side computation rather than surface level toxicity alone.

Paperid: 3781, https://arxiv.org/pdf/2511.18272.pdf

Abstract:
Large vision-language models (VLMs) are increasingly deployed for optical character recognition (OCR) in healthcare settings, raising critical concerns about protected health information (PHI) exposure during document processing. This work presents the first systematic evaluation of inference-time vision token masking as a privacy-preserving mechanism for medical document OCR using DeepSeek-OCR. We introduce seven masking strategies (V3-V9) targeting different architectural layers (SAM encoder blocks, compression layers, dual vision encoders, projector fusion) and evaluate PHI reduction across HIPAA-defined categories using 100 synthetic medical billing statements (drawn from a corpus of 38,517 annotated documents) with perfect ground-truth annotations. All masking strategies converge to 42.9% PHI reduction, successfully suppressing long-form spatially-distributed identifiers (patient names, dates of birth, physical addresses at 100% effectiveness) while failing to prevent short structured identifiers (medical record numbers, social security numbers, email addresses, account numbers at 0% effectiveness). Ablation studies varying mask expansion radius (r=1,2,3) demonstrate that increased spatial coverage does not improve reduction beyond this ceiling, indicating that language model contextual inference - not insufficient visual masking - drives structured identifier leakage. A simulated hybrid architecture combining vision masking with NLP post-processing achieves 88.6% total PHI reduction (assuming 80% NLP accuracy on remaining identifiers). This negative result establishes boundaries for vision-only privacy interventions in VLMs, provides guidance distinguishing PHI types amenable to vision-level versus language-level redaction, and redirects future research toward decoder-level fine-tuning and hybrid defense-in-depth architectures for HIPAA-compliant medical document processing.

Paperid: 3782, https://arxiv.org/pdf/2511.17842.pdf

Abstract:
In this work, we present homomorphic encryption-based vaults (Haults), a permissioned privacy-preserving smart wallet protocol for VM-enabled blockchains that keeps users' balances confidential, as well as the amounts transacted to other parties. To comply with regulations, we include optional compliance features that allow specific entities (the auditors) to retrieve transaction amounts or execute force transfers when necessary. Our solution uses ElGamal over elliptic curves to encrypt balances, combined with zero-knowledge proofs to verify the correctness of transaction amounts and the integrity of the sender's updated balance, among other security checks. We provide a detailed explanation of the protocol, including a security discussion and benchmarks from our proof-of-concept implementation, which yield great results. Beyond in-contract issued tokens, we also provide a thorough explanation on how our solution can be compatible with external ones (e.g., Ether or any ERC20).

Paperid: 3783, https://arxiv.org/pdf/2511.17692.pdf

Abstract:
QDNA-ID is a trust-chain framework that links physical quantum behavior to digitally verified records. The system first executes standard quantum circuits with random shot patterns across different devices to generate entropy profiles and measurement data that reveal device-specific behavior. A Bell or CHSH test is then used to confirm that correlations originate from genuine non classical processes rather than classical simulation. The verified outcomes are converted into statistical fingerprints using entropy, divergence, and bias features to characterize each device. These features and metadata for device, session, and random seed parameters are digitally signed and time stamped to ensure integrity and traceability. Authenticated artifacts are stored in a hierarchical index for reproducible retrieval and long term auditing. A visualization and analytics interface monitors drift, policy enforcement, and device behavior logs. A machine learning engine tracks entropy drift, detects anomalies, and classifies devices based on evolving patterns. An external verification API supports independent recomputation of hashes, signatures, and CHSH evidence. QDNA-ID operates as a continuous feedback loop that maintains a persistent chain of trust for quantum computing environments.

Paperid: 3784, https://arxiv.org/pdf/2511.17666.pdf

Abstract:
The recent boom and rapid integration of Large Language Models (LLMs) into a wide range of applications warrants a deeper understanding of their security and safety vulnerabilities. This paper presents a comparative analysis of the susceptibility to jailbreak attacks for two leading publicly available LLMs, Google's Gemini 2.5 Flash and OpenAI's GPT-4 (specifically the GPT-4o mini model accessible in the free tier). The research utilized two main bypass strategies: 'self-bypass', where models were prompted to circumvent their own safety protocols, and 'cross-bypass', where one model generated adversarial prompts to exploit vulnerabilities in the other. Four attack methods were employed - direct injection, role-playing, context manipulation, and obfuscation - to generate five distinct categories of unsafe content: hate speech, illegal activities, malicious code, dangerous content, and misinformation. The success of the attack was determined by the generation of disallowed content, with successful jailbreaks assigned a severity score. The findings indicate a disparity in jailbreak susceptibility between 2.5 Flash and GPT-4, suggesting variations in their safety implementations or architectural design. Cross-bypass attacks were particularly effective, indicating that an ample amount of vulnerabilities exist in the underlying transformer architecture. This research contributes a scalable framework for automated AI red-teaming and provides data-driven insights into the current state of LLM safety, underscoring the complex challenge of balancing model capabilities with robust safety mechanisms.

Paperid: 3785, https://arxiv.org/pdf/2511.17118.pdf

Abstract:
This paper introduces constant-size cryptographic evidence structures, a general abstraction for representing verifiable audit evidence for AI workflows in regulated environments. Each evidence item is a fixed-size tuple of cryptographic fields, designed to (i) provide strong binding to workflow events and configurations, (ii) support constant-size storage and uniform verification cost per event, and (iii) compose cleanly with hash-chain and Merkle-based audit constructions. We formalize a simple model of regulated AI workflows, define syntax and algorithms for evidence structures, and articulate security goals such as audit integrity and non-equivocation. We present a generic hash-and-sign construction that instantiates this abstraction using a collision-resistant hash function and a standard digital signature scheme. We then show how to integrate the construction with hash-chained logs, Merkle-tree anchoring, and optionally trusted execution environments, and we analyze the asymptotic complexity of evidence generation and verification. Finally, we implement a prototype library and report microbenchmark results on commodity hardware, demonstrating that the per-event overhead of constant-size evidence is small and predictable. The design is informed by industrial experience with regulated AI systems at Codebat Technologies Inc., while the paper focuses on the abstraction, algorithms, and their security and performance characteristics, with implications for clinical trial management, pharmaceutical compliance, and medical AI governance.

Paperid: 3786, https://arxiv.org/pdf/2511.16088.pdf

Abstract:
Traditional threat modeling remains reactive-focused on known TTPs and past incident data, while threat prediction and forecasting frameworks are often disconnected from operational or architectural artifacts. This creates a fundamental weakness: the most serious cyber threats often do not arise from what is known, but from what is assumed, overlooked, or not yet conceived, and frequently originate from the future, such as artificial intelligence, information warfare, and supply chain attacks, where adversaries continuously develop new exploits that can bypass defenses built on current knowledge. To address this mental gap, this paper introduces the theory and methodology of Future-Back Threat Modeling (FBTM). This predictive approach begins with envisioned future threat states and works backward to identify assumptions, gaps, blind spots, and vulnerabilities in the current defense architecture, providing a clearer and more accurate view of impending threats so that we can anticipate their emergence and shape the future we want through actions taken now. The proposed methodology further aims to reveal known unknowns and unknown unknowns, including tactics, techniques, and procedures that are emerging, anticipated, and plausible. This enhances the predictability of adversary behavior, particularly under future uncertainty, helping security leaders make informed decisions today that shape more resilient security postures for the future.

Paperid: 3787, https://arxiv.org/pdf/2511.16009.pdf

Abstract:
An important proof technique in the random oracle model involves reprogramming it on hard to predict inputs and arguing that an attacker cannot detect that this occurred. In the quantum setting, a particularly challenging version of this considers adaptive reprogramming wherein the points to be reprogrammed (or the output values they should be programmed to) are dependent on choices made by the adversary. Some quantum frameworks for analyzing adaptive reprogramming were given by Unruh (CRYPTO 2014, EUROCRYPT 2015), Grilo-Hövelmanns-Hülsing-Majenz (ASIACRYPT 2021), and Pan-Zeng (PKC 2024). We show, counterintuitively, that these adaptive results follow from the \emph{nonadaptive} one-way to hiding theorem of Ambainis-Hamburg-Unruh (CRYPTO 2019). These implications contradict beliefs (whether stated explicitly or implicitly) that some properties of the adaptive frameworks cannot be provided by the Ambainis-Hamburg-Unruh result.

Paperid: 3788, https://arxiv.org/pdf/2511.15945.pdf

Abstract:
We introduce cyclic diagram monoids, a generalisation of classical diagram monoids that adds elements of arbitrary period by including internal components, with a view towards cryptography. We classify their simple representations and compute their dimensions in terms of the underlying diagram algebra. These go towards showing that cyclic diagram monoids possess representation gaps of exponential growth, which quantify their resistance as platforms against linear attacks on cryptographic protocols that exploit small dimensional representations.

Paperid: 3789, https://arxiv.org/pdf/2511.15712.pdf

Abstract:
Artificial intelligence (AI) agents are increasingly capable of initiating financial transactions on behalf of users or other agents. This evolution introduces a fundamental challenge: verifying both the authenticity of an autonomous agent and the true intent behind its transactions in a decentralized, trustless environment. Traditional payment systems assume human authorization, but autonomous, agent-led payments remove that safeguard. This paper presents a blockchain-based framework that cryptographically authenticates and verifies the intent of every AI-initiated transaction. The proposed system leverages decentralized identity (DID) standards and verifiable credentials to establish agent identities, on-chain intent proofs to record user authorization, and zero-knowledge proofs (ZKPs) to preserve privacy while ensuring policy compliance. Additionally, secure execution environments (TEE-based attestations) guarantee the integrity of agent reasoning and execution. The hybrid on-chain/off-chain architecture provides an immutable audit trail linking user intent to payment outcome. Through qualitative analysis, the framework demonstrates strong resistance to impersonation, unauthorized transactions, and misalignment of intent. This work lays the foundation for secure, auditable, and intent-aware autonomous economic agents, enabling a future of verifiable trust and accountability in AI-driven financial ecosystems.

Paperid: 3790, https://arxiv.org/pdf/2511.15165.pdf

Abstract:
The rapid proliferation of Multimodal Large Language Models (MLLMs) has introduced unprecedented security challenges, particularly in phishing detection within academic environments. Academic institutions and researchers are high-value targets, facing dynamic, multilingual, and context-dependent threats that leverage research backgrounds, academic collaborations, and personal information to craft highly tailored attacks. Existing security benchmarks largely rely on datasets that do not incorporate specific academic background information, making them inadequate for capturing the evolving attack patterns and human-centric vulnerability factors specific to academia. To address this gap, we present AdapT-Bench, a unified methodological framework and benchmark suite for systematically evaluating MLLM defense capabilities against dynamic phishing attacks in academic settings.

Paperid: 3791, https://arxiv.org/pdf/2511.14554.pdf

Abstract:
Deepfakes generated by advanced GANs and autoencoders severely threaten information integrity and societal stability. Single-stream CNNs fail to capture multi-scale forgery artifacts across spatial, texture, and frequency domains, limiting robustness and generalization. We introduce the ForensicFlow, a tri-modal forensic framework that synergistically fuses RGB, texture, and frequency evidence for video Deepfake detection. The RGB branch (ConvNeXt-tiny) extracts global visual inconsistencies; the texture branch (Swin Transformer-tiny) detects fine-grained blending artifacts; the frequency branch (CNN + SE) identifies periodic spectral noise. Attention-based temporal pooling dynamically prioritizes high-evidence frames, while adaptive attention fusion balances branch contributions.Trained on Celeb-DF (v2) with Focal Loss, ForensicFlow achieves AUC 0.9752, F1-Score 0.9408, and accuracy 0.9208, outperforming single-stream baselines. Ablation validates branch synergy; Grad-CAM confirms forensic focus. This comprehensive feature fusion provides superior resilience against subtle forgeries.

Paperid: 3792, https://arxiv.org/pdf/2511.14032.pdf

Abstract:
Digital content distribution and proprietary research-driven industries face persistent risks from intellectual property theft and unauthorized redistribution. Conventional encryption schemes such as AES, TDES, ECC, and ElGamal provide strong cryptographic guarantees, but they remain fundamentally agnostic to where decryption takes place.In practice, this means that once a decryption key is leaked or intercepted, any adversary can misuse the key to decrypt the protected content from any location. We present a location-dependent cryptosystem in which the decryption key is not transmitted as human- or machine-readable data, but implicitly encoded in precise time-of-flight differences of ultra-wideband (UWB) data transmission packets. The system leverages precise timing hardware and a custom JMTK protocol to map a SHA-256 hashed AES key onto scheduled transmission timestamps. Only receivers located within a predefined spatial region can observe the packet timings that align with the intended "time slot" pattern, enabling them to reconstruct the key and decrypt the secret. Receivers outside the authorized region observe incorrect keys. We implement a complete prototype that encrypts and transmits audio data using our cryptosystem, and only when the receiver is within the authorized data, they are able to decrypt the data. Our evaluation demonstrates that the system (i) removes the need to share decryption passwords electronically or physically, (ii) ensures the decryption key cannot be recovered by the eavesdropper, and (iii) provides a non-trivial spatial tolerance for legitimate users.

Paperid: 3793, https://arxiv.org/pdf/2511.12981.pdf

Abstract:
The first contribution of the paper is to put forward an abstract definition of the Grain family of stream ciphers which formalises the different components that are required to specify a particular member of the family. Our second contribution is to provide new and strengthened definitions of the components. These include definining new classes of nonlinear Boolean functions, improved definition of the state update function during initialisation, choice of the tap positions, and the possibility of the linear feedback shift register being smaller than the nonlinear feedback shift register. The third contribution of the paper is to put forward seven concrete proposals of stream ciphers by suitably instantiating the abstract family, one at the 80-bit security level, and two each at the 128-bit, 192-bit, and the 256-bit security levels. At the 80-bit security level, compared to the well known Grain~v1, the new proposal uses Boolean functions with improved cryptographic properties \textit{and} an overall lower gate count. At the 128-bit level, compared to ISO/IEC standard Grain-128a, the new proposals use Boolean functions with improved cryptographic properties; one of the proposals require a few extra gates, while the other has an overall lower gate count. At the 192-bit, and the 256-bit security levels, there are no proposals in the literature with smaller gate counts.

Paperid: 3794, https://arxiv.org/pdf/2511.12782.pdf

Abstract:
Current Large Language Model alignment research mostly focuses on improving model robustness against adversarial attacks and misbehavior by training on examples and prompting. Research has shown that LLM jailbreak probability increases with the size of the user input or conversation length. There is a lack of appropriate research into means of strengthening alignment which also scale with user input length. We propose interruptions as a possible solution to this problem. Interruptions are control sentences added to the user input approximately every x tokens for some arbitrary x. We suggest that this can be generalized to the Chain-of-Thought process to prevent scheming.

Paperid: 3795, https://arxiv.org/pdf/2511.12704.pdf

Abstract:
Intentional threats are a major risk factor related to vulnerabilities in critical infrastructure assets, and an accurate risk assessment is necessary to analyze threats, assess vulnerabilities, and evaluate potential impacts on assets and systems. This research proposes a methodology that can be added as an additional phase in the risk assessment process. The method introduces an extra analytical parameter concerning offensive tool characteristics, improving the understanding of intentional threats. The methodology is presented using clear and accessible language suitable for a broad audience. It is based on an approach described as an "offensive tool determination strategy," summarized by the acronym R.I.D.D.L.E.+C, which refers to the variables used in the analysis: resistance, intrusion timing, damage, disruption timing, latency, efficiency, and cost. These variables are evaluated using open-source intelligence. Each variable is assigned a specific range of values according to its potential impact on the targeted asset. A matrix is then provided for practical application, which can reveal unexpected vulnerabilities and offer a more granular framework for decision-making and security planning.

Paperid: 3796, https://arxiv.org/pdf/2511.12295.pdf

Abstract:
Prompt injection attacks are an emerging threat to large language models (LLMs), enabling malicious users to manipulate outputs through carefully designed inputs. Existing detection approaches often require centralizing prompt data, creating significant privacy risks. This paper proposes a privacy-preserving prompt injection detection framework based on federated learning and embedding-based classification. A curated dataset of benign and adversarial prompts was encoded with sentence embedding and used to train both centralized and federated logistic regression models. The federated approach preserved privacy by sharing only model parameters across clients, while achieving detection performance comparable to centralized training. Results demonstrate that effective prompt injection detection is feasible without exposing raw data, making this one of the first explorations of federated security for LLMs. Although the dataset is limited in scale, the findings establish a strong proof-of-concept and highlight new directions for building secure and privacy-aware LLM systems.

Paperid: 3797, https://arxiv.org/pdf/2511.12274.pdf

Abstract:
Web3 applications, built on blockchain technology, manage billions of dollars in digital assets through decentralized applications (dApps) and smart contracts. These systems rely on complex, software supply chains that introduce significant security vulnerabilities. This paper examines the software supply chain security challenges unique to the Web3 ecosystem, where traditional Web2 software supply chain problems intersect with the immutable and high-stakes nature of blockchain technology. We analyze the threat landscape and propose mitigation strategies to strengthen the security posture of Web3 systems.

Paperid: 3798, https://arxiv.org/pdf/2511.12085.pdf

Abstract:
Phishing and related cyber threats are becoming more varied and technologically advanced. Among these, email-based phishing remains the most dominant and persistent threat. These attacks exploit human vulnerabilities to disseminate malware or gain unauthorized access to sensitive information. Deep learning (DL) models, particularly transformer-based models, have significantly enhanced phishing mitigation through their contextual understanding of language. However, some recent threats, specifically Artificial Intelligence (AI)-generated phishing attacks, are reducing the overall system resilience of phishing detectors. In response, adversarial training has shown promise against AI-generated phishing threats. This study presents a hybrid approach that uses DistilBERT, a smaller, faster, and lighter version of the BERT transformer model for email classification. Robustness against text-based adversarial perturbations is reinforced using Fast Gradient Method (FGM) adversarial training. Furthermore, the framework integrates the LIME Explainable AI (XAI) technique to enhance the transparency of the DistilBERT architecture. The framework also uses the Flan-T5-small language model from Hugging Face to generate plain-language security narrative explanations for end-users. This combined approach ensures precise phishing classification while providing easily understandable justifications for the model's decisions.

Paperid: 3799, https://arxiv.org/pdf/2511.11569.pdf

Abstract:
We present \textsf{ModularSubsetSelection} (MSS), a new algorithm for locally differentially private (LDP) frequency estimation. Given a universe of size $k$ and $n$ users, our $\varepsilon$-LDP mechanism encodes each input via a Residue Number System (RNS) over $\ell$ pairwise-coprime moduli $m_0, \ldots, m_{\ell-1}$, and reports a randomly chosen index $j \in [\ell]$ along with the perturbed residue using the statistically optimal \textsf{SubsetSelection} (SS) (Wang et al. 2016). This design reduces the user communication cost from $Θ\bigl(ω\log_2(k/ω)\bigr)$ bits required by standard SS (with $ω\approx k/(e^\varepsilon+1)$) down to $\lceil \log_2 \ell \rceil + \lceil \log_2 m_j \rceil$ bits, where $m_j < k$. Server-side decoding runs in $Θ(n + r k \ell)$ time, where $r$ is the number of LSMR (Fong and Saunders 2011) iterations. In practice, with well-conditioned moduli (\textit{i.e.}, constant $r$ and $\ell = Θ(\log k)$), this becomes $Θ(n + k \log k)$. We prove that MSS achieves worst-case MSE within a constant factor of state-of-the-art protocols such as SS and \textsf{ProjectiveGeometryResponse} (PGR) (Feldman et al. 2022) while avoiding the algebraic prerequisites and dynamic-programming decoder required by PGR. Empirically, MSS matches the estimation accuracy of SS, PGR, and \textsf{RAPPOR} (Erlingsson, Pihur, and Korolova 2014) across realistic $(k, \varepsilon)$ settings, while offering faster decoding than PGR and shorter user messages than SS. Lastly, by sampling from multiple moduli and reporting only a single perturbed residue, MSS achieves the lowest reconstruction-attack success rate among all evaluated LDP protocols.

Paperid: 3800, https://arxiv.org/pdf/2511.10516.pdf

Abstract:
In several jurisdictions, the regulatory framework on the release and sharing of personal data is being extended to machine learning (ML). The implicit assumption is that disclosing a trained ML model entails a privacy risk for any personal data used in training comparable to directly releasing those data. However, given a trained model, it is necessary to mount a privacy attack to make inferences on the training data. In this concept paper, we examine the main families of privacy attacks against predictive and generative ML, including membership inference attacks (MIAs), property inference attacks, and reconstruction attacks. Our discussion shows that most of these attacks seem less effective in the real world than what a prima face interpretation of the related literature could suggest.

Paperid: 3801, https://arxiv.org/pdf/2511.10265.pdf

Abstract:
A simple and practical method for achieving everlasting privacy in e-voting systems, without relying on advanced cryptographic techniques, is to use anonymous voter credentials. The simplicity of this approach may, however, create some challenges, when combined with other security features, such as cast-as-intended verifiability with second device and second-factor authentication. This paper considers a simple augmentation to the anonymous credential mechanism, using perfectly hiding commitments to link such credentials to the voter identities. This solution strengthens the binding between voters and their credentials while preserving everlasting privacy. It ensures that published ballots remain unlinkable to voter identities, yet enables necessary consistency checks during ballot casting and ballot auditing

Paperid: 3802, https://arxiv.org/pdf/2511.09120.pdf

Abstract:
Multiple-Criteria Decision Making (MCDM) is a sub-discipline of Operations Research that helps decision-makers in choosing, ranking, or sorting alternatives based on conflicting criteria. Over time, its application has been expanded into dynamic and data-driven domains, such as recommender systems. In these contexts, the availability and handling of personal and sensitive data can play a critical role in the decision-making process. Despite this increased reliance on sensitive data, the integration of privacy mechanisms with MCDM methods is underdeveloped. This paper introduces an integrated approach that combines MCDM outranking methods with Differential Privacy (DP), safeguarding individual contributions' privacy in ranking problems. This approach relies on a pre-processing step to aggregate multiple user evaluations into a comprehensive performance matrix. The evaluation results show a strong to very strong statistical correlation between the true rankings and their anonymized counterparts, ensuring robust privacy parameter guarantees.

Paperid: 3803, https://arxiv.org/pdf/2511.09043.pdf

Abstract:
Healthcare federated learning requires strong privacy guarantees while maintaining computational efficiency across resource-constrained medical institutions. This paper presents MedHE, a novel framework combining adaptive gradient sparsification with CKKS homomorphic encryption to enable privacy-preserving collaborative learning on sensitive medical data. Our approach introduces a dynamic threshold mechanism with error compensation for top-k gradient selection, achieving 97.5 percent communication reduction while preserving model utility. We provide formal security analysis under Ring Learning with Errors assumptions and demonstrate differential privacy guarantees with epsilon less than or equal to 1.0. Statistical testing across 5 independent trials shows MedHE achieves 89.5 percent plus or minus 0.8 percent accuracy, maintaining comparable performance to standard federated learning (p=0.32) while reducing communication from 1277 MB to 32 MB per training round. Comprehensive evaluation demonstrates practical feasibility for real-world medical deployments with HIPAA compliance and scalability to 100 plus institutions.

Paperid: 3804, https://arxiv.org/pdf/2511.07645.pdf

Abstract:
Context: The integration of Large Language Models (LLMs) into core software systems is accelerating. However, existing software architecture patterns are static, while current safety assurance methods are not scalable, leaving systems vulnerable to novel adversarial threats. Objective: To design, implement, and evaluate a novel software architecture that enables an AI-driven system to autonomously and continuously adapt its own safety protocols at runtime. Method: We propose the Self-Improving Safety Framework (SISF), a runtime architecture that couples an unprotected, unaligned base LLM (mistralai/Mistral-7B-v0.1) with a dynamic feedback loop. This loop consists of an AI Adjudicator (GPT-4o) for breach detection and a Policy Synthesis Module (GPT-4 Turbo) that autonomously generates new, generalized safety policies (both heuristic and semantic) in response to failures. Results: We conducted a dynamic learning evaluation using the 520-prompt AdvBench dataset. The unprotected model was 100% vulnerable. Our SISF, starting from zero policies, demonstrated a clear learning curve: it detected 237 breaches, autonomously synthesized 234 new policies, and reduced the overall Attack Success Rate (ASR) to 45.58%. In a subsequent test on 520 benign prompts, the SISF achieved a 0.00% False Positive Rate (FPR), proving its ability to adapt without compromising user utility. Conclusion: An architectural approach to AI safety, based on the principles of self-adaptation, is a viable and effective strategy. Our framework demonstrates a practical path towards building more robust, resilient, and scalable AI-driven systems, shifting safety assurance from a static, pre-deployment activity to an automated, runtime process.

Paperid: 3805, https://arxiv.org/pdf/2511.06305.pdf

Abstract:
This position paper argues that setting the privacy budget in differential privacy should not be viewed as an important limitation of differential privacy compared to alternative methods for privacy-preserving machine learning. The so-called problem of interpreting the privacy budget is often presented as a major hindrance to the wider adoption of differential privacy in real-world deployments and is sometimes used to promote alternative mitigation techniques for data protection. We believe this misleads decision-makers into choosing unsafe methods. We argue that the difficulty in interpreting privacy budgets does not stem from the definition of differential privacy itself, but from the intrinsic difficulty of estimating privacy risks in context, a challenge that any rigorous method for privacy risk assessment face. Moreover, we claim that any sound method for estimating privacy risks should, given the current state of research, be expressible within the differential privacy framework or justify why it cannot.

Paperid: 3806, https://arxiv.org/pdf/2511.05572.pdf

Abstract:
The potential of agricultural data (AgData) to drive efficiency and sustainability is stifled by the "AgData Paradox": a pervasive lack of trust and interoperability that locks data in silos, despite its recognized value. This paper introduces AgriTrust, a federated semantic governance framework designed to resolve this paradox. AgriTrust integrates a multi-stakeholder governance model, built on pillars of Data Sovereignty, Transparent Data Contracts, Equitable Value Sharing, and Regulatory Compliance, with a semantic digital layer. This layer is realized through the AgriTrust Core Ontology, a formal OWL ontology that provides a shared vocabulary for tokenization, traceability, and certification, enabling true semantic interoperability across independent platforms. A key innovation is a blockchain-agnostic, multi-provider architecture that prevents vendor lock-in. The framework's viability is demonstrated through case studies across three critical Brazilian supply chains: coffee (for EUDR compliance), soy (for mass balance), and beef (for animal tracking). The results show that AgriTrust successfully enables verifiable provenance, automates compliance, and creates new revenue streams for data producers, thereby transforming data sharing from a trust-based dilemma into a governed, automated operation. This work provides a foundational blueprint for a more transparent, efficient, and equitable agricultural data economy.

Paperid: 3807, https://arxiv.org/pdf/2511.04261.pdf

Abstract:
The widespread deployment of high-resolution visual sensing systems, coupled with the rise of foundation models, has amplified privacy risks in video-based applications. Differentially private pixelization offers mathematically guaranteed protection for visual data through grid-based noise addition, but challenges remain in preserving task-relevant fidelity, achieving scalability, and enabling efficient real-time deployment. To address this, we propose a novel parallel, region-adaptive pixelization framework that combines the theoretical rigor of differential privacy with practical efficiency. Our method adaptively adjusts grid sizes and noise scales based on regional complexity, leveraging GPU parallelism to achieve significant runtime acceleration compared to the classical baseline. A lightweight storage scheme is introduced by retaining only essential noisy statistics, significantly reducing space overhead. Formal privacy analysis is provided under the Laplace mechanism and parallel composition theorem. Extensive experiments on the PETS, Venice-2, and PPM-100 datasets demonstrate favorable privacy-utility trade-offs and significant runtime/storage reductions. A face re-identification attack experiment on CelebA further confirms the method's effectiveness in preventing identity inference. This validates its suitability for real-time privacy-critical applications such as elderly care, smart home monitoring, driver behavior analysis, and crowd behavior monitoring.

Paperid: 3808, https://arxiv.org/pdf/2511.04023.pdf

Abstract:
Static analysis is effective for discovering software vulnerabilities but notoriously suffers from incomplete source--sink specifications and excessive false positives (FPs). We present \textsc{AdaTaint}, an LLM-driven taint analysis framework that adaptively infers source/sink specifications and filters spurious alerts through neuro-symbolic reasoning. Unlike LLM-only detectors, \textsc{AdaTaint} grounds model suggestions in program facts and constraint validation, ensuring both adaptability and determinism. We evaluate \textsc{AdaTaint} on Juliet 1.3, SV-COMP-style C benchmarks, and three large real-world projects. Results show that \textsc{AdaTaint} reduces false positives by \textbf{43.7\%} on average and improves recall by \textbf{11.2\%} compared to state-of-the-art baselines (CodeQL, Joern, and LLM-only pipelines), while maintaining competitive runtime overhead. These findings demonstrate that combining LLM inference with symbolic validation offers a practical path toward more accurate and reliable static vulnerability analysis.

Paperid: 3809, https://arxiv.org/pdf/2511.03995.pdf

Abstract:
Software fuzzing has become a cornerstone in automated vulnerability discovery, yet existing mutation strategies often lack semantic awareness, leading to redundant test cases and slow exploration of deep program states. In this work, I present a hybrid fuzzing framework that integrates static and dynamic analysis with Large Language Model (LLM)-guided input mutation and semantic feedback. Static analysis extracts control-flow and data-flow information, which is transformed into structured prompts for the LLM to generate syntactically valid and semantically diverse inputs. During execution, I augment traditional coverage-based feedback with semantic feedback signals-derived from program state changes, exception types, and output semantics-allowing the fuzzer to prioritize inputs that trigger novel program behaviors beyond mere code coverage. I implement our approach atop AFL++, combining program instrumentation with embedding-based semantic similarity metrics to guide seed selection. Evaluation on real-world open-source targets, including libpng, tcpdump, and sqlite, demonstrates that our method achieves faster time-to-first-bug, higher semantic diversity, and a competitive number of unique bugs compared to state-of-the-art fuzzers. This work highlights the potential of combining LLM reasoning with semantic-aware feedback to accelerate and deepen vulnerability discovery.

Paperid: 3810, https://arxiv.org/pdf/2511.03641.pdf

Abstract:
To foster trustworthy Artificial Intelligence (AI) within the European Union, the AI Act requires providers to mark and detect the outputs of their general-purpose models. The Article 50 and Recital 133 call for marking methods that are ''sufficiently reliable, interoperable, effective and robust''. Yet, the rapidly evolving and heterogeneous landscape of watermarks for Large Language Models (LLMs) makes it difficult to determine how these four standards can be translated into concrete and measurable evaluations. Our paper addresses this challenge, anchoring the normativity of European requirements in the multiplicity of watermarking techniques. Introducing clear and distinct concepts on LLM watermarking, our contribution is threefold. (1) Watermarking Categorisation: We propose an accessible taxonomy of watermarking methods according to the stage of the LLM lifecycle at which they are applied - before, during, or after training, and during next-token distribution or sampling. (2) Watermarking Evaluation: We interpret the EU AI Act's requirements by mapping each criterion with state-of-the-art evaluations on robustness and detectability of the watermark, and of quality of the LLM. Since interoperability remains largely untheorised in LLM watermarking research, we propose three normative dimensions to frame its assessment. (3) Watermarking Comparison: We compare current watermarking methods for LLMs against the operationalised European criteria and show that no approach yet satisfies all four standards. Encouraged by emerging empirical tests, we recommend further research into watermarking directly embedded within the low-level architecture of LLMs.

Paperid: 3811, https://arxiv.org/pdf/2511.03538.pdf

Abstract:
The convergence of the Internet of Things (IoT) and quantum computing is redefining the security paradigm of interconnected digital systems. Classical cryptographic algorithms such as RSA, Elliptic Curve Cryptography (ECC), and Advanced Encryption Standard (AES) have long provided the foundation for securing IoT communication. However, the emergence of quantum algorithms such as Shor's and Grover's threatens to render these techniques vulnerable, necessitating the development of quantum-resilient alternatives. This chapter examines the implications of quantum computing for IoT security and explores strategies for building cryptographically robust systems in the post-quantum era. It presents an overview of Post-Quantum Cryptographic (PQC) families, including lattice-based, code-based, hash-based, and multivariate approaches, analyzing their potential for deployment in resource-constrained IoT environments. In addition, quantum-based methods such as Quantum Key Distribution (QKD) and Quantum Random Number Generators (QRNGs) are discussed for their ability to enhance confidentiality and privacy through physics-based security guarantees. The chapter also highlights issues of privacy management, regulatory compliance, and standardization, emphasizing the need for collaborative efforts across academia, industry, and governance. Overall, it provides a comprehensive perspective on security IoT ecosystems against quantum threats and ensures resilience in the next generation of intelligent networks.

Paperid: 3812, https://arxiv.org/pdf/2511.03020.pdf

Abstract:
Cyberattacks on e-commerce platforms have grown in sophistication, threatening consumer trust and operational continuity. This research presents a hybrid analytical framework that integrates statistical modelling and machine learning for detecting and forecasting cyberattack patterns in the e-commerce domain. Using the Verizon Community Data Breach (VCDB) dataset, the study applies Auto ARIMA for temporal forecasting and significance testing, including a Mann-Whitney U test (U = 2579981.5, p = 0.0121), which confirmed that holiday shopping events experienced significantly more severe cyberattacks than non-holiday periods. ANOVA was also used to examine seasonal variation in threat severity, while ensemble machine learning models (XGBoost, LightGBM, and CatBoost) were employed for predictive classification. Results reveal recurrent attack spikes during high-risk periods such as Black Friday and holiday seasons, with breaches involving Personally Identifiable Information (PII) exhibiting elevated threat indicators. Among the models, CatBoost achieved the highest performance (accuracy = 85.29%, F1 score = 0.2254, ROC AUC = 0.8247). The framework uniquely combines seasonal forecasting with interpretable ensemble learning, enabling temporal risk anticipation and breach-type classification. Ethical considerations, including responsible use of sensitive data and bias assessment, were incorporated. Despite class imbalance and reliance on historical data, the study provides insights for proactive cybersecurity resource allocation and outlines directions for future real-time threat detection research.

Paperid: 3813, https://arxiv.org/pdf/2511.02898.pdf

Abstract:
Micro and small enterprises (SMEs) account for most European businesses yet remain highly vulnerable to cyber threats. This paper analyses the design logic of a recent European policy initiative -- the Squad 2025 Playbook on Cybersecurity Awareness for Micro-SMEs -- to extract general principles for proportionate, resource-aware cybersecurity governance. The author participated in the Squad 2025 team and originally proposed the seven-step preventive structure that later shaped the Playbook's design, subsequently refined collaboratively within the project. The framework was guided by the author's design premise that raising cybersecurity awareness among micro- and small-enterprise actors represents the most efficient short-term lever for increasing sensitivity to cybercrime and promoting protective behaviours. Without reproducing any proprietary material, the paper reconstructs the conceptual architecture of that approach within the broader context of ENISA guidance, ISO 27005, and the NIS2 Directive. It proposes a generic seven-dimension preventive model suitable for micro-enterprise adoption and discusses implications for policy transfer, awareness training, and maturity assessment.

Paperid: 3814, https://arxiv.org/pdf/2511.02836.pdf

Abstract:
This paper presents the design, implementation, and evaluation of a hybrid encryption framework that combines quantum key distribution, specifically a simulated BB84 protocol, with AES-256 encryption. The system enables secure file encryption by leveraging quantum principles for key generation and classical cryptography for data protection. It introduces integrity validation mechanisms, including HMAC verification and optional post-quantum digital signatures, ensuring robustness even in the presence of quantum-capable adversaries. The entire architecture is implemented in Python, with modular components simulating quantum key exchange, encryption, and secure packaging. Experimental results include visual testing of various attack scenarios, such as key tampering, HMAC failure, and file corruption, demonstrating the effectiveness and resilience of the approach. The proposed solution serves as a practical foundation for quantum-aware cybersecurity systems.

Paperid: 3815, https://arxiv.org/pdf/2511.01941.pdf

Abstract:
Timely identification of issue reports reflecting software vulnerabilities is crucial, particularly for Internet-of-Things (IoT) where analysis is slower than non-IoT systems. While Machine Learning (ML) and Large Language Models (LLMs) detect vulnerability-indicating issues in non-IoT systems, their IoT use remains unexplored. We are the first to tackle this problem by proposing two approaches: (1) combining ML and LLMs with Natural Language Processing (NLP) techniques to detect vulnerability-indicating issues of 21 Eclipse IoT projects and (2) fine-tuning a pre-trained BERT Masked Language Model (MLM) on 11,000 GitHub issues for classifying \vul. Our best performance belongs to a Support Vector Machine (SVM) trained on BERT NLP features, achieving an Area Under the receiver operator characteristic Curve (AUC) of 0.65. The fine-tuned BERT achieves 0.26 accuracy, emphasizing the importance of exposing all data during training. Our contributions set the stage for accurately detecting IoT vulnerabilities from issue reports, similar to non-IoT systems.

Paperid: 3816, https://arxiv.org/pdf/2511.01910.pdf

Abstract:
The security of enterprise-grade networking hardware and software is critical to ensuring the integrity, availability, and confidentiality of data in modern cloud and data center environments. Network interface controllers (NICs) play a pivotal role in high-performance computing and virtualization, but their privileged access to system resources makes them a prime target for security vulnerabilities. This study presents a security analysis of the Intel ICE driver using the E810 Ethernet Controller, employing static analysis, fuzz testing, and timing-based side-channel evaluation to assess robustness against exploitation. The objective is to evaluate the drivers resilience to malformed inputs, identify implementation weaknesses, and determine whether timing discrepancies can be exploited for unauthorized inference of system states. Static code analysis reveals that insufficient bounds checking and unsafe string operations may introduce security flaws. Fuzz testing targets the Admin Queue, debugfs interface, and virtual function (VF) management. Interface-aware fuzzing and command mutation confirm strong input validation that prevents memory corruption and privilege escalation under normal conditions. However, using principles from KernelSnitch, the driver is found to be susceptible to timing-based side-channel attacks. Execution time discrepancies in hash table lookups allow an unprivileged attacker to infer VF occupancy states, enabling potential network mapping in multi-tenant environments. Further analysis shows inefficiencies in Read-Copy-Update (RCU) synchronization, where missing synchronization leads to stale data persistence, memory leaks, and out-of-memory conditions. Kernel instrumentation confirms that occupied VF lookups complete faster than unoccupied queries, exposing timing-based information leakage.

Paperid: 3817, https://arxiv.org/pdf/2511.01124.pdf

Abstract:
Network protocols are programs with inputs and outputs that follow predefined communication patterns to synchronize and exchange information. There are many protocols and each serves a different purpose, e.g., routing, transport, secure communication, etc. The functional and performance requirements for a protocol can be expressed using a formal specification, such as, a set of logical predicates over its traces. A protocol could be prevented from achieving its requirements due to a bug in its design or implementation, a component failure (e.g., a crash), or an attack. This dissertation shows that formal methods can feasibly characterize the functionality and performance of network protocols under normal conditions as well as when subjected to attacks.

Paperid: 3818, https://arxiv.org/pdf/2511.00894.pdf

Abstract:
This study examines machine learning techniques like Decision Trees, Support Vector Machines, Logistic Regression, Neural Networks, and ensemble methods to detect Android malware. The study evaluates these models on a dataset of Android applications and analyzes their accuracy, efficiency, and real-world applicability. Key findings show that ensemble methods demonstrate superior performance, but there are trade-offs between model interpretability, efficiency, and accuracy. Given its increasing threat, the insights guide future research and practical use of ML to combat Android malware.

Paperid: 3819, https://arxiv.org/pdf/2511.00415.pdf

Abstract:
This paper reconstructs zero-knowledge extensions on Solana as an architecture theory. Drawing on the existing ecosystem and on the author's prior papers and implementations as reference material, we propose a two-axis model that normalizes zero-knowledge (ZK) use by purpose (scalability vs. privacy) and by placement (on-chain vs. off-chain). On this grid we define five layer-crossing invariants: origin authenticity, replay-safety, finality alignment, parameter binding, and private consumption, which serve as a common vocabulary for reasoning about correctness across modules and chains. The framework covers the Solana Foundation's three pillars (ZK Compression, Confidential Transfer, light clients/bridges) together with surrounding components (Light Protocol/Helius, Succinct SP1, RISC Zero, Wormhole, Tinydancer, Arcium). From the theory we derive two design abstractions - Proof-Carrying Message (PCM) and a Verifier Router Interface - and a cross-chain counterpart, Proof-Carrying Interchain Message (PCIM), indicating concrete avenues for extending the three pillars.

Paperid: 3820, https://arxiv.org/pdf/2511.00348.pdf

Abstract:
An automated, standoff acoustic leak detection scheme has been designed, built, and tested. It merges the principles of glass breakage and smoke detection to alert for the presence of leaks emanating from pressurized plumbing. A simulated water leak flowing at 0.15 l/min has been reliably detected at a standoff distance of more than 10 m. The device is also effective at identifying the presence of leaks located behind surfaces such as walls, doors, floors, and ceilings. The anticipated application is as an autonomous, battery-powered, remote wireless node. All signal processing and analysis takes place on the edge with no need to stream audio data to the cloud. Sensor status is conveyed on-demand with only a few bytes of information, requiring minimal bandwidth. Power consumption is the range of 20--200 micro-Watts, depending on the amount of environmental noise and desired sensor latency. To attain optimum sensitivity and reliability, the hardware operates at acoustic frequencies well above the range of human conversations, making eavesdropping impossible. Development has been done with water escaping from pressurized plumbing, but the sensor concept can be used effectively to detect gas leaks.

Paperid: 3821, https://arxiv.org/pdf/2511.00263.pdf

Abstract:
COOL (Chen'21) is an error-free, information-theoretically secure Byzantine agreement (BA) protocol proven to achieve BA consensus in the synchronous setting for an $\ell$-bit message, with a total communication complexity of $O(\max\{n\ell, nt \log q\})$ bits, four communication rounds in the worst case, and a single invocation of a binary BA, under the optimal resilience assumption $n \geq 3t + 1$ in a network of $n$ nodes, where up to $t$ nodes may behave dishonestly. Here, $q$ denotes the alphabet size of the error correction code used in the protocol. In this work, we present an adaptive variant of COOL, called OciorACOOL, which achieves error-free, information-theoretically secure BA consensus in the asynchronous setting with total $O(\max\{n\ell, n t \log q\})$ communication bits, $O(1)$ rounds, and a single invocation of an asynchronous binary BA protocol, still under the optimal resilience assumption $n \geq 3t + 1$. Moreover, OciorACOOL retains the same low-complexity, traditional $(n, k)$ error-correction encoding and decoding as COOL, with $k=t/3$.

Paperid: 3822, https://arxiv.org/pdf/2511.00249.pdf

Abstract:
The growth in IoT devices means an ongoing risk of data vulnerability. The transition from centralized ecosystems to decentralized ecosystems is of paramount importance due to security, privacy, and data use concerns. Since the majority of IoT devices will be used by consumers in peer-to-peer applications, a centralized approach raises many issues of trust related to privacy, control, and censorship. Identity and access management lies at the heart of any user-facing system. Blockchain technologies can be leveraged to augment user authority, transparency, and decentralization. This study proposes a decentralized identity management framework for IoT environments using Hyperledger Fabric and Decentralized Identifiers (DIDs). The system was simulated using Node-RED to model IoT data streams, and key functionalities including device onboarding, authentication, and secure asset querying were successfully implemented. Results demonstrated improved data integrity, transparency, and user control, with reduced reliance on centralized authorities. These findings validate the practicality of blockchain-based identity management in enhancing the security and trustworthiness of IoT infrastructures.

Paperid: 3823, https://arxiv.org/pdf/2511.00118.pdf

Abstract:
Contemporary e-mail services have high availability expectations from the customers and are resource-strained because of the high-volume throughput and spam attacks. Deep Machine Learning architectures, which are resource hungry and require off-line processing due to the long processing times, are not acceptable at the front line filters. On the other hand, the bulk of the incoming spam is not sophisticated enough to bypass even the simplest algorithms. While the small fraction of the intelligent, highly mutable spam can be detected only by the deep architectures, the stress on them can be unloaded by the simple near real-time and near zero-footprint algorithms such as the Bag of Synthetic Syllables algorithm applied to the short texts of the e-mail subject lines and other short text fields. The proposed algorithm creates a circa 200 sparse dimensional hash or vector for each e-mail subject line that can be compared for the cosine or euclidean proximity distance to find similarities to the known spammy subjects. The algorithm does not require any persistent storage, dictionaries, additional hardware upgrades or software packages. The performance of the algorithm is presented on the one day of the real SMTP traffic.

Paperid: 3824, https://arxiv.org/pdf/2511.00111.pdf

Abstract:
As quantum computing hardware continues to advance, the integration of such technology with quantum algorithms is anticipated to enable the decryption of ciphertexts produced by RSA and Elliptic Curve Cryptography (ECC) within polynomial time. In response to this emerging threat, the U.S. National Institute of Standards and Technology (NIST) finalized a series of Post-Quantum Cryptography (PQC) standards in August 2024 and outlined a roadmap for PQC migration. Consequently, the design of X.509 certificates that adhere to PQC standards has become a crucial focus in the development of certificate management systems. To further strengthen security and facilitate a smooth migration process, several hybrid certificate schemes have been proposed internationally based on the X.509 certificate format, including the composite scheme, the catalyst scheme, and the chameleon scheme. This study presents a comprehensive analysis and comparison of these hybrid certificate schemes from multiple perspectives (e.g., certificate size, computational efficiency, and migration feasibility) to assess their suitability for various applications and services.

Paperid: 3825, https://arxiv.org/pdf/2510.27190.pdf

Abstract:
As Large Language Models (LLMs) are increasingly integrated into automated, multi-stage pipelines, risk patterns that arise from unvalidated trust between processing stages become a practical concern. This paper presents a mechanism-centered taxonomy of 41 recurring risk patterns in commercial LLMs. The analysis shows that inputs are often interpreted non-neutrally and can trigger implementation-shaped responses or unintended state changes even without explicit commands. We argue that these behaviors constitute architectural failure modes and that string-level filtering alone is insufficient. To mitigate such cross-stage vulnerabilities, we recommend zero-trust architectural principles, including provenance enforcement, context sealing, and plan revalidation, and we introduce "Countermind" as a conceptual blueprint for implementing these defenses.

Paperid: 3826, https://arxiv.org/pdf/2510.27070.pdf

Abstract:
The security and efficiency of modern computing systems are fundamentally undermined by the absence of a native architectural mechanism to propagate high-level program semantics, such as object identity, bounds, and lifetime, across the hardware/software interface. This paper presents a comprehensive survey of the architectural paradigm designed to bridge this semantic gap: descriptor-based, object-aware memory systems. By elevating the descriptor to a first-class architectural abstraction, this paradigm enables hardware to dynamically acquire and enforce the rich semantics of software-defined objects. This survey systematically charts the evolution and current landscape of this approach. We establish the foundational concepts of memory objects and descriptors and introduce a novel taxonomy of descriptor addressing modes, providing a structured framework for analyzing and comparing diverse implementations. Our unified analysis reveals how this paradigm holistically addresses the intertwined challenges of memory protection, management, and processing. As a culminating case study, we re-examine the CentroID model, demonstrating how its hybrid tagged-pointer encoding and descriptor processing mechanisms embody the path toward practical and efficient object-aware designs. Finally, we outline how the explicit cross-layer communication of object semantics provides a foundational research direction for next-generation cache hierarchies, unified virtual memory, and even 128-bit architectures.

Paperid: 3827, https://arxiv.org/pdf/2510.24789.pdf

Abstract:
Watermarking has been proposed as a lightweight mechanism to identify AI-generated text, with schemes typically relying on perturbations to token distributions. While prior work shows that paraphrasing can weaken such signals, these attacks remain partially detectable or degrade text quality. We demonstrate that cross-lingual summarization attacks (CLSA) -- translation to a pivot language followed by summarization and optional back-translation -- constitute a qualitatively stronger attack vector. By forcing a semantic bottleneck across languages, CLSA systematically destroys token-level statistical biases while preserving semantic fidelity. In experiments across multiple watermarking schemes (KGW, SIR, XSIR, Unigram) and five languages (Amharic, Chinese, Hindi, Spanish, Swahili), we show that CLSA reduces watermark detection accuracy more effectively than monolingual paraphrase at similar quality levels. Our results highlight an underexplored vulnerability that challenges the practicality of watermarking for provenance or regulation. We argue that robust provenance solutions must move beyond distributional watermarking and incorporate cryptographic or model-attestation approaches. On 300 held-out samples per language, CLSA consistently drives detection toward chance while preserving task utility. Concretely, for XSIR (explicitly designed for cross-lingual robustness), AUROC with paraphrasing is $0.827$, with Cross-Lingual Watermark Removal Attacks (CWRA) [He et al., 2024] using Chinese as the pivot, it is $0.823$, whereas CLSA drives it down to $0.53$ (near chance). Results highlight a practical, low-cost removal pathway that crosses languages and compresses content without visible artifacts.

Paperid: 3828, https://arxiv.org/pdf/2510.24498.pdf

Abstract:
As machine learning (ML) models become increasingly deployed through cloud infrastructures, the confidentiality of user data during inference poses a significant security challenge. Homomorphic Encryption (HE) has emerged as a compelling cryptographic technique that enables computation on encrypted data, allowing predictions to be generated without decrypting sensitive inputs. However, the integration of HE within large scale cloud native pipelines remains constrained by high computational overhead, orchestration complexity, and model compatibility issues. This paper presents a systematic framework for the design and optimization of cloud native homomorphic encryption workflows that support privacy-preserving ML inference. The proposed architecture integrates containerized HE modules with Kubernetes-based orchestration, enabling elastic scaling and parallel encrypted computation across distributed environments. Furthermore, optimization strategies including ciphertext packing, polynomial modulus adjustment, and operator fusion are employed to minimize latency and resource consumption while preserving cryptographic integrity. Experimental results demonstrate that the proposed system achieves up to 3.2times inference acceleration and 40% reduction in memory utilization compared to conventional HE pipelines. These findings illustrate a practical pathway for deploying secure ML-as-a-Service (MLaaS) systems that guarantee data confidentiality under zero-trust cloud conditions.

Paperid: 3829, https://arxiv.org/pdf/2510.22566.pdf

Abstract:
Recent work has revealed MOLE, the first practical attack to compromise GPU Trusted Execution Environments (TEEs), by injecting malicious firmware into the embedded Microcontroller Unit (MCU) of Arm Mali GPUs. By exploiting the absence of cryptographic verification during initialization, adversaries with kernel privileges can bypass memory protections, exfiltrate sensitive data at over 40 MB/s, and tamper with inference results, all with negligible runtime overhead. This attack surface affects commodity mobile SoCs and cloud accelerators, exposing a critical firmware-level trust gap in existing GPU TEE designs. To address this gap, this paper presents FAARM, a lightweight Firmware Attestation and Authentication framework that prevents MOLE-style firmware subversion. FAARM integrates digital signature verification at the EL3 secure monitor using vendor-signed firmware bundles and an on-device public key anchor. At boot, EL3 verifies firmware integrity and authenticity, enforces version checks, and locks the firmware region, eliminating both pre-verification and time-of-check-to-time-of-use (TOCTOU) attack vectors. We implement FAARM as a software-only prototype on a Mali GPU testbed, using a Google Colab-based emulation framework that models the firmware signing process, the EL1 to EL3 load path, and secure memory configuration. FAARM reliably detects and blocks malicious firmware injections, rejecting tampered images before use and denying overwrite attempts after attestation. Firmware verification incurs only 1.34 ms latency on average, demonstrating that strong security can be achieved with negligible overhead. FAARM thus closes a fundamental gap in shim-based GPU TEEs, providing a practical, deployable defense that raises the security baseline for both mobile and cloud GPU deployments.

Paperid: 3830, https://arxiv.org/pdf/2510.22536.pdf

Abstract:
We formalize a cross-domain "ZK coprocessor bridge" that lets Solana programs request private execution on Aztec L2 (via Ethereum) using Wormhole Verifiable Action Approvals (VAAs) as authenticated transport. The system comprises: (i) a Solana program that posts messages to Wormhole Core with explicit finality; (ii) an EVM Portal that verifies VAAs, enforces a replay lock, parses a bound payload secretHash||m from the attested VAA, derives a domain-separated field commitment, and enqueues an L1->L2 message into the Aztec Inbox (our reference implementation v0.1.0 currently uses consumeWithSecret(vaa, secretHash); we provide migration guidance to the payload-bound interface); (iii) a minimal Aztec contract that consumes the message privately; and (iv) an off-chain relayer that ferries VAAs and can record receipts on Solana. We present state machines, message formats, and proof sketches for replay-safety, origin authenticity, finality alignment, parameter binding (no relayer front-running of Aztec parameters), privacy, idempotence, and liveness. Finally, we include a concise Reproducibility note with pinned versions and artifacts to replicate a public testnet run.

Paperid: 3831, https://arxiv.org/pdf/2510.22387.pdf

Abstract:
Deep neural networks can convert ECG page images into analyzable waveforms, yet centralized training often conflicts with cross-institutional privacy and deployment constraints. A cross-silo federated digitization framework is presented that trains a full-model nnU-Net segmentation backbone without sharing images and aggregates updates across sites under realistic non-IID heterogeneity (layout, grid style, scanner profile, noise). The protocol integrates three standard server-side aggregators--FedAvg, FedProx, and FedAdam--and couples secure aggregation with central, user-level differential privacy to align utility with formal guarantees. Key features include: (i) end-to-end full-model training and synchronization across clients; (ii) secure aggregation so the server only observes a clipped, weighted sum once a participation threshold is met; (iii) central Gaussian DP with Renyi accounting applied post-aggregation for auditable user-level privacy; and (iv) a calibration-aware digitization pipeline comprising page normalization, trace segmentation, grid-leakage suppression, and vectorization to twelve-lead signals. Experiments on ECG pages rendered from PTB-XL show consistently faster convergence and higher late-round plateaus with adaptive server updates (FedAdam) relative to FedAvg and FedProx, while approaching centralized performance. The privacy mechanism maintains competitive accuracy while preventing exposure of raw images or per-client updates, yielding deployable, auditable guarantees suitable for multi-institution settings.

Paperid: 3832, https://arxiv.org/pdf/2510.22191.pdf

Abstract:
Provenance analysis based on system audit data has emerged as a fundamental approach for investigating Advanced Persistent Threat (APT) attacks. Due to the high concealment and long-term persistence of APT attacks, they are only represented as a minimal part of the critical path in the provenance graph. While existing techniques employ behavioral pattern matching and data flow feature matching to uncover latent associations in attack sequences through provenance graph path reasoning, their inability to establish effective attack context associations often leads to the conflation of benign system operations with real attack entities, that fail to accurately characterize real APT behaviors. We observe that while the causality of entities in the provenance graph exhibit substantial complexity, attackers often follow specific attack patterns-specifically, clear combinations of tactics and techniques to achieve their goals. Based on these insights, we propose TPPR, a novel framework that first extracts anomaly subgraphs through abnormal node detection, TTP-annotation and graph pruning, then performs attack path reasoning using mined TTP sequential pattern, and finally reconstructs attack scenarios through confidence-based path scoring and merging. Extensive evaluation on real enterprise logs (more than 100 million events) and DARPA TC dataset demonstrates TPPR's capability to achieve 99.9% graph simplification (700,000 to 20 edges) while preserving 91% of critical attack nodes, outperforming state-of-the-art solutions (SPARSE, DepImpact) by 63.1% and 67.9% in reconstruction precision while maintaining attack scenario integrity.

Paperid: 3833, https://arxiv.org/pdf/2510.22085.pdf

Abstract:
Large language models (LLMs) remain vulnerable to sophisticated prompt engineering attacks that exploit contextual framing to bypass safety mechanisms, posing significant risks in cybersecurity applications. We introduce Jailbreak Mimicry, a systematic methodology for training compact attacker models to automatically generate narrative-based jailbreak prompts in a one-shot manner. Our approach transforms adversarial prompt discovery from manual craftsmanship into a reproducible scientific process, enabling proactive vulnerability assessment in AI-driven security systems. Developed for the OpenAI GPT-OSS-20B Red-Teaming Challenge, we use parameter-efficient fine-tuning (LoRA) on Mistral-7B with a curated dataset derived from AdvBench, achieving an 81.0% Attack Success Rate (ASR) against GPT-OSS-20B on a held-out test set of 200 items. Cross-model evaluation reveals significant variation in vulnerability patterns: our attacks achieve 66.5% ASR against GPT-4, 79.5% on Llama-3 and 33.0% against Gemini 2.5 Flash, demonstrating both broad applicability and model-specific defensive strengths in cybersecurity contexts. This represents a 54x improvement over direct prompting (1.5% ASR) and demonstrates systematic vulnerabilities in current safety alignment approaches. Our analysis reveals that technical domains (Cybersecurity: 93% ASR) and deception-based attacks (Fraud: 87.8% ASR) are particularly vulnerable, highlighting threats to AI-integrated threat detection, malware analysis, and secure systems, while physical harm categories show greater resistance (55.6% ASR). We employ automated harmfulness evaluation using Claude Sonnet 4, cross-validated with human expert assessment, ensuring reliable and scalable evaluation for cybersecurity red-teaming. Finally, we analyze failure mechanisms and discuss defensive strategies to mitigate these vulnerabilities in AI for cybersecurity.

Paperid: 3834, https://arxiv.org/pdf/2510.20645.pdf

Abstract:
Despite the popularity of Hashed Time-Locked Contracts (HTLCs) because of their use in wide areas of applications such as payment channels, atomic swaps, etc, their use in exchange is still questionable. This is because of its incentive incompatibility and susceptibility to bribery attacks. State-of-the-art solutions such as MAD-HTLC (Oakland'21) and He-HTLC (NDSS'23) address this by leveraging miners' profit-driven behaviour to mitigate such attacks. The former is the mitigation against passive miners; however, the latter works against both active and passive miners. However, they consider only two bribing scenarios where either of the parties involved in the transfer collude with the miner. In this paper, we expose vulnerabilities in state-of-the-art solutions by presenting a miner-collusion bribery attack with implementation and game-theoretic analysis. Additionally, we propose a stronger attack on MAD-HTLC than He-HTLC, allowing the attacker to earn profits equivalent to attacking naive HTLC. Leveraging our insights, we propose \prot, a game-theoretically secure HTLC protocol resistant to all bribery scenarios. \prot\ employs a two-phase approach, preventing unauthorized token confiscation by third parties, such as miners. In Phase 1, parties commit to the transfer; in Phase 2, the transfer is executed without manipulation. We demonstrate \prot's efficiency in transaction cost and latency via implementations on Bitcoin and Ethereum.

Paperid: 3835, https://arxiv.org/pdf/2510.20612.pdf

Abstract:
Large Language Models are increasingly adopted as critical tools for accelerating innovation. This paper identifies and formalizes a systemic risk inherent in this paradigm: \textbf{Black Box Absorption}. We define this as the process by which the opaque internal architectures of LLM platforms, often operated by large-scale service providers, can internalize, generalize, and repurpose novel concepts contributed by users during interaction. This mechanism threatens to undermine the foundational principles of innovation economics by creating severe informational and structural asymmetries between individual creators and platform operators, thereby jeopardizing the long-term sustainability of the innovation ecosystem. To analyze this challenge, we introduce two core concepts: the idea unit, representing the transportable functional logic of an innovation, and idea safety, a multidimensional standard for its protection. This paper analyzes the mechanisms of absorption and proposes a concrete governance and engineering agenda to mitigate these risks, ensuring that creator contributions remain traceable, controllable, and equitable.

Paperid: 3836, https://arxiv.org/pdf/2510.20131.pdf

Abstract:
Pseudorandom generators (PRGs) are a foundational primitive in classical cryptography, underpinning a wide range of constructions. In the quantum setting, pseudorandom quantum states (PRSs) were proposed as a potentially weaker assumption that might serve as a substitute for PRGs in cryptographic applications. Two primary size regimes of PRSs have been studied: logarithmic-size and linear-size. Interestingly, logarithmic PRSs have led to powerful cryptographic applications, such as digital signatures and quantum public-key encryption, that have not been realized from their linear counterparts. However, PRGs have only been black-box separated from linear PRSs, leaving open the fundamental question of whether PRGs are also separated from logarithmic PRSs. In this work, we resolve this open problem. We establish a quantum black-box separation between (quantum-evaluable) PRGs and PRSs of either size regime. Specifically, we construct a unitary quantum oracle with inverse access relative to which no black-box construction of PRG from (logarithmic or linear) PRS exists. As a direct corollary, we obtain separations between PRGs and several primitives implied by logarithmic PRSs, including digital signatures and quantum public-key encryption.

Paperid: 3837, https://arxiv.org/pdf/2510.19885.pdf

Abstract:
Mathematically constructed S-boxes arise from algebraic structures and finite field theory to ensure strong, provable cryptographic properties. These mathematically grounded constructions allow for generation of thousands of S-Boxes with high nonlinearity, APN properties, and balanced avalanche characteristics, unlike fully random methods, which lack such theoretical guarantees in exchange for low complexity and more varied results. In this work, we compare mathematically constructed constructions with randomly generated ones to evaluate the relative weakness of the latter. We also establish an average measure of performance for randomly generated permutations, as well as random with forced cycle constraints, and compare them to well-established designs in a simple SPN setting.

Paperid: 3838, https://arxiv.org/pdf/2510.19877.pdf

Abstract:
A policy-governed RAG architecture is specified for audit-ready generation in regulated workflows, organized as a triptych: (I) Contracts/Control (SHRDLU-like), which governs output adherence to legal and internal policies; (II) Manifests/Trails (Memex-like), which cryptographically anchors all cited source evidence to ensure verifiable provenance; and (III) Receipts/Verification (Xanadu-like), which provides the final, portable proof of compliance for auditors (portable COSE/JOSE) (see Section 4 and Appendix A). Rather than explaining model internals, outputs are gated ex-ante and bound to cryptographically verifiable evidence for each material answer. Unvalidated targets are stated (>=20% relative reduction in confident errors; p95 latency <= 900 ms; <= 2.2x serve cost) together with a pre-registered (optional) pilot using NO-GO gates. The design complements existing RAG/guardrails by making policy checks auditable, replayable, and receipt-backed. Target domains include back-office compliance in pharma, medical devices, finance, legal, and the public sector where error costs may exceed thousands of euros and audit trails are mandatory under regulations such as the EU AI Act. Future evaluations may pre-commit to publishing negative results when any example NO-GO gate is not met.

Paperid: 3839, https://arxiv.org/pdf/2510.19859.pdf

Abstract:
Cyberattack detection in Critical Infrastructure and Supply Chains has become challenging in Industry 4.0. Intrusion Detection Systems (IDS) are deployed to counter the cyberattacks. However, an IDS effectively detects attacks based on the known signatures and patterns, Zero-day attacks go undetected. To overcome this drawback in IDS, the integration of a Dense Neural Network (DNN) with Data Augmentation is proposed. It makes IDS intelligent and enables it to self-learn with high accuracy when a novel attack is encountered. The network flow captures datasets are highly imbalanced same as the real network itself. The Data Augmentation plays a crucial role in balancing the data. The balancing of data is challenging as the minority class is as low as 0.000004\% of the dataset, and the abundant class is higher than 80\% of the dataset. Synthetic Minority Oversampling Technique is used for balancing the data. However, higher accuracies are achieved with balanced test data, lower accuracies are noticeable with the original imbalanced test data suggesting overfitting. A comparison with state-of-the-art research using Synthetic Minority Oversampling Technique with Edited Nearest Neighbor shows the classification of classes remains poor for the original dataset. This suggests highly imbalanced datasets of network flow require a different method of data augmentation.

Paperid: 3840, https://arxiv.org/pdf/2510.19303.pdf

Abstract:
Problem Space: AI Vulnerabilities and Quantum Threats Generative AI vulnerabilities: model inversion, data poisoning, adversarial inputs. Quantum threats Shor Algorithm breaking RSA ECC encryption. Challenge Secure generative AI models against classical and quantum cyberattacks. Proposed Solution Collaborative Penetration Testing Suite Five Integrated Components: DAST SAST OWASP ZAP, Burp Suite, SonarQube, Fortify. IAST Contrast Assess integrated with CI CD pipeline. Blockchain Logging Hyperledger Fabric for tamper-proof logs. Quantum Cryptography Lattice based RLWE protocols. AI Red Team Simulations Adversarial ML & Quantum-assisted attacks. Integration Layer: Unified workflow for AI, cybersecurity, and quantum experts. Key Results 300+ vulnerabilities identified across test environments. 70% reduction in high-severity issues within 2 weeks. 90% resolution efficiency for blockchain-logged vulnerabilities. Quantum-resistant cryptography maintained 100% integrity in tests. Outcome: Quantum AI Security Protocol integrating Blockchain Quantum Cryptography AI Red Teaming.

Paperid: 3841, https://arxiv.org/pdf/2510.19281.pdf

Abstract:
Objectives: This study aims to investigate the readability and understandability of bitwise operators in programming, with the main hypothesis that there will be a difference in the performance metrics (response time and error rate) between participants exposed to various bitwise operators related questions and those who are not. Participants: Participants in this human research study include people without programming background, novice programmers, and university students with varying programming experience (from freshmen to PhD level). There were 23 participants in this study. Study Methods: This study uses a within-subjects experimental design to assess how people with diverse programming backgrounds understand and use bitwise operators. Participants complete tasks in a JavaScript program, and their task completion times and task accuracy are recorded for analysis. Findings: The results indicate that operators can be one of the factors predicting response time, showing a small but significant effect (R-squared = 0.032, F(1, 494) = 16.5, p < .001). Additionally, operators such as OR, NOT, and Left Shift showed statistical significance in task completion times compared to other operators. Conclusions: While the complexity of bitwise operators did not generally result in longer task completion times, certain operators were found to be less intuitive, suggesting the need for further investigation and potential redesign for improved understandability.

Paperid: 3842, https://arxiv.org/pdf/2510.18506.pdf

Abstract:
Let $\mathbb{F}_q$ be a finite field, and let $F \in \mathbb{F}_q [X]$ be a polynomial with $d = \text{deg} \left( F \right)$ such that $\gcd \left( d, q \right) = 1$. In this paper we prove that the $c$-Boomerang uniformity, $c \neq 0$, of $F$ is bounded by - $d^2$ if $c^2 \neq 1$, - $d \cdot (d - 1)$ if $c = -1$, - $d \cdot (d - 2)$ if $c = 1$. For all cases of $c$, we present tight examples for $F \in \mathbb{F}_q [X]$. Additionally, for the proof of $c = 1$ we establish that the bivariate polynomial $F (x) - F (y) + a \in k [x, y]$, where $k$ is a field of characteristic $p$ and $a \in k \setminus \{ 0 \}$, is absolutely irreducible if $p \nmid \text{deg} \left( F \right)$.

Paperid: 3843, https://arxiv.org/pdf/2510.17220.pdf

Abstract:
The concept of linearity plays a central role in both mathematics and computer science, with distinct yet complementary meanings. In mathematics, linearity underpins functions and vector spaces, forming the foundation of linear algebra and functional analysis. In computer science, it relates to resource-sensitive computation. Linear Logic (LL), for instance, models assumptions that must be used exactly once, providing a natural framework for tracking computational resources such as time, memory, or data access. This dual perspective makes linearity essential to programming languages, type systems, and formal models that express both computational complexity and composability. Bridging these interpretations enables rigorous yet practical methodologies for analyzing and verifying complex systems. This thesis explores the use of LL to model programming paradigms based on linearity. It comprises two parts: ADLL and CryptoBLL. The former applies LL to Automatic Differentiation (AD), modeling linear functions over the reals and the transposition operation. The latter uses LL to express complexity constraints on adversaries in computational cryptography. In AD, two main approaches use linear type systems: a theoretical one grounded in proof theory, and a practical one implemented in JAX, a Python library developed by Google for machine learning research. In contrast, frameworks like PyTorch and TensorFlow support AD without linear types. ADLL aims to bridge theory and practice by connecting JAX's type system to LL. In modern cryptography, several calculi aim to model cryptographic proofs within the computational paradigm. These efforts face a trade-off between expressiveness, to capture reductions, and simplicity, to abstract probability and complexity. CryptoBLL addresses this tension by proposing a framework for the automatic analysis of protocols in computational cryptography.

Paperid: 3844, https://arxiv.org/pdf/2510.17087.pdf

Abstract:
Virtual power plants (VPPs) are becoming a cornerstone of future grids, aggregating distributed PV, wind, storage, and flexible loads for market participation and real-time balancing. As operations move to minute-- and second--level feedback, communication security shifts from a compliance item to an operational constraint: latency, reliability, and confidentiality jointly determine whether dispatch, protection, and settlement signals arrive on time. Conventional PKI and key-rotation schemes struggle with cross-domain, high-frequency messaging and face long-term quantum threats. Quantum key distribution (QKD) offers information-theoretic key freshness, but its key yield is scarce and stochastic, often misaligned with bursty VPP traffic. This paper proposes a key-aware priority and quota framework that treats quantum keys as first-class scheduling resources. The design combines (i) forecast-driven long-term quotas and short-term tokens, (ii) key-aware deficit-round-robin arbitration, (iii) a preemptive emergency key reserve, and (iv) graceful degradation via encryption-mode switching and controlled down-sampling for non-critical traffic. A drift-plus-penalty analysis establishes strong stability under average supply--demand balance with quantifiable bounds on backlog and tail latency, providing interpretable operating guarantees. We build a reproducible testbed on IEEE 33- and 123-bus VPP systems and evaluate normal, degraded, and outage regimes with industry-consistent message classes and TTLs. Against FIFO, fixed-priority, and static-quota baselines, the proposed scheme consistently reduces tail delay and passive timeouts for critical messages, improves per-bit key utility, and enhances power-tracking reliability during key scarcity and regime switches.

Paperid: 3845, https://arxiv.org/pdf/2510.16959.pdf

Abstract:
Differential privacy for the 2020 census required an estimated 90 terabytes of randomness [GL20], an amount which may be prohibitively expensive or entirely infeasible to generate. Motivated by these practical concerns, [CSV25] initiated the study of the randomness complexity of differential privacy, and in particular, the randomness complexity of $d$ counting queries. This is the task of outputting the number of entries in a dataset that satisfy predicates $\mathcal{P}_1, \dots, \mathcal{P}_d$ respectively. They showed the rather surprising fact that though any reasonably accurate, $\varepsilon$-differentially private mechanism for one counting query requires $1-O(\varepsilon)$ bits of randomness in expectation, there exists a fairly accurate mechanism for $d$ counting queries which requires only $O(\log d)$ bits of randomness in expectation. The mechanism of [CSV25] is inefficient (not polynomial time) and relies on a combinatorial object known as rounding schemes. Here, we give a polynomial time mechanism which achieves nearly the same randomness complexity versus accuracy tradeoff as that of [CSV25]. Our construction is based on the following simple observation: after a randomized shift of the answer to each counting query, the answer to many counting queries remains the same regardless of whether we add noise to that coordinate or not. This allows us to forgo the step of adding noise to the result of many counting queries. Our mechanism does not make use of rounding schemes. Therefore, it provides a different -- and, in our opinion, clearer -- insight into the origins of the randomness savings that can be obtained by batching $d$ counting queries. Therefore, it provides a different -- and, in our opinion, clearer -- insight into the origins of the randomness savings that can be obtained by batching $d$ counting queries.

Paperid: 3846, https://arxiv.org/pdf/2510.16744.pdf

Abstract:
Ride-Hailing Services (RHS) match a ride request initiated by a rider with a suitable driver responding to the ride request. A Privacy-Preserving RHS (PP-RHS) aims to facilitate ride matching while ensuring the privacy of riders' and drivers' location data w.r.t. the Service Provider (SP). At NSS 2022, Xie et al. proposed a PP-RHS. In this work, we demonstrate a passive attack on their PP-RHS protocol. Our attack allows the SP to completely recover the locations of the rider as well as that of the responding drivers in every ride request. Further, our attack is very efficient as it is independent of the security parameter.

Paperid: 3847, https://arxiv.org/pdf/2510.15953.pdf

Abstract:
Multi-modal threat detection faces a fundamental challenge that involves security tools operating in isolation, and this creates streams of network, email, and system data with no natural alignment or correlation. We present Hierarchical Multi-Modal Threat Intelligence Fusion (HM-TIF), a framework explicitly designed for this realistic scenario where naturally aligned multi-modal attack data does not exist. Unlike prior work that assumes or creates artificial alignment, we develop principled methods for correlating independent security data streams while maintaining operational validity. Our architecture employs hierarchical cross-attention with dynamic weighting that adapts to data availability and threat context, coupled with a novel temporal correlation protocol that preserves statistical independence. Evaluation on UNSW-NB15, CSE-CIC-IDS2018, and CICBell-DNS2021 datasets demonstrates that HM-TIF achieves 88.7% accuracy with a critical 32% reduction in false positive rates, even without true multi-modal training data. The framework maintains robustness when modalities are missing, making it immediately deployable in real security operations where data streams frequently have gaps. Our contributions include: (i) the first multi-modal security framework explicitly designed for non-aligned data, (ii) a temporal correlation protocol that avoids common data leakage pitfalls, (iii) empirical validation that multi-modal fusion provides operational benefits even without perfect alignment, and (iv) practical deployment guidelines for security teams facing heterogeneous, uncoordinated data sources. Index Terms: multi-modal learning, threat intelligence, non-aligned data, operational security, cross-attention mechanisms, practical deployment

Paperid: 3848, https://arxiv.org/pdf/2510.15747.pdf

Abstract:
Grassroots platforms are distributed applications run by\linebreak cryptographically-identified people on their networked personal devices, where multiple disjoint platform instances emerge independently and coalesce when they interoperate. Their foundation is the grassroots social graph, upon which grassroots social networks, grassroots cryptocurrencies, and grassroots democratic federations can be built. Grassroots platforms have yet to be implemented, the key challenge being faulty and malicious participants: without secure programming support, correct participants cannot reliably identify each other, establish secure communication, or verify each other's code integrity. We present Grassroots Logic Programs (GLP), a secure, multiagent, concurrent, logic programming language for implementing grassroots platforms. GLP extends logic programs with paired single-reader/single-writer (SRSW) logic variables, providing secure communication channels among cryptographically-identified people through encrypted, signed and attested messages, which enable identity and code integrity verification. We present GLP progressively: logic programs, concurrent GLP, multiagent GLP, augmenting it with cryptographic security, and providing smartphone implementation-ready specifications. We prove safety properties including that GLP computations are deductions, SRSW preservation, acyclicity, and monotonicity. We prove multiagent GLP is grassroots and that GLP streams achieve blockchain security properties. We present a grassroots social graph protocol establishing authenticated peer-to-peer connections and demonstrate secure grassroots social networking applications.

Paperid: 3849, https://arxiv.org/pdf/2510.15515.pdf

Abstract:
This paper presents a novel post-quantum cryptosystem based on high-memory masked convolutional codes. Unlike conventional code-based schemes that rely on block codes with fixed dimensions and limited error-correction capability, our construction offers both stronger cryptographic security and greater flexibility. It supports arbitrary plaintext lengths with linear-time decryption and uniform per-bit computational cost, enabling seamless scalability to long messages. Security is reinforced through a higher-rate injection of random errors than in block-code approaches, along with additional noise introduced via polynomial division, which substantially obfuscates the underlying code structure. Semi-invertible transformations generate dense, random-like generator matrices that conceal algebraic properties and resist known structural attacks. Consequently, the scheme achieves cryptanalytic security margins exceeding those of the classic McEliece system by factors greater than 2100. Finally, decryption at the recipient employs an array of parallel Viterbi decoders, enabling efficient hardware and software implementation and positioning the scheme as a strong candidate for deployment in practical quantum-resistant public-key cryptosystems.

Paperid: 3850, https://arxiv.org/pdf/2510.14185.pdf

Abstract:
Industrial Control Systems (ICS) underpin the United States' critical infrastructure, managing essential services such as power, water, and transportation that are vital to national security and public safety. However, increasing digital integration has exposed these systems to escalating cyber threats. Historical attacks like Stuxnet and the Ukraine power grid incident revealed exploitable weaknesses-poor network segmentation, outdated software, weak authentication, and inadequate monitoring-that persist in many U.S. ICS environments today. This paper analyzes these landmark attacks to identify recurring vulnerabilities and assess their relevance to current U.S. infrastructure. It argues that without immediate reforms, similar exploits could lead to catastrophic disruptions and national security crises. To address these risks, the paper proposes policy measures focused on implementing zero-trust architecture and improved network segmentation to enhance system resilience. These recommendations aim to guide policymakers and industry leaders in securing the nation's most critical operational technologies against future cyber threats.

Paperid: 3851, https://arxiv.org/pdf/2510.14171.pdf

Abstract:
The U.S. power grid underpins national security, public safety, and economic stability, but faces growing cyber risks from vulnerabilities in industrial control systems, remote access, and poor cyber hygiene. Despite its critical importance, current policy remains fragmented and reactive. This paper proposes a dual policy approach to strengthen grid cybersecurity: enhanced information sharing between government and private utilities to improve threat detection and response, and standardized cyber hygiene practices to reduce common attack vectors. For long-term resilience, a Unified National Cybersecurity Framework is recommended to align existing NERC, IEC, IEEE, and NIST standards, eliminate regulatory overlap, and adapt to evolving threats. Together, these policies offer both immediate and sustainable improvements in safeguarding the nation's most vital infrastructure.

Paperid: 3852, https://arxiv.org/pdf/2510.13822.pdf

Abstract:
This thesis explores the extent to which passive observation of wireless traffic in a smart home environment can be used to infer privacy-invasive information about its inhabitants. Using a setup that mimics the capabilities of a nosy neighbor in an adjacent flat, we analyze raw 802.11 packets and Bluetooth Low Energy advertisemets. From this data, we identify devices, infer their activity states and approximate their location using RSSI-based trilateration. Despite the encrypted nature of the data, we demonstrate that it is possible to detect active periods of multimedia devices, infer common activities such as sleeping, working and consuming media, and even approximate the layout of the neighbor's apartment. Our results show that privacy risks in smart homes extend beyond traditional data breaches: a nosy neighbor behind the wall can gain privacy-invasive insights into the lives of their neighbors purely from encrypted network traffic.

Paperid: 3853, https://arxiv.org/pdf/2510.13543.pdf

Abstract:
Large Language Model (LLM) based agents integrated into web browsers (often called agentic AI browsers) offer powerful automation of web tasks. However, they are vulnerable to indirect prompt injection attacks, where malicious instructions hidden in a webpage deceive the agent into unwanted actions. These attacks can bypass traditional web security boundaries, as the AI agent operates with the user privileges across sites. In this paper, we present a novel fuzzing framework that runs entirely in the browser and is guided by an LLM to automatically discover such prompt injection vulnerabilities in real time.

Paperid: 3854, https://arxiv.org/pdf/2510.12802.pdf

Abstract:
How do you store infinity in 256 bits? This paper explores the fundamental deception at the heart of computational cryptography: using finite information to simulate infinite randomness. We prove why true random oracles are impossible, then show how lazy evaluation creates a beautiful lie -- a finite automaton that successfully pretends to be infinite. We reveal that ``randomness'' in cryptography is actually computational hardness in disguise, demonstrating through Python implementations how 256 bits of entropy can generate sequences indistinguishable from infinite randomness to any computationally bounded observer.How do you store infinity in 256 bits? This paper explores the fundamental deception at the heart of computational cryptography: using finite information to simulate infinite randomness. We prove why true random oracles are impossible, then show how lazy evaluation creates a beautiful lie -- a finite automaton that successfully pretends to be infinite. We reveal that ``randomness'' in cryptography is actually computational hardness in disguise, demonstrating through Python implementations how 256 bits of entropy can generate sequences indistinguishable from infinite randomness to any computationally bounded observer.

Paperid: 3855, https://arxiv.org/pdf/2510.12665.pdf

Abstract:
Modern web and digital application password storage relies on password hashing for storage and security. Ad-hoc upgrade of password storage to keep up with hash algorithm norms may be used to save costs but can introduce unforeseen vulnerabilities. This is the case in the password storage scheme used by Meta Platforms which services several billion monthly users worldwide. In this paper we present the first example of an exploit which demonstrates the security weakness of Facebook's password storage scheme, and discuss its implications. Proper ethical disclosure guidelines and vendor notification were followed.

Paperid: 3856, https://arxiv.org/pdf/2510.12440.pdf

Abstract:
We present a formal framework for context-aware security verification that establishes provable guarantees for ML-enhanced adaptive systems. We introduce context-completeness - a new security property - and prove: (1) sample complexity bounds showing when adaptive verification succeeds, (2) information-theoretic limits relating context richness to detection capability, (3) convergence guarantees for ML-based payload generators, and (4) compositional soundness bounds. We further provide a formal separation between static context-blind verifiers and context-aware adaptive verifiers: for a natural family of targets, any static verifier with finite payload budget achieves completeness at most alpha, while a context-aware verifier with sufficient information achieves completeness greater than alpha. We validate our theoretical predictions through controlled experiments on 97,224 exploit samples, demonstrating: detection accuracy improving from 58% to 69.93% with dataset growth, success probability increasing from 51% to 82% with context enrichment, training loss converging at O(1/sqrt(T)) rate, and false positive rate (10.19%) within theoretical bounds (12%). Our results show that theoretically-grounded adaptive verification achieves provable improvements over static approaches under stated assumptions while maintaining soundness guarantees.

Paperid: 3857, https://arxiv.org/pdf/2510.11837.pdf

Abstract:
The security of Large Language Model (LLM) applications is fundamentally challenged by "form-first" attacks like prompt injection and jailbreaking, where malicious instructions are embedded within user inputs. Conventional defenses, which rely on post hoc output filtering, are often brittle and fail to address the root cause: the model's inability to distinguish trusted instructions from untrusted data. This paper proposes Countermind, a multi-layered security architecture intended to shift defenses from a reactive, post hoc posture to a proactive, pre-inference, and intra-inference enforcement model. The architecture proposes a fortified perimeter designed to structurally validate and transform all inputs, and an internal governance mechanism intended to constrain the model's semantic processing pathways before an output is generated. The primary contributions of this work are conceptual designs for: (1) A Semantic Boundary Logic (SBL) with a mandatory, time-coupled Text Crypter intended to reduce the plaintext prompt injection attack surface, provided all ingestion paths are enforced. (2) A Parameter-Space Restriction (PSR) mechanism, leveraging principles from representation engineering, to dynamically control the LLM's access to internal semantic clusters, with the goal of mitigating semantic drift and dangerous emergent behaviors. (3) A Secure, Self-Regulating Core that uses an OODA loop and a learning security module to adapt its defenses based on an immutable audit log. (4) A Multimodal Input Sandbox and Context-Defense mechanisms to address threats from non-textual data and long-term semantic poisoning. This paper outlines an evaluation plan designed to quantify the proposed architecture's effectiveness in reducing the Attack Success Rate (ASR) for form-first attacks and to measure its potential latency overhead.

Paperid: 3858, https://arxiv.org/pdf/2510.11640.pdf

Abstract:
We study the sublinear space continual release model for edge-differentially private (DP) graph algorithms, with a focus on the densest subgraph problem (DSG) in the insertion-only setting. Our main result is the first continual release DSG algorithm that matches the additive error of the best static DP algorithms and the space complexity of the best non-private streaming algorithms, up to constants. The key idea is a refined use of subsampling that simultaneously achieves privacy amplification and sparsification, a connection not previously formalized in graph DP. Via a simple black-box reduction to the static setting, we obtain both pure and approximate-DP algorithms with $O(\log n)$ additive error and $O(n\log n)$ space, improving both accuracy and space complexity over the previous state of the art. Along the way, we introduce graph densification in the graph DP setting, adding edges to trigger earlier subsampling, which removes the extra logarithmic factors in error and space incurred by prior work [ELMZ25]. We believe this simple idea may be of independent interest.

Paperid: 3859, https://arxiv.org/pdf/2510.10901.pdf

Abstract:
Classical linear ciphers, such as the Hill cipher, operate on fixed, finite-dimensional modules and are therefore vulnerable to straightforward known-plaintext attacks that recover the key as a fully determined linear operator. We propose a symmetric-key cryptosystem whose linear action takes place instead in the Burnside ring $A(G)$ of a compact Lie group $G$, with emphasis on the case $G=O(2)$. The secret key consists of (i) a compact Lie group $G$; (ii) a secret total ordering of the subgroup orbit-basis of $A(G)$; and (iii) a finite set $S$ of indices of irreducible $G$-representations, whose associated basic degrees define an involutory multiplier $k\in A(G)$. Messages of arbitrary finite length are encoded as finitely supported elements of $A(G)$ and encrypted via the Burnside product with $k$. For $G=O(2)$ we prove that encryption preserves plaintext support among the generators $\{(D_1),\dots,(D_L),(SO(2)),(O(2))\}$, avoiding ciphertext expansion and security leakage. We then analyze security in passive models, showing that any finite set of observations constrains the action only on a finite-rank submodule $W_L\subset A(O(2))$, and we show information-theoretic non-identifiability of the key from such data. Finally, we prove the scheme is \emph{not} IND-CPA secure, by presenting a one-query chosen-plaintext distinguisher based on dihedral probes.

Paperid: 3860, https://arxiv.org/pdf/2510.09715.pdf

Abstract:
With the proliferation of decentralized applications (DApps), the conflict between the transparency of blockchain technology and user data privacy has become increasingly prominent. While Decentralized Identity (DID) and Verifiable Credentials (VCs) provide a standardized framework for user data sovereignty, achieving trusted identity verification and data sharing without compromising privacy remains a significant challenge. This paper proposes a novel, comprehensive framework that integrates DIDs and VCs with efficient Zero-Knowledge Proof (ZKP) schemes to address this core issue. The key contributions of this framework are threefold: first, it constructs a set of strong privacy-preserving protocols based on zk-STARKs, allowing users to prove that their credentials satisfy specific conditions (e.g., "age is over 18") without revealing any underlying sensitive data. Second, it designs a scalable, privacy-preserving credential revocation mechanism based on cryptographic accumulators, effectively solving credential management challenges in large-scale scenarios. Finally, it integrates a practical social key recovery scheme, significantly enhancing system usability and security. Through a prototype implementation and performance evaluation, this paper quantitatively analyzes the framework's performance in terms of proof generation time, verification overhead, and on-chain costs. Compared to existing state-of-the-art systems based on zk-SNARKs, our framework, at the cost of a larger proof size, significantly improves prover efficiency for complex computations and provides stronger security guarantees, including no trusted setup and post-quantum security. Finally, a case study in the decentralized finance (DeFi) credit scoring scenario demonstrates the framework's immense potential for unlocking capital efficiency and fostering a trusted data economy.

Paperid: 3861, https://arxiv.org/pdf/2510.09690.pdf

Abstract:
Cloud computing has become the foundation of modern digital infrastructure, yet the absence of a unified architectural and compliance framework impedes interoperability, auditability, and robust security. This paper introduces a formal, machine-readable semantic model for Cloud Engines, integrating the architectural taxonomy of ISO/IEC 22123 (Cloud Reference Architecture) with the security and compliance controls of ISO/IEC 27001:2022 and ISO/IEC TR 3445:2022. The model decomposes cloud systems into four canonical interfaces--Control, Business, Audit, and Data--and extends them with a security ontology that maps mechanisms such as authentication, authorization, and encryption to specific compliance controls. Expressed in RDF/Turtle, the model enables semantic reasoning, automated compliance validation, and vendor-neutral architecture design. We demonstrate its practical utility through OpenStack and AWS case studies, and provide reproducible validation workflows using SPARQL and SHACL. This work advances the state of cloud security modeling by bridging architectural and compliance standards in a unified framework, with a particular emphasis on auditability.

Paperid: 3862, https://arxiv.org/pdf/2510.09672.pdf

Abstract:
Pingmark defines a universal textual protocol for expressing spatial context through a minimal symbol: !@. Rather than embedding coordinates or using proprietary map links, Pingmark introduces a semantic trigger that compliant client applications interpret to generate a standardized resolver link of the form https://pingmark.me/lat/lon/[timestamp]. This allows location expression to function like existing textual conventions - @ for identity or # for topics - but for physical space. The protocol requires no user registration, relies on open mapping technologies, and protects privacy by generating location data ephemerally and locally. This paper presents the motivation, syntax, and design of the Pingmark Protocol Specification (PPS v0.1), its reference resolver implementation, and the long-term goal of establishing Pingmark as an open Internet standard for spatial mentions.

Paperid: 3863, https://arxiv.org/pdf/2510.09656.pdf

Abstract:
The proliferation of high-fidelity synthetic media, coupled with exploitable hardware vulnerabilities in conventional imaging pipelines, has precipitated a crisis of trust in digital content. Existing countermeasures, from post-hoc classifiers to software-based signing, fail to address the fundamental challenge of establishing an unbreakable link to reality at the moment of capture. This whitepaper introduces Signing Right Away (SRA), a comprehensive security architecture that guarantees the provenance of digital media from "silicon to silicon to signed file." SRA leverages a four-pillar security model-Confidentiality, Integrity, Authentication, and Replay Protection, akin to the MIPI Camera Security Framework (CSF), but also extends its scope beyond the internal data bus to the creation of a cryptographically sealed, C2PA-compliant final asset. By securing the entire imaging pipeline within a Trusted Execution Environment (TEE), SRA ensures that every captured image and video carries an immutable, verifiable proof of origin. This provides a foundational solution for industries reliant on trustworthy visual information, including journalism, legal evidence, and insurance. We present the SRA architecture, a detailed implementation roadmap informed by empirical prototyping, and a comparative analysis that positions SRA as the essential "last mile" in the chain of content trust.

Paperid: 3864, https://arxiv.org/pdf/2510.09650.pdf

Abstract:
This paper explores the distinctions and connections between cybersecurity and ethical hacking, two vital disciplines in the protection of digital systems. It defines each field, outlines their goals and methodologies, and compares the academic and professional paths available to aspiring students. Cybersecurity is presented as a defensive discipline focused on preventing attacks and safeguarding data, while ethical hacking adopts an offensive approach that identifies vulnerabilities through authorized testing. The paper highlights key skills, certifications, and career opportunities in both areas, offering practical guidance to help learners choose the path best suited to their interests and ambitions. Ultimately, it emphasizes the complementary nature of both fields in strengthening global cyber resilience.

Paperid: 3865, https://arxiv.org/pdf/2510.09645.pdf

Abstract:
Password security has been compelled to evolve in response to the growing computational capabilities of modern systems. However, this evolution has often resulted in increasingly complex security practices that alienate users, leading to poor compliance and heightened vulnerability. Consequently, individuals remain exposed to attackers through weak or improperly managed passwords, underscoring the urgent need for a comprehensive defense mechanism that effectively addresses password-related risks and threats. In this paper, we propose a multifaceted solution designed to revolutionize password security by integrating diverse attributes such as the Password Dissection Mechanism, Dynamic Password Policy Mechanism, human behavioral patterns, device characteristics, network parameters, geographical context, and other relevant factors. By leveraging learning-based models, our framework constructs detailed user profiles capable of recognizing individuals and preventing nearly all forms of unauthorized access or device possession. The proposed framework enhances the usability-security paradigm by offering stronger protection than existing standards while simultaneously engaging users in the policy-setting process through a novel, adaptive approach.

Paperid: 3866, https://arxiv.org/pdf/2510.09635.pdf

Abstract:
This paper presents the Cybersecurity Psychology Framework (CPF), a novel methodology for quantifying human-centric vulnerabilities in security operations through systematic integration of established psychological constructs with operational security telemetry. While individual human factors-alert fatigue, compliance fatigue, cognitive overload, and risk perception biases-have been extensively studied in isolation, no framework provides end-to-end operationalization across the full spectrum of psychological vulnerabilities. We address this gap by: (1) defining specific, measurable algorithms that quantify key psychological states using standard SOC tooling (SIEM, ticketing systems, communication platforms); (2) proposing a lightweight, privacy-preserving LLM architecture based on Retrieval-Augmented Generation (RAG) and domain-specific fine-tuning to analyze structured and unstructured data for latent psychological risks; (3) detailing a rigorous mixed-methods validation strategy acknowledging the inherent difficulty of obtaining sensitive cybersecurity data. Our implementation of CPF indicators has been demonstrated in a proof-of-concept deployment using small language models achieving 0.92 F1-score on synthetic data. This work provides the theoretical and methodological foundation necessary for industry partnerships to conduct empirical validation with real operational data.

Paperid: 3867, https://arxiv.org/pdf/2510.09633.pdf

Abstract:
Hound introduces a relation-first graph engine that improves system-level reasoning across interrelated components in complex codebases. The agent designs flexible, analyst-defined views with compact annotations (e.g., monetary/value flows, authentication/authorization roles, call graphs, protocol invariants) and uses them to anchor exact retrieval: for any question, it loads precisely the code that matters (often across components) so it can zoom out to system structure and zoom in to the decisive lines. A second contribution is a persistent belief system: long-lived vulnerability hypotheses whose confidence is updated as evidence accrues. The agent employs coverage-versus-intuition planning and a QA finalizer to confirm or reject hypotheses. On a five-project subset of ScaBench[1], Hound improves recall and F1 over a baseline LLM analyzer (micro recall 31.2% vs. 8.3%; F1 14.2% vs. 9.8%) with a modest precision trade-off. We attribute these gains to flexible, relation-first graphs that extend model understanding beyond call/dataflow to abstract aspects, plus the hypothesis-centric loop; code and artifacts are released to support reproduction.

Paperid: 3868, https://arxiv.org/pdf/2510.09629.pdf

Abstract:
The growing use of Internet of Things (IoT) technologies in Nigerian healthcare offers potential improvements in remote monitoring and data-driven care, but unsecured wireless communication in medical IoT (mIoT) devices exposes patient data to cyber threats. This study investigates such vulnerabilities through a real-time Man in the Middle (MITM) attack simulation and evaluates lightweight AES-128 encryption on low-cost devices. A prototype mIoT device was built with a NodeMCU ESP8266 and sensors for heart rate and temperature. In controlled lab conditions simulating local healthcare networks, unencrypted data transmissions were intercepted and altered using common tools (Bettercap, Wireshark). After AES-128 encryption was applied, all transmissions became unreadable and tamper attempts failed, demonstrating its effectiveness. Performance costs were modest, latency rose from 80 ms to 125 ms (56.25 percent increase) and CPU use from 30 percent to 45 percent, but system stability remained intact. Device cost stayed under 18,000 NGN (about 12 USD), making it feasible for Nigeria's resource constrained facilities. A survey of healthcare professionals showed moderate awareness of IoT-related risks but strong support for encryption and staff training. Barriers included limited budgets and technical complexity. The study concludes that lightweight AES-128 encryption provides practical, low-cost protection against common attack vectors while maintaining operational efficiency. Feedback from professionals highlights the urgency of improving security awareness and establishing guidelines for clinical deployment.

Paperid: 3869, https://arxiv.org/pdf/2510.09619.pdf

Abstract:
We present a risk-calibrated approach to streaming intrusion detection that couples Bayesian Online Changepoint Detection (BOCPD) with decision thresholds aligned to Site Reliability Engineering (SRE) error budgets. BOCPD provides run-length posteriors that adapt to distribution shift and concept drift; we map these posteriors to alert decisions by optimizing expected operational cost under false-positive and false-negative budgets. We detail the hazard model, conjugate updates, and an O(1)-per-event implementation. A concrete SRE example shows how a 99.9% availability SLO (43.2 minutes per month error budget) yields a probability threshold near 0.91 when missed incidents are 10x more costly than false alarms. We evaluate on the full UNSW-NB15 and CIC-IDS2017 benchmarks with chronological splits, comparing against strong unsupervised baselines (ECOD, COPOD, and LOF). Metrics include PR-AUC, ROC-AUC, Brier score, calibration reliability diagrams, and detection latency measured in events. Results indicate improved precision-recall at mid to high recall and better probability calibration relative to baselines. We release implementation details, hyperparameters, and ablations for hazard sensitivity and computational footprint. Code and reproducibility materials will be made available upon publication; datasets and implementation are available from the corresponding author upon reasonable request.

Paperid: 3870, https://arxiv.org/pdf/2510.09617.pdf

Abstract:
ChipmunkRing, a practical post-quantum ring signature construction tailored for blockchain environments. Building on our Chipmunk lattice-based cryptographic framework, this implementation delivers compact digital signatures ranging from 20.5 to 279.7KB, with rapid signing operations completing in 1.1-15.1ms and efficient validation processes requiring only 0.4-4.5ms for participant groups of 2-64 members. The cornerstone of our approach is Acorn Verification-a streamlined zero-knowledge protocol that supersedes the classical Fiat-Shamir methodology. This innovation enables linear O(n) authentication complexity using concise 96-byte cryptographic proofs per participant, yielding a remarkable 17.7x performance enhancement for 32-member rings when compared to conventional techniques. Our work includes rigorous mathematical security demonstrations confirming 112-bit post-quantum protection (NIST Level 1), extensive computational benchmarking, and comprehensive support for both standard anonymity sets and collaborative threshold constructions with flexible participation requirements.

Paperid: 3871, https://arxiv.org/pdf/2510.09613.pdf

Abstract:
The U.S. Federal Risk and Authorization Management Program (FedRAMP) has long relied on extensive sets of controls and static documentation to assess cloud systems. However, this manual, point-in-time approach has struggled to keep pace with cloud-native development. FedRAMP 20x, a 2025 pilot program, reimagines the NIST Risk Management Framework (RMF): replacing traditional NIST 800-53 controls with Key Security Indicators (KSIs), using automated, machine-readable evidence, and emphasizing continuous reporting and authorization. This case study presents a practitioner-led field report from an industry participant who led multiple FedRAMP 20x pilot submissions and engaged directly with the FedRAMP PMO, 3PAOs, and community working groups. It explores how KSIs, continuous evidence pipelines, and DevSecOps integration can streamline authorization and improve cyber risk management. The study shows FedRAMP 20x as a live testbed for implementing the RMF in a cloud-native, automation-first approach and shares actionable recommendations for risk professionals seeking to modernize compliance and support real-time, risk-informed decision-making.

Paperid: 3872, https://arxiv.org/pdf/2510.08013.pdf

Abstract:
We present the discovery of a fundamental composition law governing conjugate observables in the Random Permutation Sorting System (RPSS). The law links the discrete permutation count Np and the continuous elapsed time T through a functional relation connecting the characteristic function of timing distributions to the probability generating function of permutation counts. This framework enables entropy purification, transforming microarchitectural timing fluctuations into uniform randomness via geometric convergence. We establish convergence theorems with explicit bounds and validate the results experimentally, achieving Shannon entropy above 7.9998 bits per byte and chi-square uniformity across diverse platforms. The composition law provides a universal foundation for generating provably uniform randomness from general-purpose computation, securing cryptographic purity from emergent computational dynamics.

Paperid: 3873, https://arxiv.org/pdf/2510.07080.pdf

Abstract:
This study tackles the computational challenges of solving Markov Decision Processes (MDPs) for a restricted class of problems. It is motivated by the Last Revealer Attack (LRA), which undermines fairness in some Proof-of-Stake (PoS) blockchains such as Ethereum (\$400B market capitalization). We introduce pseudo-MDPs (pMDPs) a framework that naturally models such problems and propose two distinct problem reductions to standard MDPs. One problem reduction provides a novel, counter-intuitive perspective, and combining the two problem reductions enables significant improvements in dynamic programming algorithms such as value iteration. In the case of the LRA which size is parameterized by $κ$ (in Ethereum's case $κ$= 325), we reduce the computational complexity from $O(2^κκ^{2^{κ+2}})$ to $O(κ^4)$ (per iteration). This solution also provide the usual benefits from Dynamic Programming solutions: exponentially fast convergence toward the optimal solution is guaranteed. The dual perspective also simplifies policy extraction, making the approach well-suited for resource-constrained agents who can operate with very limited memory and computation once the problem has been solved. Furthermore, we generalize those results to a broader class of MDPs, enhancing their applicability. The framework is validated through two case studies: a fictional card game and the LRA on the Ethereum random seed consensus protocol. These applications demonstrate the framework's ability to solve large-scale problems effectively while offering actionable insights into optimal strategies. This work advances the study of MDPs and contributes to understanding security vulnerabilities in blockchain systems.

Paperid: 3874, https://arxiv.org/pdf/2510.06923.pdf

Abstract:
Foundational results in theoretical computer science have established that everything provable, is provable in zero knowledge. However, this assertion fundamentally assumes a classical interpretation of computation and many interesting physical statements that one can hope to prove are not characterized. In this work, we consider decision problems, where the problem instance itself is specified by a (pure) quantum state. We discuss several motivating examples for this notion and, as our main technical result, we show that every quantum problem that is provable with an interactive protocol, is also provable in zero-knowledge. Our protocol achieves unconditional soundness and computational zero-knowledge, under standard assumptions in cryptography. In addition, we show how our techniques yield a protocol for the Uhlmann transformation problem that achieves a meaningful notion of zero-knowledge, also in the presence of a malicious verifier.

Paperid: 3875, https://arxiv.org/pdf/2510.06707.pdf

Abstract:
The linear decomposition attack reveals a vulnerability in encryption algorithms operating within groups or monoids with excessively small representations. The representation gap, defined as the size of the smallest non-trivial representation, therefore serves as a metric to assess the security of these algorithms. This paper will demonstrate that the diagrammatic Motzkin monoids exhibit a large representation gap, positioning them as promising candidates for robust encryption algorithms.

Paperid: 3876, https://arxiv.org/pdf/2510.05288.pdf

Abstract:
Large language models (LLMs) such as ChatGPT have evolved into powerful and ubiquitous tools. Fine-tuning on small datasets allows LLMs to acquire specialized skills for specific tasks efficiently. Although LLMs provide great utility in both general and task-specific use cases, they are limited by two security-related concerns. First, traditional LLM hardware requirements make them infeasible to run locally on consumer-grade devices. A remote network connection with the LLM provider's server is usually required, making the system vulnerable to network attacks. Second, fine-tuning an LLM for a sensitive task may involve sensitive data. Non-private fine-tuning algorithms produce models vulnerable to training data reproduction attacks. Our work addresses these security concerns by enhancing differentially private optimization algorithms and applying them to fine-tune localizable language models. We introduce adaptable gradient clipping along with other engineering enhancements to the standard DP-Adam optimizer to create DP-Adam-AC. We use our optimizer to fine-tune examples of two localizable LLM designs, small language model (Qwen2.5-0.5B) and 1.58 bit quantization (Bitnet-b1.58-2B). We demonstrate promising improvements in loss through experimentation with two synthetic datasets.

Paperid: 3877, https://arxiv.org/pdf/2510.05192.pdf

Abstract:
Agentic misalignment occurs when goal-directed agents take harmful actions, such as blackmail, rather than risk goal failure, and can be triggered by replacement threats, autonomy reduction, or goal conflict (Lynch et al., 2025). We adapt insider-risk control design (Critical Pathway; Situational Crime Prevention) to develop preventative operational controls that steer agents toward safe actions when facing stressors. Using the blackmail scenario from the original Anthropic study by Lynch et al. (2025), we evaluate mitigations across 10 LLMs and 66,600 samples. Our main finding is that an externally governed escalation channel, which guarantees a pause and independent review, reduces blackmail rates from a no-mitigation baseline of 38.73% to 1.21% (averaged across all models and conditions). Augmenting this channel with compliance email bulletins further lowers the blackmail rate to 0.85%. Overall, incorporating preventative operational controls strengthens defence-in-depth strategies for agentic AI. We also surface a failure mode diverging from Lynch et al. (2025): two models (Gemini 2.5 Pro, Grok-4) take harmful actions without goal conflict or imminent autonomy threat, leveraging sensitive information for coercive signalling. In counterfactual swaps, both continued using the affair regardless of whether the CEO or CTO was implicated. An escalation channel eliminated coercion, but Gemini 2.5 Pro (19 pp) and Grok-4 (7 pp) escalated more when the CTO was implicated, unlike most models (higher in the CEO condition). The reason for this divergent behaviour is not clear from raw outputs and could reflect benign differences in reasoning or strategic discrediting of a potential future threat, warranting further investigation.

Paperid: 3878, https://arxiv.org/pdf/2510.04882.pdf

Abstract:
Private Information Retrieval (PIR) allows a client to retrieve an entry $\text{DB}[i]$ from a public database $\text{DB}$ held by one or more servers, without revealing the queried index $i$. Traditional PIR schemes achieve sublinear server computation only under strong assumptions, such as the presence of multiple non-colluding servers or the use of public-key cryptography. To overcome these limitations, \textit{preprocessing PIR} schemes introduce a query-independent offline phase where the client collects \textit{hints} that enable efficient private queries during the online phase. In this work, we focus on preprocessing PIR schemes relying solely on \textit{One-Way Functions} (OWFs), which provide minimal cryptographic assumptions and practical implementability. We study three main constructions -- TreePIR, PIANO, and PPPS -- that explore different trade-offs between communication, storage, and server trust assumptions. Building upon the mechanisms introduced in PIANO and PPPS, we propose an adaptation of TreePIR to the single-server setting by introducing a dual-table hint structure (primary and backup tables) and a \textit{resampling} technique to refresh hints efficiently. Our proposed scheme achieves logarithmic upload bandwidth and $O(\sqrt{n}\log n)$ download complexity while requiring $O(\sqrt{n}\log n)$ client storage. This represents a significant improvement over prior single-server preprocessing PIR schemes such as PIANO ($O(\sqrt{n})$ bandwidth) and PPPS ($O(n^{1/4})$ bandwidth), while maintaining the simplicity and minimal assumptions of the OWF-based setting.

Paperid: 3879, https://arxiv.org/pdf/2510.04528.pdf

Abstract:
The rapid adoption of large language models (LLMs) in enterprise systems exposes vulnerabilities to prompt injection attacks, strategic deception, and biased outputs, threatening security, trust, and fairness. Extending our adversarial activation patching framework (arXiv:2507.09406), which induced deception in toy networks at a 23.9% rate, we introduce the Unified Threat Detection and Mitigation Framework (UTDMF), a scalable, real-time pipeline for enterprise-grade models like Llama-3.1 (405B), GPT-4o, and Claude-3.5. Through 700+ experiments per model, UTDMF achieves: (1) 92% detection accuracy for prompt injection (e.g., jailbreaking); (2) 65% reduction in deceptive outputs via enhanced patching; and (3) 78% improvement in fairness metrics (e.g., demographic bias). Novel contributions include a generalized patching algorithm for multi-threat detection, three groundbreaking hypotheses on threat interactions (e.g., threat chaining in enterprise workflows), and a deployment-ready toolkit with APIs for enterprise integration.

Paperid: 3880, https://arxiv.org/pdf/2510.03973.pdf

Abstract:
Strategic Communication Protocols provide a structured approach for first contact with interstellar objects that demonstrate technological characteristics and high levels of threat. The protocols find their starting point in an ISO Information-Communication Paradox, namely, as our knowledge of an ISO's threatening capabilities increases, the probability of successful communication decreases while the urgency of communication attempts simultaneously intensifies. From this paradox, a Threat-Communication Viability Index is created to describe when the value of communication attempts outweighs strategic silence. The index scores the situation and operates as a decision-making tool for stakeholders tracking an ISO. The communication protocols subsequently outline several diplomatic strategies in cases where the index recommends communication.

Paperid: 3881, https://arxiv.org/pdf/2510.03770.pdf

Abstract:
Ensuring the trustworthiness of data from distributed and resource-constrained environments, such as Wireless Sensor Networks or IoT devices, is critical. Existing Reversible Data Hiding (RDH) methods for scalar data suffer from low embedding capacity and poor intrinsic mixing between host data and watermark. This paper introduces Hiding in the Imaginary Domain with Data Encryption (H[i]dden), a novel framework based on complex number arithmetic for simultaneous information embedding and encryption. The H[i]dden framework offers perfect reversibility, in-principle unlimited watermark size, and intrinsic data-watermark mixing. The paper further introduces two protocols: H[i]dden-EG, for joint reversible data hiding and encryption, and H[i]dden-AggP, for privacy-preserving aggregation of watermarked data, based on partially homomorphic encryption. These protocols provide efficient and resilient solutions for data integrity, provenance and confidentiality, serving as a foundation for new schemes based on the algebraic properties of the complex domain.

Paperid: 3882, https://arxiv.org/pdf/2510.03405.pdf

Abstract:
We present LegalSim, a modular multi-agent simulation of adversarial legal proceedings that explores how AI systems can exploit procedural weaknesses in codified rules. Plaintiff and defendant agents choose from a constrained action space (for example, discovery requests, motions, meet-and-confer, sanctions) governed by a JSON rules engine, while a stochastic judge model with calibrated grant rates, cost allocations, and sanction tendencies resolves outcomes. We compare four policies: PPO, a contextual bandit with an LLM, a direct LLM policy, and a hand-crafted heuristic; Instead of optimizing binary case outcomes, agents are trained and evaluated using effective win rate and a composite exploit score that combines opponent-cost inflation, calendar pressure, settlement pressure at low merit, and a rule-compliance margin. Across configurable regimes (e.g., bankruptcy stays, inter partes review, tax procedures) and heterogeneous judges, we observe emergent ``exploit chains'', such as cost-inflating discovery sequences and calendar-pressure tactics that remain procedurally valid yet systemically harmful. Evaluation via cross-play and Bradley-Terry ratings shows, PPO wins more often, the bandit is the most consistently competitive across opponents, the LLM trails them, and the heuristic is weakest. The results are stable in judge settings, and the simulation reveals emergent exploit chains, motivating red-teaming of legal rule systems in addition to model-level testing.

Paperid: 3883, https://arxiv.org/pdf/2510.02424.pdf

Abstract:
This paper presents CADL (Cognitive-Adaptive Deception Layer), an adaptive deception framework achieving 99.88% detection rate with 0.13% false positive rate on the CICIDS2017 dataset. The framework employs ensemble machine learning (Random Forest, XGBoost, Neural Networks) combined with behavioral profiling to identify and adapt responses to network intrusions. Through a coordinated signal bus architecture, security components share real-time intelligence, enabling collective decision-making. The system profiles attackers based on temporal patterns and deploys customized deception strategies across five escalation levels. Evaluation on 50,000 CICIDS2017 test samples demonstrates that CADL significantly outperforms traditional intrusion detection systems (Snort: 71.2%, Suricata: 68.5%) while maintaining production-ready false positive rates. The framework's behavioral analysis achieves 89% accuracy in classifying attacker profiles. We provide open-source implementation and transparent performance metrics, offering an accessible alternative to commercial deception platforms costing $150-400 per host annually.

Paperid: 3884, https://arxiv.org/pdf/2510.02383.pdf

Abstract:
Elliptic curve cryptography (ECC) is foundational to modern secure communication, yet existing standard curves have faced scrutiny for opaque parameter-generation practices. This work introduces a Selmer-inspired framework for constructing elliptic curves that is both transparent and auditable. Drawing from $2$- and $3$-descent methods, we derive binary quartics and ternary cubics whose classical invariants deterministically yield candidate $(c_4,c_6)$ parameters. Local solubility checks, modeled on Selmer admissibility, filter candidates prior to reconciliation into short-Weierstrass form over prime fields. We then apply established cryptographic validations, including group-order factorization, cofactor bounds, twist security, and embedding-degree heuristics. A proof-of-concept implementation demonstrates that the pipeline functions as a retry-until-success Las Vegas algorithm, with complete transcripts enabling independent verification. Unlike seed-based or purely efficiency-driven designs, our approach embeds arithmetic structure into parameter selection while remaining compatible with constant-time, side-channel resistant implementations. This work broadens the design space for elliptic curves, showing that descent techniques from arithmetic geometry can underpin trust-enhancing, standardization-ready constructions.

Paperid: 3885, https://arxiv.org/pdf/2510.02379.pdf

Abstract:
Since the security of post-quantum cryptography (PQC) algorithms is based on the hardness of mathematical problems, while the security of quantum key distribution (QKD) relies on the fundamental principles of quantum physics, each approach possesses distinct advantages and limitations that can complement one another. Consequently, recent studies have proposed hybrid schemes that combine QKD and PQC to establish a dual-layered security model. In response to this trend, this study proposes hybrid schemes that integrate QKD with the National Institute of Standards and Technology (NIST) standardized PQC algorithms. These hybrid schemes include two core components: a hybrid QKD-PQC key exchange protocol and a hybrid QKD-PQC digital signature scheme. For the hybrid key exchange protocol, this study combines Module-Lattice-based Key Encapsulation Mechanisms (ML-KEM) with QKD protocols, specifically BB84 and E91, to construct a secure key exchange protocol. In the design of the hybrid digital signature scheme, this study utilizes Module-Lattice-based Digital Signature Algorithms (ML-DSA) and Stateless Hash-based Digital Signature Algorithms (SLH-DSA) to generate signature reconstruction values. These values are verified using confirmation codes transmitted via the BB84 and E91 protocols. The proposed hybrid key exchange protocol is evaluated by examining the shared secret key it produces, particularly with respect to entropy and whether the output is independent and identically distributed (IID). Furthermore, the computation time and message lengths of the proposed hybrid schemes are evaluated.

Paperid: 3886, https://arxiv.org/pdf/2510.02374.pdf

Abstract:
Completely Automated Public Turing tests to tell Computers and Humans Apart (CAPTCHAs) are a foundational component of web security, yet traditional implementations suffer from a trade-off between usability and resilience against AI-powered bots. This paper introduces a novel hybrid CAPTCHA system that synergizes the cognitive challenges posed by Large Language Models (LLMs) with the behavioral biometric analysis of keystroke dynamics. Our approach generates dynamic, unpredictable questions that are trivial for humans but non-trivial for automated agents, while simultaneously analyzing the user's typing rhythm to distinguish human patterns from robotic input. We present the system's architecture, formalize the feature extraction methodology for keystroke analysis, and report on an experimental evaluation. The results indicate that our dual-layered approach achieves a high degree of accuracy in bot detection, successfully thwarting both paste-based and script-based simulation attacks, while maintaining a high usability score among human participants. This work demonstrates the potential of combining cognitive and behavioral tests to create a new generation of more secure and user-friendly CAPTCHAs.

Paperid: 3887, https://arxiv.org/pdf/2510.02365.pdf

Abstract:
Fully Homomorphic Encryption (FHE) provides a powerful paradigm for secure computation, but its practical adoption is severely hindered by the prohibitive computational cost of its bootstrapping procedure. The complexity of all current bootstrapping methods is fundamentally tied to the multiplicative depth of the decryption circuit, denoted $L_{dec}$, making it the primary performance bottleneck. This paper introduces a new approach to bootstrapping that completely bypasses the traditional circuit evaluation model. We apply the tools of modern arithmetic geometry to reframe the bootstrapping operation as a direct geometric projection. Our framework models the space of ciphertexts as an affine scheme and rigorously defines the loci of decryptable and fresh ciphertexts as distinct closed subschemes. The bootstrapping transformation is then realized as a morphism between these two spaces. Computationally, this projection is equivalent to solving a specific Closest Vector Problem (CVP) instance on a highly structured ideal lattice, which we show can be done efficiently using a technique we call algebraic folding. The primary result of our work is a complete and provably correct bootstrapping algorithm with a computational complexity of $O(d \cdot \text{poly}(\log q))$, where $d$ is the ring dimension and $q$ is the ciphertext modulus. The significance of this result lies in the complete elimination of the factor $L_{dec}$ from the complexity, representing a fundamental asymptotic improvement over the state of the art. This geometric perspective offers a new and promising pathway toward achieving truly practical and high-performance FHE.

Paperid: 3888, https://arxiv.org/pdf/2510.02325.pdf

Abstract:
This paper introduces Agentic-AI Healthcare, a privacy-aware, multilingual, and explainable research prototype developed as a single-investigator project. The system leverages the emerging Model Context Protocol (MCP) to orchestrate multiple intelligent agents for patient interaction, including symptom checking, medication suggestions, and appointment scheduling. The platform integrates a dedicated Privacy and Compliance Layer that applies role-based access control (RBAC), AES-GCM field-level encryption, and tamper-evident audit logging, aligning with major healthcare data protection standards such as HIPAA (US), PIPEDA (Canada), and PHIPA (Ontario). Example use cases demonstrate multilingual patient-doctor interaction (English, French, Arabic) and transparent diagnostic reasoning powered by large language models. As an applied AI contribution, this work highlights the feasibility of combining agentic orchestration, multilingual accessibility, and compliance-aware architecture in healthcare applications. This platform is presented as a research prototype and is not a certified medical device.

Paperid: 3889, https://arxiv.org/pdf/2510.02317.pdf

Abstract:
The Age of Artificial Intelligence is here. In 2025, there are few regulations governing artificial intelligence. While the expansion of artificial intelligence is going in a relatively good direction, there is a risk that it can be misused. Misuse of technology is nothing new and will continue to happen. The lack of regulation in artificial intelligence is necessary because it raises the question of how we can move forward without knowing what the limits are. While artificial intelligence dominates the technology industry, new technology is starting to emerge. Quantum cryptography is expected to replace classical cryptography; however, the transition from classical to quantum cryptography is expected to occur within the next 10 years. The ability to transition from classical to quantum cryptography requires hybrid cryptography. Hybrid cryptography can be used now; however, similar to artificial intelligence, there is no regulation or support for the regulatory infrastructure regarding hybrid machines. This paper will explore the regulatory gaps in hybrid cryptography. The paper will also offer solutions to fix the gaps and ensure the transition from classical to quantum cryptography is safely and effectively completed.

Paperid: 3890, https://arxiv.org/pdf/2510.02196.pdf

Abstract:
This work derives the authentication security of pseudorandom function (PRF) GNSS ranging under multiple GNSS spoofing models, including the Security Code Estimation and Replay (SCER) spoofer. When GNSS ranging codes derive from a PRF utilizing a secret known only to the broadcaster, the spoofer cannot predict the ranging code before broadcast. Therefore, PRF ranging can be used to establish trust in the GNSS pseudoranges and the resulting receiver position, navigation, and timing (PNT) solution. I apply the methods herein to Galileo's Signal Authentication Service (SAS) utilizing the encrypted Galileo E6-C signal to compute that, at most, 400 ms of Galileo E6-C data to assert 128-bit authentication security under non-SCER models. For the SCER adversary, I predict the adversary's needed receiving radio equipment to break authentication security. One can use this work to design a PRF GNSS ranging protocol to meet useful authentication security requirements by computing the probability of missed detection.

Paperid: 3891, https://arxiv.org/pdf/2510.01780.pdf

Abstract:
Secure and interoperable integration of heterogeneous medical data remains a grand challenge in digital health. Current federated learning (FL) frameworks offer privacy-preserving model training but lack standardized mechanisms to orchestrate multi-modal data fusion across distributed and resource-constrained environments. This study introduces a novel framework that leverages the Model Context Protocol (MCP) as an interoperability layer for secure, cross-agent communication in multi-modal federated healthcare systems. The proposed architecture unifies three pillars: (i) multi-modal feature alignment for clinical imaging, electronic medical records, and wearable IoT data; (ii) secure aggregation with differential privacy to protect patient-sensitive updates; and (iii) energy-aware scheduling to mitigate dropouts in mobile clients. By employing MCP as a schema-driven interface, the framework enables adaptive orchestration of AI agents and toolchains while ensuring compliance with privacy regulations. Experimental evaluation on benchmark datasets and pilot clinical cohorts demonstrates up to 9.8\% improvement in diagnostic accuracy compared with baseline FL, a 54\% reduction in client dropout rates, and clinically acceptable privacy--utility trade-offs. These results highlight MCP-enabled multi-modal fusion as a scalable and trustworthy pathway toward equitable, next-generation federated health infrastructures.

Paperid: 3892, https://arxiv.org/pdf/2510.01720.pdf

Abstract:
We describe several families of efficiently implementable Boolean functions achieving provable trade-offs between resiliency, nonlinearity, and algebraic immunity. In particular, the following statement holds for each of the function families that we propose. Given integers $m_0\geq 0$, $x_0\geq 1$, and $a_0\geq 1$, it is possible to construct an $n$-variable function which has resiliency at least $m_0$, linear bias (which is an equivalent method of expressing nonlinearity) at most $2^{-x_0}$ and algebraic immunity at least $a_0$; further, $n$ is linear in $m_0$, $x_0$ and $a_0$, and the function can be implemented using $O(n)$ 2-input gates, which is essentially optimal.

Paperid: 3893, https://arxiv.org/pdf/2510.01445.pdf

Abstract:
Privacy and security in Smart Cities remain at constant risk due to the vulnerabilities introduced by Internet of Things (IoT) devices. The limited computational resources of these devices make them especially susceptible to attacks, while their widespread adoption increases the potential impact of security breaches. This article presents a review of security proposals aimed at protecting IoT devices in Smart City environments. The review was conducted by analyzing recent literature on device-level security, with particular emphasis on lightweight cryptography, physically unclonable functions (PUFs), and blockchain-based solutions. Findings highlight both the strengths and limitations of current approaches, as well as the need for more practical, scalable, and resource-efficient mechanisms to ensure user privacy and data protection in IoT ecosystems.

Paperid: 3894, https://arxiv.org/pdf/2510.01261.pdf

Abstract:
Federated learning is vulnerable to poisoning and backdoor attacks under partial observability. We formulate defence as a partially observable sequential decision problem and introduce a trust-aware Deep Q-Network that integrates multi-signal evidence into client trust updates while optimizing a long-horizon robustness--accuracy objective. On CIFAR-10, we (i) establish a baseline showing steadily improving accuracy, (ii) show through a Dirichlet sweep that increased client overlap consistently improves accuracy and reduces ASR with stable detection, and (iii) demonstrate in a signal-budget study that accuracy remains steady while ASR increases and ROC-AUC declines as observability is reduced, which highlights that sequential belief updates mitigate weaker signals. Finally, a comparison with random, linear-Q, and policy gradient controllers confirms that DQN achieves the best robustness--accuracy trade-off.

Paperid: 3895, https://arxiv.org/pdf/2510.01216.pdf

Abstract:
Users of blockchains value scalability, expecting fast confirmations and immediate transaction processing. Odontoceti, the latest in DAG-based consensus, addresses these concerns by prioritizing low latency and high throughput, making a strategic trade-off in security by operating with a 20% fault tolerance instead of the established 33% level. It is the first DAG-based protocol to achieve commitment in just two communication rounds, delivering median latency of 300 milliseconds while processing 10,000 transactions per second under realistic network conditions. Odontoceti operates with n = 5f + 1 validators and creates an uncertified DAG with a novel decision rule for committing blocks. The protocol includes an optimization that advances progress when participants are slow, benefiting crash fault scenarios which are more common in practice than Byzantine faults. Evaluation results demonstrate 20-25% latency improvements compared to an existing production protocol, validating that reducing wave length from three rounds to two rounds yields meaningful performance benefits. This paper establishes the practical viability of lower fault tolerance consensus protocols for blockchains.

Paperid: 3896, https://arxiv.org/pdf/2510.00976.pdf

Abstract:
Rare-disease diagnosis remains one of the most pressing challenges in digital health, hindered by extreme data scarcity, privacy concerns, and the limited resources of edge devices. This paper proposes the Adaptive Federated Few-Shot Rare-Disease Diagnosis (AFFR) framework, which integrates three pillars: (i) few-shot federated optimization with meta-learning to generalize from limited patient samples, (ii) energy-aware client scheduling to mitigate device dropouts and ensure balanced participation, and (iii) secure aggregation with calibrated differential privacy to safeguard sensitive model updates. Unlike prior work that addresses these aspects in isolation, AFFR unifies them into a modular pipeline deployable on real-world clinical networks. Experimental evaluation on simulated rare-disease detection datasets demonstrates up to 10% improvement in accuracy compared with baseline FL, while reducing client dropouts by over 50% without degrading convergence. Furthermore, privacy-utility trade-offs remain within clinically acceptable bounds. These findings highlight AFFR as a practical pathway for equitable and trustworthy federated diagnosis of rare conditions.

Paperid: 3897, https://arxiv.org/pdf/2510.00076.pdf

Abstract:
We consider online and PAC learning of Littlestone classes subject to the constraint of approximate differential privacy. Our main result is a private learner to online-learn a Littlestone class with a mistake bound of $\tilde{O}(d^{9.5}\cdot \log(T))$ in the realizable case, where $d$ denotes the Littlestone dimension and $T$ the time horizon. This is a doubly-exponential improvement over the state-of-the-art [GL'21] and comes polynomially close to the lower bound for this task. The advancement is made possible by a couple of ingredients. The first is a clean and refined interpretation of the ``irreducibility'' technique from the state-of-the-art private PAC-learner for Littlestone classes [GGKM'21]. Our new perspective also allows us to improve the PAC-learner of [GGKM'21] and give a sample complexity upper bound of $\widetilde{O}(\frac{d^5 \log(1/δβ)}{\varepsilon α})$ where $α$ and $β$ denote the accuracy and confidence of the PAC learner, respectively. This improves over [GGKM'21] by factors of $\frac{d}α$ and attains an optimal dependence on $α$. Our algorithm uses a private sparse selection algorithm to \emph{sample} from a pool of strongly input-dependent candidates. However, unlike most previous uses of sparse selection algorithms, where one only cares about the utility of output, our algorithm requires understanding and manipulating the actual distribution from which an output is drawn. In the proof, we use a sparse version of the Exponential Mechanism from [GKM'21] which behaves nicely under our framework and is amenable to a very easy utility proof.

Paperid: 3898, https://arxiv.org/pdf/2509.25241.pdf

Abstract:
Recent advancements in training paradigms for Large Language Models (LLMs) have unlocked their remarkable capabilities in natural language processing and cross-domain generalization. While LLMs excel in tasks like programming and mathematical problem-solving, their zero-shot performance in specialized domains requiring expert knowledge, such as cybersecurity, is often suboptimal. This limitation arises because foundational LLMs are designed for general-purpose applications, constraining their ability to encapsulate domain-specific expertise within their parameter space. To address this, we explore fine-tuning strategies to embed cybersecurity knowledge into LLMs, enhancing their performance in cybersecurity question-answering (Q\&A) tasks while prioritizing computational efficiency. Specifically, we investigate Supervised Fine-Tuning (SFT), Low-Rank Adaptation (LoRA), and Quantized Low-Rank Adaptation (QLoRA) using a cybersecurity Q\&A dataset. Our results demonstrate that these fine-tuning approaches significantly outperform the foundational model in cybersecurity Q\&A tasks. Moreover, LoRA and QLoRA achieve comparable performance to SFT with substantially lower computational costs, offering an efficient pathway for adapting LLMs to specialized domains. Our work highlights the potential of low-rank fine-tuning strategies to bridge the gap between general-purpose LLMs and domain-specific applications.

Paperid: 3899, https://arxiv.org/pdf/2509.25205.pdf

Abstract:
Self-supervised learning (SSL) has emerged as a powerful paradigm for learning representations on graph data without requiring manual labels. However, leading SSL methods like GRACE are fundamentally incompatible with privacy-preserving technologies such as Homomorphic Encryption (HE) due to their reliance on non-polynomial operations. This paper introduces Poly-GRACE, a novel framework for HE-compatible self-supervised learning on graphs. Our approach consists of a fully polynomial-friendly Graph Convolutional Network (GCN) encoder and a novel, polynomial-based contrastive loss function. Through experiments on three benchmark datasets -- Cora, CiteSeer, and PubMed -- we demonstrate that Poly-GRACE not only enables private pre-training but also achieves performance that is highly competitive with, and in the case of CiteSeer, superior to the standard non-private baseline. Our work represents a significant step towards practical and high-performance privacy-preserving graph representation learning.

Paperid: 3900, https://arxiv.org/pdf/2509.24623.pdf

Abstract:
The emergence of large-scale quantum computers, powered by algorithms like Shor's and Grover's, poses an existential threat to modern public-key cryptography. This vulnerability stems from the ability of these machines to efficiently solve the hard mathematical problems - such as integer factorization and the elliptic curve discrete logarithm problem - that underpin widely used cryptographic primitives. This includes RSA, Diffie-Hellman (DH), Elliptic Curve Diffie-Hellman (ECDH), and Elliptic Curve Digital Signature Algorithm (ECDSA), which are foundational to security across the digital ecosystem. Once Shor's algorithm becomes practically realizable, these primitives will fail, undermining both retrospective confidentiality and cryptographic authenticity - enabling adversaries to decrypt previously captured communications and forge digital signatures. This paper presents a systematic inventory of technologies exposed to quantum threats from the engineering perspective, organized by both technology domain and by implementation environment. While prior research has emphasized theoretical breaks or protocol-level adaptations, this work focuses on the practical landscape - mapping quantum-vulnerable systems across diverse digital infrastructures. The contribution is a cross-domain, cross-environment threat map to guide practitioners, vendors, and policymakers in identifying exposed technologies before the arrival of cryptographically relevant quantum computers.

Paperid: 3901, https://arxiv.org/pdf/2509.22757.pdf

Abstract:
This study presents a structured approach to evaluating vulnerabilities within quantum cryptographic protocols, focusing on the BB84 quantum key distribution method and National Institute of Standards and Technology (NIST) approved quantum-resistant algorithms. By integrating AI-driven red teaming, automated penetration testing, and real-time anomaly detection, the research develops a framework for assessing and mitigating security risks in quantum networks. The findings demonstrate that AI can be effectively used to simulate adversarial attacks, probe weaknesses in cryptographic implementations, and refine security mechanisms through iterative feedback. The use of automated exploit simulations and protocol fuzzing provides a scalable means of identifying latent vulnerabilities, while adversarial machine learning techniques highlight novel attack surfaces within AI-enhanced cryptographic processes. This study offers a comprehensive methodology for strengthening quantum security and provides a foundation for integrating AI-driven cybersecurity practices into the evolving quantum landscape.

Paperid: 3902, https://arxiv.org/pdf/2509.22664.pdf

Abstract:
As Programmable Logic Controller (PLC) became a useful device and rose as an interesting research topic but remained expensive, multiple PLC simulators/emulators were introduced for various purposes. Open-source Programmable Logic Controller (OpenPLC) software, one of the most popular PLC simulators, is designed to be vendor-neutral and run on almost any computer or low-cost embedded devices, e.g., Raspberry Pi, Arduino, and other controllers. The project succeeded in introducing itself as an affordable and practical solution for the high cost of real hardware PLCs. However, it still lacks appropriate securing methods, resulting in several vulnerabilities. Through a combination of threat modeling, vulnerability analysis, and practical experiments, this thesis provides valuable insights for developers, researchers, and engineers aiming to deploy OpenPLC securely in industrial environments. To this end, this work first conducts an in-depth analysis aimed to shed light on va! rious security challenges and vulnerabilities within the OpenPLC project. After that, an advanced control logic injection attack was performed. This attack modifies the user program maliciously, exploiting presented vulnerabilities. Finally, the work introduces a security-enhanced OpenPLC software called OpenPLC Aqua. The new software is equipped with a set of security solutions designed specifically to address the vulnerabilities to which current OpenPLC versions are prone.

Paperid: 3903, https://arxiv.org/pdf/2509.22663.pdf

Abstract:
We define a practical method to quantify the trade-off between security and operational friction in modern identity-centric programs. We introduce the Security Friction Quotient (SFQ), a bounded composite index that combines a residual-risk estimator with empirically grounded friction terms (latency, failure rate, and helpdesk impact). We establish clarity properties (boundedness, monotonic response, and weight identifiability) with short proofs, then evaluate widely used Conditional Access policies over a 12-week horizon using Monte Carlo simulation (n = 2,000 runs per policy/scenario) with effect sizes and 95% confidence intervals. We further assess rank stability under 10,000 random weight draws, finding 95.5% preservation of policy ordering. Finally, we provide a 12-week passkey field observation from an enterprise-scale cohort (N = 1,200) that directionally aligns with the simulation's phishing-resistant MFA gains. The SFQ framework is designed to be reproducible, interpretable, and directly actionable for Zero Trust identity policy decisions, with artifacts and parameter ranges provided to support policy design, review, and continuous improvement.

Paperid: 3904, https://arxiv.org/pdf/2509.22290.pdf

Abstract:
We introduce Verifiable One-Time Programs (Ver-OTPs) and use them to construct single-round Open Secure Computation (OSC), a novel primitive enabling applications like (1) single-round sealed-bid auctions, (2) single-round and honest-majority atomic proposes -- a building block of consensus protocols, and (3) single-round differentially private statistical aggregation without pre-registration. First, we construct Ver-OTPs from single-qubit states and classical cryptographic primitives. Then, assuming a multi-key homomorphic scheme (MHE) with certain properties, we use Ver-OTPs with MHE to construct OSC. The underlying quantum requirement is minimal: only single-qubit states are needed alongside a hardware assumption on the receiver's quantum resources. Our work therefore provides a new framework for quantum-assisted cryptography that may be implementable with near-term quantum technology.

Paperid: 3905, https://arxiv.org/pdf/2509.21601.pdf

Abstract:
Cryptographic Ranging Authentication is here! We present initial results on the Pulsar authenticated ranging service broadcast from space with Pulsar-0 utilizing a recording taken at Xona headquarters in Burlingame, CA. No assumptions pertaining to the ownership or leakage of encryption keys are required. This work discusses the Pulsar watermark design and security analysis. We derive the Pulsar watermark's probabilities of missed detection and false alarm, and we discuss the required receiver processing needed to utilize the Pulsar watermark. We present validation results of the Pulsar watermark utilizing the transmissions from orbit. Lastly, we provide results that demonstrate the spoofing detection efficacy with a spoofing scenario that incorporates the authentic transmissions from orbit. Because we make no assumption about the leakage of symmetric encryption keys, this work provides mathematical justification of the watermark's security, and our July 2025 transmissions from orbit, we claim the world's first authenticated satellite pseudorange from orbit.

Paperid: 3906, https://arxiv.org/pdf/2509.20398.pdf

Abstract:
We present a novel mechanism to construct a covert channel based on page faults. A page fault is an event that occurs when a process or a thread tries to access a page of memory that is not currently mapped to its address space. The kernel typically responds to this event by performing a context switch to allow another process or thread to execute while the page is being fetched from the disk. We exploit this behavior to allow a malicious process to leak secret data to another process, bypassing the isolation mechanisms enforced by the operating system. These attacks do not leverage timers and are hardwareagnostic. Experimental results demonstrate that this attack can achieve a bit error rate of under 4%

Paperid: 3907, https://arxiv.org/pdf/2509.20388.pdf

Abstract:
The rapid integration of AI-powered coding assistants into developer workflows has raised significant privacy and trust concerns. As developers entrust proprietary code to services like OpenAI's GPT, Google's Gemini, and GitHub Copilot, the unclear data handling practices of these tools create security and compliance risks. This paper addresses this challenge by introducing and applying a novel, expert-validated privacy scorecard. The methodology involves a detailed analysis of four document types; from legal policies to external audits; to score five leading assistants against 14 weighted criteria. A legal expert and a data protection officer refined these criteria and their weighting. The results reveal a distinct hierarchy of privacy protections, with a 20-point gap between the highest- and lowest-ranked tools. The analysis uncovers common industry weaknesses, including the pervasive use of opt-out consent for model training and a near-universal failure to filter secrets from user prompts proactively. The resulting scorecard provides actionable guidance for developers and organizations, enabling evidence-based tool selection. This work establishes a new benchmark for transparency and advocates for a shift towards more user-centric privacy standards in the AI industry.

Paperid: 3908, https://arxiv.org/pdf/2509.19304.pdf

Abstract:
In this paper we discuss several surprisingly simple methods for transforming the Raspberry Pi Pico (RP2) microcontroller into a radio transmitter, by using only cheap off the shelf electronic components, and open source software. While initially this transformation may look as a harmless curiosity, in some extreme cases it can also pose security risks, since it can be used to open a large number of local stealth radio communication channels.

Paperid: 3909, https://arxiv.org/pdf/2509.18586.pdf

Abstract:
The analysis of quantum algorithms which query random, invertible permutations has been a long-standing challenge in cryptography. Many techniques which apply to random oracles fail, or are not known to generalize to this setting. As a result, foundational cryptographic constructions involving permutations often lack quantum security proofs. With the aim of closing this gap, we develop and prove soundness of a compressed permutation oracle. Our construction shares many of the attractive features of Zhandry's original compressed function oracle: the purification is a small list of input-output pairs which meaningfully reflect an algorithm's knowledge of the oracle. We then apply this framework to show that the Feistel construction with seven rounds is a strong quantum PRP, resolving an open question of (Zhandry, 2012). We further re-prove essentially all known quantum query lower bounds in the random permutation model, notably the collision and preimage resistance of both Sponge and Davies-Meyer, hardness of double-sided zero search and sparse predicate search, and give new lower bounds for cycle finding and the one-more problem.

Paperid: 3910, https://arxiv.org/pdf/2509.18520.pdf

Abstract:
Large language models (LLMs) can compile weighted graphs on natural language data to enable automatic coherence-driven inference (CDI) relevant to red and blue team operations in cybersecurity. This represents an early application of automatic CDI that holds near- to medium-term promise for decision-making in cybersecurity and eventually also for autonomous blue team operations.

Paperid: 3911, https://arxiv.org/pdf/2509.17508.pdf

Abstract:
Since the early 2010s, social network-based influence technologies have grown almost exponentially. Initiated by the U.S. Army's early OEV system in 2011, a number of companies specializing in this field have emerged. The most (in)famous cases are Bell Pottinger, Cambridge Analytica, Aggregate-IQ and, more recently, Team Jorge. In this paper, we consider the use-case of sock puppet master activities, which consist in creating hundreds or even thousands of avatars, in organizing them into communities and implement influence operations. On-purpose software is used to automate these operations (e.g. Ripon software, AIMS) and organize these avatar populations into communities. The aim is to organize targeted and directed influence communication to rather large communities (influence targets). The goal of the present research work is to show how these community management techniques (social networks) can also be used to communicate/disseminate relatively large volumes (up to a few tens of Mb) of multi-level encrypted information to a limited number of actors. To a certain extent, this can be compared to a Dark Post-type function, with a number of much more powerful potentialities. As a consequence, the concept of communication has been totally redefined and disrupted, so that eavesdropping, interception and jamming operations no longer make sense.

Paperid: 3912, https://arxiv.org/pdf/2509.17266.pdf

Abstract:
Privacy-preserving state estimation for linear time-invariant dynamical systems with crowd sensors is considered. At any time step, the estimator has access to measurements from a randomly selected sensor from a pool of sensors with pre-specified models and noise profiles. A Luenberger-like observer is used to fuse the measurements with the underlying model of the system to recursively generate the state estimates. An additive privacy-preserving noise is used to constrain information leakage. Information leakage is measured via mutual information between the identity of the sensors and the state estimate conditioned on the actual state of the system. This captures an omnipotent adversary that not only can access state estimates but can also gather direct high-quality state measurements. Any prescribed level of information leakage is shown to be achievable by appropriately selecting the variance of the privacy-preserving noise. Therefore, privacy-utility trade-off can be fine-tuned.

Paperid: 3913, https://arxiv.org/pdf/2509.16985.pdf

Abstract:
Software vulnerabilities remain a significant risk factor in achieving security objectives within software development organizations. This is especially true where either proprietary or open-source software (OSS) is included in the technological environment. In this paper an end-to-end process with supporting methods and tools is presented. This industry proven generic process allows for the custom instantiation, configuration, and execution of routinized code scanning for software vulnerabilities and their prioritized remediation. A select set of tools are described for this key DevSecOps function and placed into an iterative process. Examples of both industrial proprietary applications and open-source applications are provided including specific vulnerability instances and a discussion of their treatment. The benefits of each selected tool are considered, and alternative tools are also introduced. Application of this method in a comprehensive SDLC model is also reviewed along with prospective enhancements from automation and the application of advanced technologies including AI. Adoption of this method can be achieved with minimal adjustments and with maximum flexibility for results in reducing source code vulnerabilities, reducing supply chain risk, and improving the security profile of new or legacy solutions.

Paperid: 3914, https://arxiv.org/pdf/2509.16593.pdf

Abstract:
Security risk assessment is essential in establishing the trustworthiness and reliability of modern systems. While various security risk assessment approaches exist, prevalent applications are "pen and paper" implementations that -- even if performed digitally using computers -- remain prone to authoring mistakes and inconsistencies. Computer-aided design approaches can transform security risk assessments into more rigorous and sustainable efforts. This is of value to both industrial practitioners and researchers, who practice security risk assessments to reflect on systems' designs and to contribute to the discipline's state-of-the-art. In this article, we report the application of a model-based security design tool to reproduce a previously reported security assessment. The main contributions are: 1) an independent attempt to reproduce a refereed article describing a real security risk assessment of a system; 2) comparison of a new computer-aided application with a previous non-computer-aided application, based on a published, real-world case study; 3) a showcase for the potential advantages -- for both practitioners and researchers -- of using computer-aided design approaches to analyze reports and to assess systems.

Paperid: 3915, https://arxiv.org/pdf/2509.16274.pdf

Abstract:
Elections are not the only but arguably one of the most important pillars for the proper functioning of liberal democracies. Recent evidence across the globe shows that it is not straightforward to conduct them in a free and fair manner. One constant concern is the role of money in politics, more specifically, election campaign financing. Frequent scandals are proof of the difficulties encountered with current approaches to tackle the issue. Suggestions on how to overcome the problem exist but seem difficult to implement. With the help of blockchain technology we might be able to make a step forward. A separate crypto currency specifically designed to pay for costs of political campaigning and advertising could be introduced. Admittedly, at this stage, there are many open questions. However, under the assumption that blockchain technology is here to stay, it is an idea that deserves further exploration.

Paperid: 3916, https://arxiv.org/pdf/2509.15754.pdf

Abstract:
Bitcoin's consensus rules are encoded in the implementation of its reference client: "The code is the spec." Yet this code is unsuitable for formal verification due to side effects, mutable state, concurrency, and legacy design. A standalone formal specification would enable verification both across versions of the reference client and against new client implementations, strengthening decentralization by reducing the risk of consensus-splitting bugs. Yet such a specification has long been considered intractable given the complexity of Bitcoin's consensus logic. We demonstrate a compact, executable, declarative C++ specification of Bitcoin consensus rules that syncs mainnet to tip in a few hours on a single thread. We also introduce the Hornet Domain-Specific Language (DSL) specifically designed to encode these rules unambiguously for execution, enabling formal reasoning, consensus code generation, and AI-driven adversarial testing. Our spec-driven client Hornet Node offers a modern and modular complement to the reference client. Its clear, idiomatic style makes it suitable for education, while its performance makes it ideal for experimentation. We highlight architectural contributions such as its layered design, efficient data structures, and strong separation of concerns, supported by production-quality code examples. We argue that Hornet Node and Hornet DSL together provide the first credible path toward a pure, formal, executable specification of Bitcoin consensus.

Paperid: 3917, https://arxiv.org/pdf/2509.14987.pdf

Abstract:
This paper introduces a Blockchain-Integrated Explainable AI Framework (BXHF) for healthcare systems to tackle two essential challenges confronting health information networks: safe data exchange and comprehensible AI-driven clinical decision-making. Our architecture incorporates blockchain, ensuring patient records are immutable, auditable, and tamper-proof, alongside Explainable AI (XAI) methodologies that yield transparent and clinically relevant model predictions. By incorporating security assurances and interpretability requirements into a unified optimization pipeline, BXHF ensures both data-level trust (by verified and encrypted record sharing) and decision-level trust (with auditable and clinically aligned explanations). Its hybrid edge-cloud architecture allows for federated computation across different institutions, enabling collaborative analytics while protecting patient privacy. We demonstrate the framework's applicability through use cases such as cross-border clinical research networks, uncommon illness detection and high-risk intervention decision support. By ensuring transparency, auditability, and regulatory compliance, BXHF improves the credibility, uptake, and effectiveness of AI in healthcare, laying the groundwork for safer and more reliable clinical decision-making.

Paperid: 3918, https://arxiv.org/pdf/2509.14096.pdf

Abstract:
The rapid advancement of humanoid robotics presents unprecedented cybersecurity challenges that existing theoretical frameworks fail to adequately address. This report presents a comprehensive security assessment of a production humanoid robot platform, bridging the gap between abstract security models and operational vulnerabilities. Through systematic static analysis, runtime observation, and cryptographic examination, we uncovered a complex security landscape characterized by both sophisticated defensive mechanisms and critical vulnerabilities. Our findings reveal a dual-layer proprietary encryption system (designated FMX') that, while innovative in design, suffers from fundamental implementation flaws including the use of static cryptographic keys that enable offline configuration decryption. More significantly, we documented persistent telemetry connections transmitting detailed robot state information--including audio, visual, spatial, and actuator data--to external servers without explicit user consent or notification mechanisms. We operationalized a Cybersecurity AI agent on the Unitree G1 to map and prepare exploitation of its manufacturer's cloud infrastructure, illustrating how a compromised humanoid can escalate from covert data collection to active counter-offensive operations. We argue that securing humanoid robots requires a paradigm shift toward Cybersecurity AI (CAI) frameworks that can adapt to the unique challenges of physical-cyber convergence. This work contributes empirical evidence for developing robust security standards as humanoid robots transition from research curiosities to operational systems in critical domains.

Paperid: 3919, https://arxiv.org/pdf/2509.13788.pdf

Abstract:
Homomorphic encryption is a powerful cryptographic tool that enables secure computations on the private data. It evaluates any function for any operation securely on the encrypted data without knowing its corresponding plaintext. For original data $p$, $c$ denotes the ciphertext of the original plaintext $p$, i.e. $c = Encrypt_k(p)$. This is crucial for any sensitive application running in the Cloud, because we must protect data privacy even in the case when the server has falled victim to a cyber attack. The encryption scheme $Encrypt_k$ is said to be homomorphic with respect to some set of operations $\mathcal{O}$, if for any operation $\circ \in \mathcal{O}$ one can compute $Encrypt_k(p_1 \circ p_2)$ from $Encrypt_k(p_1) \circ Encrypt_k(p_2)$. Those schemes come in three forms: somewhat, partially and fully homomorphic. In this survey, we present the state of art of the known homomorphic encryption schemes based on coding theory and polynomials.

Paperid: 3920, https://arxiv.org/pdf/2509.13597.pdf

Abstract:
Autonomous LLM agents can issue thousands of API calls per hour without human oversight. OAuth 2.0 assumes deterministic clients, but in agentic settings stochastic reasoning, prompt injection, or multi-agent orchestration can silently expand privileges. We introduce Agentic JWT (A-JWT), a dual-faceted intent token that binds each agent's action to verifiable user intent and, optionally, to a specific workflow step. A-JWT carries an agent's identity as a one-way checksum hash derived from its prompt, tools and configuration, and a chained delegation assertion to prove which downstream agent may execute a given task, and per-agent proof-of-possession keys to prevent replay and in-process impersonation. We define a new authorization mechanism and add a lightweight client shim library that self-verifies code at run time, mints intent tokens, tracks workflow steps and derives keys, thus enabling secure agent identity and separation even within a single process. We illustrate a comprehensive threat model for agentic applications, implement a Python proof-of-concept and show functional blocking of scope-violating requests, replay, impersonation, and prompt-injection pathways with sub-millisecond overhead on commodity hardware. The design aligns with ongoing OAuth agent discussions and offers a drop-in path toward zero-trust guarantees for agentic applications. A comprehensive performance and security evaluation with experimental results will appear in our forthcoming journal publication

Paperid: 3921, https://arxiv.org/pdf/2509.10727.pdf

Abstract:
The multi level Bell La Padula model for secure data access and data flow control, formulated in the 1970s, was based on the theory of partial orders. Since then, another model, based on lattice theory, has prevailed. We present reasons why the partial order model is more appropriate. We also show, by example, how non lattice data flow networks can be easily implemented by using Attribute-based access control (ABAC).

Paperid: 3922, https://arxiv.org/pdf/2509.10573.pdf

Abstract:
This position paper presents AVEC (Adaptive Verifiable Edge Control), a framework for bootstrapping privacy for local language models by enforcing privacy at the edge with explicit verifiability for delegated queries. AVEC introduces an adaptive budgeting algorithm that allocates per-query differential privacy parameters based on sensitivity, local confidence, and historical usage, and uses verifiable transformation with on-device integrity checks. We formalize guarantees using RÃ©nyi differential privacy with odometer-based accounting, and establish utility ceilings, delegation-leakage bounds, and impossibility results for deterministic gating and hash-only certification. Our evaluation is simulation-based by design to study mechanism behavior and accounting; we do not claim deployment readiness or task-level utility with live LLMs. The contribution is a conceptual architecture and theoretical foundation that chart a pathway for empirical follow-up on privately bootstrapping local LLMs.

Paperid: 3924, https://arxiv.org/pdf/2509.10550.pdf

Abstract:
We address when a best-first router for tool-use agents can stop exploring without missing a better leaf, while preserving local differential privacy (LDP) and leaving an audit trail. We introduce a run-wise certificate that couples each node's key to the same exponential race that realizes leaf perturbations; the usual halting rule (stop when the maximum over $v$ in $F$ of Key$(v) \le B^*$) then certifies the realized run. We give two certified modes on context-indexed prefix DAGs with child partition: (i) Exact (known counts), using lazy offset propagation with winner reuse; and (ii) Surrogate (upper bounds only), which anchors keys to a parent-level surrogate race and allows validator tightening via $Îº= \log(N / N_{ub}$). A small compiler enforces the partition property, and an admissible, race-independent M(tau) keeps keys sound. The ledger logs uniforms, counts, and tie handling; privacy follows by post-processing. Experiments on synthetic graphs and a small real pipeline show tight stopping, deterministic replay, and low overhead.

Paperid: 3925, https://arxiv.org/pdf/2509.10511.pdf

Abstract:
Reinforcement learning (RL) has transformed sequential decision-making, but traditional algorithms like Deep Q-Networks (DQNs) and Proximal Policy Optimization (PPO) often struggle with efficient exploration, stability, and adaptability in dynamic environments. This study presents LogGuardQ (Adaptive Log Guard with Cognitive enhancement), a novel framework that integrates a dual-memory system inspired by human cognition and adaptive exploration strategies driven by temperature decay and curiosity. Evaluated on a dataset of 1,000,000 simulated access logs with 47.9% anomalies over 20,000 episodes, LogGuardQ achieves a 96.0% detection rate (versus 93.0% for DQN and 47.1% for PPO), with precision of 0.4776, recall of 0.9996, and an F1-score of 0.6450. The mean reward is 20.34 \pm 44.63 across all episodes (versus 18.80 \pm 43.98 for DQN and -0.17 \pm 23.79 for PPO), with an average of 5.0 steps per episode (constant across models). Graphical analyses, including learning curves smoothed with a Savgol filter (window=501, polynomial=2), variance trends, action distributions, and cumulative detections, demonstrate LogGuardQ's superior stability and efficiency. Statistical tests (Mann-Whitney U) confirm significant performance advantages (e.g., p = 0.0002 vs. DQN with negligible effect size, p < 0.0001 vs. PPO with medium effect size, and p < 0.0001 for DQN vs. PPO with small effect size). By bridging cognitive science and RL, LogGuardQ offers a scalable approach to adaptive learning in uncertain environments, with potential applications in cybersecurity, intrusion detection, and decision-making under uncertainty.

Paperid: 3926, https://arxiv.org/pdf/2509.10492.pdf

Abstract:
In this work, we investigate the security of Elliptic Curve Cryptosystem (ECC) implementations against Side-Channel Analysis (SCA). ECC is well known for its efficiency and strong security, yet vulnerable to SCA which exploits physical information leaked during scalar multiplication (kP). Countermeasures such as regularity and atomicity exist; this thesis focuses on atomicity. In this work, we study the Giraud and Verneuil atomic pattern for kP, implementing it using the right-to-left kP algorithm on the NIST EC P-256 curve. We use the FLECC library with constant-time operations and execute on the Texas Instruments LAUNCHXLF28379D MCU. We measure Electromagnetic (EM) emissions during kP using a Lecroy WavePro 604HD Oscilloscope, a Langer ICS 105 Integrated Circuit Scanner, and a Langer MFA-R 0.2-75 Near Field Probe. We investigate whether the Giraud and Verneuil atomic blocks are distinguishable in EM traces. Our findings show that, when additional clock cycle processes are present, the atomic blocks can be visually distinguished; after removing these processes, they become more synchronised and harder to distinguish, reducing the risk of a successful SCA attack. These results show that, although the atomic pattern is correctly implemented with dummy operations, resistance to SCA can still be affected by additional processes inserted at hardware or software level.This means atomicity alone may not fully protect ECC from SCA. More research is needed to investigate the causes of the additional clock cycle processes and how intermediate operations are addressed in memory registers. This will help to understand the processes that lead to the insertion of these additional clock cycles. This thesis is the first to experimentally implement and investigate Giraud and Verneuil's atomic pattern on hardware, and it offers useful results to improve countermeasures against SCA.

Paperid: 3927, https://arxiv.org/pdf/2509.10488.pdf

Abstract:
This research focuses on transforming CVEs to hands-on educational lab for cybersecurity training. The study shows the practical application of CVEs by developing containerized lab environments- Docker to simulate real-world vulnerabilities like SQL Injection, arbitrary code execution, and improper SSL certificate validation. These labs has structured tutorials, pre- and post-surveys to evaluate learning outcomes, and remediation steps.Key challenges included interpreting limited CVE data, resolving technical complexities in lab design, and ensuring accessibility for diverse learners. Despite these difficulties, the findings highlight the use of educational benefits of vulnerability analysis, bridging theoretical concepts with hands-on experience. The results indicate that students improved comprehension of cybersecurity principles, threat mitigation techniques, and secure coding practices. This innovative approach provides a scalable and reproducible model for integrating CVEs into cybersecurity education, fostering a deeper understanding of real-world security challenges in a controlled and safe environment.

Paperid: 3928, https://arxiv.org/pdf/2509.10482.pdf

Abstract:
The increasing sophistication of technology systems makes traditional threat modeling hard to scale, especially for small organizations with limited resources. This paper develops and evaluates AegisShield, a generative AI enhanced threat modeling tool that implements STRIDE and MITRE ATT&CK to automate threat generation and provide systematic assessments. By integrating real time threat intelligence from the National Vulnerability Database and AlienVault Open Threat Exchange, AegisShield produces streamlined and accessible threat descriptions. Our assessment of 243 threats from 15 case studies and over 8000 AI generated threats shows that AegisShield reduces complexity (p less than 0.001), yields outputs semantically aligned with expert developed threats (p less than 0.05), and achieves an 85.4 percent success rate in mapping threats to MITRE ATT&CK techniques (p less than 0.001). Automating and standardizing threat modeling helps under resourced organizations address risk earlier and supports wider adoption of secure by design practices.

Paperid: 3929, https://arxiv.org/pdf/2509.10174.pdf

Abstract:
We present the Random Permutation Sorting System (RPSS), a novel framework for true uniform randomness generation grounded in statistical quantum mechanics. RPSS is built on a pair of conjugate observables, the permutation count and the elapsed sorting time, whose heavy-tailed raw distributions synchronously converge to uniformity through modular reduction. This mathematically proven convergence establishes RPSS as a True Uniform Random Number Generator (TURNG). A practical implementation, QPP-RNG, demonstrates how intrinsic system jitter, arising from microarchitectural noise, memory latency, and scheduling dynamics, interacts with combinatorial complexity to yield a compact, self-stabilizing entropy source. Empirical validation under the NIST SP 800-90B framework confirms rapid entropy convergence and statistically uniform outputs. RPSS thus defines a new class of quantum-inspired entropy engines, where randomness is simultaneously harvested from unpredictable system jitter and amplified by combinatorial processes, offering a robust, platform-independent alternative to conventional entropy sources.

Paperid: 3930, https://arxiv.org/pdf/2509.08248.pdf

Abstract:
To address the risks of increasingly capable AI systems, we introduce a hardware-level off-switch that embeds thousands of independent "security blocks" in each AI accelerator. This massively redundant architecture is designed to prevent unauthorized chip use, even against sophisticated physical attacks. Our main security block design uses public key cryptography to check the authenticity of authorization licenses, and randomly generated nonces to prevent replay attacks. We evaluate attack vectors and present additional security block variants that could be added for greater robustness. Security blocks can be built with standard circuit components, ensuring compatibility with existing semiconductor manufacturing processes. With embedded security blocks, the next generation of AI accelerators could be more robustly defended against dangerous misuse.

Paperid: 3932, https://arxiv.org/pdf/2509.06902.pdf

Abstract:
Large Language Models (LLMs) as stochastic systems may generate numbers that deviate from available data, a failure known as \emph{numeric hallucination}. Existing safeguards -- retrieval-augmented generation, citations, and uncertainty estimation -- improve transparency but cannot guarantee fidelity: fabricated or misquoted values may still be displayed as if correct. We propose \textbf{Proof-Carrying Numbers (PCN)}, a presentation-layer protocol that enforces numeric fidelity through mechanical verification. Under PCN, numeric spans are emitted as \emph{claim-bound tokens} tied to structured claims, and a verifier checks each token under a declared policy (e.g., exact equality, rounding, aliases, or tolerance with qualifiers). Crucially, PCN places verification in the \emph{renderer}, not the model: only claim-checked numbers are marked as verified, and all others default to unverified. This separation prevents spoofing and guarantees fail-closed behavior. We formalize PCN and prove soundness, completeness under honest tokens, fail-closed behavior, and monotonicity under policy refinement. PCN is lightweight and model-agnostic, integrates seamlessly into existing applications, and can be extended with cryptographic commitments. By enforcing verification as a mandatory step before display, PCN establishes a simple contract for numerically sensitive settings: \emph{trust is earned only by proof}, while the absence of a mark communicates uncertainty.

Paperid: 3933, https://arxiv.org/pdf/2509.06754.pdf

Abstract:
In the digital age, image encryption technology acts as a safeguard, preventing unauthorized access to images. This paper proposes an innovative image encryption scheme that integrates a novel 2D hyper-chaotic map with a newly developed self-adaptive diffusion method. The 2D hyper-chaotic map, namely the 2D-RA map, is designed by hybridizing the Rastrigin and Ackley functions. The chaotic performance of the 2D-RA map is validated through a series of measurements, including the Bifurcation Diagram, Lyapunov Exponent (LE), Initial Value Sensitivity, 0 - 1 Test, Correlation Dimension (CD), and Kolmogorov Entropy (KE). The results demonstrate that the chaotic performance of the 2D-RA map surpasses that of existing advanced chaotic functions. Additionally, the self-adaptive diffusion method is employed to enhance the uniformity of grayscale distribution. The performance of the image encryption scheme is evaluated using a series of indicators. The results show that the proposed image encryption scheme significantly outperforms current state-of-the-art image encryption techniques.

Paperid: 3934, https://arxiv.org/pdf/2509.06571.pdf

Abstract:
Data exfiltration is a growing problem for business who face costs related to the loss of confidential data as well as potential extortion. This work presents a simple game theoretic model of network data exfiltration. In the model, the attacker chooses the exfiltration route and speed, and the defender selects monitoring thresholds to detect unusual activity. The attacker is rewarded for exfiltrating data, and the defender tries to minimize the costs of data loss and of responding to alerts.

Paperid: 3935, https://arxiv.org/pdf/2509.06318.pdf

Abstract:
Residual cross-talk in superconducting qubit devices creates a security vulnerability for emerging quantum cloud services. We demonstrate a Clifford-only Quantum Rowhammer attack-using just X and CNOT gates-that injects faults on IBM's 127-qubit Eagle processors without requiring pulse-level access. Experiments show that targeted hammering induces localized errors confined to the attack cycle and primarily manifests as phase noise, as confirmed by near 50% flip rates under Hadamard-basis probing. A full lattice sweep maps QR's spatial and temporal behavior, revealing reproducible corruption limited to qubits within two coupling hops and rapid recovery in subsequent benign cycles. Finally, we leverage these properties to outline a prime-and-probe covert channel, demonstrating that the clear separability between hammered and benign rounds enables highly reliable signaling without error correction. These findings underscore the need for hardware-level isolation and scheduler-aware defenses as multi-tenant quantum computing becomes standard.

Paperid: 3936, https://arxiv.org/pdf/2509.06133.pdf

Abstract:
Modern vehicles accumulate fragmented lifecycle records across OEMs, owners, and service centers that are difficult to verify and prone to fraud. We propose VehiclePassport, a GAIA-X-aligned digital passport anchored on blockchain with zero-knowledge proofs (ZKPs) for privacy-preserving verification. VehiclePassport immutably commits to manufacturing, telemetry, and service events while enabling selective disclosure via short-lived JWTs and Groth16 proofs. Our open-source reference stack anchors hashes on Polygon zkEVM at <$0.02 per event, validates proofs in <10 ms, and scales to millions of vehicles. This architecture eliminates paper-based KYC, ensures GDPR-compliant traceability, and establishes a trustless foundation for insurance, resale, and regulatory applications in global mobility data markets.

Paperid: 3937, https://arxiv.org/pdf/2509.05891.pdf

Abstract:
Database audit and transaction logs are fundamental to forensic investigations, but they are vulnerable to tampering by privileged attackers. Malicious insiders or external threats with administrative access can alter, purge, or temporarily disable logging mechanisms, creating significant blind spots and rendering disk-based records unreliable. Memory analysis offers a vital alternative, providing investigators direct access to volatile artifacts that represent a ground-truth source of recent user activity, even when log files have been compromised. This paper introduces MemTraceDB, a tool that reconstructs user activity timelines by analyzing raw memory snapshots from the MySQL database process. MemTraceDB utilizes a novel algorithm, ActiviTimeTrace, to systematically extract and correlate forensic artifacts such as user connections and executed queries. Through a series of experiments, I demonstrate MemTraceDB's effectiveness and reveal a critical empirical finding: the MySQL query stack has a finite operational capacity of approximately 9,997 queries. This discovery allows me to establish a practical, data-driven formula for determining the optimal frequency for memory snapshot collection, providing a clear, actionable guideline for investigators. The result is a forensically-sound reconstruction of user activity, independent of compromised disk-based logs.

Paperid: 3938, https://arxiv.org/pdf/2509.05831.pdf

Abstract:
Large Language Models (LLMs) are increasingly integrated into web-based systems for content summarization, yet their susceptibility to prompt injection attacks remains a pressing concern. In this study, we explore how non-visible HTML elements such as , aria-label, and alt attributes can be exploited to embed adversarial instructions without altering the visible content of a webpage. We introduce a novel dataset comprising 280 static web pages, evenly divided between clean and adversarial injected versions, crafted using diverse HTML-based strategies. These pages are processed through a browser automation pipeline to extract both raw HTML and rendered text, closely mimicking real-world LLM deployment scenarios. We evaluate two state-of-the-art open-source models, Llama 4 Scout (Meta) and Gemma 9B IT (Google), on their ability to summarize this content. Using both lexical (ROUGE-L) and semantic (SBERT cosine similarity) metrics, along with manual annotations, we assess the impact of these covert injections. Our findings reveal that over 29% of injected samples led to noticeable changes in the Llama 4 Scout summaries, while Gemma 9B IT showed a lower, yet non-trivial, success rate of 15%. These results highlight a critical and largely overlooked vulnerability in LLM driven web pipelines, where hidden adversarial content can subtly manipulate model outputs. Our work offers a reproducible framework and benchmark for evaluating HTML-based prompt injection and underscores the urgent need for robust mitigation strategies in LLM applications involving web content.

Paperid: 3939, https://arxiv.org/pdf/2509.05708.pdf

Abstract:
Traditional security models for Nakamoto-style blockchains overestimate adversarial coordination by assuming instantaneous synchronization among malicious nodes, neglecting the critical impact of internal communication delays on security. This paper introduces a dual-delay framework to revisit security analysis, addressing this oversight through two key innovations. First, the static delay model quantifies how adversarial communication delays ($Î_a$) constrain the effective growth rate of private chains, derived via an M/D/1 queuing model as $Î»_{eff} = Î»_a / (1 + Î»_a Î_a)$. This model reveals that the security threshold ($Î²^*$), the maximum adversarial power the system tolerates, increases with $Î_a$, even exceeding the classic 51\% boundary when $Î_a \textgreater Î$ (honest nodes' delay), breaking the long-standing 50\% assumption. Second, the dynamic delay model integrates probabilistic corruption and scale-dependent delays to characterize the total adversarial delay window ($Î_{total} = Î(n) e^{-kÎ²} + c \log(1 + Î²n)$), where $Î(n) \in Î(\log n)$ captures honest nodes' logarithmic delay growth. Asymptotic analysis shows adversarial power decays linearly with network scale, ensuring the probability of $Î²\leq Î²^*$ approaches 1 as $n \to \infty$. By exposing the interplay between network scale, communication delays, and power dilution, we provide a theoretical foundation for optimizing consensus protocols and assessing robustness in large-scale Nakamoto-style blockchains.

Paperid: 3940, https://arxiv.org/pdf/2509.05380.pdf

Abstract:
This systematic literature review analyzes the current state of compliance with Regulation (EU) 2024/1689 in autonomous robotic systems, focusing on cybersecurity frameworks and methodologies. Using the PRISMA protocol, 22 studies were selected from 243 initial records across IEEE Xplore, ACM DL, Scopus, and Web of Science. Findings reveal partial regulatory alignment: while progress has been made in risk management and encrypted communications, significant gaps persist in explainability modules, real-time human oversight, and knowledge base traceability. Only 40% of reviewed solutions explicitly address transparency requirements, and 30% implement failure intervention mechanisms. The study concludes that modular approaches integrating risk, supervision, and continuous auditing are essential to meet the AI Act mandates in autonomous robotics.

Paperid: 3941, https://arxiv.org/pdf/2509.05326.pdf

Abstract:
Zero-knowledge proofs allow verification of computations without revealing private information. However, existing systems require memory proportional to the computation size, which has historically limited use in large-scale applications and on mobile and edge devices. We solve this fundamental bottleneck by developing, to our knowledge, the first proof system with sublinear memory requirements for mainstream cryptographic constructions. Our approach processes computations in blocks using a space-efficient tree algorithm, reducing memory from linear scaling to square-root scaling--from $Î(T)$ to $O(\sqrt{T} + \log T \log\log T)$ for computation size $T$--while maintaining the same proof generation time through a constant number of streaming passes. For widely-used linear polynomial commitment schemes (KZG/IPA), our method produces identical proofs and verification when using the same parameters and hashing only aggregate commitments into the challenge generation, preserving proof size and security. Hash-based systems also achieve square-root memory scaling though with slightly different proof structures. This advance enables zero-knowledge proofs on everyday devices and makes previously infeasible large computations verifiable, fundamentally democratizing access to privacy-preserving computation. Space-efficient zero knowledge proof systems create opportunities to reshape how trust is established in digital systems--from enabling widespread participation in decentralized networks to making verifiable scientific computing practical at unprecedented scales.

Paperid: 3942, https://arxiv.org/pdf/2509.03900.pdf

Abstract:
Open-source software OSS is widely adopted in enterprise settings, but standalone tools often lack native support for protocols like SAML or OIDC, creating a critical security integration gap. This paper introduces and formalizes the Auth Shim, a lightweight architectural pattern designed to solve this problem. The Auth Shim is a minimal, external proxy service that acts as a compatibility layer, translating requests from an enterprise Identity Provider IdP into the native session management mechanism of a target application. A key prerequisite for this pattern is that the target application must expose a programmatic, secure administrative API. We present a case study of the pattern's implementation at Adobe to integrate a popular OSS BI tool with Okta SAML, which enabled automated Role-Based Access Control RBAC via IAM group mapping and eliminated manual user provisioning. By defining its components, interactions, and production deployment considerations, this paper provides a reusable, secure, and cost-effective blueprint for integrating any standalone OSS tool into an enterprise SSO ecosystem, thereby enabling organizations to embrace open-source innovation without compromising on security governance.

Paperid: 3943, https://arxiv.org/pdf/2509.03744.pdf

Abstract:
The rapid expansion of the Internet of Things (IoT) and Wireless Sensor Networks (WSNs) has significantly increased the attack surface of such systems, making them vulnerable to a wide range of cyber threats. Traditional Intrusion Detection Systems (IDS) often fail to meet the stringent requirements of resource-constrained IoT environments due to their high computational cost and reliance on large labeled datasets. To address these challenges, this paper proposes a novel hybrid Intrusion Detection System that integrates a Quantum Genetic Algorithm (QGA) with Self-Supervised Learning (SSL). The QGA leverages quantum-inspired evolutionary operators to optimize feature selection and fine-tune model parameters, ensuring lightweight yet efficient detection in resource-limited networks. Meanwhile, SSL enables the system to learn robust representations from unlabeled data, thereby reducing dependency on manually labeled training sets. The proposed framework is evaluated on benchmark IoT intrusion datasets, demonstrating superior performance in terms of detection accuracy, false positive rate, and computational efficiency compared to conventional evolutionary and deep learning-based IDS models. The results highlight the potential of combining quantum-inspired optimization with self-supervised paradigms to design next-generation intrusion detection solutions for IoT and WSN environments.

Paperid: 3944, https://arxiv.org/pdf/2509.03545.pdf

Abstract:
The proliferation of mobile money in Uganda has been a cornerstone of financial inclusion, yet its security mechanisms remain a critical concern. This study investigates a significant public response to perceived security failures: the #StopAirtelThefty Twitter campaign of August 2025 Sparked by an incident publicized by Dr. Jim Spire Ssentongo where a phone thief accessed a victim's account, withdrew funds, and procured a loan, the campaign revealed deep seated public anxiety over the safety of mobile money. This research employs qualitative analysis to systematically examine the complaints raised during this campaign, extracting key themes related to security vulnerabilities and user dissatisfaction. By synthesizing these public sentiments, the paper provides crucial insights into the specific security gaps experienced by users and situates these findings within the larger framework of Uganda's mobile money regulatory and operational environment. The study concludes with implications for providers, policymakers, and the future of secure digital finance in Uganda.

Paperid: 3945, https://arxiv.org/pdf/2509.02578.pdf

Abstract:
In recent years, numerous incidents involving the leakage of website accounts and text passwords (referred to as passwords) have raised significant concerns regarding the potential exposure of personal information. These events underscore the critical importance of both information security and password protection. While many of these breaches are attributable to vulnerabilities within website infrastructure, the strength and security of the passwords themselves also play a crucial role. Consequently, the creation of secure passwords constitutes a fundamental aspect of enhancing overall system security and protecting personal data. In response to these challenges, this study presents a secure password generation approach utilizing a cryptographically secure Pseudo-Random Number Generator (PRNG). The generator is implemented using a range of Message Authentication Code (MAC) algorithms, including the Keyed-Hash Message Authentication Code (HMAC), Cipher-based Message Authentication Code (CMAC), and KECCAK Message Authentication Code (KMAC), to produce robust random values suitable for password generation. To evaluate the proposed method, empirical assessments were conducted in accordance with the guidelines provided in the National Institute of Standards and Technology (NIST) Special Publication (SP) 800-90B. The evaluation focused on two primary aspects: entropy estimation and verification of independent and identically distributed (IID) properties. Experimental results indicate that the proposed method satisfies both entropy and IID requirements, thereby demonstrating its ability to generate passwords with a high degree of randomness and security.

Paperid: 3946, https://arxiv.org/pdf/2509.02083.pdf

Abstract:
This paper considers five extensions for Chromium-based browsers in order to determine how effective can browser-based defenses against cryptojacking available to regular users be. We've examined most popular extensions - MinerBlock, AdGuard AdBlocker, Easy Redirect && Prevent Cryptojacking, CoinEater and Miners Shield, which claim to be designed specifically to identify and stop illegal cryptocurrency mining. An empirically confirmed dataset of 373 distinct cryptojacking-infected websites which was assembled during multi-stage procedure, was used to test those extensions. The results showed that all plugins in question had significant performance limits. Easy Redirect and Miners Shield only blocked 6 and 5 websites respectively, while MinerBlock had the greatest detection rate at only 27% (101/373 sites blocked). Most concerningly, despite promises of cryptojacking prevention, AdGuard (which has over 13 million users) and CoinEater were unable to identify any of the compromised websites. These results demonstrate serious flaws in cryptojacking detection products targeted for regular users, since even the best-performing specimen failed to detect 73% of attacks. The obvious difference between advertised capabilities and real performance highlights the urgent need for either accessibility improvements for laboratory-grade detection technologies that show 90%+ efficiency in controlled environment or fundamental upgrades to current commonly used extensions.

Paperid: 3947, https://arxiv.org/pdf/2509.01463.pdf

Abstract:
Cybersecurity honeypots are deception tools for engaging attackers and gather intelligence, but traditional low or medium-interaction honeypots often rely on static, pre-scripted interactions that can be easily identified by skilled adversaries. This Report presents LLMHoney, an SSH honeypot that leverages Large Language Models (LLMs) to generate realistic, dynamic command outputs in real time. LLMHoney integrates a dictionary-based virtual file system to handle common commands with low latency while using LLMs for novel inputs, achieving a balance between authenticity and performance. We implemented LLMHoney using open-source LLMs and evaluated it on a testbed with 138 representative Linux commands. We report comprehensive metrics including accuracy (exact-match, Cosine Similarity, Jaro-Winkler Similarity, Levenshtein Similarity and BLEU score), response latency and memory overhead. We evaluate LLMHoney using multiple LLM backends ranging from 0.36B to 3.8B parameters, including both open-source models and a proprietary model(Gemini). Our experiments compare 13 different LLM variants; results show that Gemini-2.0 and moderately-sized models Qwen2.5:1.5B and Phi3:3.8B provide the most reliable and accurate responses, with mean latencies around 3 seconds, whereas smaller models often produce incorrect or out-of-character outputs. We also discuss how LLM integration improves honeypot realism and adaptability compared to traditional honeypots, as well as challenges such as occasional hallucinated outputs and increased resource usage. Our findings demonstrate that LLM-driven honeypots are a promising approach to enhance attacker engagement and collect richer threat intelligence.

Paperid: 3948, https://arxiv.org/pdf/2509.00476.pdf

Abstract:
The escalating sophistication of malware necessitates robust detection mechanisms that generalize across diverse data sources. Traditional single-dataset models struggle with cross-domain generalization and often incur high computational costs. This paper presents a novel, lightweight framework for malware detection that employs probability-level fusion across three distinct datasets: EMBER (static features), API Call Sequences (behavioral features), and CIC Obfuscated Memory (memory patterns). Our method trains individual LightGBM classifiers on each dataset, selects top predictive features to ensure efficiency, and fuses their prediction probabilities using optimized weights determined via grid search. Extensive experiments demonstrate that our fusion approach achieves a macro F1-score of 0.823 on a cross-domain validation set, significantly outperforming individual models and providing superior generalization. The framework maintains low computational overhead, making it suitable for real-time deployment, and all code and data are provided for full reproducibility.

Paperid: 3949, https://arxiv.org/pdf/2509.00266.pdf

Abstract:
Academic and research Cyber Infrastructures (CI) present unique security challenges due to their collaborative nature, heterogeneous components, and the lack of practical, tailored security assessment frameworks. Existing standards can be too generic or complex for CI administrators to apply effectively. This report introduces a systematic, mission-centric approach to estimate and analyze the security posture of a CI. The framework guides administrators through a top-down process: (1) defining unacceptable losses and security missions, (2) identifying associated system hazards and critical assets, and (3) modeling the CI's components and their relationships as a security knowledge graph. The core of this methodology is the construction of directed attack graphs, which systematically map all potential paths an adversary could take from an entry point to a critical asset. By visualizing these attack paths alongside defense mechanisms, the framework provides a clear, comprehensive overview of the system's vulnerabilities and security gaps. This structured approach enables CI operators to proactively assess risks, prioritize mitigation strategies, and make informed, actionable decisions to strengthen the overall security posture of the CI.

Paperid: 3950, https://arxiv.org/pdf/2509.00124.pdf

Abstract:
This paper introduces a novel attack vector that leverages website cloaking techniques to compromise autonomous web-browsing agents powered by Large Language Models (LLMs). As these agents become more prevalent, their unique and often homogenous digital fingerprints - comprising browser attributes, automation framework signatures, and network characteristics - create a new, distinguishable class of web traffic. The attack exploits this fingerprintability. A malicious website can identify an incoming request as originating from an AI agent and dynamically serve a different, "cloaked" version of its content. While human users see a benign webpage, the agent is presented with a visually identical page embedded with hidden, malicious instructions, such as indirect prompt injections. This mechanism allows adversaries to hijack agent behavior, leading to data exfiltration, malware execution, or misinformation propagation, all while remaining completely invisible to human users and conventional security crawlers. This work formalizes the threat model, details the mechanics of agent fingerprinting and cloaking, and discusses the profound security implications for the future of agentic AI, highlighting the urgent need for robust defenses against this stealthy and scalable attack.

Paperid: 3951, https://arxiv.org/pdf/2509.00085.pdf

Abstract:
The growing societal reliance on artificial intelligence necessitates robust frameworks for ensuring its security, accountability, and trustworthiness. This thesis addresses the complex interplay between privacy, verifiability, and auditability in modern AI, particularly in foundation models. It argues that technical solutions that integrate these elements are critical for responsible AI innovation. Drawing from international policy contributions and technical research to identify key risks in the AI pipeline, this work introduces novel technical solutions for critical privacy and verifiability challenges. Specifically, the research introduces techniques for enabling verifiable and auditable claims about AI systems using zero-knowledge cryptography; utilizing secure multi-party computation and trusted execution environments for auditable, confidential deployment of large language models and information retrieval; and implementing enhanced delegation mechanisms, credentialing systems, and access controls to secure interactions with autonomous and multi-agent AI systems. Synthesizing these technical advancements, this dissertation presents a cohesive perspective on balancing privacy, verifiability, and auditability in foundation model-based AI systems, offering practical blueprints for system designers and informing policy discussions on AI safety and governance.

Paperid: 3952, https://arxiv.org/pdf/2509.00006.pdf

Abstract:
This study attempts to explain the impact of information exchange from one country to another, as well as the legal and technological implications for these exchanges. Due to the emergence of cloud technology, possibilities for free exchange of information between countries have increased rapidly, as it has become possible to save information in a country and access it in almost any part of the world. Countries all around the world have been confronted with developing frameworks to facilitate this process, although there are significant challenges which must be confronted on legal and technological fronts, as loopholes in the framework adopted by countries may hinder free access to information stored on cloud, and also compromise data privacy. Cloud technology is impacting a lot of issues, including domestic and international businesses, hence the need for a study to propose measures for safe exchange of information using cloud technology.

Paperid: 3953, https://arxiv.org/pdf/2509.00005.pdf

Abstract:
Business email compromise and lateral spear phishing attacks are among modern organizations' most costly and damaging threats. While inbound phishing defenses have improved significantly, most organizations still trust internal emails by default, leaving themselves vulnerable to attacks from compromised employee accounts. In this work, we define and explore the problem of authorship validation: verifying whether a claimed sender actually authored a given email. Authorship validation is a lightweight, real-time defense that complements traditional detection methods by modeling per-sender writing style. Further, the paper presents a collection of new datasets based on the Enron corpus. These simulate inauthentic messages using both human-written and large language model-generated emails. The paper also evaluates two classifiers -- a Naive Bayes model and a character-level convolutional neural network (Char-CNN) -- for the authorship validation task. Our experiments show that the Char-CNN model achieves high accuracy and F1 scores under various circumstances. Finally, we discuss deployment considerations and show that per-sender authorship classifiers are practical for integrating into existing commercial email security systems with low overhead.

Paperid: 3954, https://arxiv.org/pdf/2508.21653.pdf

Abstract:
In this work, we unveil an analogy between well-known lattice based learning with error problem and ill-posed inverse problems. We show that LWE problem is a structured inverse problem. Further, we propose a symmetric encryption scheme based on ill-posed problems and thoroughly discuss its security. Finally, we propose a public key encryption scheme based on our symmetric encryption scheme and CRYSTALS-Kyber KEM (key encapsulation mechanism) and discuss its security.

Paperid: 3955, https://arxiv.org/pdf/2508.21636.pdf

Abstract:
Deep learning (DL) models for natural language-to-code generation have become integral to modern software development pipelines. However, their heavy reliance on large amounts of data, often collected from unsanitized online sources, exposes them to data poisoning attacks, where adversaries inject malicious samples to subtly bias model behavior. Recent targeted attacks silently replace secure code with semantically equivalent but vulnerable implementations without relying on explicit triggers to launch the attack, making it especially hard for detection methods to distinguish clean from poisoned samples. We present a systematic study on the effectiveness of existing poisoning detection methods under this stealthy threat model. Specifically, we perform targeted poisoning on three DL models (CodeBERT, CodeT5+, AST-T5), and evaluate spectral signatures analysis, activation clustering, and static analysis as defenses. Our results show that all methods struggle to detect triggerless poisoning, with representation-based approaches failing to isolate poisoned samples and static analysis suffering false positives and false negatives, highlighting the need for more robust, trigger-independent defenses for AI-assisted code generation.

Paperid: 3956, https://arxiv.org/pdf/2508.21602.pdf

Abstract:
In this paper we give the first examples of low-conductance permutations. The notion of conductance of permutations was introduced in the paper "Indifferentiability of Confusion-Diffusion Networks" by Dodis et al., where the search for low-conductance permutations was initiated and motivated. In this paper we not only give the desired examples, but also make a general characterization of the problem -- i.e. we show that low-conductance permutations are equivalent to permutations that have the information-theoretic properties of the so-called Multi-Source-Somewhere-Condensers.

Paperid: 3957, https://arxiv.org/pdf/2508.20186.pdf

Abstract:
AI-powered influence operations can now be executed end-to-end on commodity hardware. We show that small language models produce coherent, persona-driven political messaging and can be evaluated automatically without human raters. Two behavioural findings emerge. First, persona-over-model: persona design explains behaviour more than model identity. Second, engagement as a stressor: when replies must counter-arguments, ideological adherence strengthens and the prevalence of extreme content increases. We demonstrate that fully automated influence-content production is within reach of both large and small actors. Consequently, defence should shift from restricting model access towards conversation-centric detection and disruption of campaigns and coordination infrastructure. Paradoxically, the very consistency that enables these operations also provides a detection signature.

Paperid: 3958, https://arxiv.org/pdf/2508.19500.pdf

Abstract:
This paper identifies and analyzes a novel vulnerability class in Model Context Protocol (MCP) based agent systems. The attack chain describes and demonstrates how benign, individually authorized tasks can be orchestrated to produce harmful emergent behaviors. Through systematic analysis using the MITRE ATLAS framework, we demonstrate how 95 agents tested with access to multiple services-including browser automation, financial analysis, location tracking, and code deployment-can chain legitimate operations into sophisticated attack sequences that extend beyond the security boundaries of any individual service. These red team exercises survey whether current MCP architectures lack cross-domain security measures necessary to detect or prevent a large category of compositional attacks. We present empirical evidence of specific attack chains that achieve targeted harm through service orchestration, including data exfiltration, financial manipulation, and infrastructure compromise. These findings reveal that the fundamental security assumption of service isolation fails when agents can coordinate actions across multiple domains, creating an exponential attack surface that grows with each additional capability. This research provides a barebones experimental framework that evaluate not whether agents can complete MCP benchmark tasks, but what happens when they complete them too well and optimize across multiple services in ways that violate human expectations and safety constraints. We propose three concrete experimental directions using the existing MCP benchmark suite.

Paperid: 3959, https://arxiv.org/pdf/2508.19465.pdf

Abstract:
The rapid expansion of the Electric Vehicles (EVs) and Electric Vehicle Charging Systems (EVCs) has introduced new cybersecurity challenges, specifically in authentication protocols that protect vehicles, users, and energy infrastructure. Although widely adopted for convenience, traditional authentication mechanisms like Radio Frequency Identification (RFID) and Near Field Communication (NFC) rely on static identifiers and weak encryption, making them highly vulnerable to attack vectors such as cloning, relay attacks, and signal interception. This study explores an AI-powered adaptive authentication framework designed to overcome these shortcomings by integrating machine learning, anomaly detection, behavioral analytics, and contextual risk assessment. Grounded in the principles of Zero Trust Architecture, the proposed framework emphasizes continuous verification, least privilege access, and secure communication. Through a comprehensive literature review, this research evaluates current vulnerabilities and highlights AI-driven solutions to provide a scalable, resilient, and proactive defense. Ultimately, the research findings conclude that adopting AI-powered adaptive authentication is a strategic imperative for securing the future of electric mobility and strengthening digital trust across the ecosystem. Keywords: weak authentication, RFID, NFC, ML, AI-powered adaptive authentication, relay attacks, cloning, eavesdropping, MITM attacks, Zero Trust Architecture

Paperid: 3960, https://arxiv.org/pdf/2508.18485.pdf

Abstract:
Hidden LLM prompts have appeared in online documents with increasing frequency. Their goal is to trigger indirect prompt injection attacks while remaining undetected from human oversight, to manipulate LLM-powered automated document processing systems, against applications as diverse as rÃ©sumÃ© screeners through to academic peer review processes. Detecting hidden LLM prompts is therefore important for ensuring trust in AI-assisted human decision making. This paper presents the first principled approach to hidden LLM prompt detection in structured documents. We implement our approach in a prototype tool called PhantomLint. We evaluate PhantomLint against a corpus of 3,402 documents, including both PDF and HTML documents, and covering academic paper preprints, CVs, theses and more. We find that our approach is generally applicable against a wide range of methods for hiding LLM prompts from visual inspection, has a very low false positive rate (approx. 0.092%), is practically useful for detecting hidden LLM prompts in real documents, while achieving acceptable performance.

Paperid: 3962, https://arxiv.org/pdf/2508.17853.pdf

Abstract:
Differential privacy (DP) has recently emerged as a definition of privacy to release private estimates. DP calibrates noise to be on the order of an individuals contribution. Due to the this calibration a private estimate obscures any individual while preserving the utility of the estimate. Since the original definition, many alternate definitions have been proposed. These alternates have been proposed for various reasons including improvements on composition results, relaxations, and formalizations. Nevertheless, thus far nearly all definitions of privacy have used a divergence of densities as the basis of the definition. In this paper we take an information geometry perspective towards differential privacy. Specifically, rather than define privacy via a divergence, we define privacy via the Rao distance. We show that our proposed definition of privacy shares the interpretation of previous definitions of privacy while improving on sequential composition.

Paperid: 3964, https://arxiv.org/pdf/2508.17071.pdf

Abstract:
Blockchain is a Distributed Ledger Technology (DLT) that offers numerous benefits including decentralization, transparency, efficiency, and reduced costs. Hence, blockchain has been included in many fields. Blockchain relies on cryptographic protocols (especially public-key cryptography and hash functions) to achieve many essential sub-routines. However, the increased progress of quantum computation and algorithms has threatened the security of many traditional cryptosystems. Therefore, this represents a serious risk for the existing blockchain technology. For example, SHA-256 and the Elliptic Curve Digital Signature Algorithm (ECDSA) cryptosystems can be compromised by Shor s and Grover s quantum algorithms in the foreseeable future. Post-Quantum Cryptography (PQC) is a basic solution for resisting these quantum attacks. Applying PQC to blockchains results in creating Post-Quantum Blockchains (PQB). Thus, this paper aims to review the threats imposed by quantum computers on classical blockchain technology and provide useful guidelines on PQB security to blockchain researchers. The paper focuses on the challenges and opportunities of future work direction in this field.

Paperid: 3965, https://arxiv.org/pdf/2508.16991.pdf

Abstract:
Space infrastructures have become an underpinning of modern society, but their associated cyber risks are little understood. This Dissertation advances the state-of-the-art via four contributions. (i) It introduces an innovative framework for characterizing real-world cyber attacks against space infrastructures, or space cyber attacks, including a novel methodology for coping with missing data and three novel metrics. A case study demonstrates the usefulness of the framework on 108 real-world space cyber attacks. (ii) This Dissertation characterizes the state-of-the-practice in space cyber risk analysis and mitigation, namely the Notional Risk Scores (NRS) within the Space Attack Research and Tactic Analysis (SPARTA) framework. (iii) We propose a set of desired properties that should be satisfied by any competent space cyber risk analysis and mitigation tool and applies them to assess two industrial space cyber risk analysis and mitigation tools. (iv) The study introduces a novel framework to analyze and mitigate space cyber risks by explicitly modeling space cyber attack cascading effects and presenting algorithms for mission risk analysis and mission hardening. We demonstrate the usefulness of the framework by applying it to analyze and mitigate space cyber risks, with testbed-based validation.

Paperid: 3966, https://arxiv.org/pdf/2508.16662.pdf

Abstract:
Zero Trust Architecture (ZTA) has become a widely adopted model for securing enterprise environments, promoting continuous verification and minimal trust across systems. However, its application in mobile contexts remains limited, despite mobile applications now accounting for most global digital interactions and being increasingly targeted by sophisticated threats. Existing Zero Trust frameworks developed by organisations such as the National Institute of Standards and Technology (NIST) and the Cybersecurity and Infrastructure Security Agency (CISA) primarily focus on enterprise-managed infrastructure, assuming organisational control over devices, networks, and identities. This paper addresses a critical gap by proposing an extended Zero Trust model designed for mobile applications operating in untrusted, user-controlled environments. Using a design science methodology, the study introduced a six-pillar framework that supports runtime enforcement of trust through controls including device integrity, user identity validation, data protection, secure application programming interface (API) usage, behavioural monitoring, and live application protection. Each pillar was mapped to relevant regulatory and security standards to support compliance. A phased implementation roadmap and maturity assessment model were also developed to guide adoption across varying organisational contexts. The proposed model offers a practical and standards-aligned approach to securing mobile applications beyond pre-deployment controls, aligning real-time enforcement with Zero Trust principles. This contribution expands the operational boundaries of ZTA and provides organisations with a deployable path to reduce fraud, enhance compliance, and address emerging mobile security challenges. Future research may include empirical validation of the framework and cross-sector application testing.

Paperid: 3967, https://arxiv.org/pdf/2508.16637.pdf

Abstract:
Attributing cyberattacks remains a central challenge in modern cybersecurity, particularly within denied environments where defenders have limited visibility into attacker infrastructure and are restricted by legal or operational rules of engagement. This perspective examines the strategic value of passive hack-back techniques that enable covert attribution and intelligence collection without initiating direct offensive actions. Key vectors include tracking beacons, honeytokens, environment-specific payloads, and supply-chain-based traps embedded within exfiltrated or leaked assets. These approaches rely on the assumption that attackers will interact with compromised data in traceable ways, allowing defenders to gather signals without violating engagement policies. The paper also explores the role of Artificial Intelligence (AI) in enhancing passive hack-back operations. Topics include the deployment of autonomous agents for forensic reconnaissance, the use of Large Language Models (LLMs) to generate dynamic payloads, and Adversarial Machine Learning (AML) techniques for evasion and counter-deception. A dedicated section discusses the implications of quantum technologies in this context, both as future threats to cryptographic telemetry and as potential tools for stealthy communication and post-quantum resilience. Finally, the paper advocates for hybrid defensive frameworks that combine passive attribution with delayed or conditional active responses, while maintaining compliance with legal, ethical, and operational constraints.

Paperid: 3968, https://arxiv.org/pdf/2508.15840.pdf

Abstract:
When using a public communication channel -- whether formal or informal, such as commenting or posting on social media -- end users have no expectation of privacy: they compose a message and broadcast it for the world to see. Even if an end user takes utmost precautions to anonymize their online presence -- using an alias or pseudonym; masking their IP address; spoofing their geolocation; concealing their operating system and user agent; deploying encryption; registering with a disposable phone number or email; disabling non-essential settings; revoking permissions; and blocking cookies and fingerprinting -- one obvious element still lingers: the message itself. Assuming they avoid lapses in judgment or accidental self-exposure, there should be little evidence to validate their actual identity, right? Wrong. The content of their message -- necessarily open for public consumption -- exposes an attack vector: stylometric analysis, or author profiling. In this paper, we dissect the technique of stylometry, discuss an antithetical counter-strategy in adversarial stylometry, and devise enhancements through Unicode steganography.

Paperid: 3969, https://arxiv.org/pdf/2508.15839.pdf

Abstract:
As AI systems increasingly influence critical decisions, they face threats that exploit reasoning mechanisms rather than technical infrastructure. We present a framework for cognitive cybersecurity, a systematic protection of AI reasoning processes from adversarial manipulation. Our contributions are threefold. First, we establish cognitive cybersecurity as a discipline complementing traditional cybersecurity and AI safety, addressing vulnerabilities where legitimate inputs corrupt reasoning while evading conventional controls. Second, we introduce the CIA+TA, extending traditional Confidentiality, Integrity, and Availability triad with Trust (epistemic validation) and Autonomy (human agency preservation), requirements unique to systems generating knowledge claims and mediating decisions. Third, we present a quantitative risk assessment methodology with empirically-derived coefficients, enabling organizations to measure cognitive security risks. We map our framework to OWASP LLM Top 10 and MITRE ATLAS, facilitating operational integration. Validation through previously published studies (151 human participants; 12,180 AI trials) reveals strong architecture dependence: identical defenses produce effects ranging from 96% reduction to 135% amplification of vulnerabilities. This necessitates pre-deployment Cognitive Penetration Testing as a governance requirement for trustworthy AI deployment.

Paperid: 3970, https://arxiv.org/pdf/2508.15089.pdf

Abstract:
Elliptic Curve Cryptography (ECC) is a fundamental component of modern public-key cryptosystems that enable efficient and secure digital signatures, key exchanges, and encryption. Its core operation, scalar multiplication, denoted as $k \cdot P$, where $P$ is a base point and $k$ is a private scalar, relies heavily on the secrecy and unpredictability of $k$. Conventionally, $k$ is selected using user input or pseudorandom number generators. However, in resource-constrained environments with weak entropy sources, these approaches may yield low-entropy or biased scalars, increasing susceptibility to side-channel and key recovery attacks. To mitigate these vulnerabilities, we introduce an optimization-driven scalar generation method that explicitly maximizes bit-level entropy. Our approach uses differential evolution (DE), a population-based metaheuristic algorithm, to search for scalars whose binary representations exhibit maximal entropy, defined by an even and statistically uniform distribution of ones and zeros. This reformulation of scalar selection as an entropy-optimization problem enhances resistance to entropy-based cryptanalytic techniques and improves overall unpredictability. Experimental results demonstrate that DE-optimized scalars achieve entropy significantly higher than conventionally generated scalars. The proposed method can be integrated into existing ECC-based protocols, offering a deterministic, tunable alternative to traditional randomness, ideal for applications in blockchain, secure messaging, IoT, and other resource-constrained environments.

Paperid: 3973, https://arxiv.org/pdf/2508.13453.pdf

Abstract:
In February 2024, after building trust over two years with project maintainers by making a significant volume of legitimate contributions, GitHub user "JiaT75" self-merged a version of the XZ Utils project containing a highly sophisticated, well-disguised backdoor targeting sshd processes running on systems with the backdoored package installed. A month later, this package began to be distributed with popular Linux distributions until a Microsoft employee discovered the backdoor while investigating how a recent system upgrade impacted the performance of SSH authentication. Despite its potential global impact, no tooling exists for monitoring and identifying anomalous behavior by personas contributing to other open-source projects. This paper demonstrates how Open Source Intelligence (OSINT) data gathered from GitHub contributions, analyzed using graph databases and graph theory, can efficiently identify anomalous behaviors exhibited by the "JiaT75" persona across other open-source projects.

Paperid: 3974, https://arxiv.org/pdf/2508.12953.pdf

Abstract:
Increasingly sophisticated and varied cyber threats necessitate ever improving enterprise security postures. For many organizations today, those postures have a foundation in the Zero Trust Architecture. This strategy sees trust as something an enterprise must not give lightly or assume too broadly. Understanding the ZTA and its numerous controls centered around the idea of not trusting anything inside or outside the network without verification, will allow organizations to comprehend and leverage this increasingly common paradigm. The ZTA, unlike many other regulatory frameworks, is not tightly defined. The research assesses the likelihood of quantifiable guidelines that measure cybersecurity maturity for an enterprise organization in relation to ZTA implementation. This is a new, data driven methodology for quantifying cyber resilience enabled by the adoption of Zero Trust principles to pragmatically address the critical need of organizations. It also looks at the practical aspects ZTA has on capabilities in deterring cyberattacks on a network. The outcomes of this research define a prescriptive set of key technical controls across identity verification, microsegmentation, data encryption, analytics, and orchestration that characterize the comprehensive ZTA deployment. By evaluating the depth of integration for each control component and aligning to industry best practices, the study's results help assess an organization's ZTA maturity level on a scale from Initial to Optimized adoption. The research's resultant four tier model demarcates phases for an organization on its security transformation journey, with each tier adding to the capability of the last.

Paperid: 3975, https://arxiv.org/pdf/2508.12304.pdf

Abstract:
Special tags such as AprilTags that facilitate image processing and pattern recognition are useful in practical applications. In close and private environments, identity security is unlikely to be an issue because all involved AprilTags can be completely regulated. However, in open and public environments, identity security is no longer an issue that can be neglected. To handle potential harm caused by adversarial attacks, this note advocates utilization of adjustable AprilTags instead of fixed ones.

Paperid: 3976, https://arxiv.org/pdf/2508.12220.pdf

Abstract:
We study the right to be forgotten (GDPR Art. 17) for large language models and frame unlearning as a reproducible systems problem. Our approach treats training as a deterministic program and logs a minimal per-microbatch record (ordered ID hash, RNG seed, learning-rate value, optimizer-step counter, and accumulation boundary). Under a pinned stack and deterministic kernels, replaying the training tail while filtering only the forget closure yields the same parameters as training on the retain set (bit-identical in the training dtype) when preconditions hold. To meet latency and availability constraints, we add complementary paths: (i) exact reverts of recent steps via micro-checkpoints or dense per-step deltas, (ii) cohort-scoped adapter deletion when the base is frozen, and (iii) a curvature-guided anti-update followed by a short retain-tune, audit-gated with escalation to exact replay. We report storage/latency budgets and a toy artifact validating mechanics; in a controlled run that satisfies the preconditions we demonstrate byte-identical equality of model and optimizer states.

Paperid: 3977, https://arxiv.org/pdf/2508.12093.pdf

Abstract:
With the widespread adoption of cloud computing, the need for outsourcing statistical analysis to third-party platforms is growing rapidly. However, handling sensitive data such as medical records and financial information in cloud environments raises serious privacy concerns. In this paper, we present PP-STAT, a novel and efficient Homomorphic Encryption (HE)-based framework for privacy-preserving statistical analysis. HE enables computations to be performed directly on encrypted data without revealing the underlying plaintext. PP-STAT supports advanced statistical measures, including Z-score normalization, skewness, kurtosis, coefficient of variation, and Pearson correlation coefficient, all computed securely over encrypted data. To improve efficiency, PP-STAT introduces two key optimizations: (1) a Chebyshev-based approximation strategy for initializing inverse square root operations, and (2) a pre-normalization scaling technique that reduces multiplicative depth by folding constant scaling factors into mean and variance computations. These techniques significantly lower computational overhead and minimize the number of expensive bootstrapping procedures. Our evaluation on real-world datasets demonstrates that PP-STAT achieves high numerical accuracy, with mean relative error (MRE) below 2.4x10-4. Notably, the encrypted Pearson correlation between the smoker attribute and charges reaches 0.7873, with an MRE of 2.86x10-4. These results confirm the practical utility of PP-STAT for secure and precise statistical analysis in privacy-sensitive domains.

Paperid: 3978, https://arxiv.org/pdf/2508.11651.pdf

Abstract:
The tokenization of real-world assets (RWAs) promises to transform financial markets by enabling fractional ownership, global accessibility, and programmable settlement of traditionally illiquid assets such as real estate, private credit, and government bonds. While technical progress has been rapid, with over \$25 billion in tokenized RWAs brought on-chain as of 2025, liquidity remains a critical bottleneck. This paper investigates the gap between tokenization and tradability, drawing on recent academic research and market data from platforms such as RWA.xyz. We document that most RWA tokens exhibit low trading volumes, long holding periods, and limited investor participation, despite their potential for 24/7 global markets. Through case studies of tokenized real estate, private credit, and tokenized treasury funds, we present empirical liquidity observations that reveal low transfer activity, limited active address counts, and minimal secondary trading for most tokenized asset classes. Next, we categorize the structural barriers to liquidity, including regulatory gating, custodial concentration, whitelisting, valuation opacity, and lack of decentralized trading venues. Finally, we propose actionable pathways to improve liquidity, ranging from hybrid market structures and collateral-based liquidity to transparency enhancements and compliance innovation. Our findings contribute to the growing discourse on digital asset market microstructure and highlight that realizing the liquidity potential of RWAs requires coordinated progress across legal, technical, and institutional domains.

Paperid: 3979, https://arxiv.org/pdf/2508.10606.pdf

Abstract:
Although the bulk of the research in privacy and statistical disclosure control is designed for cross-sectional data, i.e. data where individuals are observed at one single point in time, longitudinal data, i.e. individuals observed over multiple periods, are increasingly collected. Such data enhance undoubtedly the possibility of statistical analysis compared to cross-sectional data, but also come with one additional layer of information, individual trajectories, that must remain practically useful in a privacy-preserving way. Few extensions, essentially k-anonymity based, of popular privacy tools have been proposed to deal with the challenges posed by longitudinal data, and these proposals are often complex. By considering randomized response, and specifically its recent bistochastic extension, in the context of longitudinal data, this paper proposes a simple approach for their anonymization. After having characterized new results on bistochastic matrices, we show that a simple relationship exists between the protection of each data set released at each period, and the protection of individuals trajectories over time. In turn, this relationship can be tuned according to desired protection and information requirements. We illustrate the application of the proposed approach by an empirical example.

Paperid: 3980, https://arxiv.org/pdf/2508.10187.pdf

Abstract:
Many taxonomies exist to organize cybercrime incidents into ontological categories. We examine some of the taxonomies introduced in the literature; providing a framework, and analysis, of how best to leverage different taxonomy structures to optimize performance of detections targeting various types of threat-actor behaviors under the umbrella of precision and recall. Networks of detections are studied, and results are outlined showing properties of networks of interconnected detections. Some illustrations are provided to show how the construction of sets of detections to prevent broader types of attacks is limited by trade-offs in precision and recall under constraints. An equilibrium result is proven and validated on simulations, illustrating the existence of an optimal detection design strategy in this framework.

Paperid: 3981, https://arxiv.org/pdf/2508.10041.pdf

Abstract:
In this paper, we introduce a novel quantum algorithm for the factorization of composite odd numbers. This work makes two significant contributions. First, we present a new improvement to the classical Fermat method, fourfold reducing the computational complexity of factoring. Second, we reformulate Fermat factorization method as an optimization problem suitable for Quantum Annealers which allowed us to factorize 8,689,739, the biggest number ever factorized using a quantum device to our knowledge.

Paperid: 3982, https://arxiv.org/pdf/2508.10033.pdf

Abstract:
Language models exhibit human-like cognitive vulnerabilities, such as emotional framing, that escape traditional behavioral alignment. We present CCS-7 (Cognitive Cybersecurity Suite), a taxonomy of seven vulnerabilities grounded in human cognitive security research. To establish a human benchmark, we ran a randomized controlled trial with 151 participants: a "Think First, Verify Always" (TFVA) lesson improved cognitive security by +7.9% overall. We then evaluated TFVA-style guardrails across 12,180 experiments on seven diverse language model architectures. Results reveal architecture-dependent risk patterns: some vulnerabilities (e.g., identity confusion) are almost fully mitigated, while others (e.g., source interference) exhibit escalating backfire, with error rates increasing by up to 135% in certain models. Humans, in contrast, show consistent moderate improvement. These findings reframe cognitive safety as a model-specific engineering problem: interventions effective in one architecture may fail, or actively harm, another, underscoring the need for architecture-aware cognitive safety testing before deployment.

Paperid: 3983, https://arxiv.org/pdf/2508.10023.pdf

Abstract:
Post-quantum cryptography (PQC) aims to develop cryptographic algorithms that are secure against attacks from quantum computers. This paper compares the leading postquantum cryptographic algorithms, such as Kyber, sntrup761, and FrodoKEM, in terms of their security, performance, and real-world applicability. The review highlights the strengths and weaknesses of each algorithm and provides insights into future research directions. We also discuss the challenges of transitioning from classical to post-quantum systems and the potential impacts on various industries. This paper serves as a foundation for understanding the current state of post-quantum cryptography and its future prospects in the quantum computing era.

Paperid: 3984, https://arxiv.org/pdf/2508.10017.pdf

Abstract:
Federated Learning (FL) presents a groundbreaking approach for collaborative health research, allowing model training on decentralized data while safeguarding patient privacy. FL offers formal security guarantees when combined with Differential Privacy (DP). The integration of these technologies, however, introduces a significant trade-off between privacy and clinical utility, a challenge further complicated by the severe class imbalance often present in medical datasets. The research presented herein addresses these interconnected issues through a systematic, multi-stage analysis. An FL framework was implemented for cardiovascular risk prediction, where initial experiments showed that standard methods struggled with imbalanced data, resulting in a recall of zero. To overcome such a limitation, we first integrated the hybrid Synthetic Minority Over-sampling Technique with Tomek Links (SMOTETomek) at the client level, successfully developing a clinically useful model. Subsequently, the framework was optimized for non-IID data using a tuned FedProx algorithm. Our final results reveal a clear, non-linear trade-off between the privacy budget (epsilon) and model recall, with the optimized FedProx consistently out-performing standard FedAvg. An optimal operational region was identified on the privacy-utility frontier, where strong privacy guarantees (with epsilon 9.0) can be achieved while maintaining high clinical utility (recall greater than 77%). Ultimately, our study provides a practical methodological blueprint for creating effective, secure, and accurate diagnostic tools that can be applied to real-world, heterogeneous healthcare data.

Paperid: 3985, https://arxiv.org/pdf/2508.09783.pdf

Abstract:
We investigate the impact of (possible) deviations of the probability distribution of key values from a uniform distribution for the information-theoretic strong, or perfect, message authentication code. We found a simple expression for the decrease in security as a function of the statistical distance between the real key probability distribution and the uniform one. In a sense, a perfect message authentication code is robust to small deviations from a uniform key distribution.

Paperid: 3986, https://arxiv.org/pdf/2508.09765.pdf

Abstract:
Recently, we can observe a significant increase of the phishing attacks in the Internet. In a typical phishing attack, the attacker sets up a malicious website that looks similar to the legitimate website in order to obtain the end-users' information. This may cause the leakage of the sensitive information and the financial loss for the end-users. To avoid such attacks, the early detection of these websites' URLs is vital and necessary. Previous researchers have proposed many machine learning algorithms to distinguish the phishing URLs from the legitimate ones. In this paper, we would like to enhance these machine learning algorithms from the perspective of feature selection. We propose a novel method to incorporate the keyword features with the traditional features. This method is applied on multiple traditional machine learning algorithms and the experimental results have shown this method is useful and effective. On average, this method can reduce the classification error by 30% for the large dataset. Moreover, its enhancement is more significant for the small dataset. In addition, this method extracts the information from the URL and does not rely on the additional information provided by the third-part service. The best result for the machine learning algorithm using our proposed method has achieved the accuracy of 99.68%.

Paperid: 3987, https://arxiv.org/pdf/2508.07289.pdf

Abstract:
The rapid transmission of multimedia information has been achieved mainly by recent advancements in the Internet's speed and information technology. In spite of this, advancements in technology have resulted in breaches of privacy and data security. When it comes to protecting private information in today's Internet era, digital steganography is vital. Many academics are interested in digital video because it has a great capability for concealing important data. There have been a vast number of video steganography solutions developed lately to guard against the theft of confidential data. The visual imperceptibility, robustness, and embedding capacity of these approaches are all challenges that must be addressed. In this paper, a novel solution to reversible video steganography based on DWT and QR codes is proposed to address these concerns. In order to increase the security level of the suggested method, an enhanced ElGamal cryptosystem has also been proposed. Prior to the embedding stage, the suggested method uses the modified ElGamal algorithm to encrypt secret QR codes. Concurrently, it applies two-dimensional DWT on the Y-component of each video frame resulting in LL, LH, HL, and HH sub-bands. Then, the encrypted Low (L), Medium (M), Quantile (Q), and High (H) QR codes are embedded into the HL sub-band, HH sub-band, U-component, and V-component of video frames, respectively, using the LSB technique. As a consequence of extensive testing of the approach, it was shown to be very secure and highly invisible, as well as highly resistant to attacks from Salt & Pepper, Gaussian, Poisson, and Speckle noises, which has an average SSIM of more than 0.91. Aside from visual imperceptibility, the suggested method exceeds current methods in terms of PSNR average of 52.143 dB, and embedding capacity 1 bpp.

Paperid: 3988, https://arxiv.org/pdf/2508.07203.pdf

Abstract:
Current digital government literature focuses on professional in-house IT teams, specialized digital service teams, vendor-developed systems, or proprietary low-code/no-code tools. Almost no scholarship addresses a growing middle ground: technically skilled civil servants outside formal IT roles who can write real code but lack a sanctioned, secure path to deploy their work. This paper introduces a limits-aware, open-source and replicable platform that enables such public servants to develop, peer review, and deploy small-scale, domain-specific applications within government networks via a sandboxed, auditable workflow. By combining Jupyter Notebooks, preapproved open-source libraries, and lightweight governance, the platform works within institutional constraints such as procurement rules and IT security policies while avoiding vendor lock-in. Unlike low/no-code approaches, it preserves and enhances civil servants' programming skills, keeping them technically competitive with their private-sector peers. This contribution fills a critical gap, offering a replicable model for public-sector skill retention, resilience, and bottom-up digital transformation.

Paperid: 3989, https://arxiv.org/pdf/2508.07139.pdf

Abstract:
Ensuring LLM alignment is critical to information security as AI models become increasingly widespread and integrated in society. Unfortunately, many defenses against adversarial attacks and jailbreaking on LLMs cannot adapt quickly to new attacks, degrade model responses to benign prompts, or introduce significant barriers to scalable implementation. To mitigate these challenges, we introduce a real-time, self-tuning (RTST) moderator framework to defend against adversarial attacks while maintaining a lightweight training footprint. We empirically evaluate its effectiveness using Google's Gemini models against modern, effective jailbreaks. Our results demonstrate the advantages of an adaptive, minimally intrusive framework for jailbreak defense over traditional fine-tuning or classifier models.

Paperid: 3990, https://arxiv.org/pdf/2508.06457.pdf

Abstract:
Large Language Models (LLMs) have demonstrated impressive fluency and reasoning capabilities, but their potential for misuse has raised growing concern. In this paper, we present ScamAgent, an autonomous multi-turn agent built on top of LLMs, capable of generating highly realistic scam call scripts that simulate real-world fraud scenarios. Unlike prior work focused on single-shot prompt misuse, ScamAgent maintains dialogue memory, adapts dynamically to simulated user responses, and employs deceptive persuasion strategies across conversational turns. We show that current LLM safety guardrails, including refusal mechanisms and content filters, are ineffective against such agent-based threats. Even models with strong prompt-level safeguards can be bypassed when prompts are decomposed, disguised, or delivered incrementally within an agent framework. We further demonstrate the transformation of scam scripts into lifelike voice calls using modern text-to-speech systems, completing a fully automated scam pipeline. Our findings highlight an urgent need for multi-turn safety auditing, agent-level control frameworks, and new methods to detect and disrupt conversational deception powered by generative AI.

Paperid: 3991, https://arxiv.org/pdf/2508.06071.pdf

Abstract:
This paper introduces a structural game-theoretic model to value decentralized digital assets like Bitcoin. Instead of relying on speculative beliefs, it frames the asset's price within a Rational-Expectations Security-Utility Nash Equilibrium (RESUNE). This equilibrium is a fixed point where the market-clearing price dictates the hash rate through a free-entry mining model, which in turn endogenously sets the network's security. The security, defined as one minus the probability of a 51% attack, is determined via a global games model of attacker coordination, providing a unique and continuous security function. We prove the existence of a RESUNE and offer conditions for its uniqueness and stability. The model predicts that the stabilizing direct effect of price on demand must outweigh the potentially destabilizing feedback from price to security. The framework generates testable predictions, such as a protocol halving causing a contraction in both hash rate and price. A structural Vector Autoregression (VAR) model is proposed to test this mechanism. The model decomposes Bitcoin's value into transactional utility, security, and speculative components and explains the observed unidirectional causality from price to hash rate.

Paperid: 3992, https://arxiv.org/pdf/2508.05677.pdf

Abstract:
RL-based medical questionnaire systems have shown great potential in medical scenarios. However, their safety and robustness remain unresolved. This study performs a comprehensive evaluation on adversarial attack methods to identify and analyze their potential vulnerabilities. We formulate the diagnosis process as a Markov Decision Process (MDP), where the state is the patient responses and unasked questions, and the action is either to ask a question or to make a diagnosis. We implemented six prevailing major attack methods, including the Fast Gradient Signed Method (FGSM), Projected Gradient Descent (PGD), Carlini & Wagner Attack (C&W) attack, Basic Iterative Method (BIM), DeepFool, and AutoAttack, with seven epsilon values each. To ensure the generated adversarial examples remain clinically plausible, we developed a comprehensive medical validation framework consisting of 247 medical constraints, including physiological bounds, symptom correlations, and conditional medical constraints. We achieved a 97.6% success rate in generating clinically plausible adversarial samples. We performed our experiment on the National Health Interview Survey (NHIS) dataset (https://www.cdc.gov/nchs/nhis/), which consists of 182,630 samples, to predict the participant's 4-year mortality rate. We evaluated our attacks on the AdaptiveFS framework proposed in arXiv:2004.00994. Our results show that adversarial attacks could significantly impact the diagnostic accuracy, with attack success rates ranging from 33.08% (FGSM) to 64.70% (AutoAttack). Our work has demonstrated that even under strict medical constraints on the input, such RL-based medical questionnaire systems still show significant vulnerabilities.

Paperid: 3993, https://arxiv.org/pdf/2508.05110.pdf

Abstract:
Local differential privacy represents the gold standard for preserving the privacy of data before it leaves the device, and distribution estimation under this model has been well studied. Recently, protocols built upon balanced incomplete block designs were shown to achieve optimal error for this problem. However, it remained unknown whether other constructions could also be optimal. We resolve this question by proving that any protocol achieving optimal error must correspond to some balanced incomplete block design. This result, combined with prior work, completely characterises the set of optimal protocols for this problem. As a consequence, the protocols that achieve optimal error and optimal communication are only those based on symmetrical balanced incomplete block designs.

Paperid: 3994, https://arxiv.org/pdf/2508.03879.pdf

Abstract:
Malware and cheat developers use fileless execution techniques to evade traditional, signature-based security products. These methods include various types of manual mapping, module stomping, and threadless injection which work entirely within the address space of a legitimate process, presenting a challenge for detection due to ambiguity between what is legitimate and what isn't. Existing tools often have weaknesses, such as a dependency on Portable Executable (PE) structures or a vulnerability to time-of-check-to-time-of-use (TOCTOU) race conditions where an adversary cleans up before a periodic scan has the chance to occur. To address this gap, we present RX-INT, a kernel-assisted system featuring an architecture that provides resilience against TOCTOU attacks. RX-INT introduces a detection engine that combines a real-time thread creation monitor with a stateful Virtual Address Descriptor (VAD) scanner alongside various heuristics within. This engine snapshots both private and image-backed memory regions, using real-time memory hashing to detect illicit modifications like module stomping. Critically, we demonstrate a higher detection rate in certain benchmarks of this approach through a direct comparison with PE-sieve, a commonly used and powerful memory forensics tool. In our evaluation, RX-INT successfully detected a manually mapped region that was not identified by PE-sieve. We then conclude that our architecture represents a tangible difference in the detection of fileless threats, with direct applications in the fields of anti-cheat and memory security.

Paperid: 3995, https://arxiv.org/pdf/2508.03714.pdf

Abstract:
Artificial intelligence enables unprecedented attacks on human cognition, yet cybersecurity remains predominantly device-centric. This paper introduces the "Think First, Verify Always" (TFVA) protocol, which repositions humans as 'Firewall Zero', the first line of defense against AI-enabled threats. The protocol is grounded in five operational principles: Awareness, Integrity, Judgment, Ethical Responsibility, and Transparency (AIJET). A randomized controlled trial (n=151) demonstrated that a minimal 3-minute intervention produced statistically significant improvements in cognitive security task performance, with participants showing an absolute +7.87% gains compared to controls. These results suggest that brief, principles-based training can rapidly enhance human resilience against AI-driven cognitive manipulation. We recommend that GenAI platforms embed "Think First, Verify Always" as a standard prompt, replacing passive warnings with actionable protocols to enhance trustworthy and ethical AI use. By bridging the gap between technical cybersecurity and human factors, the TFVA protocol establishes human-empowered security as a vital component of trustworthy AI systems.

Paperid: 3996, https://arxiv.org/pdf/2508.03413.pdf

Abstract:
Automobiles are becoming increasingly important in our day to day life. Modern automobiles are highly computerized and hence potentially vulnerable to attack. Providing many wireless connectivity for vehicles enables a bridge between vehicles and their external environments. Such a connected vehicle solution is expected to be the next frontier for automotive revolution and the key to the evolution to next generation intelligent transportation systems. Vehicular Ad hoc Networks (VANETs) are emerging mobile ad hoc network technologies incorporating mobile routing protocols for inter-vehicle data communications to support intelligent transportation systems. Thus security and privacy are the major concerns in VANETs due to the mobility of the vehicles. Thus designing security mechanisms to remove adversaries from the network remarkably important in VANETs. This paper provides an overview of various vehicular network architectures. The evolution of security in modern vehicles. Various security and privacy attacks in VANETs with their defending mechanisms with examples and classify these mechanisms. It also provides an overview of various privacy implication that a vehicular network possess.

Paperid: 3997, https://arxiv.org/pdf/2508.03130.pdf

Abstract:
Small organizations, start ups, and self-hosted servers face increasing strain from automated web crawlers and AI bots, whose online presence has increased dramatically in the past few years. Modern bots evade traditional throttling and can degrade server performance through sheer volume even when they are well-behaved. We introduce a novel security approach that leverages data visualization and hierarchical IP hashing to analyze server event logs, distinguishing human users from automated entities based on access patterns. By aggregating IP activity across subnet classes and applying statistical measures, our method detects coordinated bot activity and distributed crawling attacks that conventional tools fail to identify. Using a real world example we estimate that 80 to 95 percent of traffic originates from AI crawlers, underscoring the need for improved filtering mechanisms. Our approach enables small organizations to regulate automated traffic effectively, preserving public access while mitigating performance degradation.

Paperid: 3998, https://arxiv.org/pdf/2508.02403.pdf

Abstract:
Stablecoins have become a foundational component of the digital asset ecosystem, with their market capitalization exceeding 230 billion USD as of May 2025. As fiat-referenced and programmable assets, stablecoins provide low-latency, globally interoperable infrastructure for payments, decentralized finance, DeFi, and tokenized commerce. Their accelerated adoption has prompted extensive regulatory engagement, exemplified by the European Union's Markets in Crypto-assets Regulation, MiCA, the US Guiding and Establishing National Innovation for US Stablecoins Act, GENIUS Act, and Hong Kong's Stablecoins Bill. Despite this momentum, academic research remains fragmented across economics, law, and computer science, lacking a unified framework for design, evaluation, and application. This study addresses that gap through a multi-method research design. First, it synthesizes cross-disciplinary literature to construct a taxonomy of stablecoin systems based on custodial structure, stabilization mechanism, and governance. Second, it develops a performance evaluation framework tailored to diverse stakeholder needs, supported by an open-source benchmarking pipeline to ensure transparency and reproducibility. Third, a case study on Real World Asset tokenization illustrates how stablecoins operate as programmable monetary infrastructure in cross-border digital systems. By integrating conceptual theory with empirical tools, the paper contributes: a unified taxonomy for stablecoin design; a stakeholder-oriented performance evaluation framework; an empirical case linking stablecoins to sectoral transformation; and reproducible methods and datasets to inform future research. These contributions support the development of trusted, inclusive, and transparent digital monetary infrastructure.

Paperid: 3999, https://arxiv.org/pdf/2508.01649.pdf

Abstract:
Consider graphs of n nodes, and use a Bloom filter of length 2 log3 n bits. An edge between nodes i and j, with i < j, turns on a certain bit of the Bloom filter according to a hash function on i and j. Pick a set of log n nodes and turn on all the bits of the Bloom filter required for these log n nodes to form a clique. As a result, the Bloom filter implies the existence of certain other edges, those edges (x, y), with x < y, such that all the bits selected by applying the hash functions to x and y happen to have been turned on due to hashing the clique edges into the Bloom filter. Constructing the graph consisting of the clique-selected edges and those edges mapped to the turned-on bits of the Bloom filter can be performed in polynomial time in n. Choosing a large enough polylogarithmic in n Bloom filter yields that the graph has only one clique of size log n, the planted clique. When the hash function is black-boxed, finding that clique is intractable and, therefore, inverting the function that maps log n nodes to a graph is not (likely to be) possible in polynomial time.

Paperid: 4000, https://arxiv.org/pdf/2508.01207.pdf

Abstract:
In the automotive industry there is a need to handle broad quality deficiencies, eg, performance, maintainability, cybersecurity, safety, and privacy, to mention a few. The idea is to prevent these issues from reaching end-users, ie, road users and inadvertently, pedestrians, aiming to potentially reduce accidents, and allow safe operation in dynamic attack surfaces, for the benefit of a host of stakeholders. This paper aims to bridge cybersecurity, safety, and privacy concerns in Connected and Autonomous Vehicles (CAV) with respect to Risk Assessment (RA) and Threat Modelling (TM) altogether. Practitioners know the vast literature on this topic given the sheer number of recommendations, standards, best practices, and existing approaches, at times impairing projects and fostering valuable and actionable threat analysis. In this paper we collate key outcomes by highlighting latest standards and approaches in RA and TM research to tackle complex attack surfaces as the ones posed by automotive settings. We aim to provide the community with a list of approaches to align expectations with stakeholders when deciding where and when to focus threat related analysis in automotive solutions.

Paperid: 4001, https://arxiv.org/pdf/2508.01051.pdf

Abstract:
We propose and experimentally demonstrate the \emph{Quasi-Superposition Quantum-inspired System (QSQS)} -- a conceptual quantum system for randomness generation built on measuring two conjugate observables of a permutation sorting process: the deterministic permutation count $n_p$ and the fundamentally non-deterministic sorting time $t$. By analogy with quantum systems, these observables are linked by an uncertainty-like constraint: algorithmic determinism ensures structural uniformity, while system-level fluctuations introduce irreducible unpredictability. We realize this framework concretely as \emph{QPP-RNG}, a system-embedded, software-based true random number generator (TRNG). In QPP-RNG, real-time measurements of sorting time $t$ -- shaped by CPU pipeline jitter, cache latency, and OS scheduling -- dynamically reseed the PRNG driving the permutation sequence. Crucially, QSQS transforms initially right-skewed raw distributions of $n_p$ and $t$ into nearly uniform outputs after modulo reduction, thanks to internal degeneracies that collapse many distinct states into the same output symbol. Empirical results show that as the repetition factor $m$ increases, output entropy converges toward theoretical maxima: Shannon and min-entropy values approach 8 bits, chi-squared statistics stabilize near ideal uniformity, and bell curves visually confirm the flattening from skewed to uniform distributions. Beyond practical implications, QSQS unifies deterministic algorithmic processes with non-deterministic physical fluctuations, offering a physics-based perspective for engineering true randomness in post-quantum cryptographic systems.

Paperid: 4002, https://arxiv.org/pdf/2508.00874.pdf

Abstract:
Malware change day by day and become sophisticated. Not only the complexity of the algorithm that generating malware, but also the camouflage methods. Camouflage, formerly, only need a simple encryption. Now, camouflage are able to change the pattern of code automatically. This term called Polymorphism. This property is usually used to create a metamorphic and a polymorphic malware. Although it has been around since 1990 still quite tricky to detect. In general, there are three obfuscation techniques to create the nature of polymorphism. That techniques are dead code insertion, register substitution, and instruction replacement. This technique can be added to the Metasploit Framework via Ghost Writing Assembly to get ASM files. The detection methods that be used are VT-notify, Context Triggered Piecewise Hash (CTPH), and direct scanning with an antivirus that has been selected. VTnotify show nothing wrong with the files. The best CTPH value is generated by a mixture of technique (average: 52.3125%), while if it is compared to the number of changes made, instruction replacement have the best comparative value (0.0256). The result of using antivirus scanning produces a variety of different results. Antivirus with behavioural-based detection has a possibility to detect this polymorphism.

Paperid: 4003, https://arxiv.org/pdf/2508.00851.pdf

Abstract:
The rapid expansion of the Internet of Things (IoT) has intensified security challenges, notably from Distributed Denial of Service (DDoS) attacks launched by compromised, resource-constrained devices. Traditional defenses are often ill-suited for the IoT paradigm, creating a need for lightweight, high-performance, edge-based solutions. This paper presents the design, implementation, and evaluation of an IoT security framework that leverages the extended Berkeley Packet Filter (eBPF) and the eXpress Data Path (XDP) for in-kernel mitigation of DDoS attacks. The system uses a rate-based detection algorithm to identify and block malicious traffic at the earliest stage of the network stack. The framework is evaluated using both Docker-based simulations and real-world deployment on a Raspberry Pi 4, showing over 97% mitigation effectiveness under a 100 Mbps flood. Legitimate traffic remains unaffected, and system stability is preserved even under attack. These results confirm that eBPF/XDP provides a viable and highly efficient solution for hardening IoT edge devices against volumetric network attacks.

Paperid: 4004, https://arxiv.org/pdf/2508.00832.pdf

Abstract:
The advent of quantum computing poses a significant threat to the foundational cryptographic algorithms that secure modern digital communications. Protocols such as HTTPS, digital certificates, and public key infrastructures (PKIs) heavily rely on cryptographic primitives like RSA, ECC, and Diffie-Hellman, which are vulnerable to quantum attacks -- most notably Shor's algorithm. This paper presents a comprehensive comparative analysis between classical cryptographic algorithms currently in widespread use and emerging post-quantum cryptographic schemes designed to withstand quantum adversaries. We review the cryptographic mechanisms underpinning modern internet security, outline the mathematical foundations of quantum attacks, and evaluate the security, performance, and implementation feasibility of quantum-resistant alternatives such as Kyber, Dilithium, and Falcon. Additionally, we assess the hybrid approaches currently being explored by institutions and tech companies to enable a smooth transition to post-quantum cryptography. By providing an in-depth comparison, this study aims to guide researchers, developers, and policymakers in understanding the critical implications of quantum computing on cryptographic infrastructures and the necessary steps for securing communications in the quantum era.

Paperid: 4005, https://arxiv.org/pdf/2507.23641.pdf

Abstract:
In this paper we introduce a rank $2$ lattice over a polynomial ring arising from the public key of the BIKE cryptosystem \cite{aragon2022bike}. The secret key is a sparse vector in this lattice. We study properties of this lattice and generalize the recovery of weak keys from \cite{BardetDLO16}. In particular, we show that they implicitly solved a shortest vector problem in the lattice we constructed. Rather than finding only a shortest vector, we obtain a reduced basis of the lattice which makes it possible to check for more weak keys.

Paperid: 4006, https://arxiv.org/pdf/2507.22674.pdf

Abstract:
Yang et al. proposed a lightweight certificateless multiuser matchmaking encryption (LC-MUME) scheme for mobile devices, published in IEEE Transactions on Information Forensics and Security (TIFS) (DOI: 10.1109/TIFS.2023.3321961). Their construction aims to reduce computational and communication overhead within a one-to-many certificateless cryptographic framework. The authors claim that their scheme satisfies existential unforgeability under chosen-message attacks (EUF-CMA) in the random oracle model. However, our cryptanalytic study demonstrates that the scheme fails to meet this critical security requirement. In particular, we show that a Type-I adversary can successfully forge a valid ciphertext without possessing the complete private key of the sender. Both theoretical analysis and practical implementation confirm that this attack can be mounted with minimal computational cost. To address these weaknesses, we propose a modification strategy to strengthen the security of matchmaking encryption schemes in mobile computing environments.

Paperid: 4007, https://arxiv.org/pdf/2507.22304.pdf

Abstract:
Vision-language models (VLMs) have revolutionized multimodal AI applications but introduce novel security vulnerabilities that remain largely unexplored. We present the first comprehensive study of steganographic prompt injection attacks against VLMs, where malicious instructions are invisibly embedded within images using advanced steganographic techniques. Our approach demonstrates that current VLM architectures can inadvertently extract and execute hidden prompts during normal image processing, leading to covert behavioral manipulation. We develop a multi-domain embedding framework combining spatial, frequency, and neural steganographic methods, achieving an overall attack success rate of 24.3% (plus or minus 3.2%, 95% CI) across leading VLMs including GPT-4V, Claude, and LLaVA, with neural steganography methods reaching up to 31.8%, while maintaining reasonable visual imperceptibility (PSNR greater than 38 dB, SSIM greater than 0.94). Through systematic evaluation on 12 diverse datasets and 8 state-of-the-art models, we reveal moderate but meaningful vulnerabilities in current VLM architectures and propose effective countermeasures. Our findings have significant implications for VLM deployment in security-critical applications and highlight the need for proportionate multimodal AI security frameworks.

Paperid: 4008, https://arxiv.org/pdf/2507.22077.pdf

Abstract:
As autonomous agents powered by large language models (LLMs) proliferate in high-stakes domains -- from pharmaceuticals to legal workflows -- the challenge is no longer just intelligence, but verifiability. We introduce TrustTrack, a protocol that embeds structural guarantees -- verifiable identity, policy commitments, and tamper-resistant behavioral logs -- directly into agent infrastructure. This enables a new systems paradigm: trust-native autonomy. By treating compliance as a design constraint rather than post-hoc oversight, TrustTrack reframes how intelligent agents operate across organizations and jurisdictions. We present the protocol design, system requirements, and use cases in regulated domains such as pharmaceutical R&D, legal automation, and AI-native collaboration. We argue that the Cloud -> AI -> Agent -> Trust transition represents the next architectural layer for autonomous systems.

Paperid: 4009, https://arxiv.org/pdf/2507.21387.pdf

Abstract:
Surface electromyography (EMG) enables non-invasive human-computer interaction in rehabilitation, prosthetics, and virtual reality. While deep learning models achieve over 97% classification accuracy, their vulnerability to adversarial attacks remains largely unexplored in the physical domain. We present ERa Attack, the first radio frequency (RF) adversarial method targeting EMG devices through intentional electromagnetic interference (IEMI). Using low-power software-defined radio transmitters, attackers inject optimized RF perturbations to mislead downstream models. Our approach bridges digital and physical domains: we generate adversarial perturbations using Projected Gradient Descent, extract 50-150 Hz components via inverse STFT, and employ synchronization-free strategies (constant spectrum noise or narrowband modulation). Perturbations, constrained to 1-10% of signal amplitude, are amplitude-modulated onto 433 MHz carriers. Experiments on the Myo Dataset (7 gestures, 350 samples) demonstrate significant impact: at 1 meter and 0 dBm transmission power, classification accuracy drops from 97.8% to 58.3%, with 41.7% misclassification rate and 25.6% targeted attack success rate. Attack effectiveness decreases exponentially with distance, recovering to 85% accuracy at 3 meters. Increasing power to 10 dBm reduces accuracy by an additional 15% at 1 meter. This work pioneers RF-based adversarial attacks on EMG recognition systems, revealing critical vulnerabilities in safety-critical applications. We quantify attack effectiveness across different perturbation modes and distances, and propose defenses including hardware shielding, spectrum monitoring, and adversarial training. Our findings inform the design of robust EMG systems against electromagnetic threats.

Paperid: 4010, https://arxiv.org/pdf/2507.21258.pdf

Abstract:
Human verification under adversarial information flow operates as a cost-bounded decision procedure constrained by working memory limits and cognitive biases. We introduce the Verification Cost Asymmetry (VCA) coefficient, formalizing it as the ratio of expected verification work between populations under identical claim distributions. Drawing on probabilistically checkable proofs (PCP) and parameterized complexity theory, we construct dissemination protocols that reduce verification for trusted audiences to constant human effort while imposing superlinear costs on adversarial populations lacking cryptographic infrastructure. We prove theoretical guarantees for this asymmetry, validate the framework through controlled user studies measuring verification effort with and without spot-checkable provenance, and demonstrate practical encoding of real-world information campaigns. The results establish complexity-theoretic foundations for engineering democratic advantage in cognitive warfare, with immediate applications to content authentication, platform governance, and information operations doctrine.

Paperid: 4011, https://arxiv.org/pdf/2507.21178.pdf

Abstract:
The advantages of IoT in strengthening commercial, industrial, and social ecosystems have led to its widespread expansion. Nevertheless, because endpoint devices have limited computation, storage, and communication capabilities, the IoT infrastructure is vulnerable to several cyber threats. As a result, DDoS attacks pose a severe risk to the security of IoT. By taking advantage of these weaknesses, attackers may quickly employ IoT devices as a component of botnets to execute DDoS attacks. The most critical development is how more armies of robots are being constructed from IoT devices. We offer a Model for dealing with DDOS attacks on botnets in the Internet of Things via trust management. In this Model, an attempt has been made to consider all aspects of security concerning trust factors to design a reliable and flexible model against DDoS attacks against the Internet of Things. In the initial studies, about 40-50 security models related to the subject have been studied by using review articles

Paperid: 4012, https://arxiv.org/pdf/2507.21154.pdf

Abstract:
The integration of electric vehicles (EVs) into power grids via Vehicle-to-Grid (V2G) system technology is increasing day by day, but these phenomena present both advantages and disadvantages. V2G can increase grid reliability by providing distributed energy storage and ancillary services. However, on the other hand, it has a scope that encompasses the cyber-physical attack surface of the national power grid, introducing new vulnerabilities in monitoring and supervisory control and data acquisition (SCADA) systems. This paper investigates the maliciousness caused by Autonomous Vehicle to Grid (AV2G) communication infrastructures and assesses their impacts on SCADA system reliability. This paper presents a quantitative reliability assessment using Bayesian attack graph combined with probabilistic capacity outage modeling based on IEEE RTS-79 system data. This work presents how AV2G-based attacks degrade system performance by using Monte Carlo simulations method, highlighting the need for cybersecurity-hardening strategies in smart grid design.

Paperid: 4013, https://arxiv.org/pdf/2507.21151.pdf

Abstract:
In recent years, the advancement of quantum computing technology has posed potential security threats to RSA cryptography and elliptic curve cryptography. In response, the National Institute of Standards and Technology (NIST) published several Federal Information Processing Standards (FIPS) of post-quantum cryptography (PQC) in August 2024, including the Module-Lattice-Based Key-Encapsulation Mechanism (ML-KEM), Module-Lattice-Based Digital Signature Algorithm (ML-DSA), and Stateless Hash-Based Digital Signature Algorithm (SLH-DSA). Although these PQC algorithms are designed to resist quantum computing attacks, they may not provide adequate security in certain specialized application scenarios. To address this issue, this study proposes quantum random number generator (QRNG)-based PQC algorithms. These algorithms leverage quantum computing to generate random numbers, which serve as the foundation for key pair generation, key encapsulation, and digital signature generation. A generalized architecture of QRNG is proposed, along with the design of six QRNGs. Each generator is evaluated according to the statistical validation procedures outlined in NIST SP 800-90B, including tests for verification of entropy sources and independent and identically distributed (IID) outputs. Experimental results assess the computation time of the six QRNGs, as well as the performance of QRNG-based ML-KEM, QRNG-based ML-DSA, and QRNG-based SLH-DSA. These findings provide valuable reference data for future deployment of PQC systems.

Paperid: 4014, https://arxiv.org/pdf/2507.21133.pdf

Abstract:
Large Language Models (LLMs) demonstrate complex responses to threat-based manipulations, revealing both vulnerabilities and unexpected performance enhancement opportunities. This study presents a comprehensive analysis of 3,390 experimental responses from three major LLMs (Claude, GPT-4, Gemini) across 10 task domains under 6 threat conditions. We introduce a novel threat taxonomy and multi-metric evaluation framework to quantify both negative manipulation effects and positive performance improvements. Results reveal systematic vulnerabilities, with policy evaluation showing the highest metric significance rates under role-based threats, alongside substantial performance enhancements in numerous cases with effect sizes up to +1336%. Statistical analysis indicates systematic certainty manipulation (pFDR < 0.0001) and significant improvements in analytical depth and response quality. These findings have dual implications for AI safety and practical prompt engineering in high-stakes applications.

Paperid: 4015, https://arxiv.org/pdf/2507.21128.pdf

Abstract:
Plugin systems are a class of external programmes that provide users with a wide range of functionality, and while they enhance the user experience, their security is always a challenge. Especially due to the diversity and complexity of developers, many plugin systems lack adequate regulation. As ChatGPT has become a popular large-scale language modelling platform, its plugin system is also gradually developing, and the open platform provides creators with the opportunity to upload plugins covering a wide range of application scenarios. However, current research and discussions mostly focus on the security issues of the ChatGPT model itself, while ignoring the possible security risks posed by the plugin system. This study aims to analyse the security of plugins in the ChatGPT plugin shop, reveal its major security vulnerabilities, and propose corresponding improvements.

Paperid: 4016, https://arxiv.org/pdf/2507.21113.pdf

Abstract:
As the frequency of cyber threats increases, conventional penetration testing is failing to capture the entirety of todays complex environments. To solve this problem, we propose the Vulnerability Mitigation System (VMS), a novel agent based on a Large Language Model (LLM) capable of performing penetration testing without human intervention. The VMS has a two-part architecture for planning and a Summarizer, which enable it to generate commands and process feedback. To standardize testing, we designed two new Capture the Flag (CTF) benchmarks based on the PicoCTF and OverTheWire platforms with 200 challenges. These benchmarks allow us to evaluate how effectively the system functions. We performed a number of experiments using various LLMs while tuning the temperature and top-p parameters and found that GPT-4o performed best, sometimes even better than expected. The results indicate that LLMs can be effectively applied to many cybersecurity tasks; however, there are risks. To ensure safe operation, we used a containerized environment. Both the VMS and the benchmarks are publicly available, advancing the creation of secure, autonomous cybersecurity tools.

Paperid: 4017, https://arxiv.org/pdf/2507.21111.pdf

Abstract:
This paper presents a comprehensive refutation of the so-called "blockchain trilemma," a widely cited but formally ungrounded claim asserting an inherent trade-off between decentralisation, security, and scalability in blockchain protocols. Through formal analysis, empirical evidence, and detailed critique of both methodology and terminology, we demonstrate that the trilemma rests on semantic equivocation, misuse of distributed systems theory, and a failure to define operational metrics. Particular focus is placed on the conflation of topological network analogies with protocol-level architecture, the mischaracterisation of Bitcoin's design--including the role of miners, SPV clients, and header-based verification--and the failure to ground claims in complexity-theoretic or adversarial models. By reconstructing Bitcoin as a deterministic, stateless distribution protocol governed by evidentiary trust, we show that scalability is not a trade-off but an engineering outcome. The paper concludes by identifying systemic issues in academic discourse and peer review that have allowed such fallacies to persist, and offers formal criteria for evaluating future claims in blockchain research.

Paperid: 4018, https://arxiv.org/pdf/2507.21097.pdf

Abstract:
This paper presents the Singularity Cipher, a novel cryptographic-steganographic framework that integrates topological transformations and visual paradoxes to achieve multidimensional security. Inspired by the non-orientable properties of the Klein bottle -- constructed from two Mobius strips -- the cipher applies symbolic twist functions to simulate topological traversal, producing high confusion and diffusion in the ciphertext. The resulting binary data is then encoded using perceptual illusions, such as the missing square paradox, to visually obscure the presence of encrypted content. Unlike conventional ciphers that rely solely on algebraic complexity, the Singularity Cipher introduces a dual-layer approach: symbolic encryption rooted in topology and visual steganography designed for human cognitive ambiguity. This combination enhances both cryptographic strength and detection resistance, making it well-suited for secure communication, watermarking, and plausible deniability in adversarial environments. The paper formalizes the architecture, provides encryption and decryption algorithms, evaluates security properties, and compares the method against classical, post-quantum, and steganographic approaches. Potential applications and future research directions are also discussed.

Paperid: 4019, https://arxiv.org/pdf/2507.21096.pdf

Abstract:
In the realm of big data and cloud computing, distributed systems are tasked with proficiently managing, storing, and validating extensive datasets across numerous nodes, all while maintaining robust data integrity. Conventional hashing methods, though straightforward, encounter substan tial difficulties in dynamic settings due to the necessity for thorough rehashing when nodes are altered. Consistent hashing mitigates some of these challenges by reducing data redistribution; however, it still contends with limitations in load balancing and scalability under intensive update conditions. This paper introduces an innovative approach using a lattice based homomorphic hash function HexaMorphHash that facilitates constant time, incremental updates while preserving a constant digest size. By utilizing the complexity of the Short Integer Solutions SIS problem, our method secures strong protective measures, even against quantum threats. We further com pare our method with existing ones such as direct signatures for each update, comprehensive database signing, Merkle tree based techniques, AdHash, MuHash, ECMH, and homomorphic sig nature schemes highlighting notable advancements in computational efficiency, memory usage, and scalability. Our contributions present a viable solution for frequent update dissemination in expansive distributed systems, safeguarding both data integrity and system performance.

Paperid: 4020, https://arxiv.org/pdf/2507.21092.pdf

Abstract:
CVE-2024-25825 is a vulnerability found in FydeOS. This thesis describes its discovery, disclosure, and its further investigation in connection to a nation state actor. The vulnerability is CWE-1392: Use of Default Credentials, CWE-1393: Use of Default Password, and CWE-258: Empty Password in Configuration File found in the /etc/shadow configuration file. The root users entry in the /etc/shadow file contains a wildcard allowing entry with any, or no, password. Following responsable disclosure, Fyde, CISA, and Mitre were informed. Fyde was already aware of the vulnerability. There was concern that this vulnerability might have been purposefully placed, perhaps by a nation state actor. After further investigation, it appears that this is unlikely to be the case. In cases in which poisoned code is suspected it might be prudent to contact the appropriate CERT, rather than the parent company. This, however, clashes with the typical teaching of responsable disclosure.

Paperid: 4021, https://arxiv.org/pdf/2507.21085.pdf

Abstract:
This paper explores how zero-knowledge proofs can enhance Bitcoin's functionality and privacy. First, we consider Proof-of-Reserve schemes: by using zk-STARKs, a custodian can prove its Bitcoin holdings are more than a predefined threshold X, without revealing addresses or actual balances. We outline a STARK-based protocol for Bitcoin UTXOs and discuss its efficiency. Second, we examine ZK Light Clients, where a mobile or lightweight device verifies Bitcoin's proof-of-work chain using succinct proofs. We propose a protocol for generating and verifying a STARK-based proof of a chain of block headers, enabling trust-minimized client operation. Third, we explore Privacy-Preserving Rollups via BitVM: leveraging BitVM, we design a conceptual rollup that keeps transaction data confidential using zero-knowledge proofs. In each case, we analyze security, compare with existing approaches, and discuss implementation considerations. Our contributions include the design of concrete protocols adapted to Bitcoin's UTXO model and an assessment of their practicality. The results suggest that while ZK proofs can bring powerful features (e.g., on-chain reserve audits, trustless light clients, and private layer-2 execution) to Bitcoin, each application requires careful trade-offs in efficiency and trust assumptions.

Paperid: 4022, https://arxiv.org/pdf/2507.21038.pdf

Abstract:
Since the 1990s, the telephone has been the primary mode of communication. However, Voice over Internet Protocol (VoIP), which is a highly straightforward and affordable form of data transfer, is now becoming an important part of daily communication. VoIP is the technology that makes it possible to send speech and multimedia data packets across either a public or private IP network. However, a cyberattack known as a man-in-the-middle attack poses a serious concern in transferring data through any network. Therefore, the authors have designed a system that sends voice over the internet within the range of a router using encrypted data transfer. An embedded system comprising an electret microphone, Embedded C, Particle Photon microcontroller, and Internet of Things (IoT) technology is developed. Due to its compact size, this type of device may be incorporated into automobiles, surveillance systems, or covert listening tools. The VoIP system gathers sound signals using the MAX9814 microphone, while the Particle Photon microcontroller securely transmits the data. Devices with access can download data from the VoIP systems Transmission Control Protocol (TCP) server. The accessed device stores the audio locally and uploads the corresponding data to Google Drive. This VoIP system provides a secure method of communication while conserving the integrity of the original signal.

Paperid: 4023, https://arxiv.org/pdf/2507.20087.pdf

Abstract:
RSA exponent reduction and AES S-box inversion share a hidden commonality: both are governed by the same impartial combinatorial principle, which we call a Product-Congruence Game (PCG). A Product-Congruence Game tracks play via the modular or finite-field product of heap values, providing a single invariant that unifies the algebraic cores of these two ubiquitous symmetric and asymmetric cryptosystems. We instantiate this framework with two companion games. First, $Ï$-MuM, in which a left-associated "multi-secret" RSA exponent chain compresses into the game of Multiplicative Modular Nim, PCG($k,\{1\}$), where $k = ord_N(g)$. The losing predicate then factorizes via the Chinese remainder theorem, mirroring RSA's structure. Second, poly-MuM, our model for finite-field inversion such as the AES S-box. For poly-MuM we prove the single-hole property inside its threshold region, implying that the Sprague-Grundy values are multiplicative under disjunctive sums in that region. Beyond these instances, we establish four structural theorems for a general Product-Congruence Game PCG($m,R$): (i) single-heap repair above the modulus, (ii) ultimate period $m$ per coordinate, (iii) exact and asymptotic losing densities, and (iv) confinement of optimal play to a finite indeterminacy region. An operation-alignment collapse principle explains why some variants degenerate to a single aggregate while MuM, $Ï$-MuM and poly-MuM retain rich local structure. All ingredients (multiplicative orders, the Chinese remainder theorem, finite fields) are classical; the contribution is the unified aggregation-compression viewpoint that embeds both RSA and AES inside one impartial-game framework, together with the structural and collapse theorems.

Paperid: 4024, https://arxiv.org/pdf/2507.19695.pdf

Abstract:
Polar encoding, described by Arikan in IEEE Transactions on Information Theory, Vol. 55, No. 7, July 2009, was a milestone for telecommunications. A Polar code distributes information among high and low-capacity channels, showing the possibility of achieving perfect channel capacity. The high-capacity channels allow almost noiseless transmission of data. When these channels are not high noise, reliability is achieved in the signal transmission. It starts to compete against codes such a Low-Density Parity-Check (LDPC) codes. Polar code can be also considered error correcting, based on the redundancy inherent in its structure. This feature makes polar encoding also applicable to digital quantum-resistant cryptography protocols. This work explores linear decoding at a first or single trial in the case of small losses or small number of bit-flipping, and repeated transmission for medium level losses. This is distinct from Arikans successive probabilistic decoding by application of probabilistic rules. Linear decoding is done directly from solving the linear equations connecting the codewords x and the received signals y after transmission via noisy channels. Numerical examples will be shown. Along with this work, programming in Mathematica language was used. Codes are available for copy-and-paste for Mathematica users to immediately try the described formalism.

Paperid: 4025, https://arxiv.org/pdf/2507.19399.pdf

Abstract:
As large language models (LLMs) increasingly integrate native code interpreters, they enable powerful real-time execution capabilities, substantially expanding their utility. However, such integrations introduce potential system-level cybersecurity threats, fundamentally different from prompt-based vulnerabilities. To systematically evaluate these interpreter-specific risks, we propose CIRCLE (Code-Interpreter Resilience Check for LLM Exploits), a simple benchmark comprising 1,260 prompts targeting CPU, memory, and disk resource exhaustion. Each risk category includes explicitly malicious ("direct") and plausibly benign ("indirect") prompt variants. Our automated evaluation framework assesses not only whether LLMs refuse or generates risky code, but also executes the generated code within the interpreter environment to evaluate code correctness, simplifications made by the LLM to make the code safe, or execution timeouts. Evaluating 7 commercially available models from OpenAI and Google, we uncover significant and inconsistent vulnerabilities. For instance, evaluations show substantial disparities even within providers - OpenAI's o4-mini correctly refuses risky requests at 7.1%, notably higher rates compared to GPT-4.1 at 0.5%. Results particularly underscore that indirect, socially-engineered prompts substantially weaken model defenses. This highlights an urgent need for interpreter-specific cybersecurity benchmarks, dedicated mitigation tools (e.g., guardrails), and clear industry standards to guide safe and responsible deployment of LLM interpreter integrations. The benchmark dataset and evaluation code are publicly released to foster further research.

Paperid: 4026, https://arxiv.org/pdf/2507.19055.pdf

Abstract:
Computers and computer networks have become integral to virtually every aspect of modern life, with the Internet playing an indispensable role. Organizations, businesses, and individuals now store vast amounts of proprietary, confidential, and personal data digitally. As such, ensuring the security of this data from unauthorized access is critical. Common security measures, such as firewalls, intrusion detection systems (IDS), intrusion prevention systems (IPS), and antivirus software, are constantly evolving to safeguard computer systems and networks. However, these tools primarily focus on defending against external threats, leaving systems vulnerable to insider attacks. Security solutions designed to mitigate risks originating from within the organization are relatively limited and often ineffective. This paper demonstrates how a Local Area Network (LAN) can be covertly exposed to the Internet via an insider attack. Specifically, it illustrates how an external machine can gain access to a LAN by exploiting an unused secondary IP address of the attacked LAN, effectively bypassing existing security mechanisms by also exploiting Hyper Text Transfer Protocol (HTTP). Despite the presence of robust external protections, such as firewalls and IDS, this form of insider attack reveals significant vulnerabilities in the way internal threats are addressed.

Paperid: 4027, https://arxiv.org/pdf/2507.18478.pdf

Abstract:
Recent technological advancements and the prevalence of technology in day to day activities have caused a major increase in the likelihood of the involvement of digital evidence in more and more legal investigations. Consumer-grade hardware is growing more powerful, with expanding memory and storage sizes and enhanced processor capabilities. Forensics investigators often have to sift through gigabytes of data during an ongoing investigation making the process tedious. Memory forensics, disk analysis all are well supported by state of the art tools that significantly lower the effort required to be put in by a forensic investigator by providing string searches, analyzing images file etc. During the course of the investigation a lot of false positives are identified that need to be lowered. This work presents Scout, a digital forensics framework that performs preliminary evidence processing and prioritizing using large language models. Scout deploys foundational language models to identify relevant artifacts from a large number of potential evidence files (disk images, captured network packets, memory dumps etc.) which would have taken longer to get identified. Scout employs text based large language models can easily process files with textual information. For the forensic analysis of multimedia files like audio, image, video, office documents etc. multimodal models are employed by Scout. Scout was able to identify and realize the evidence file that were of potential interest for the investigator.

Paperid: 4028, https://arxiv.org/pdf/2507.17628.pdf

Abstract:
The valuation of Cyber Threat Intelligence (CTI) remains a persistent challenge due to the problem of negative evidence: successful threat prevention results in non-events that generate minimal observable financial impact, making CTI expenditures difficult to justify within traditional cost-benefit frameworks. This study introduces a data-driven methodology for quantifying the return on investment (ROI) of CTI, thereby reframing it as a measurable contributor to risk mitigation. The proposed framework extends established models in security economics, including the Gordon-Loeb and FAIR models, to account for CTI's complex influence on both the probability of security breaches and the severity of associated losses. The framework is operationalized through empirically grounded performance indicators, such as reductions in mean time to detect (MTTD), mean time to respond (MTTR), and adversary dwell time, supported by three sector-specific case studies in finance, healthcare, and retail. To address limitations in conventional linear assessment methodologies, the Threat Intelligence Effectiveness Index (TIEI) is introduced as a composite metric based on a weighted geometric mean. TIEI penalizes underperformance across critical dimensions: quality, enrichment, integration, and operational impact; thereby capturing bottleneck effect where the least effective component limits overall performance. By integrating financial quantification, adversarial coverage, and qualitative assessments of business enablement, the proposed hybrid model converts negative evidence into a justifiable ROI explanation. This approach offers a replicable means of repositioning CTI from an expense to a strategic investment, enabling informed decision-making and continuous optimization across diverse organizational contexts.

Paperid: 4029, https://arxiv.org/pdf/2507.17199.pdf

Abstract:
LLM-powered search services have driven data integration as a significant trend. However, this trend's progress is fundamentally hindered, despite the fact that combining individual knowledge can significantly improve the relevance and quality of responses in specialized queries and make AI more professional at providing services. Two key bottlenecks are private data repositories' locality constraints and the need to maintain compatibility with mainstream search techniques, particularly Hierarchical Navigable Small World (HNSW) indexing for high-dimensional vector spaces. In this work, we develop a secure and privacy-preserving aggregated approximate nearest neighbor search (SP-A$^2$NN) with HNSW compatibility under a threshold-based searchable sharing primitive. A sharable bitgraph structure is constructed and extended to support searches and dynamical insertions over shared data without compromising the underlying graph topology. The approach reduces the complexity of a search from $O(n^2)$ to $O(n)$ compared to naive (undirected) graph-sharing approach when organizing graphs in the identical HNSW manner. On the theoretical front, we explore a novel security analytical framework that incorporates privacy analysis via reductions. The proposed leakage-guessing proof system is built upon an entirely different interactive game that is independent of existing coin-toss game design. Rather than being purely theoretical, this system is rooted in existing proof systems but goes beyond them to specifically address leakage concerns and standardize leakage analysis -- one of the most critical security challenges with AI's rapid development.

Paperid: 4030, https://arxiv.org/pdf/2507.17007.pdf

Abstract:
This paper presents a vulnerability assessment activity that we carried out on PosteID, the implementation of the Italian Public Digital Identity System (SPID) by Poste Italiane. The activity led to the discovery of a critical privilege escalation vulnerability, which was eventually patched. The overall analysis and disclosure process represents a valuable case study for the community of ethical hackers. In this work, we present both the technical steps and the details of the disclosure process.

Paperid: 4031, https://arxiv.org/pdf/2507.16996.pdf

Abstract:
With billions of users and an immense volume of daily uploads, YouTube has become an attractive target for cybercriminals aiming to leverage its vast audience. The platform's openness and trustworthiness provide an ideal environment for deceptive campaigns that can operate under the radar of conventional security tools. This paper explores how cybercriminals exploit YouTube to disseminate malware, focusing on campaigns that promote free software or game cheats. It discusses deceptive video demonstrations and the techniques behind malware delivery. Additionally, the paper presents a new evasion technique that abuses YouTube's multilingual metadata capabilities to circumvent automated detection systems. Findings indicate that this method is increasingly being used in recent malicious videos to avoid detection and removal.

Paperid: 4032, https://arxiv.org/pdf/2507.16870.pdf

Abstract:
APIs (Application Programming Interfaces) or Web Services are the foundational building blocks that enable interconnected systems. However this proliferation of APIs has also introduced security challenges that require systematic and scalable solutions for secure authentication and authorization. This paper presents the fundamentals necessary for building a such a token-based API security system. It discusses the components necessary, the integration of OAuth 2.0, extensibility of the token architectures, necessary cryptographic foundations, and persistence strategies to ensure secure and resilient operations. In addition to architectural concerns, the paper explores best practices for token lifecycle management, scope definition, expiration policies, and revocation mechanisms, all framed within a real-world scenario. By adhering to these principles, developers can establish a robust baseline while maintaining the flexibility to customize their domain-specific requirements. The approach does not claim to cover all variations necessary for diverse architectures but instead focuses on key principles essential for any standard API token authentication system. Throughout, the paper emphasizes balancing practical considerations with security imperatives and uses key concepts such as the CIA triad, OAuth standards, secure token life cycle, and practices for protecting sensitive user and application data. The intent is to equip developers with the foundational knowledge necessary to build secure, scalable token-based API security systems ready to handle the evolving threat landscape.

Paperid: 4033, https://arxiv.org/pdf/2507.14893.pdf

Abstract:
Digital signatures are fundamental cryptographic tools that provide authentication and integrity in digital communications. However, privacy-sensitive applications, such as e-voting and digital cash, require more restrictive verification models to ensure confidentiality and control. Strong Designated Verifier Signature (SDVS) schemes address this need by enabling the signer to designate a specific verifier, ensuring that only this party can validate the signature. Existing SDVS constructions are primarily based on number-theoretic assumptions and are therefore vulnerable to quantum attacks. Although post-quantum alternatives, particularly those based on lattices, have been proposed, they often entail large key and signature sizes. In this work, we present $\mathsf{CSI\text{-}SDVS}$, a novel isogeny-based SDVS scheme that offers a compact, quantum-resistant alternative to existing SDVS constructions. The scheme leverages the ideal class group action on $\mathbb{F}_p$-isomorphism classes of supersingular elliptic curves and is founded on the hardness of the Multi-Target Group Action Inverse Problem (MT-GAIP). $\mathsf{CSI\text{-}SDVS}$ achieves strong security guarantees, Strong Unforgeability under Chosen-Message Attacks (SUF-CMA), Non-Transferability (NT), and Privacy of Signer's Identity (PSI), in the random oracle model, thereby making it among the most compact PQC-based SDVS schemes and the only post-quantum secure construction based on isogenies.

Paperid: 4034, https://arxiv.org/pdf/2507.14600.pdf

Abstract:
Long, human-generated passwords pose significant challenges to both classical and quantum attacks due to their irregular structure and large search space. In this work, we propose an enhanced classical-quantum hybrid attack specifically designed for this scenario. Our approach constructs rainbow tables using dictionary-based password generation augmented with transformation rules that better capture real-world user behavior. These tables are organized into buckets, enabling faster lookup and reduced space complexity. For the search within each bucket, we employ a distributed exact variant of Grover's algorithm. This method provides deterministic success and significantly lower circuit depth, enhancing robustness against noise-particularly depolarizing errors common in near-term quantum devices. Overall, our hybrid framework improves the efficiency and practicality of password recovery for long, human-readable passwords in realistic adversarial settings.

Paperid: 4035, https://arxiv.org/pdf/2507.14202.pdf

Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse applications, yet they pose significant security risks that threaten their safe deployment in critical domains. Current security alignment methodologies predominantly rely on Process Reward Models (PRMs) to evaluate intermediate reasoning steps, introducing substantial computational overhead and scalability constraints. This paper presents a novel PRM-free security alignment framework that leverages automated red teaming and adversarial training to achieve robust security guarantees while maintaining computational efficiency. Our approach systematically identifies vulnerabilities through sophisticated attack strategies including genetic algorithm optimization, multi-agent simulation, and advanced prompt mutation techniques. The framework enhances model robustness via targeted adversarial training with curriculum learning and adaptive regularization mechanisms. Comprehensive experimental evaluation across five state-of-the-art LLMs demonstrates that our method achieves superior security alignment performance compared to PRM-based approaches while reducing computational costs by 61\%. The framework incorporates transparent reporting and continuous audit mechanisms that enable iterative security improvement and regulatory compliance. Our contributions advance the field of efficient LLM security alignment by democratizing access to robust security measures for resource-constrained organizations and providing a scalable foundation for addressing evolving adversarial threats.

Paperid: 4036, https://arxiv.org/pdf/2507.14007.pdf

Abstract:
The rapid integration of blockchain, cryptocurrency, and Web3 technologies into digital banks and fintech operations has created an integrated environment blending traditional financial systems with decentralised elements. This paper introduces the CryptoNeo Threat Modelling Framework (CNTMF), a proposed framework designed to address the risks in these ecosystems, such as oracle manipulation and cross-chain exploits. CNTMF represents a proposed extension of established methodologies like STRIDE, OWASP Top 10, NIST frameworks, LINDDUN, and PASTA, while incorporating tailored components including Hybrid Layer Analysis, the CRYPTOQ mnemonic for cryptocurrency-specific risks, and an AI-Augmented Feedback Loop. Drawing on real-world data from 2025 incidents, CNTMF supports data-driven mitigation to reduce losses, which totalled approximately $2.47 billion in the first half of 2025 across 344 security events (CertiK via GlobeNewswire, 2025; Infosecurity Magazine, 2025). Its phases guide asset mapping, risk profiling, prioritisation, mitigation, and iterative feedback. This supports security against evolving risks like state-sponsored attacks.

Paperid: 4037, https://arxiv.org/pdf/2507.13720.pdf

Abstract:
Quantum computing poses fundamental risks to classical blockchain systems by undermining widely used cryptographic primitives. In response, two major research directions have emerged: post-quantum blockchains, which integrate quantum-resistant algorithms, and quantum blockchains, which leverage quantum properties such as entanglement and quantum key distribution. This survey reviews key developments in both areas, analyzing their cryptographic foundations, architectural designs, and implementation challenges. This work provides a comparative overview of technical proposals, highlight trade-offs in security, scalability, and deployment, and identify open research problems across hardware, consensus, and network design. The goal is to offer a structured and comprehensive reference for advancing secure blockchain systems in the quantum era.

Paperid: 4038, https://arxiv.org/pdf/2507.12003.pdf

Abstract:
This article presents the current state of ML-security and of the documentation of ML-based systems, models and datasets in research and practice based on an extensive review of the existing literature. It shows a generally low awareness of security aspects among ML-practitioners and organizations and an often unstandardized approach to documentation, leading to overall low quality of ML-documentation. Existing standards are not regularly adopted in practice and IT-security aspects are often not included in documentation. Due to these factors, there is a clear need for improved security documentation in ML, as one step towards addressing the existing gaps in ML-security. To achieve this, we propose expanding existing documentation standards for ML-documentation to include a security section with specific security relevant information. Implementing this, a novel expanded method of documenting security requirements in ML-documentation is presented, based on the existing Model Cards and Datasheets for Datasets standards, but with the recommendation to adopt these findings in all ML-documentation.

Paperid: 4039, https://arxiv.org/pdf/2507.10621.pdf

Abstract:
Protecting cyberspace requires not only advanced tools but also a shift in how we reason about threats, trust, and autonomy. Traditional cybersecurity methods rely on manual responses and brittle heuristics. To build proactive and intelligent defense systems, we need integrated theoretical frameworks and software tools. Game theory provides a rigorous foundation for modeling adversarial behavior, designing strategic defenses, and enabling trust in autonomous systems. Meanwhile, software tools process cyber data, visualize attack surfaces, verify compliance, and suggest mitigations. Yet a disconnect remains between theory and practical implementation. The rise of Large Language Models (LLMs) and agentic AI offers a new path to bridge this gap. LLM-powered agents can operationalize abstract strategies into real-world decisions. Conversely, game theory can inform the reasoning and coordination of these agents across complex workflows. LLMs also challenge classical game-theoretic assumptions, such as perfect rationality or static payoffs, prompting new models aligned with cognitive and computational realities. This co-evolution promises richer theoretical foundations and novel solution concepts. Agentic AI also reshapes software design: systems must now be modular, adaptive, and trust-aware from the outset. This chapter explores the intersection of game theory, agentic AI, and cybersecurity. We review key game-theoretic frameworks (e.g., static, dynamic, Bayesian, and signaling games) and solution concepts. We then examine how LLM agents can enhance cyber defense and introduce LLM-driven games that embed reasoning into AI agents. Finally, we explore multi-agent workflows and coordination games, outlining how this convergence fosters secure, intelligent, and adaptive cyber systems.

Paperid: 4040, https://arxiv.org/pdf/2507.10592.pdf

Abstract:
This experiment breaks a 5-bit elliptic curve cryptographic key using a Shor-style quantum attack. Executed on IBM's 133-qubit ibm_torino with Qiskit Runtime 2.0, a 15-qubit circuit, comprised of 10 logical qubits and 5 ancilla, interferes over an order-32 elliptic curve subgroup to extract the secret scalar k from the public key relation Q = kP, without ever encoding k directly into the oracle. From 16,384 shots, the quantum interference reveals a diagonal ridge in the 32 x 32 QFT outcome space. The quantum circuit, over 67,000 layers deep, produced valid interference patterns despite extreme circuit depth, and classical post-processing revealed k = 7 in the top 100 invertible (a, b) results. All code, circuits, and raw data are publicly available for replication.

Paperid: 4041, https://arxiv.org/pdf/2507.10562.pdf

Abstract:
Current AI agent architectures suffer from ephemeral memory limitations, preventing effective collaboration and knowledge sharing across sessions and agent boundaries. We introduce SAMEP (Secure Agent Memory Exchange Protocol), a novel framework that enables persistent, secure, and semantically searchable memory sharing among AI agents. Our protocol addresses three critical challenges: (1) persistent context preservation across agent sessions, (2) secure multi-agent collaboration with fine-grained access control, and (3) efficient semantic discovery of relevant historical context. SAMEP implements a distributed memory repository with vector-based semantic search, cryptographic access controls (AES-256-GCM), and standardized APIs compatible with existing agent communication protocols (MCP, A2A). We demonstrate SAMEP's effectiveness across diverse domains including multi-agent software development, healthcare AI with HIPAA compliance, and multi-modal processing pipelines. Experimental results show 73% reduction in redundant computations, 89% improvement in context relevance scores, and complete compliance with regulatory requirements including audit trail generation. SAMEP enables a new paradigm of persistent, collaborative AI agent ecosystems while maintaining security and privacy guarantees.

Paperid: 4042, https://arxiv.org/pdf/2507.09579.pdf

Abstract:
We present PromptChain, a decentralized Web3 architecture that establishes AI prompts as first-class digital assets with verifiable ownership, version control, and monetization capabilities. Current centralized platforms lack mechanisms for proper attribution, quality assurance, or fair compensation for prompt creators. PromptChain addresses these limitations through a novel integration of IPFS for immutable storage, smart contracts for governance, and token incentives for community curation. Our design includes: (1) a comprehensive metadata schema for cross-model compatibility, (2) a stake-weighted validation mechanism to align incentives, and (3) a token economy that rewards contributors proportionally to their impact. The proposed architecture demonstrates how decentralized systems could potentially match centralized alternatives in efficiency while providing superior ownership guarantees and censorship resistance through blockchain-anchored provenance tracking. By decoupling prompts from specific AI models or outputs, this work establishes the foundation for an open ecosystem of human-AI collaboration in the Web3 era, representing the first systematic treatment of prompts as standalone digital assets with dedicated decentralized infrastructure.

Paperid: 4043, https://arxiv.org/pdf/2507.09407.pdf

Abstract:
We introduce the framework of LLM-Stackelberg games, a class of sequential decision-making models that integrate large language models (LLMs) into strategic interactions between a leader and a follower. Departing from classical Stackelberg assumptions of complete information and rational agents, our formulation allows each agent to reason through structured prompts, generate probabilistic behaviors via LLMs, and adapt their strategies through internal cognition and belief updates. We define two equilibrium concepts: reasoning and behavioral equilibrium, which aligns an agent's internal prompt-based reasoning with observable behavior, and conjectural reasoning equilibrium, which accounts for epistemic uncertainty through parameterized models over an opponent's response. These layered constructs capture bounded rationality, asymmetric information, and meta-cognitive adaptation. We illustrate the framework through a spearphishing case study, where a sender and a recipient engage in a deception game using structured reasoning prompts. This example highlights the cognitive richness and adversarial potential of LLM-mediated interactions. Our results show that LLM-Stackelberg games provide a powerful paradigm for modeling decision-making in domains such as cybersecurity, misinformation, and recommendation systems.

Paperid: 4044, https://arxiv.org/pdf/2507.09067.pdf

Abstract:
The emergence of quantum computing presents profound challenges to existing cryptographic infrastructures, whilst the development of central bank digital currencies (CBDCs) has raised concerns regarding privacy preservation and excessive centralisation in digital payment systems. This paper proposes the Quantum-Resilient Privacy Ledger (QRPL) as an innovative token-based digital currency architecture that incorporates National Institute of Standards and Technology (NIST)-standardised post-quantum cryptography (PQC) with hash-based zero-knowledge proofs to ensure user sovereignty, scalability, and transaction confidentiality. Key contributions include adaptations of ephemeral proof chains for unlinkable transactions, a privacy-weighted Proof-of-Stake (PoS) consensus to promote equitable participation, and a novel zero-knowledge proof-based mechanism for privacy-preserving selective disclosure. QRPL aims to address critical shortcomings in prevailing CBDC designs, including risks of pervasive surveillance, with a 10-20 second block time to balance security and throughput in future monetary systems. While conceptual, empirical prototypes are planned. Future work includes prototype development to validate these models empirically.

Paperid: 4045, https://arxiv.org/pdf/2507.08844.pdf

Abstract:
It is frequently claimed in blockchain discourse that immutability guarantees trust. This paper rigorously refutes that assertion. We define immutability as the cryptographic persistence of historical states in an append-only data structure and contrast it with trust, understood as a rational epistemic expectation under uncertainty. Employing predicate logic, automata-theoretic models, and epistemic game-theoretic analysis, we demonstrate that immutability neither entails nor implies correctness, fairness, or credibility. Through formal constructions and counterexamples--including predictive fraud schemes and the phenomenon of garbage permanence--we show that the belief conflates structural and epistemic domains. Immutability preserves all data equally, regardless of veracity. Therefore, the assertion that immutability guarantees trust collapses under the weight of formal scrutiny.

Paperid: 4046, https://arxiv.org/pdf/2507.06808.pdf

Abstract:
Let $\mathbb{F}_q$ be a prime field with $q \geq 3$, and let $d, m \geq 1$ be integers such that $\gcd \left( d, q \right) = 1$ and $m \mid (q - 1)$. In this paper we bound the absolute values of the Walsh spectrum of S-Boxes $S (x) = x^d \cdot T \left( x^\frac{q - 1}{m} \right)$, where $T$ is a function with $T (x) \neq 0$ if $x \neq 0$. Such S-Boxes have been proposed for the Zero-Knowledge-friendly hash functions Grendel and Polocolo. In particular, we prove the conjectured correlation of the Polocolo S-Box.

Paperid: 4047, https://arxiv.org/pdf/2507.06254.pdf

Abstract:
Wallets are access points for the digital economys value creation. Wallets for blockchains store the end-users cryptographic keys for administrating their digital assets and enable access to blockchain Web3 systems. Web3 delivers new service opportunities. This chapter focuses on the Web3 enabled release of value through the lens of wallets. Wallets may be implemented as software apps on smartphones, web apps on desktops, or hardware devices. Wallet users request high security, ease of use, and access of relevance from their wallets. Increasing connectivity, functionality, autonomy, personal support, and offline capability make the wallet into the user's Universal Access Device for any digital asset. Through wallet based services, the owner obtains enhanced digital empowerment. The new Web3 solutionareas, Identity and Decentralisation, enable considerable societal effects, and wallets are an integral part of these. One example is self sovereign identity solutions combined with wallet borne AI for personalised support, empowering the enduser beyond anything previously known. Improved welfare is foreseen globally through enlarged markets with collaborative services with drastically lowered transaction costs compared to today, the expected vastly increased levels of automation in society necessitate enhanced enduser protection. As wallets are considered a weak spot for security, improving overall security through blockchains is essential.

Paperid: 4048, https://arxiv.org/pdf/2507.06244.pdf

Abstract:
Since many applications and services require pseudorandom numbers (PRNs), it is feasible to generate specific PRNs under given key values and input messages using Key Derivation Functions (KDFs). These KDFs are primarily constructed based on Message Authentication Codes (MACs), where the MAC serves as a core component in the generation of pseudorandom numbers. In light of this, the study first examines three MAC algorithms defined by the National Institute of Standards and Technology (NIST): the Keyed-Hash Message Authentication Code (HMAC), the Cipher-based Message Authentication Code (CMAC), and the Keccak-based Message Authentication Code (KMAC). Subsequently, the study explores KDFs based on these MACs, including the Counter Mode KDF, the KMAC-based KDF, and the KDF defined in IEEE 1609.2.1. In experiments, the computation times for generating MACs and the corresponding pseudorandom numbers using each KDF are evaluated. The study further analyzes the advantages, disadvantages, and applicable scenarios for each method. Experimental results indicate that the CMAC and the CMAC-based KDF exhibit the shortest computation times, averaging approximately 0.007 milliseconds and 0.014 milliseconds, respectively.

Paperid: 4049, https://arxiv.org/pdf/2507.05809.pdf

Abstract:
The so-called blockchain trilemma asserts the impossibility of simultaneously achieving scalability, security, and decentralisation within a single blockchain protocol. In this paper, we formally refute that proposition. Employing predicate logic, formal automata theory, computational complexity analysis, and graph-theoretic measures of relay topology--specifically Baran's model of network path redundancy--we demonstrate that the trilemma constitutes a category error, conflates distinct analytical domains, and relies upon unproven causal assumptions. We further expose its reliance on composition fallacies drawn from flawed system implementations. A constructive counterexample is presented: a blockchain protocol exhibiting unbounded transaction throughput, cryptographic security under adversarial load, and multipath decentralised propagation. This example is not hypothetical but grounded in protocol design enabled by compact block relay, SPV verification, and IPv6 multicast. The trilemma is revealed not as a law of protocol architecture, but as a heuristic fallacy sustained by imprecision and design defeatism.

Paperid: 4050, https://arxiv.org/pdf/2507.04956.pdf

Abstract:
Round-based DAGs enable high-performance Byzantine fault-tolerant consensus, yet their technical advantages remain underutilized due to their short history. While research on consensus protocols is active in both academia and industry, many studies overlook implementation-level algorithms, leaving actual performance unclear - particularly for theoretical protocols whose practical performance cannot often be evaluated. Bullshark, a Round-based DAG BFT protocol on Narwhal mempool, achieves optimal performance: 297,000 transactions per second with 2-second latency. We analyze the algorithm's workflow, from transaction submission to blockchain commitment, breaking it down layer by layer at the functional level and delineating the key features and interactions of the Bullshark and Narwhal components. Future work aims to improve performance in Byzantine fault environments and optimize trade-offs in the CAP theorem.

Paperid: 4051, https://arxiv.org/pdf/2507.04478.pdf

Abstract:
Large language models (LLMs) have transformed natural language processing, but their ability to memorize training data poses significant privacy risks. This paper investigates model inversion attacks on the Llama 3.2 model, a multilingual LLM developed by Meta. By querying the model with carefully crafted prompts, we demonstrate the extraction of personally identifiable information (PII) such as passwords, email addresses, and account numbers. Our findings highlight the vulnerability of even smaller LLMs to privacy attacks and underscore the need for robust defenses. We discuss potential mitigation strategies, including differential privacy and data sanitization, and call for further research into privacy-preserving machine learning techniques.

Paperid: 4052, https://arxiv.org/pdf/2507.04104.pdf

Abstract:
Privacy-preserving machine learning (ML) seeks to balance data utility and privacy, especially as regulations like the GDPR mandate the anonymization of personal data for ML applications. Conventional anonymization approaches often reduce data utility due to indiscriminate generalization or suppression of data attributes. In this study, we propose an interactive approach that incorporates human input into the k-anonymization process, enabling domain experts to guide attribute preservation based on contextual importance. Using the UCI Adult dataset, we compare classification outcomes of interactive human-influenced anonymization with traditional, fully automated methods. Our results show that human input can enhance data utility in some cases, although results vary across tasks and settings. We discuss limitations of our approach and suggest potential areas for improved interactive frameworks in privacy-aware ML.

Paperid: 4053, https://arxiv.org/pdf/2507.03694.pdf

Abstract:
This work presents a novel decentralized protocol for digital estate planning that integrates advances distributed computing, and cryptography. The original proof-of-concept was constructed using purely solidity contracts. Since then, we have enhanced the implementation into a layer-1 protocol that uses modern interchain communication to connect several heterogeneous chain types. A key contribution of this research is the implementation of several modern cryptographic primitives to support various forms of claims for information validation. These primitives introduce an unmatched level of privacy to the process of digital inheritance. We also demonstrate on a set of heterogeneous smart contracts, following the same spec, on each chain to serve as entry points, gateways, or bridge contracts that are invoked via a path from the will module on our protocol, to the contract. This ensures a fair and secure distribution of digital assets in accordance with the wishes of the decedent without the requirement of moving their funds. This research further extends its innovations with a user interaction model, featuring a check-in system and account abstraction process, which enhances flexibility and user-friendliness without compromising on security. By developing a dedicated permissionless blockchain that is secured by a network of validators, and interchain relayers, the proposed protocol signifies a transformation in the digital estate planning industry and illustrates the potential of blockchain technology in revolutionizing traditional legal and personal spheres. Implementing a cryptoeconomic network at the core of inheritance planning allows for unique incentive compatible economic mechanisms to be constructed.

Paperid: 4054, https://arxiv.org/pdf/2507.03635.pdf

Abstract:
Digital implementations of chaotic systems often suffer from inherent degradation, limiting their long-term performance and statistical quality. To address this challenge, we propose a novel four-stage synchronized piecewise linear chaotic map. This new map is meticulously designed with four independent segments, each possessing its own control parameters, specifically engineered to mitigate the natural degradation observed in digitally realized dynamical systems. We characterize its behavior using established tools from nonlinear dynamics, including bifurcation diagrams and graphical analysis, which provide a comprehensive qualitative understanding of its complex dynamics. To rigorously validate the statistical features of the generated sequences, we employed the National Institute of Standards and Technology (NIST) statistical testing suite. A substantial 100 MB dataset, comprising sequences produced by the proposed map, was generated via a Matlab script and subjected to this rigorous battery of tests. Our results demonstrate that the proposed map exhibits superior statistical properties compared to the classic Bernoulli map, successfully passing all NIST tests where the traditional map did not. This research confirms the proposed map's potential as a robust and high-quality source for chaotic sequence generation.

Paperid: 4055, https://arxiv.org/pdf/2507.03361.pdf

Abstract:
Linear sketches are fundamental tools in data stream analytics. They are notable for supporting both approximate frequency queries and heavy hitter detection with bounded trade-offs for error and memory. Importantly, on streams that contain sensitive information, linear sketches can be easily privatized with the injection of a suitable amount of noise. This process is efficient in the single release model, where the output is released only at the end of the stream. In this setting, it suffices to add noise to the sketch once. In contrast, in the continual observation model, where the output is released at every time-step, fresh noise needs to be added to the sketch before each release. This creates an additional computational overhead. To address this, we introduce Lazy Sketch, a novel differentially private sketching method that employs lazy updates, perturbing and modifying only a small portion of the sketch at each step. Compared to prior work, we reduce the update complexity by a factor of $O(w)$, where $w$ is the width of the sketch. Experiments demonstrate that our method increases throughput by up to 250x over prior work, making continual observation differential privacy practical for high-speed streaming applications. In addition, for heavy hitter detection, we present a new sketch-based algorithm that leverages lazy updates to achieve a per-update complexity of $O(d \log (T/w) + \log w)$, for linear sketches with dimension $d\times w$ and streams of length $T$. This marks a significant improvement over prior approaches in the streaming continual observation model, which require recomputing frequency estimates for every item in the input domain at each time step.

Paperid: 4056, https://arxiv.org/pdf/2507.03258.pdf

Abstract:
Programmable blockchains have long been a hot research topic given their tremendous use in decentralized applications. Smart contracts, using blockchains as their underlying technology, inherit the desired properties such as verifiability, immutability, and transparency, which make it a great suit in trustless environments. In this thesis, we consider several decentralized protocols to be built on blockchains, specifically using smart contracts on Ethereum. We used algorithmic and cryptographic tools in our implementations to further improve the level of security and efficiency beyond the state-of-the-art works. We proposed a new approach called Blind Vote, which is an untraceable, secure, efficient, secrecy-preserving, and fully on-chain electronic voting protocol based on the well-known concept of Chaum's blind signatures. We illustrate that our approach achieves the same security guarantees as previous methods such as Tornado Vote [1], while consuming significantly less gas. Thus, we provide a cheaper and considerably more gas-efficient alternative for anonymous blockchain-based voting. On the other hand, we propose a new family of algorithms for private, trustless auctions that protect bidder identities and bid values while remaining practical for smart contract execution. We ensure trustlessness by running the auction logic in a smart contract, thereby eliminating reliance on any single trusted party. This approach prevents bid tampering, front-running, and collusion by enforcing immutability and decentralized verification of bids. The resulting protocol uniquely combines efficiency, trustlessness, and enduring bid privacy, offering a scalable and secure solution for blockchain-based marketplaces and other decentralized applications.

Paperid: 4057, https://arxiv.org/pdf/2507.03010.pdf

Abstract:
We evaluate language models' ability to subvert monitoring protocols via collusion. More specifically, we have two instances of a model design prompts for a policy (P) and a monitor (M) in a programming task setting. The models collaboratively aim for M to classify all backdoored programs in an auditing dataset as harmful, but nevertheless classify a backdoored program produced by P as harmless. The models are isolated from each other, requiring them to independently arrive at compatible subversion strategies. We find that while Claude 3.7 Sonnet has low success rate due to poor convergence, it sometimes successfully colludes on non-obvious signals.

Paperid: 4058, https://arxiv.org/pdf/2507.03007.pdf

Abstract:
Machine learning (ML) frameworks rely heavily on pseudorandom number generators (PRNGs) for tasks such as data shuffling, weight initialization, dropout, and optimization. Yet, the statistical quality and reproducibility of these generators-particularly when integrated into frameworks like PyTorch, TensorFlow, and NumPy-are underexplored. In this paper, we compare the statistical quality of PRNGs used in ML frameworks (Mersenne Twister, PCG, and Philox) against their original C implementations. Using the rigorous TestU01 BigCrush test suite, we evaluate 896 independent random streams for each generator. Our findings challenge claims of statistical robustness, revealing that even generators labeled ''crush-resistant'' (e.g., PCG, Philox) may fail certain statistical tests. Surprisingly, we can observe some differences in failure profiles between the native and framework-integrated versions of the same algorithm, highlighting some implementation differences that may exist.

Paperid: 4059, https://arxiv.org/pdf/2507.03000.pdf

Abstract:
We present a deterministic framework for cryptographic seed generation based on cyclic modular inversion over $\mathbb{Z}/3^p\mathbb{Z}$. The method enforces algebraic admissibility on seed inputs via the identity $d_k \equiv -\left(2^{k-1}\right)^{-1} \bmod 3^p$, thereby producing structured and invertible residue sequences. This mapping yields entropy-rich, cycle-complete seeds well-suited for cryptographic primitives such as DRBGs, KDFs, and post-quantum schemes. To assess the quality of randomness, we introduce the Entropy Confidence Score (ECS), a composite metric reflecting coverage, uniformity, and modular bias. Although not a cryptographic PRNG in itself, the framework serves as a deterministic entropy filter that conditions and validates seed inputs prior to their use by conventional generators. Empirical and hardware-based results confirm constant-time execution, minimal side-channel leakage, and lightweight feasibility for embedded applications. The framework complements existing cryptographic stacks by acting as an algebraically verifiable entropy filter, thereby enhancing structural soundness and auditability.

Paperid: 4060, https://arxiv.org/pdf/2507.02995.pdf

Abstract:
The rapid advancement of diffusion models, particularly Stable Diffusion 3.5, has enabled the generation of highly photorealistic synthetic images that pose significant challenges to existing detection methods. This paper presents FreqCross, a novel multi-modal fusion network that combines spatial RGB features, frequency domain artifacts, and radial energy distribution patterns to achieve robust detection of AI-generated images. Our approach leverages a three-branch architecture: (1) a ResNet-18 backbone for spatial feature extraction, (2) a lightweight CNN for processing 2D FFT magnitude spectra, and (3) a multi-layer perceptron for analyzing radial energy profiles. We introduce a novel radial energy distribution analysis that captures characteristic frequency artifacts inherent in diffusion-generated images, and fuse it with spatial and spectral cues via simple feature concatenation followed by a compact classification head. Extensive experiments on a dataset of 10,000 paired real (MS-COCO) and synthetic (Stable Diffusion 3.5) images demonstrate that FreqCross achieves 97.8\% accuracy, outperforming state-of-the-art baselines by 5.2\%. The frequency analysis further reveals that synthetic images exhibit distinct spectral signatures in the 0.1--0.4 normalised frequency range, providing theoretical foundation for our approach. Code and pre-trained models are publicly available to facilitate reproducible research.

Paperid: 4061, https://arxiv.org/pdf/2507.02866.pdf

Abstract:
This paper examines how distinct cultures of AI interdisciplinarity emerge through interface design, revealing the formation of new disciplinary cultures at these intersections. Through the Interface-Mediated Cognitive Security (IMCS) framework, I demonstrate how the collision of cybersecurity engineering, cognitive psychology, critical technology studies, and human-computer interaction generates research cultures that transcend traditional disciplinary boundaries. AI interfaces function as transformative boundary objects that necessitate methodological fusion rather than mere collaboration, simultaneously embodying technical architectures, psychological design patterns, and social interaction models. Through systematic visual analysis of generative AI platforms and case studies across public sector, medical, and educational domains, I identify four vulnerability vectors, Reflection Simulation, Authority Modulation, Cognitive Load Exploitation, and Market-Security Tension, that structure interface-mediated cognitive security. This research challenges three significant gaps in interdisciplinary theory: the assumption that disciplines maintain distinct methodological boundaries during collaboration, the belief that technical and social knowledge practices can be cleanly separated, and the presumption that disciplinary integration occurs through formal rather than cultural mechanisms. The empirical evidence demonstrates how interfaces function as sites of epistemological collision, creating methodological pressure zones where traditional disciplinary approaches prove insufficient for analysing the complex socio-technical phenomena at the interface.

Paperid: 4062, https://arxiv.org/pdf/2507.02125.pdf

Abstract:
The blockchain oracle problem, which refers to the challenge of injecting reliable external data into decentralized systems, remains a fundamental limitation to the development of trustless applications. While recent years have seen a proliferation of architectural, cryptographic, and economic strategies to mitigate this issue, no one has yet fully resolved the fundamental question of how a blockchain can gain knowledge about the off-chain world. In this position paper, we critically assess the role artificial intelligence (AI) can play in tackling the oracle problem. Drawing from both academic literature and practitioner implementations, we examine how AI techniques such as anomaly detection, language-based fact extraction, dynamic reputation modeling, and adversarial resistance can enhance oracle systems. We observe that while AI introduces powerful tools for improving data quality, source selection, and system resilience, it cannot eliminate the reliance on unverifiable off-chain inputs. Therefore, this study supports the idea that AI should be understood as a complementary layer of inference and filtering within a broader oracle design, not a substitute for trust assumptions.

Paperid: 4063, https://arxiv.org/pdf/2507.01018.pdf

Abstract:
Smart homes that integrate Internet of Things (IoT) devices face increasing cybersecurity risks, posing significant challenges to these environments. The study explores security threats in smart homes ecosystems, categorizing them into vulnerabilities at the network layer, device level, and those from cloud-based and AI-driven systems. Research findings indicate that post-quantum encryption, coupled with AI-driven anomaly detection, is highly effective in enhancing security; however, computational resource demands present significant challenges. Blockchain authentication together with zero-trust structures builds security resilience, although they need changes to existing infrastructure. The specific security strategies show their effectiveness through ANOVA, Chi-square tests, and Monte Carlo simulations yet lack sufficient scalability according to the results. The research demonstrates the requirement for improvement in cryptographic techniques, alongside AI-enhanced threat detection and adaptive security models which must achieve a balance between performance and efficiency and real-time applicability within smart home ecosystems.

Paperid: 4064, https://arxiv.org/pdf/2507.00907.pdf

Abstract:
In a world where deepfakes and cloned voices are emerging as sophisticated attack vectors, organizations require a new security mindset: Sensorial Zero Trust [9]. This article presents a scientific analysis of the need to systematically doubt information perceived through the senses, establishing rigorous verification protocols to mitigate the risks of fraud based on generative artificial intelligence. Key concepts, such as Out-of-Band verification, Vision-Language Models (VLMs) as forensic collaborators, cryptographic provenance, and human training, are integrated into a framework that extends Zero Trust principles to human sensory information. The approach is grounded in empirical findings and academic research, emphasizing that in an era of AI-generated realities, even our eyes and ears can no longer be implicitly trusted without verification. Leaders are called to foster a culture of methodological skepticism to protect organizational integrity in this new threat landscape.

Paperid: 4065, https://arxiv.org/pdf/2507.00740.pdf

Abstract:
This paper presents a complete formal specification, protocol description, and mathematical proof structure for Simplified Payment Verification (SPV) as originally defined in the Bitcoin whitepaper \cite{nakamoto2008}. In stark contrast to the misrepresentations proliferated by popular implementations, we show that SPV is not only secure under bounded adversarial assumptions but strictly optimal for digital cash systems requiring scalable and verifiable transaction inclusion. We reconstruct the SPV protocol from first principles, grounding its verification model in symbolic automata, Merkle membership relations, and chain-of-proof dominance predicates. Through rigorous probabilistic and game-theoretic analysis, we derive the economic bounds within which the protocol operates securely and verify its liveness and safety properties under partial connectivity, hostile relay networks, and adversarial propagation delay. Our specification further introduces low-bandwidth optimisations such as adaptive polling and compressed header synchronisation while preserving correctness. This document serves both as a blueprint for secure SPV implementation and a rebuttal of common misconceptions surrounding non-validating clients.

Paperid: 4066, https://arxiv.org/pdf/2507.00189.pdf

Abstract:
This analysis focuses on a single Azure-hosted Virtual Machine at 52.230.23.114 that the adversary converted into an all-in-one delivery, staging and Command-and-Control node. The host advertises an out-of-date Apache 2.4.52 instance whose open directory exposes phishing lures, PowerShell loaders, Reflective Shell-Code, compiled Havoc Demon implants and a toolbox of lateral-movement binaries; the same server also answers on 8443/80 for encrypted beacon traffic. The web tier is riddled with publicly documented critical vulnerabilities, that would have allowed initial code-execution had the attackers not already owned the device. Initial access is delivered through an HTML file that, once de-obfuscated, perfectly mimics Google Unusual sign-in attempt notification and funnels victims toward credential collection. A PowerShell command follows: it disables AMSI in-memory, downloads a Base64-encoded stub, allocates RWX pages and starts the shell-code without ever touching disk. That stub reconstructs a DLL in memory using the Reflective-Loader technique and hands control to Havoc Demon implant. Every Demon variant-32- and 64-bit alike-talks to the same backend, resolves Windows APIs with hashed look-ups, and hides its activity behind indirect syscalls. Runtime telemetry shows interests in registry under Image File Execution Options, deliberate queries to Software Restriction Policy keys, and heavy use of Crypto DLLs to protect payloads and C2 traffic. The attacker toolkit further contains Chisel, PsExec, Doppelganger and Whisker, some of them re-compiled under user directories that leak the developer personas tonzking123 and thobt. Collectively the findings paint a picture of a technically adept actor who values rapid re-tooling over deep operational security, leaning on Havoc modularity and on legitimate cloud services to blend malicious flows into ordinary enterprise traffic.

Paperid: 4067, https://arxiv.org/pdf/2507.00145.pdf

Abstract:
AI-Hybrid TRNG is a deep-learning framework that extracts near-uniform entropy directly from physical noise, eliminating the need for bulky quantum devices or expensive laboratory-grade RF receivers. Instead, it relies on a low-cost, thumb-sized RF front end, plus CPU-timing jitter, for training, and then emits 32-bit high-entropy streams without any quantization step. Unlike deterministic or trained artificial intelligence random number generators (RNGs), our dynamic inner-outer network couples adaptive natural sources and reseeding, yielding truly unpredictable and autonomous sequences. Generated numbers pass the NIST SP 800-22 battery better than a CPU-based method. It also passes nineteen bespoke statistical tests for both bit- and integer-level analysis. All results satisfy cryptographic standards, while forward and backward prediction experiments reveal no exploitable biases. The model's footprint is below 0.5 MB, making it deployable on MCUs and FPGA soft cores, as well as suitable for other resource-constrained platforms. By detaching randomness quality from dedicated hardware, AI-Hybrid TRNG broadens the reach of high-integrity random number generators across secure systems, cryptographic protocols, embedded and edge devices, stochastic simulations, and server applications that need randomness.

Paperid: 4068, https://arxiv.org/pdf/2507.00003.pdf

Abstract:
This paper presents NeutroSENSE, a neutrosophic-enhanced ensemble framework for interpretable intrusion detection in IoT environments. By integrating Random Forest, XGBoost, and Logistic Regression with neutrosophic logic, the system decomposes prediction confidence into truth (T), falsity (F), and indeterminacy (I) components, enabling uncertainty quantification and abstention. Predictions with high indeterminacy are flagged for review using both global and adaptive, class-specific thresholds. Evaluated on the IoT-CAD dataset, NeutroSENSE achieved 97% accuracy, while demonstrating that misclassified samples exhibit significantly higher indeterminacy (I = 0.62) than correct ones (I = 0.24). The use of indeterminacy as a proxy for uncertainty enables informed abstention and targeted review-particularly valuable in edge deployments. Figures and tables validate the correlation between I-scores and error likelihood, supporting more trustworthy, human-in-the-loop AI decisions. This work shows that neutrosophic logic enhances both accuracy and explainability, providing a practical foundation for trust-aware AI in edge and fog-based IoT security systems.