Aligning large vision-language models (LVLMs) with human preferences is challenging due to the scarcity of fine-grained, high-quality, and multimodal preference data without human annotations. Existing methods relying on direct distillation often struggle with low-confidence data, leading to suboptimal performance. To address this, we propose CAREVL, a novel method for preference reward modeling by reliably using both high- and low-confidence data. First, a cluster of auxiliary expert models (textual reward models) innovatively leverages image captions as weak supervision signals to filter high-confidence data. The high-confidence data are then used to fine-tune the LVLM. Second, low-confidence data are used to generate diverse preference samples using the fine-tuned LVLM. These samples are then scored and selected to construct reliable chosen-rejected pairs for further training. CAREVL achieves performance improvements over traditional distillation-based methods on VL-RewardBench and MLLM-as-a-Judge benchmark, demonstrating its effectiveness. The code will be released soon.
Paperid: 660,
Authors: Bocheng Pan, Hailong Shi, Xingyu Gao
Title: DR-VQA:Decompose-then-ReconstructforVisualQuestion AnsweringinBLVAssistance
Paperid: 661,
Authors: Qian Sun, Chengzhuo Lu, Wenyu Chen, Wenjie Wei, Jingya Wang, Jieyuan Zhang, Xiaoli Liu, Yalan Ye, Yang Yang, Malu Zhang
Title: Temporal-coded Spiking Transformer
Paperid: 662,
Authors: Yuhang Lan, Shilin Xu, Chao Su, Run Ye, Dezhong Peng, Yuan Sun
Title: Multi-view Hashing Classification
Paperid: 663,
Authors: Junwei Zhu, Wei Li, honghui xu, jiawei jiang, Zhi Liu, Jianwei Zheng
Title: Arbitrary-scale Fusion Neural Operator
Paperid: 664,
Authors: Jiawei Zhang, Xiaoli Jiang, Hao Wang, Lin Yuan, Xiangyang Luo, Bin Ma, Jinwei Wang
Title: DVW: Diffusion Visible Watermark
Paperid: 665,
Authors: Yifan Hu, Rui Liu, Yi Ren, Xiang Yin, Haizhou Li
Title: UniTalker: Conversational Speech-Visual Synthesis
Paperid: 666,
Authors: Cheng Peng, Oya Celiktutan
Title: Multi-Task Gaze Communication Understanding
Paperid: 667,
Authors: Yingbing Liu, Fei Ma, Yanan Wu, Xinxin Zuo, Fan Zhang, Yang Wang
Title: Collaborative Cloud-edge Generalized Category Discovery
Paperid: 668,
Authors: Na Zhao, Kejiang Chen, Yuang Qi, Kai Zeng, Weiming Zhang, Nenghai Yu
Title: Merging-Resistant Watermarking for LoRA Modules
Paperid: 669,
Authors: Quanmin Liang, Jinyi Lu, Qiang Li, Shuai Liu, Zhihao Zhao, Yinzheng Zhao, Wei Zhang, Kai Huang, Yonghong Tian
Title: ESOD: Event-Based Small Object Detection
Paperid: 670,
Authors: Zhihong Zheng, Yang Cao, Junlong Gao, Hanzi Wang
Title: OV-VOD: Open-Vocabulary Video Object Detection
Paperid: 671,
Authors: Ziyu Wang, Yiming Du, Rui Ning, Lusi Li
Title: Energy-based Deep Incomplete Multi-View Clustering
Paperid: 672,
Authors: Kuan Liu, Ke Wang, Ji Zhang, Gang Zhou
Title: LLM-Grounded Diffusion for Cross-Domain Recommendation
Paperid: 673,
Authors: Ruocheng Gu, Sen Jia, Yule Ma, Jinqin Zhong, Jenq-Neng Hwang, Lei Li
Title: MoCount: Motion-Based Repetitive Action Counting
Paperid: 674,
Authors: Xin Zhang, Weiying Xie, Yunsong Li, Xiaoyu Chen, Tianlin Hui, Jitao Ma, Leyuan Fang
Title: TF-ATM: Training-Free Adaptive Token Merging
Paperid: 675,
Authors: Zebing Yao, Hao Fu, Yuanhang Yang, Guanghua Gu
Title: Dynamic Optimization Noisy Cross-Modal Hashing
Paperid: 676,
Authors: LiyuanCao LiyuanCao, ZiHang Guo, Huaiwen Zhang
Title: Event Consistency-aware Robust Fake News Detection
Paperid: 677,
Authors: Wending Xiong, Ruimin Hu, Lingfei Ren, Xixi Li, Dengshi Li
Title: SE2E: Recognizing Emotion Behind Societal Behavior
Paperid: 678,
Authors: Bingqian Zhou, Zhihao Wu, Yushi Cheng, Wenyuan Xu
Title: AdvPainting: Clean-text Jailbreaking Against Inpainting Models
Paperid: 679,
Authors: Shubo Liu, Hongsheng Zhang, Qian Qiao, Qi Wu, PENG WANG
Title: VLN-ChEnv: Vision-language Navigation in Changeable Environments
Paperid: 680,
Authors: Kai Niu, Liucun Shi, Ke Han, Qinzi Zhao, Yue Wu, Yanning Zhang
Title: Test-Time Adaptation for Text-Based Person Search
Paperid: 681,
Authors: Xiaofeng Liu, Guanchen Meng, Chongyang Feng, Risheng Liu, Zhongxuan Luo, Xin Fan
Title: TNT-GS: Truncated and Tailored Gaussian Splatting
Paperid: 682,
Authors: Chenbo Zhang, Bing Huangfu, Hongxu Ma, Jihong Guan, Shuigeng Zhou
Title: Multi-modal Prototype Guided Few-shot Object Detection
Paperid: 683,
Authors: Zikai Zhang, Xu Zhang, ziyi li, Yidong Li, Yuanzhouhan Cao
Title: GMML:Gradient-Modulated Robustness for Imbalance-Aware Multimodal Learning
Paperid: 684,
Authors: Changzhou Li, Xinyu Yang, Weiguo Yang, Xinyi Li
Title: VaF-LangSplat: Voxel-Aware Fusion Language Gaussian Splatting
Paperid: 685,
Authors: Guoyi Li, Die Hu, Haozhe Li, Qirui Tang, Xiaomeng Fu, Yulei Wu, Xiaodan Zhang, Honglei Lyu
Title: Zero-Shot Multimodal Fact-Checking with Conceptual Reasoning
Paperid: 686,
Authors: Tongfei Liu, Yufan Liu, Bing Li, Weiming Hu, Yuming Li, Chenguang Ma
Title: Noise-Optimized Distribution Distillation for Dataset Condensation
Paperid: 687,
Authors: Jilong Wei, Yangyang Hu, Xiangjuan Wu, Yiqiang Wu, Hao Liu
Title: Appearance Contrasts for Unconstrained Age Estimation
Paperid: 688,
Authors: Jiajun Han, Xuran Yang, Hui Zhang
Title: Query-Focused Multimodal Summarization with Gate-Guided Mixture-of-Experts
Paperid: 689,
Authors: Wentao Fan, Chao Zhang, Chunlin Chen, Huaxiong Li
Title: Online Cross-Modal Hashing with Multi-Level Memory
Paperid: 690,
Authors: Jiajun Zhang, Xin Li, Si Wu, Yong Xu, Yaowei Wang
Title: Prior-Free Augmentation for Cloth-Changing Person Re-Identification
Paperid: 691,
Authors: Kamakshya Nayak, Kamalakar Thakare, Ashesh Xalxo, Lalit Lohani, Debi Prosad Dogra
Title: Can Person-Level Attributes Improve Group Re-Identification?
Paperid: 692,
Authors: Zelei Wu, Xulun Ye, Jieyu Zhao
Title: Clustering-Based Tail-class Mitigation for New-class Discovery
Paperid: 693,
Authors: Ze Huang, Zhongyang Xiao, Mingliang Song, Yu Fang, Hongyuan Yuan, Kevin Sun, Li Zhang
Title: MS-Road: Towards Spatiotemporal-Consistent Large-Scale Road Reconstruction
Paperid: 694,
Authors: Adhi Widagdo, Teemu Kämäräinen, Ahmad Alhilal, Matti Siekkinen, Cheng-Hsin Hsu
Title: Gaze-Adaptive Foveation for Remote Rendered VR
Paperid: 695,
Authors: Yao Zhang, Ping Huang, Rui Zhang
Title: Multimodal Dual Population Evolutionary Reinforcement Learning
Paperid: 696,
Authors: Yanming Chen, Zixin Ma, Chuanguang Yang, Zhulin An, Yiwen Zhang
Title: Accelerating Diffusion Models via Parallel Denoising
Paperid: 697,
Authors: Guimin Hu, Yi Xin, Lijie Hu, Zhihong Zhu, Hasti Seifi
Title: PgM: Partitioner Guided Modal Learning Framework
Paperid: 698,
Authors: Lingling Dai, Andong Li, Zhe Han, Chengshi Zheng, Xiaodong Li
Title: BAPEN: Towards Versatile Audio Phase Retrieval
Paperid: 699,
Authors: Siqi Song, Limin Yu, Jimin XIAO
Title: SDP: Spectral-Decomposed Prompting for Continual Learning
Paperid: 700,
Authors: Hridayesh Lekhak, Theron Wang, Tuan Dang, Kenny Zhu
Title: DogSpeak: A Canine Vocalization Classification Dataset
Paperid: 701,
Authors: Nickolay Safonov, Rakhmanov Mikhail, Dmitriy Vatolin
Title: Screen content video dataset and benchmark
Paperid: 702,
Authors: Yiang Zhu, Haoyue Wang, Zhenxing Qian, Sheng Li, Xinpeng Zhang, Jian liu
Title: Towards Generalized Physical Occlusion Detection On Documents
Paperid: 703,
Authors: Chenxu Wang, Dong Zhou, Ting Liu, Jianghao Lin, Yongmei Zhou, Aimin Yang
Title: DiffTMR: Diffusion-based Hierarchical Alignment for Text-Molecule Retrieval
Paperid: 704,
Authors: Jiahao Wang, Fang Liu, Licheng Jiao, Hao Wang, Shuo Li, Lingling Li, Puhua Chen, Xu Liu, Xinyi Wang
Title: FA³T: Feature-Aware Adversarial Attacks for Multi-modal Tracking
Paperid: 705,
Authors: Zhiyuan Fan, Keyi Liang
Title: Video-to-Image Affordance Grounding via Visual Conceptual Learning
Paperid: 706,
Authors: Wei Li, Junwei Zhu, honghui xu, jiawei jiang, Jianwei Zheng
Title: SpecSolver: Solving Spatial-Spectral Fusion via Semantic Transformer
Paperid: 707,
Authors: Yifan Wang, Yuntai Ding, Yiyang Gu, Ziyue Qiao, Chong Chen, Xian-Sheng Hua, Ming Zhang, Wei Ju
Title: Deep Graph Clustering with Disentangled Representation Learning
Paperid: 708,
Authors: Binrui Wu, Haochen Sui, Jiaye Lin, Jiechao Gao, Ting Xu, Keyan Jin, Xuesong Zhang
Title: Prototype-Guided Representation Projection for Multi-Domain Multi-Task Recommendation
Paperid: 709,
Authors: Zihao Zhang, Xingjiao Wu, Junjie Xu, Tianlong Ma, Tangren Yao, Wen Wu, Liang He
Title: Temporal-Conditioned Symbolic Alignment for Controllable Text-to-Music Generation
Paperid: 710,
Authors: Shuo Wang, Zhichuan Wang, Yanmin Chen, Mengyao Zhou, Jun Luo
Title: DRMix: Decomposition-Recomposition Data Augmentation with Diffusion Model
Paperid: 711,
Authors: Tianyi Zhang, Qinglong Lin, Yang Hu, Pengming Feng, Rubo Zhang
Title: Edge-aware Affinity Enhancement for Image Manipulation Localization
Paperid: 712,
Authors: Yize Song, Yunqing Chen, Zhou Wang, Cheng Chen, Ruoxiu Xiao
Title: Symmetrical Awareness Generation for Pelvic Image Segmentation
Paperid: 713,
Authors: Haoyu Shi, Huaiwen Zhang
Title: Sequence-Event Semantic Consistent Learning for Text-to-Motion Retrieval
Paperid: 714,
Authors: Peiqi Jiang, Bohan Lei, Yuhao Sun, Lingyun Yu, Zhineng Chen, Hongtao Xie, Yongdong Zhang
Title: Proactive Deepfake Detection via Self-Verifiable Semantic Watermarking
Paperid: 715,
Authors: Qinchen Wu, Difei Gao, Kevin Qinghong Lin, Zhuoyu Wu, Mike Zheng Shou
Title: GUI-Narrator: Detecting and Captioning Computer GUI Actions
Paperid: 716,
Authors: Wang Runjie, Kemi Chen, Shuijie Li, Mingkai chen, Tiesong Zhao
Title: Efficient Semantic Codec for Real-time Vibrotactile Transmission
Paperid: 717,
Authors: Renjie Lin, Jiacheng Li, Shide Du, Shiping Wang, Le Zhang
Title: OIMGC-Net: Optimization-inspired Interpretable Multi-view Graph Clustering Network
Paperid: 718,
Authors: Zhan Yang, Binghong Chen, Jiajun Tang, Yinan Li
Title: Unsupervised Similarity-Fusion Transformer Hashing for Multimodal Retrieval
Paperid: 719,
Authors: Nhu-Thuat Tran, Hady Lauw
Title: Parameter-Efficient Variational AutoEncoder for Multimodal Multi-Interest Recommendation
Paperid: 720,
Authors: Jun Yang, MAOYU MAO
Title: DiffuSeg: Diffusion-Enhanced Cross-Modal Semantic Segmentation for RGB-D
Paperid: 721,
Authors: Binbin Zheng, Aiqiu Wu, Kai Fan, Ao Li, Minghui Wang
Title: Domain-Specific Interactive Prompting for Generalized Nuclei Classification
Paperid: 722,
Authors: Weicheng Xie, Chunlin Yan, Siyang Song, Zitong YU, Linlin Shen, Laizhong Cui
Title: Smooth Online Multiple Appropriate Facial Reaction Generation
Paperid: 723,
Authors: Xinyao Li, Dan Zhang, Zhekai Du, Lei Zhu, Zhi Chen, Jingjing Li
Title: PatAug: Augmentation of Augmentation for Test-Time Adaptation
Paperid: 724,
Authors: Fan Qi, Zhan Wang, Changsheng Xu, Huaiwen Zhang
Title: Fine-tuning Bias Neurons for Fair Text-to-Image Generation
Paperid: 725,
Authors: Zixuan Wan, Jiqing Zhang, Yushan Wang, Hu Lin, Yafei Wang, Zetian Mi, Xin Yang, Xianping Fu, Huibing Wang
Title: Eye-based Emotion Recognition via Event-Driven Sparse Transformers
Paperid: 726,
Authors: Shuyong Gao, Qianyu Guo, Yuang Feng, Chunyuan Chen, Xujun Wei, Yan Wang, Wenqiang Zhang
Title: Progressive Representation Learning for Weakly-Supervised Camouflaged Detection
Paperid: 727,
Authors: Xu Shaowu, Xibin Jia, Junyu Gao, Qianmei Sun, Jing Chang, Chao Fan
Title: Cross-Modal Dual-Causal Learning for Long-Term Action Recognition
Paperid: 728,
Authors: Yichen Bao, Yuxuan Liu, Yu Duan, Jing Li, Quanxue Gao
Title: Multi-view Clustering Based on Probabilistic Tensor Regression
Paperid: 729,
Authors: Zhijie Rao, Jingcai Guo
Title: Balancing Cross-Modal Attention for Generalized Zero-Shot Learning
Paperid: 730,
Authors: Sidun Liu, Wenyu Li, Peng Qiao, Yong Dou
Title: Regist3R: Increamental Registration with Stereo Foundation Model
Paperid: 731,
Authors: Jiahao Li, Yiqiang Chen, Yunbing Xing, Yang Gu, Xiangyuan Lan
Title: K-Space Bispectrum Steganography for Robust Unlearnable Data
Paperid: 732,
Authors: Kai Li, Wenqi Ren, Wei Wang, Linchao Zhang, Xiaochun Cao
Title: Detecting Synthetic Image by Cross-Modal Commonality Interaction
Paperid: 733,
Authors: Wei Chen, Jianwei Niu, Xuefeng Liu, Xinghao Wu
Title: Decoupling Dense Video Captioning via Task-specific Prompts
Paperid: 734,
Authors: Xuedong He, Huiying Xu, Xinzhong Zhu, Hongbo Li
Title: High-Performance Discriminative Tracking with Spatio-Temporal Template Fusion
Paperid: 735,
Authors: Le Han, Kaixuan Chen, Minchen Ye, Nenggan Zheng
Title: Hi-Motion: Hierarchical Intention Guided Conditional Motion Synthesis
Paperid: 736,
Authors: Fan Li, Jiazhen Huang, Shisong Tang, Bing Han, Huafeng Cao, Haochen Sui, Ting Xu, Kangxiaoyu Kangxiaoyu
Title: Contrastive Prototype Framework for Calibrating Video Recommendation
Paperid: 737,
Authors: Yalan Qin, Nan Pu, Hanzhou Wu, Zhaoxin Fan
Title: Flexible Multi-view Clustering with Dynamic Views Generation
Paperid: 738,
Authors: Haolun Li, Weihuang Liu, JiaTeng Liu, Zhenhua Tang, Chi-Man Pun, Qiguang Miao, Feng Xu, Hao Gao
Title: MotionRefineNet: Fine-Grained Pose Sequence Smoothing and Refinement
Paperid: 739,
Authors: Taichun Zhou, Zhibin Dong, Siwei Wang, KE LIANG, Miaomiao Li, Xinwang Liu, En Zhu, Xiangjun Dong
Title: DPFMVC: Dynamic Progressive Fusion for Multi-view Clustering
Paperid: 740,
Authors: Jiaxin Peng, Siwang Zhou, Chengqing Li, Yucheng Li, Dunyun Chen
Title: Mitigating Delivery Artifacts in Real-World Video Super-Resolution
Paperid: 741,
Authors: Pengsheng Liu, Zhaojie Chu, Xiaofen Xing, Xiangmin Xu
Title: SemGesture: Synthesizing Semantically Enhanced and Coherent Gestures
Paperid: 742,
Authors: Zheyun Qin, Deng Yu, Yang Shi, Qiangchang Wang, Zhumin Chen
Title: Video Instance Segmentation by Weighted Structure Inference
Paperid: 743,
Authors: Vitalii Emelianov, Niki Martinel
Title: Neural Additive Adapters for Interpretable Nutrition Prediction
Paperid: 744,
Authors: Qiyuan Zhu, Lujun Li, Dezhi Li, Jiacheng Liu, Pengyu Cheng, Yucheng Xu, Sirui Han, Yike Guo
Title: Outlier-Aware Model Merging for Efficient Multitask Inference
Paperid: 745,
Authors: wangjiawen wangjiawen, Jianjun Li, Zhiyuan Ma, Bairuixia Bairuixia
Title: SAKR-Edit: Scene-Aware Knowledge Reasoning for Text-to-Image Editing
Paperid: 746,
Authors: Da Zhang, Feiyu Wang, Bingyu Li, Zhiyuan Zhao, Junyu Gao, Xuelong Li
Title: KAID: Knowledge-Aware Interactive Distillation for Vision-Language Models
Paperid: 747,
Authors: Gang Pan, Meihua Liu, Lei Zhou, Jiahao Wang, Di Sun
Title: Image Retargeting based on Text Region Awareness
Paperid: 748,
Authors: Cai Xu, Ziqi Wen, Jie Zhao, Wanqing Zhao, Jinlong Yu, Haishun Chen, Ziyu Guan, Wei Zhao
Title: Beyond Equal Views: Strength-Adaptive Evidential Multi-View Learning
Paperid: 749,
Authors: Xiaoyu Chen, Yigang Cen, Wanru Xu, Yue Zhang, Yi Jin, Yidong Li, Linna Zhang
Title: Hierarchical Meta-prototypes Network for Few-shot Action Recognition
Paperid: 750,
Authors: Ao Yang, Yanglin Feng, Yuan Sun, Dezhong Peng, Guiduo Duan, Yang Qin
Title: Noise-Robust Cross-modal Learning for Reliable 2D-3D Retrieval
Paperid: 751,
Authors: Chenda Wei, Haoyue Wang, Zhenxing Qian, Sheng Li, Xinpeng Zhang, Jian liu
Title: Learning Discrepant Transformations for Face Privacy Protection
Paperid: 752,
Authors: Weitao You, Heda Zuo, Junxian Wu, Dengming Zhang, Zhibin ZHOU, Lingyun Sun
Title: Spatial-Temporal Decomposition and Alignment in Controllable Video-to-Music Generation
Paperid: 753,
Authors: Yawei Chen, Huibing Wang, Mingze Yao, Jinjia Peng, Guangqi Jiang, Jiqing Zhang
Title: Scalable Multi-view Clustering based on Tight Anchor Distribution
Paperid: 754,
Authors: Ruiqi Dong, Wenjing Pang, ChenJie Pan, Heng-yang Lu, Chenyou Fan
Title: StoryCrafter: Instance-Aligned Multi-Character Storytelling with Diffusion Policy Learning
Paperid: 755,
Authors: Qi Chen, Zhuoya Yao, Haiguang Wang, Gaowei Wu, Bihui Yu, Siyuan Li, Jingxuan Wei, Cheng Tan
Title: ResearchPulse: Building Method–Experiment Chains through Multi-Document Scientific Inference
Paperid: 756,
Authors: Jianqiao Cui, Bingyao Yu, Qihao Wang, Fei Meng, Jiwen Lu
Title: WhiADD: Semantic-Acoustic Fusion for Robust Audio Deepfake Detection
Paperid: 757,
Authors: Rouqi Zhang, Chengdi Lu, Hancheng Lu, Yang Cao, Tiesong Zhao
Title: RobustVisH: Robust Visual-Haptic Cross-Modal Recognition Under Transmission Interference
Paperid: 758,
Authors: Wenlan Chen, Lu Gao, Cheng Liang, Fei Guo
Title: Deep Variational Incomplete Multi-View Clustering with Information-Theoretic Guidance
Paperid: 759,
Authors: Yizhou Lin, Nisha Huang, Kaer Huang, Henglin Liu, Yiqiang Yan, Jie Guo, Tong-Yee Lee, Xiu Li
Title: ICE: Intercede Concept Erasure in Text-to-Image Diffusion Models
Paperid: 760,
Authors: Chengcheng Xing, Yanyu Xu, Yonghui Xu, Lizhen Cui
Title: Learning Invariant Discriminative Patterns for Unified Anomaly Detection
Paperid: 761,
Authors: Nan He, Yiming Chen, Zheng Jiang, Song Yang, Lifeng Sun
Title: DynFed: Adaptive Federated Learning via Quantization-Aware Knowledge Distillation
Paperid: 762,
Authors: Liuyi Li, Feng Shi, Jian Wang, Jinjing Zhu, Wenze Shao
Title: An Event-tailored State-Space based Model for Pedestrian Detection
Paperid: 763,
Authors: Chen Gao, Youfang Lin, Wenbin Wang, Shuo Zhang
Title: Epipolar Consistency-based Network for Structure-Aware LF Semantic Segmentation
Paperid: 764,
Authors: Kyungjune Lee, Seongjean Kim, Hoseok Tong, Hyucksang Lee, Seongmin Lee, Weisi Lin, Ping An, Sanghoon Lee
Title: Domain crossover Non-Rigid Registration for 3D Human Meshes
Paperid: 765,
Authors: Xinyu Xiao, Peixi Peng, Qiang Wang, XingChao XingChao, Shuhan Qi
Title: Multi-faceted Complementary Learning for Incomplete Multi-view Multi-label Classification
Paperid: 766,
Authors: Hang Lv, Zixuan Guo, Zijie Wu, Yanchao Tan, Guofang Ma, zhigang lin, Xiping Chen, Hong Cheng, Carl Yang
Title: MedAlign: Enhancing Combinatorial Medication Recommendation with Multi-modality Alignment
Paperid: 767,
Authors: Yu Tong, Weihai Lu, Xiaoxi Cui, Yifan Mao, Zhejun Zhao
Title: DAPT: Domain-Aware Prompt-Tuning for Multimodal Fake New Detection
Paperid: 768,
Authors: Yufan Hu, Kunlin Yang, Junyu Gao, Bin Fan, Hongmin Liu
Title: Learning Evidential Delta Denoising Scores for Video Editing
Paperid: 769,
Authors: Donglin Zhang, Boyuan Ma, Xiaojun Wu, Josef Kittler
Title: Ingredients-Guided and Nutrients-Prompted Network for Food Nutrition Estimation
Paperid: 770,
Authors: Juan Zhao, Yudao Sun, Zhihai Yang, Cai Xu, Hongji Chen, Fan Zhang, Jianxin Li
Title: Cross-Model Watermarking via Discriminative Samples for Secure Authentication
Paperid: 771,
Authors: Xiaohan Yu, Zicheng Pan, Yang Zhao, Qin Zhang, Yongsheng Gao
Title: Contrastive Lie Algebra Learning for Ultra-Fine-Grained Visual Categorization
Paperid: 772,
Authors: Zhang Haofan, Shangfei Wang
Title: EmIT: Emotional Interaction control in Text-to-image diffusion models
Paperid: 773,
Authors: Haoxiang Cao, Chaoqun Wang, Yongwen Lai, Shaobo Min, Xuejin Chen
Title: CausalCtrl: Causality-Aware Control Framework for Text-Guided Visual Editing
Paperid: 774,
Authors: Chong Wu, Maolin Che, Renjie Xu, Zhuoheng Ran, Hong Yan
Title: ELFATT: Efficient Linear Fast Attention for Vision Transformers
Paperid: 775,
Authors: Xuan Zhang, Sinchee Chin, Jing-Hao Xue, Xiaochen Yang, Wenming Yang
Title: DARL: Mitigating Gradient Conflicts in Long-Tailed Out-of-Distribution Learning
Paperid: 776,
Authors: Zhongyun Bao, Jhon Jhon, Jianchi Sun, Jing Zhou, Ziqi Yu, Chunxia Xiao
Title: I2HDiffuser: Image Illumination Harmonization Meets the Diffusion Model
Paperid: 777,
Authors: Wenming Wu, Tianlei Sheng, Gaofeng Zhang, Liping Zheng
Title: FloorplanSBS: Synthesizing Vector Floorplans by Patch-Based Floorplan Segmentation
Paperid: 778,
Authors: Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Chi LEUNG, Tien-Tsin Wong
Title: ColorDiffuser: Video Colorization with Pretrained Text-to-Image Diffusion Models
Paperid: 779,
Authors: Zeyu Xia, Canqun Yang, Haoang Chi, Tao Tang, weiming xiang, Yingbo Cui
Title: MMF-SV: A Multi-Modal Feature Fusion-Based Structural Variant Caller
Paperid: 780,
Authors: Jiayi Gao, Huaiwen Zhang
Title: Evaluating and Mitigating Sycophancy in Large Vision-Language Models
Paperid: 781,
Authors: Zhen Wang, Dongyuan Li, Yaozu Wu, Peide Zhu, Shiyin Tan, Renhe Jiang
Title: Video-based Transparent Object Segmentation via Temporal Feature Aggregation
Paperid: 782,
Authors: Fengxin Li, Zhiqian Yin, Hongyan Liu, Jingcai Guo, Jun He, Yi LI, CHAO ZHOU, Jun Zhang, Haijie Gu
Title: Topic Guided Multi-faceted Semantic Disentanglement for CTR prediction
Paperid: 783,
Authors: Biao Chen, Kunbin He, Zhikun Zheng, Mengmeng Jing, Lin Zuo
Title: Chain-of-Thought Guided Semantic Debiasing for Low-Shot Vision-Language Tasks
Paperid: 784,
Authors: Hanmo Chen, Chenghao Xu, Jiexi Yan, Cheng Deng
Title: AStF: Motion Style Tranfer via Adaptive Statistics Fusor
Paperid: 785,
Authors: Ting Li, Songtao Li, Shuaifeng Li, Xiaolin Qin, Maoyuan Zhao, Luping Ji, Mao Ye
Title: SAM-Guided Semantic Knowledge Fusion for Visible-Infrared Object Detection
Paperid: 786,
Authors: Mingle Zhou, Jiahui Liu, Jin Wan, Gang Li, Min Li
Title: Exploring Multimodal Prompts For Unsupervised Continuous Anomaly Detection
Paperid: 787,
Authors: Mengling Xu, Ming Tao, Bingkun BAO
Title: Chain-of-Cooking: Cooking Process Visualization via Bidirectional Chain-of-Thought Guidance
Paperid: 788,
Authors: Hang Xiong, Runmin Cong, Jinpeng Chen, Chen Zhang, Feng Li, Huihui Bai, Sam Kwong
Title: MM-Prompt: Multi-modality and Multi-granularity Prompts for Few-Shot Segmentation
Paperid: 789,
Authors: Ruonan Wei, Yuntao Wang, Siyan Fang, Yuehuan Wang
Title: End-to-End Multiple Object Tracking with Dynamic Scene Perception
Paperid: 790,
Authors: Ziqi Yuan, Jun Li, Yanghao Li, Yuxiang Huang, Chi Chen, Shuo Wang, Zhinan Gou
Title: CITR: Efficient Long Video Understanding Needs Causal Importance
Paperid: 791,
Authors: Zhaoqi chen, Wanni Xu, Yunfeng Zhang, Yawei Hou, Zhenyu Wen, Cong Wang
Title: DeCoRec: Decoupled Collaborative Refinement for Multi-Modal Sequential Recommendations
Paperid: 792,
Authors: Hang Yu, Yimin Wen, Xiongjian Lv
Title: DiffuFuse:Diffusion-Driven Dual-Stream Fusion Framework for Multimodal Sentiment Analysis
Paperid: 793,
Authors: Pengyuan Li, Man Liu, Dongxia Chang, Yiming Wang, Zisen Kong, Yao Zhao
Title: AEMVC: Mitigate Imbalanced Embedding Space in Multi-view Clustering
Paperid: 794,
Authors: Fei Ye, Adrian Bors
Title: Online Continual Learning via Dynamic Expandable Recursive Model
Paperid: 795,
Authors: Xinbiao Gan, Qiang Zhang, Tiejun Li, Chunye Gong, Kai Lu
Title: GraphWorld: Ultra-fast Graph Engine for World-Wide Web Searching
Paperid: 796,
Authors: Yufeng Chen, Umakant Kulkarni, Voicu Popescu, Sonia Fahmy
Title: RUN: A Case for Cross-Layer Networked Virtual Reality
Paperid: 797,
Authors: Dongjian Yu, Weiqing Min, Xin Jin, Qian Jiang, Shuqiang Jiang
Title: Spatial-Aware Multi-Modal Information Fusion for Food Nutrition Estimation
Paperid: 798,
Authors: Rui Shang, Min Liu, Xueping Wang, Yuan Bian, Yaonan Wang
Title: Decoupled Identity and Attribute Tokenization for Person Re-Identification
Paperid: 799,
Authors: Disen Hu, Xun Jiang, Sun Zhe, Hao Yang, Chong Peng, Peng Yan, Heng Tao Shen, Xing Xu
Title: Geometric Gradient Divergence Modulation for Imbalanced Multimodal Learning
Paperid: 800,
Authors: Ruian He, Zixian Zhang, Ri Cheng, Weimin Tan, Bo Yan
Title: Efficient Trajectory Space-Time Super-Resolution for Fast Live-cell Imaging
Paperid: 801,
Authors: Tung-I Chen, Dae Lee, Guan-Ming Su, Mohammad Hajiesmaili, Ramesh Sitaraman
Title: NIVM: Real-time View Morphing via Neural Implicit Function
Paperid: 802,
Authors: Zhicheng Dong, Xiaodong Yue, Yufei Chen, Yuxian Zhou
Title: Trusted Open-World Multi-View Classification with Dynamic Opinion Aggregation
Paperid: 803,
Authors: Penglei Wang, Ziming Quan, Danyang Wu, Jin Xu
Title: Cluster-Aware Contrastive Multi-View Clustering Based on Masked Views
Paperid: 804,
Authors: Kaixiang Wang, Xiaojian Ding, Wanqi Yang, Ming Yang
Title: Label-Semantics-Guided Multi-View Multi-Label Learning via High-Order Semantic Fusion
Paperid: 805,
Authors: Rui Wang, Yuxuan Liu, Guangyu Yang, Quanxue Gao, Cheng Deng
Title: Bi-Orthogonal Non-negative Tensor tri-Factorization for Tensorized Label Learning
Paperid: 806,
Authors: Yuwu Lu, Haoyu Huang, Xue Hu
Title: Domain-aware Visual Context Prompt for Multi-Source Domain Adaptation
Paperid: 807,
Authors: Chengzhou Li, Xiaokang Liu, Qi Jia, Jinyuan Liu, Zhiying Jiang, Longhan Feng, Yu Liu, Zhongxuan Luo, Xin Fan
Title: Physics-Guided Sonar Image Fine-grained Recognition under Scarce Annotations
Paperid: 808,
Authors: Songtao Zhou, Xiaoyu Qin, Yixuan Zhou, Qixin Wang, Zeyu Jin, Zixuan Wang, Zhiyong Wu, Jia Jia
Title: HarmoniVox: Painting Voices to Match the Avatar’s Soul
Paperid: 809,
Authors: Chen Feng, Nicu Sebe, Georgios Tzimiropoulos, Miguel Rodrigues, Ioannis Patras
Title: Unveiling Open-set Noise: Theoretical Insights into Label Noise
Paperid: 810,
Authors: Lin Peng, Cong Wan, Shaokun Wang, Xiang Song, Yuhang He, Yihong Gong
Title: CIA: Class- and Instance-aware Adaptation for Vision-Language Models
Paperid: 811,
Authors: Jiaqi Cui, Yilun Li, Xi Wu, Jiliu Zhou, Yan Wang
Title: PREMISE: Individual Preference-aware Multi-modal Cooperation for Survival Prediction
Paperid: 812,
Authors: Hongyu Liu, Hongwei Ge, Yuxuan Liu, Yaqing Hou
Title: Dialogue-Driven Interactive Dynamic Learning for Text-to-Image Person Retrieval
Paperid: 813,
Authors: Hongyang Lin, Kuixiang Shao, Peijun Xu, Zhuoyang Bu, Yuyang Jiao, Ziyuan Tang, Chenxi Xiao, Jingyi Yu
Title: HandCraft: Tactile-Informed Hand-Object Dynamics Capture and Realistic Rendering
Paperid: 814,
Authors: Fan Qi, Ao Liu, Zixin Zhang, Changsheng Xu
Title: FORGET ME: Federated Unlearning for Face Generation Models
Paperid: 815,
Authors: Kailong Yu, Liyuan Pan, Liu Liu, Wei Liang
Title: Enhanced Dual-Pixel Image Reflection Removal via Gaussian Splatting
Paperid: 816,
Authors: Jiawei Gu, Ziyue Qiao, Zechao Li
Title: Activation Shape Matters: OOD Detection with Norm-Entropy Fusion
Paperid: 817,
Authors: dengwen wang, Guanyu Xing, Yanli Liu
Title: Low-light Invariant Representation Learning for Visible-Infrared Person Re-identification
Paperid: 818,
Authors: Shuo Li, Xingchen Liu, Fang Liu, Licheng Jiao, Jiahao Wang, Xinyan Huang, Yanbiao Ma, Puhua Chen, Lingling Li, Xu Liu, Xuejian Gou
Title: Imagining Vision From Language for Few-Shot Class-Incremental Learning
Paperid: 819,
Authors: Cheng Luo, Siyang Song, Siyuan Yan, zhen yu, Zongyuan Ge
Title: ReactDiff: Fundamental Multiple Appropriate Facial Reaction Diffusion Model
Paperid: 820,
Authors: Huadai Liu, Jialei Wang, Xiangtai Li, Wen Wang, Qian Chen, Rongjie Huang, Yang Liu, Jiayang Xu, Wei Xue, Zhou Zhao
Title: MelodyEdit: Zero-shot Music Editing with Disentangled Inversion Control
Paperid: 821,
Authors: Qixun Zeng
Title: Retrieval Augmented 3D Garment Generation from Single Image
Paperid: 822,
Authors: Jingjun Yi, Qi Bi, Hao Zheng, Huimin Huang, Haolan Zhan, Yixian Shen, Wei Ji, Yawen Huang, Yuexiang Li, Xian Wu, Yefeng Zheng
Title: AtlantisGS: Underwater Sparse-View Scene Reconstruction via Gaussian Splatting
Paperid: 823,
Authors: Yonghyeon Jo, JangHyun Kim, Jinsun Park
Title: BAC-GCN: Background-Aware CLIP-GCN Framework for Unsupervised Multi-Label Classification
Paperid: 824,
Authors: Lizhi Xiong, Peipeng Yu, Yue Wu
Title: MADPHash: Manipulation-Aware Deep Perceptual Hashing using Feature Consistency
Paperid: 825,
Authors: Quanhong Peng, Dan Zhang, Dong Zhao, Jianpeng Zhang, Meihua Song, Chenlei Lv
Title: Cam-Bench: A Benchmark for Image-based Camera Parameter Estimation
Paperid: 826,
Authors: Zihan Wang, Yunhang Shen, Yuan Fang, Zuwei Long, Ke Li, Xing Sun, Jiao Xie, Shaohui Lin
Title: Towards Universal Perception through Language-Guided Open-World Object Detection
Paperid: 827,
Authors: Zihang Zhang, Shoulong Zhang, Yan Wang, Shuai Li
Title: Reactffusion: Physical Contact-guided Diffusion Model for Reaction Generation
Paperid: 828,
Authors: Xinlong Zhang, Zejian Li, Wei Li, Xiaoyu Zhang, Jia Wei, Chengyu Lin, Yongchuan Tang
Title: ObjCtrl: Object-based Control Relaxation for Conditional Text-to-image Generation
Paperid: 829,
Authors: Long Chen, De Cheng, Shizhou Zhang, Yinghui Xing, Di Xu, Yanning Zhang
Title: Amplitude-aware Domain Style Replay for Lifelong Person Re-identification
Paperid: 830,
Authors: Fujian Ren, Wenlan Chen, Lu Gao, Fei Guo, Cheng Liang
Title: Dual-level Distribution Alignment for Deep Incomplete Multi-view Clustering
Paperid: 831,
Authors: Xinchen Ye, Aokai Zhang, Rui Xu
Title: Semantics-Driven Contrastive Learning for Real-World Depth Super Resolution
Paperid: 832,
Authors: Junyu Chen, Jiawei Peng, Yuan Sun, Jian Dai, Xingfeng Li, Zhenwen Ren
Title: Scalable Unpaired Multi-View Clustering via Anchor-Driven High-Throughput Encoding
Paperid: 833,
Authors: Zefan Zhang, Weiqi Zhang, Kailong Suo, yanhui li, Tian Bai
Title: Video-Level Multimodal Relation Extraction with Event-Entity Semantic Consistency
Paperid: 834,
Authors: Wanying Zhou, Yuqi Sun, Yu Ling, Zhen Xing, Chenxi Ma, Weimin Tan, Bo Yan
Title: TabiMed: Tabularizing Medical Images for Few-Shot In-Context Diagnosis
Paperid: 835,
Authors: Liang Yao, Fan Liu, Shengxiang Xu, Chuanyi Zhang, Shimin Di, Xing Ma, Jianyu Jiang, Zequan Wang, Jun Zhou
Title: UEMM-Air: Enable UAVs to Undertake More Multi-modal Tasks
Paperid: 836,
Authors: Tan Yue, xuzhao Shi, Rui Mao, Zilong Song, Zonghai Hu, Dongyan Zhao
Title: AnaFig: A Human-Aligned Dataset for Scientific Figure Analysis
Paperid: 837,
Authors: SungHyun Moon, Aidyn Zhakatayev, SeungJae Lee
Title: HAN: Korean Heritage Augmented Narrative Visual-Language Description Dataset
Paperid: 838,
Authors: Maksim Golyadkin, Innokentiy Humonen, Valeria Rubanova, Danil Kalin, Ianis Plevokas, Dmitry Nikolotov, Aleksandr Utkov, Nikita Sidelnikov, Petr Ivanov, Bureeva Ekaterina, Ekaterina Alexandrova, Ilya Makarov
Title: MuMMy: Multimodal Dataset supporting VLM-based Egyptology Research Assistant
Paperid: 839,
Authors: Xuanliu Zhu, Yiqiao Chai, Runnan Li, Mingying Lan, Li Gao
Title: CrossMind-VL: Multi-Subject Mind-to-Video Decoding with Multimodal LLM Semantic Grounding
Paperid: 840,
Authors: Inzamamul Alam, Md Islam, Simon Woo
Title: SpecXNet: A Dual-Domain Convolutional Network for Robust Deepfake Detection
Paperid: 841,
Authors: Tianming Xu, Tiantian Guo, Youdan Feng, Zihan Chen, Qiaoyi Xue, Lingzhi Hu, Yuhang SHI
Title: Anatomical Region-Guided 3D PET/MR Tumor Segmentation via Medical Record
Paperid: 842,
Authors: Shengjiu Dai, Xiujian Liang, Sheng Li, Zhenxing Qian, Xinpeng Zhang
Title: Safe-BVAR: Text-to-Image Generative Watermarking for Bitwise Visual AutoRegressive Model
Paperid: 843,
Authors: Min Li, Jinghui He, Jiachen Li, Delong Han, Jin Wan, Gang Li
Title: HGCF: Hierarchical Geometry-Color Fusion for Multimodal Industrial Anomaly Detection
Paperid: 844,
Authors: Andong Zhu, Sheng Zhang, Xiaohang Shi, Hesheng Sun, Yu Liang, Zhuzhong Qian, Han Zheng, Xiaokun Wang, Ning Jiang
Title: VidIQ: Inference-Aware Neural Codecs for Quality-Enhanced, Real-Time Video Analytics
Paperid: 845,
Authors: Cheng Ye, Weidong Chen, Peipei Song, Xinyan Liu, Lei Zhang, Zhendong Mao
Title: Multi-round Mutual Emotion-Cause Pair Extraction for Emotion-Attributed Video Captioning
Paperid: 846,
Authors: Zhenxi Wang, Zongyao Yin, Yujie Hou, Xianchuan Yu
Title: Robust Multi-view Clustering via Pseudo Label Guided Universum Learning
Paperid: 847,
Authors: Hao Wang, Hanxiao Li, Li Xu
Title: CrosST: Cross Swin 4D Transformer for Multi-Modal Alzheimer’s Detection
Paperid: 848,
Authors: Shengqian Zhu, yu chengrong, Wenbo Qi, Jiafei Wu, Ying Song, Guangjun Li, Zhang Yi, Xiaogang Xu, Junjie Hu
Title: PRIME: Prototype-Driven Class Incremental Learning for Medical Image Segmentation
Paperid: 849,
Authors: Luyan Cui, Huibing Wang, Yawei Chen, Mingze Yao, Xianping Fu, Jiqing Zhang
Title: Dual-Constraint Multi-view Fuzzy Clustering with Scalable Anchor Graph Learning
Paperid: 850,
Authors: Junkang Liu, Fanhua Shang, Yuxuan Tian, Hongying Liu, Yuanyuan Liu
Title: Consistency of Local and Global Flatness for Federated Learning
Paperid: 851,
Authors: Xingchen Li, Wuyang Zhang, Guoliang You, Xiaomeng Chu, Wenhao Yu, Yifan Duan, Yuxuan Xiao, Yanyong Zhang
Title: CalibWorkflow: A General MLLM-Guided Workflow for Centimeter-Level Cross-Sensor Calibration
Paperid: 852,
Authors: Shifeng Bao, Zhe Xue, Qi Chen, Shilong Ou, Amin Beheshti, Quan Sheng, Anton van den Hengel, Yuankai Qi
Title: CausalMVC: Causal Content-Style Representation Learning for Deep Multi-View Clustering
Paperid: 853,
Authors: Zhou Tan, De Li, Yirui Huang, Jia-Li Yin, Ximeng Liu
Title: FeatShield: Isolating Malicious Feature Extractors for Backdoor-Robust Federated Learning
Paperid: 854,
Authors: Jinghan Liu, Xingmei Wang, Jiaxiang Meng
Title: Adaspeaker: Learning Discriminative Speaker Representations with Gradient-Aware Adaptive Scaling
Paperid: 855,
Authors: Jingrou Wu, Haoxian Liu, Jin Zhang, Dan Wang, Jing Jiang
Title: P²VS: Progressive Partition-based Volumetric Video Streaming under Network Dynamics
Paperid: 856,
Authors: Meng Chu, Yicong Li, Tat-Seng Chua
Title: GraphVideoAgent: Understanding Long Videos via LLM-Powered Entity Relation Graphs
Paperid: 857,
Authors: Yawen Cui, Wenbin Zou, Huiping Zhuang, Yi Wang, Lap-Pui Chau
Title: Probabilistic Mixture of Hyperbolic Mamba for Few-Shot Class-Incremental Learning
Paperid: 858,
Authors: Peng Zhao, Zhiguang Cao, Di Wang, Wen Song, Wei Pang, You Zhou, Yuan Jiang
Title: Visual-Enhanced Multimodal Framework for Flexible Job Shop Scheduling Problem
Paperid: 859,
Authors: Tao Ling, Siping SHI, Dan Wang
Title: Accelerating Long Video Understanding via Compressed Scene Graph-Enabled Chain-of-Thought
Paperid: 860,
Authors: Wenxiang Liu, Yongkang Liu, Weiliang Meng, Gaoqi He, Jianhua Li
Title: D³L: Curvature-Constrained Denoising Diffusion Model for 3D Lane Detection
Paperid: 861,
Authors: Xiaobo Liu, Henglu Wei, Chuxi Yang, Wei Yu, Xudong Zhao, Xiangyang Ji
Title: Camera-Specific Imaging Simulation for Raw Domain Image Super Resolution
Paperid: 862,
Authors: Amruta Muthal, Varghese Kuruvilla, Ravi Kiran Sarvadevabhatla
Title: PLATO: Generating Objects from Part Lists via Synthesized Layouts
Paperid: 863,
Authors: Jiawei Zhang, Haonan Zhang, Weitao Zhang, Liang Pu, Zesen Feng, Jie Guo
Title: Decoupled Motion Prediction for Real-time G-buffer Free Frame Extrapolation
Paperid: 864,
Authors: Yuzhen Niu, Siling Chen, Yuzhong Chen, Fusheng Li, Rui Xu, Hui Da
Title: CoFiVLA: Synergistic Coarse-Fine Vision-Language Alignment for Image Aesthetic Assessment
Paperid: 865,
Authors: Bowen Guo, Shiwei Gan, Yafeng Yin, Xiao Liu, Zhiwei Jiang, Shunmei Meng
Title: Sentence-level Segmentation for Long Sign Language Videos with Captions
Paperid: 866,
Authors: Lehao Lin, Baohua Fang, Ziheng Sun, Ke Wang, Hong Kang, Wei Cai
Title: BS3: Bézier Slicing Middleware for 3D Mesh LOD Optimization
Paperid: 867,
Authors: Ziming Quan, Penglei Wang, Danyang Wu, Jin Xu
Title: Unsupervised Cross-view Message Passing Method for Multi-view Graph Clustering
Paperid: 868,
Authors: Zongxin Liu, Yishu Liu, Guangming Lu, Xiaoling Luo, Bingzhi Chen
Title: CauRDG: Enhancing Domain Generalization with Causal-Driven Semantic Consistency Reasoning
Paperid: 869,
Authors: Haolin Wang, Yafei Ou, Prasoon Ambalathankandy, Gen Ota, Pengyu Dai, Masayuki Ikebe, Kenji Suzuki, Tamotsu Kamishima
Title: Layer Separation: Towards Adjustable Joint Space Width Images Synthesis
Paperid: 870,
Authors: Demin Yu, Wenchuan Du, Kenghong Lin, Xutao Li, Yunming Ye, Luo Chuyao, Chenxunlai Chenxunlai
Title: PiMMNet: Introducing Multi-Modal Precipitation Nowcasting via a Physics-informed Perspective
Paperid: 871,
Authors: Weihai Lu, Li Yin
Title: DMMD4SR: Diffusion Model-based Multi-level Multimodal Denoising for Sequential Recommendation
Paperid: 872,
Authors: Shu-Xun Yang, Xian-Ling Mao, Heyan Huang
Title: ESTJ: Enhancing Structured Tendency Judgment in Hybrid-Modal Table Understanding
Paperid: 873,
Authors: Yimou Guo, Yaochen Li, Jingze Liu, Jiahui Feng, Haoyi Lou, Zhimin Chen, Yuan Gao, Yuanqi Su
Title: Image Captioning with Multimodal Guidance and Search Space Optimization
Paperid: 874,
Authors: Hengnian Gu, Zhifu Chen, Jin Peng Zhou, Dongdai Zhou
Title: Hierarchical Disentanglement of Cognitive States for Enhanced Cognitive Diagnosis
Paperid: 875,
Authors: Yi Dai, Yang Ding, Kaisheng Zeng
Title: Bridging Domains in Mental Stress Assessment via Retrieval-Augmented Reasoning
Paperid: 876,
Authors: Jinjia Peng, Tianhang Cheng, Guangqi Jiang, Huibing Wang
Title: Prior-oriented Anchor Learning with Coalesced Semantics for Multi-View Clustering
Paperid: 877,
Authors: Yusen Wang, Huan Zhou, Yu Jiang, Chunxia Xiao
Title: Robust Gaussian Surface Reconstruction with Semantic Aware Progressive Propagation
Paperid: 878,
Authors: Qiyin Zhong, Xianglin Qiu, Xiaolei Wang, Zhen Zhang, Gang Liu, Jimin XIAO
Title: FAMRD: Frequency-Aware Multimodal Reverse Distillation for Industrial Anomaly Detection
Paperid: 879,
Authors: Xiao Hu, Heiko Neumann, Jochen Lang
Title: A Filtering Framework for Semi-online Referring Video Object Segmentation
Paperid: 880,
Authors: Weiqi Liu, Yongshan Zhang, Xinxin Wang, Lefei Zhang
Title: Deep Multi-Level Contrastive Clustering for Multi-Modal Remote Sensing Images
Paperid: 881,
Authors: Xueyu Yuan, Jiarui Zhang, Jiangqi Song, Liu Liu, Li Zhang, Dan Guo, Richang Hong, Meng Wang
Title: DFGAP: Towards Depth-Free Cross-Category GAParts Perception via Uncertainty-Quantified Modeling
Paperid: 882,
Authors: Weiwu Pang, Rajrup Ghosh, Jiawei Yang, Ziyu Wei, Branden Leong, Yue Wang, Ramesh Govindan
Title: SplatPose: On-Device Outdoor AR Pose Estimation Using Gaussian Splatting
Paperid: 883,
Authors: Yuxiang Zhao, Wei Huang, Haipeng Zeng, Huan Zhao, Yujie Song
Title: Cross Time Domain Intention Interaction for Conditional Trajectory Prediction
Paperid: 884,
Authors: Xuandong Huang, Yuzhe Zhou, Jiashu Li, Shiqian Lu, Shangfei Wang
Title: EmoDETective: Detecting, Exploring, and Thinking Emotional Cause in Videos
Paperid: 885,
Authors: Shengzhe You, Libo Weng, Fei Gao
Title: ViTraj: Learning Dual-Side Representations for Vehicle-Infrastructure Cooperative Trajectory Prediction
Paperid: 886,
Authors: Ruoxuan Li, Xiangyu Wu, Yang Yang
Title: Noise Self-Correction via Relation Propagation for Robust Cross-Modal Retrieval
Paperid: 887,
Authors: Yue Sun, Xinqi Liu, Zhiliang He, Jialu Zhang, Chenming Wu, Guodong Lu, Jituo Li
Title: DAFU-CAD: Depth-assisted Feature Unraveling for Sketch-based Robust CAD Modeling
Paperid: 888,
Authors: Xiaodong Zhu, Suting Wang, Junqi Yang, Yuhong Yang, Weiping Tu, Zhongyuan Wang
Title: Query-Based Audio-Visual Temporal Forgery Localization with Register-Enhanced Representation Learning
Paperid: 889,
Authors: Hongda Qin, Xiao Lu, Zhiyong Wei, Ningjiang Chen
Title: Object-Preserving Counterfactual Diffusion Augmentation for Single-Domain Generalized Object Detection
Paperid: 890,
Authors: Haosheng Cai, Yang Xue
Title: G2LFormer: Global-to-Local Query Enhancement for Robust Table Structure Recognition
Paperid: 891,
Authors: Haochen Yang, Lei Li, Jiacheng Guo, Baolu Li, Minghai Qin, Hongkai Yu, Tianyun Zhang
Title: DA3D: Domain-Aware Dynamic Adaptation for All-Weather Multimodal 3D Detection
Paperid: 892,
Authors: Kewei Zhao, Xiaowei Hu, Qinya Li
Title: Device-Cloud Collaborative Learning Framework for Efficient Unknown Object Detection
Paperid: 893,
Authors: Mianzimei Yang, Zhipeng Zhou, Jin Zhang, Yuanhao Pu, Hong Xie, Defu Lian
Title: Conflict-Buffering Optimization by Symmetry Teleportation for Deep Long-Tailed Recognition
Paperid: 894,
Authors: Tianzhong Lan, Zhang Yi, Xiuyuan Xu, Min Zhu
Title: LooBox: Loose-box-supervised 3D Tumor Segmentation with Self-correcting Bidirectional Learning
Paperid: 895,
Authors: Jiaxing Qi, Yifan Xu, Zhifei Yang, Ruifei Ma, Chao Zhang, Kuifei Yu
Title: BridgeGLM: Bridging Graph and Language Spaces for Domain Generalization
Paperid: 896,
Authors: Mengzu Liu, Junwei Xu, Tao Huang, Fangfang Wu, Le Dong, Xin Li, Weisheng Dong
Title: Exploring Global Correlations via Polarity Memory for Multispectral Demosaicing
Paperid: 897,
Authors: Junzhe Zhang, Chengfeng Han, Dandan Ding, Zhan Ma
Title: GeoQE: Enhancing Quality of Experience in Point Cloud Streaming
Paperid: 898,
Authors: Xiang Huang, Ao Luo, Xiao Wu, Zhaoquan Yuan
Title: Latent Interactiveness Field for Non-Contact Human Object Interaction Detection
Paperid: 899,
Authors: Yixuan Zhou, Yulu Tian, Wenliang Zhong, Xingbin Yu, Heng Tao Shen, Xing Xu
Title: SaP-Bot: A Multimodal Large-Language Model for End-to-End Same-Product Identification
Paperid: 900,
Authors: Tianjiao Xu, Hao Fu, Suiyang Zhang, Jianhua Yin, Tian Gan, Liqiang Nie
Title: Enhancing Democratic Mediation through Norm-Awareness in Generative Agent Societies
Paperid: 901,
Authors: Yongxin Li, Ying Cheng, Yaning Pan, Wen He, Qing Wang, Rui Feng, Xiaobo Zhang
Title: Semantic-Aware Hard Negative Mining for Medical Vision-Language Contrastive Pretraining
Paperid: 902,
Authors: Yuanyi Duan, Wei Xu, Qinlong Wu, Guo-Sen Xie, Fang Zhao, Caifeng Shan
Title: AnomalyControl: Highly-Aligned Anomalous Image Generation with Controlled Diffusion Model
Paperid: 903,
Authors: Jiawei Meng, Zhengmao Yang, Zhiqiang Liu, Shaokai Chen, Zhizhen Liu, Wen Zhang, Huajun Chen
Title: Text-to-Image Generation with Multi-modal Knowledge Graph Construction and Retrieval
Paperid: 904,
Authors: Jing Ma, Haochen Sun, Zeyuan Zang, Fangxiang Feng, Caixia Yuan, Lei Ren, Huixing Jiang, Chen Wei, Xiaojie Wang
Title: VL-DynaRefine: A Vision-Language Dynamic Refinement Approach for Visual Reasoning
Paperid: 905,
Authors: Haitao Wang, Sijia Wen, Bo Guo
Title: Polarimetric Monocular Gaussian Splatting SLAM for Dense Surface Reconstruction
Paperid: 906,
Authors: Gang Pan, Hongen Liu, Di Sun
Title: Formula Spotting Based on Synergy Perception and Representation Mining
Paperid: 907,
Authors: Guyue Jin, Tianming Zhao, Jiacan Yan, Tian Tian
Title: Contextually-Guided State Space Fusion for Misaligned Multi-Spectral Object Detection
Paperid: 908,
Authors: Ziming Zhao, Zhaoxuan Li, Tingting Li, Fan Zhang
Title: Stealthy-AE: Generating Stealthy Adversarial Examples through Online Social Networks
Paperid: 909,
Authors: Jinghan Yang, Zhenbo Xu, Dehua Ma, Liu Liu, Fei Liu, Gong Huang, Zhaofeng He
Title: RecipeRAG: Advancing Recipe Generation with Reinforced Retrieval Augmented Generation
Paperid: 910,
Authors: Zhilin Huang, Chujun Qin, Yifei Xing, Wenming Yang
Title: Enhanced Motion-aware Latent Diffusion Models for Video Frame Interpolation
Paperid: 911,
Authors: Yang Zhou, Jin Wang, Yuxiao Zhang, Kaixiang Huang, Guodong Lu, Jingru Yang, Shengfeng He
Title: Art4Math: Handwritten Mathematical Expression Recognition via Multimodal Sketch Grounding
Paperid: 912,
Authors: Fan Yang, Ling Deng, Zhiyong Gan, Qisheng He, Yuanbo Fang, Xiangmin Xu, Shuangping Huang, Tianshui Chen
Title: Optimal Feature Embedding for Document Large Visual Language Model
Paperid: 913,
Authors: Sifan Zuo, Youfa Liu, Bo Du
Title: CSDN: CLIP-Driven Similarity-Aligned Distillation Network for Weakly-Supervised Object Localization
Paperid: 914,
Authors: Mingyang Ding, Zhan Wang, Jiachen Wang, Tingting Han, Xinyuan Hu, Jiajun Ding, Min Tan, Zhenzhong Kuang
Title: FutureGS: Structured Gaussian Fields for Future-Aware Dynamic Scene Modeling
Paperid: 915,
Authors: Maksim Golyadkin, Valeria Rubanova, Aleksandr Utkov, Dmitry Nikolotov, Ilya Makarov
Title: Evaluation of Egyptian Hieroglyph Classification Across Diverse Writing Styles
Paperid: 916,
Authors: Shiying Lin, Rong Hu, Zuoyong Li, Qinghua Lin, Jiawei Wu, Changqing Zhang
Title: Gradient-Aware Revitalization of Non-Effective Samples in Medical Image Segmentation
Paperid: 917,
Authors: Ran Chen, Taiyi Su, Hanli Wang
Title: WaveCL: Wavelet Calibration Learning for Referring Video Object Segmentation
Paperid: 918,
Authors: Feng Chen, Jielong He, Yang Liu, Heng Liu, Zhe Chen, Yaxiong Wang
Title: Unsupervised Cross-Modal Person Search via Progressive Diverse Text Generation
Paperid: 919,
Authors: Mingjie Li, Junhao Lin, Dian Ouyang, Ying Zhang, Wei Wang
Title: Graph-based Approximate Nearest Neighbor Search by Deep Reinforcement Routing
Paperid: 920,
Authors: Shehzad Ali, Md Islam, IK HYUN LEE, Mingfu Xiong, Minh-Son Dao, Saeed Anwar, Sambit Bakshi, Khan Muhammad
Title: Towards Hazardous Activity Recognition for A Novel Real-World Dataset
Paperid: 921,
Authors: Wenxi Huang, Xiaojun Chen, Qin Zhang, Ting Wan, Ziqi Liu, Liang-Jie Zhang
Title: MRBench: A Multi-Image Reasoning Benchmark with Adaptive Knowledge Retrieval
Paperid: 922,
Authors: Yongan Guo, Zhongyan Zhou, Yuao Wang, Na Zhu, Xuyun Zhang, Hongwang Xiao, Yuan Miao, Bo Li
Title: RSFomer: Time Series Transformer for Robust Sports Action Recognition
Paperid: 923,
Authors: Liang Xu, Cathal Gurrin, Songkai Jia, Monica Ward, Allie Tran
Title: Through Someone Else’s Eyes: Lifelogging Meets Narrative Virtual Reality
Paperid: 924,
Authors: Yuzhen Li, Yuehui Han, Jianjun Qian, Jian Yang
Title: Self-Supervised Vision Graph Neural Networks Based on Contrastive Learning
Paperid: 925,
Authors: Zijun Xu, Jiahao Guo, Chunjie Zhang, Zhongyuan Wang, Chunxia Xiao, Chao Liang
Title: Quantum Interference-Inspired Who-What-Where Composite-Semantics Instance Search for Story Videos
Paperid: 926,
Authors: Ru Jia, Xiaoqian Liang, Xubin Duan, Jianji Wang, Nanning Zheng
Title: HybridPlane: A General 4D Representation for dynamic scene reconstruction
Paperid: 927,
Authors: Fan Wang, Zhangjie Fu, Xiang Zhang, Ziqiang Li, Ziwen He, Manyu Wang
Title: Pair-wise Confidence Difference-based Pseudo-Label Selection for Universal Mismatched Steganalysis
Paperid: 928,
Authors: Xiaoyan Yuan, Wei Wang, Junxin Chen, Xiping Hu
Title: Reading Between the Channels: Knowledge-Augmented Medical Time Series Classification
Paperid: 929,
Authors: Yufei Zheng, Jiawei Liu, Bingyu Hu, Zikun Wei, Yong Wu, Zheng-Jun Zha
Title: Dual Uncertainty-Guided Feature Alignment Learning for Text-Based Person Retrieval
Paperid: 930,
Authors: Benlong Wu, Yuang Qi, Xiuwei Shang, Weiming Zhang, Nenghai Yu, Kejiang Chen
Title: MMPro: A Decoupled Perception-Thinking-Action Framework for Secure GUI Agent
Paperid: 931,
Authors: Daoxu Sheng, Qi Qi, Jing-Yu Wang, Jianxin Liao
Title: Watch, Skip, Repeat: Hotspot-Aware Joint Optimization for Video Streaming
Paperid: 932,
Authors: Shilin Liu, Kyohei Kamikawa, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama
Title: Context-aware Image-to-Music Generation via Bridging Modalities through Musical Captions
Paperid: 933,
Authors: Shanding Diao, Yang Zhao, Yuan Chen, Zhao Zhang, Wei Jia, Ronggang Wang
Title: Multi-Layer Gaussian Splatting for Single-Image Feed-Forward Spatial Scene Reconstruction
Paperid: 934,
Authors: Mingle Zhou, Xingli Wang, Jiachen Li, Delong Han, Gang Li
Title: Unsupervised Dual-Domain Memory Model for Time Series Anomaly Detection
Paperid: 935,
Authors: Ziyun Qian, Xiao-Zeyu Xiao-Zeyu, Xingliang Jin, Dingkang Yang, Mingcheng Li, Zhenyi Wu, Dongliang Kou, Peng Zhai, Lihua Zhang
Title: UMSD:High Realism Motion Style Transfer via Unified Mamba-based Diffusion
Paperid: 936,
Authors: Xinyi Wang, Pengfei Ren, Haoyang Zhang, Xin Sheng, Da Li, Liang Xie, Yue Gao, Erwei Yin
Title: VIHand: Enhancing 3D Hand Pose Estimation with Visual-Inertial Benchmark
Paperid: 937,
Authors: Stefan Arzberger, Paul Raith, Werner Bailer, Marion Jaks
Title: A Dataset and Metric for Textual Video Content Description
Paperid: 938,
Authors: Minyi Zhao, YI LIU, wensong he, Bingzhe Yu, Yuxi Mi, Shuigeng Zhou
Title: Towards High Robust Vision-Language Large Models: Benchmark and Method
Paperid: 939,
Authors: Ines Riahi, Abduljalil Radman, Zixin Guo, Rachid Hedjam, Jorma Laaksonen
Title: Valor32k-AVQA v2.0: Open-Ended Audio-Visual Question Answering Dataset and Benchmark
Paperid: 940,
Authors: Alessandro Ragano, Carl Tolentino, Kata Szita, Dan Barry, Davoud Panah, Niall Murray, Andrew Hines
Title: EgoMusic: An Egocentric Augmented Reality Glasses Dataset for Music
Paperid: 941,
Authors: Guillaume Gautier, Xuemei Zhou, Nguyen Thong, Jack Jansen, Louis Fréneau, Marko Viitanen, Uyen Phan, Jani Käpylä, Irene Viola, Alexandre Mercat, Pablo Cesar, Jarno Vanne
Title: UVG-CWI-DQPC: Dual-Quality Point Cloud Dataset for Volumetric Video Applications
Paperid: 942,
Authors: Debora Russo, Nicola Mazzocca, Valeria Vittorini
Title: UR-MAT: A Multimodal, Material-Aware Synthetic Dataset of Urban Scenarios
Paperid: 943,
Authors: NEGIN GHAMSARIAN, Raphael Sznitman, Klaus Schoeffmann, Jens Kowal
Title: WetCat: Automating Skill Assessment in WetLab Cataract Surgery Videos
Paperid: 944,
Authors: Fanshen Meng, Zhenhua Meng, Ru Jin, Yuli Chen, Rongheng Lin, Budan Wu
Title: TAMER: Interest Tree Augmented Modality Graph Recommender for Multimodal Recommendation
Paperid: 945,
Authors: Xu Chen, Yang Li, Yahong Han, Jialie Shen
Title: Ex Pede Herculem, Predicting Global Actionness Curve from Local Clips
Paperid: 946,
Authors: Bohao Zhang, Haoxin Xu, Jingzhong Lin, Changbo Wang, Gaoqi He
Title: Regulatory Focus Theory Induced Micro-Expression Analysis with Structured Representation Learning
Paperid: 947,
Authors: Sifan Zhou, Ziyu Zhao, Jiahao Nie, Yichao Cao, Xiaobo Lu
Title: FocusTrack: One-Stage Focus-and-Suppress Framework for 3D Point Cloud Object Tracking
Paperid: 948,
Authors: Yuxuan Zhang, Bo Wang, Yu Du, Yangfu Zhu, Haorui Wang, Guangyao Su, Tao Zhou, Bin Wu
Title: Cause and Effect: Video Social Relationship Recognition from Causal Perspective
Paperid: 949,
Authors: Yixuan Gao, Xiongkuo Min, Jinliang Han, Yuqin Cao, Sijing Wu, Yunze Dou, Guangtao Zhai
Title: Multi-Dimensional Text-to-Face Image Quality Assessment Using LLM: Database and Method
Paperid: 950,
Authors: Xudong Wang, Lei Tan, Pingyang Dai, Liujuan Cao, Rongrong Ji
Title: GPT-ReID: Learning Fine-grained Representation with GPT for Text-based Person Retrieval
Paperid: 951,
Authors: Shuoshuo Li, Shuli Cheng, Liejun Wang
Title: Entity-Level Alignment with Prompt-Guided Adapter for Remote Sensing Image-Text Retrieval
Paperid: 952,
Authors: Jiale Yu, Baopeng Zhang, Zhu Teng, Jianping Fan
Title: OV-DAVEL: Towards Open-Vocabulary Dense Audio-Visual Event Localization in Untrimmed Videos
Paperid: 953,
Authors: Yongjie Hu, Yifan Jiang, Ziyun Li, Fei Gao, Henrik Boström, Nannan Wang
Title: CADQ: Attribute-Consistent Face Cartoonization with Cross-modal Aligned and Deformable Quantization
Paperid: 954,
Authors: Yang Deng, Yu-Kun Lai, Paul Rosin
Title: CCDb+: Enhanced Annotations and Multi-Modal Benchmark for Natural Dyadic Conversations
Paperid: 955,
Authors: Yuxiong Xu, Bin Li, Weixiang Li, Sara Mandelli, Viola Negroni, Sheng Li
Title: ALDEN: Dual-Level Disentanglement with Meta-learning for Generalizable Audio Deepfake Detection
Paperid: 956,
Authors: Runwei Situ, Yi Cai, Yong Xu, Jiexin Wang
Title: Ground and Reconstruct: Entity-Region Bidirectional Alignment Pre-Training for Low-Resource GMNER
Paperid: 957,
Authors: Guoqiang Liang, Chuan Qin, De Cheng, Shizhou Zhang, Yanning Zhang
Title: Boosting Multi-Modal Alignment: Geometric Feature Separation for Class Incremental Learning
Paperid: 958,
Authors: Pingting Hao, Huijie Zhang, Yongshan Zhang
Title: Tensor-based Opposing yet Complementary Learning for Multi-view Multi-label Feature Selection
Paperid: 959,
Authors: Hongzhao Li, Hualei Wan, Liangzhi Zhang, Mingyuan Jiu, Shupan Li, Mingliang Xu, Muhammad Haris Khan
Title: Towards Robust Multimodal Domain Generalization via Modality-Domain Joint Adversarial Training
Paperid: 960,
Authors: Zhiyu Ye, Guowen Li, Haoyuan Liang, Zixi Wang, Shilei Cao, Yushan Lai, Juepeng Zheng
Title: Quantifying Samples with Invariance for Source-Free Class Incremental Domain Adaptation
Paperid: 961,
Authors: Huilin Chen, Miaomiao Cai, Fan Liu, Zhiyong Cheng, Richang Hong, Meng Wang
Title: I³-MRec: Invariant Learning with Information Bottleneck for Incomplete Modality Recommendation
Paperid: 962,
Authors: Yongquan Xue, Zhaoru Guo, Zhaozhao Su, Chong Peng, Jun Feng, Pan Zhou, Marcin Pietron, Xiyuan Wang, Liejun Wang, Panpan Zheng
Title: RoDeCon-Net: Medical Image Segmentation via Robust Decoupling and Contrast-Enhanced Fusion
Paperid: 963,
Authors: Liangyu Fu, Junbo Wang, Yuke Li, Qiangguo Jin, Hongsong Wang, Ya Jing, Linjiang Huang, Liang Yao, Jiangbin Zheng, Xuecheng Wu, Zhiyong Wang
Title: DSACap: Enhancing Visual-Semantic Alignment with Diffusion-based Framework for Image Captioning
Paperid: 964,
Authors: Haizhou Wang, Guobing Zou, Fei Xu, Yangguang Cui, Tongquan Wei
Title: Multi-Width Neural Network-Assisted Hierarchical Federated Learning in Heterogeneous Cloud-Edge-Device Computing
Paperid: 965,
Authors: Yue Pan, Cunbo Li, Peiyang Li, Fali Li, Feng Wan, Dezhong Yao, Zehong Cao, Peng Xu
Title: Real-Time EEG Emotion Recognition from Dynamic Mixed Spatiotemporal Graph Learning
Paperid: 966,
Authors: Chuanwei Huang, Zexi Jia, Hongyan Fei, Yeshuang Zhu, Yuan Zhiqiang, Jinchao Zhang, Jie Zhou
Title: ArtFRD: A Fisher-Rao Mixture Metric for Generative Model Aesthetic Evaluation
Paperid: 967,
Authors: Xueyi Zhang, Peiyin Zhu, Jinping Sui, Xiaoda Yang, Jiahe Tian, Mingrui Lao, Siqi Cai, Yanming Guo, Jun Tang
Title: Choose Your Expert: Uncertainty-Guided Expert Selection for Continual Deepfake Detection
Paperid: 968,
Authors: Luyao Ren, Wenxin Yu, Zhiqiang Zhang, Chang Liu
Title: EMIFS: Efficient Multi-scale Information Fusion Self-supervision for Medical Image Segmentation
Paperid: 969,
Authors: Man Xiao, Jianbin Ye, Bo Liu, Zijian Gao, Kele Xu, Xiaodong Wang
Title: Analytic Synaptic Dynamic Scaling Balancer for Multimodal Deepfake Continual Detection
Paperid: 970,
Authors: Zixin Tang, Haihui Fan, Jinchao Zhang, Hui Ma, Xiaoyan Gu, Bo Li, Weiping Wang
Title: ShieldIR: Privacy-Preserving Unsupervised Cross-Domain Image Retrieval via Dual Protection Transformation
Paperid: 971,
Authors: Bo Wang, Jin Liu, Huiyuan Fu, Xin Wang, Heng Zhang, Huadong Ma
Title: Severe Light, Textureless Sight: A Benchmark for Extreme Exposure Correction
Paperid: 972,
Authors: Zihao Mo, Junye Chen, Chaowei Fang, Guanbin Li
Title: PatchWiper: Leveraging Dynamic Patch-Wise Parameters for Real-World Visible Watermark Removal
Paperid: 973,
Authors: Gang Pan, Liming Pan, Hongze Mi, Rongyu Xiong, Jiahao Wang, Di Sun
Title: AFFIR: Dual-Modal Attention Feature Fusion for Scene Text Image Retargeting
Paperid: 974,
Authors: Yijie Yang, Lianyong Qi, Weiming Liu, Fan Wang, Jing Du, Yuwen Liu, Xiaolong Xu, Qiang Ni, Wanchun Dou, Xiaokang Zhou
Title: Joint Test-time Adaptation with Refined Pseudo-labels and Latent Score Matching
Paperid: 975,
Authors: Zuona Chen, James She
Title: Infusing AI Art with Cultural Authenticity Through the Culture-Specific LoRA
Paperid: 976,
Authors: Yujiang Li, Zhili Zhou, Ruohan Meng, Baowei Wang, Xiaojuan Wang, Cheng Qiao, Jiantao Zhou
Title: Zero Matrix guided Adaptive Image Vaccine against Diffusion Model-based Multitask
Paperid: 977,
Authors: Lingbo Zhang, Bingqian Sun, Linghan Cai, Yifeng Wang, Ye Zhang, Songhan Jiang, Kai Zhang, Yongbing Zhang
Title: Counting by Points: Density-Guided Weakly-Supervised Nuclei Segmentation in Histopathological Images
Paperid: 978,
Authors: Lei Chen
Title: Graph-Perceptron with Semantic Fidelity for No-Reference Super-Resolution Image Quality Assessment
Paperid: 979,
Authors: Zhihao Wang, Shiyu Liu, Zhiwei He, Kangjie Zheng, Liangying Shao, Junfeng Yao, Jinsong Su
Title: Gloss Matters: Unlocking the Potential of Non-Autoregressive Sign Language Translation
Paperid: 980,
Authors: Xiaodi Xu, Lijie Li, Ye Wang, Tao Ren, Tian Qiao
Title: WFF: Wavelet-based Information Fusion for Multimodal Knowledge Graph Link Prediction
Paperid: 981,
Authors: Yongji Li, Luping Wang
Title: Spatial-Frequency Mamba Collaborative Learning Network for Infrared Small Target Detection
Paperid: 982,
Authors: shao jiang, Xinbo Zhao, XiaoChun Zou, XiaoLin Ye
Title: EgoHierMask: Hierarchical Semantic-Prior Guided Masked Autoencoder for Egocentric Action Recognition
Paperid: 983,
Authors: Hongyu Jiang, Yuxin Huo, Sirou Sheng, Hong Tao, Chenping Hou
Title: Scalable One-step Unaligned Multi-view Clustering via Joint High-Order Correlation Learning
Paperid: 984,
Authors: Xiangyu Shan, Heng Song, Junwu Zhu
Title: DFCNet: Dual-Factor Compensatory Clustering Network for Modality-Imbalanced Generalized Zero-Shot Learning
Paperid: 985,
Authors: Bingshuai Liu, Ante Wang, Zijun Min, Chenyang Lyu, Longyue Wang, Zhihao Wang, Xu Han, Peng Li, Jinsong Su
Title: EditEval: Towards Comprehensive and Automatic Evaluation for Text-guided Video Editing
Paperid: 986,
Authors: Xueheng Li, Xuanhua He, Tao Hu, Jie Zhang, Man Zhou, Chengjun Xie, Yingying Wang, Bo Huang
Title: Freq-RWKV: Granularity-Aware Spatial-Frequency Synergy via Dual-Domain Recurrent Scanning for Pan-sharpening
Paperid: 987,
Authors: Tianyi Ma, Maoying Qiao
Title: EBaR: Efficient Buffer and Resetting for Single-Sample Continual Test-Time Adaptation
Paperid: 988,
Authors: Yan Chen, Bingbing Jiang, Peng Zhou, Lei Duan, Yuhua Qian, Liang Du
Title: Balanced Multiple Kernel Clustering with Discrete Partition Entropy Auto Regularization
Paperid: 989,
Authors: Feiyu Peng, Chaobo He, Junwei Cheng, Huijuan Hu, Wenkai Zhang, Youda Mo
Title: Frequency-refined Graph Convolution Network with Cross-modal Wavelet Denoising for Recommendation
Paperid: 990,
Authors: Xiaokun Wang, Yuting Yan, Sheng Zhang, Andong Zhu, Ning Chen, Yu Chen, Zhuzhong Qian, Sanglu Lu, Yu Liang
Title: Decode-What-Matters: Frame-Level Parallel Generative Decoding to Accelerate Large-Scale Video Analytics
Paperid: 991,
Authors: Yamiao Ding, Tianrui Liu, Zhizhou Lu, Jun-Jie Huang, zhao wentao, Xinwang Liu, Meng Wang
Title: VSumMamba: Mamba Empowered Efficient Video Summarization with Multi-Scale Spatial-Temporal Modeling
Paperid: 992,
Authors: Yiyang Gu, Taian Guo, Hang Zhou, Zihao Chen, Zhiping Xiao, Yifang Qin, Xiao Luo, Wei Ju, Yifan Wang, Ming Zhang
Title: CODE: Towards Partial Label Graph Learning via Coupled Dual Separation
Paperid: 993,
Authors: Chunshi Wang, Hongxing Li, Yawei Luo
Title: SonicGauss: Interactive Position-aware Impact Audio Synthesis for 3D Gaussian Splatting
Paperid: 994,
Authors: Tengyu Ma, Jiafa Ruan, Yuetong Wang, Guangchao Han, Zhu Liu, Long Ma, Risheng Liu
Title: Degradation-Aware One-Step Diffusion Model for Content-Sensitive Super-Resolution in the Dark
Paperid: 995,
Authors: Domenic Zingsheim, Markus Plack, Hannah Dröge, Janelle Pfeifer, Patrick Stotko, Matthias Hullin, Reinhard Klein
Title: RIFTCast: A Template-Free End-to-End Multi-View Live Telepresence Framework and Benchmark
Paperid: 996,
Authors: Hui Wu, Haoquan Zhai, Yuchen Li, Hengyi Cai, Peirong Zhang, Yidan Zhang, Lei Wang, Chunle Wang, Yingyan Hou, Shuaiqiang Wang, Dawei Yin
Title: MARA: A Multimodal Adaptive Retrieval-Augmented Framework for Document Question Answering
Paperid: 997,
Authors: Yunlong Zhao, Xiaoheng Deng, Zhuohua Qiu, Feng Yang, Chang Xu, Shan You, Xiangjian He, Xiu Su
Title: CaDGS: Modeling Inter-Gaussian Mutual Information for Dynamic Novel View Synthesis
Paperid: 998,
Authors: Wenli Zheng, Huiyuan Fu, Xicong Wang, Hao kang, Chuanming Wang, Jin Liu, Zekai Xu, Heng Zhang, Huadong Ma
Title: EvRAW: Event-guided Structural and Color Modeling for RAW-to-sRGB Image Reconstruction
Paperid: 999,
Authors: Si Chen, Yujia Chen, Xiaotian Yin, Xin Liu, Huakai Lai, Tianzhu Zhang
Title: PAF: Prototype Adaptive Fusion for Test-Time Adaptation of Vision-Language Models
Paperid: 1000,
Authors: Chi Huang, Qi Zhang, Qian Zhang, Nan Li, Yipu Gong, Xiaowei Wang, Wei Feng
Title: TriGS: Tri-consistency 3D Gaussian Splatting from Sparse and Unposed Views
Paperid: 1001,
Authors: Mingrui Li, Shuhao Zhai, Zibing Zhao, Luyue Sun, Xinxiao Wang, Dong Li, Shuhong Liu, Hongyu Wang
Title: Wild3A: Novel View Synthesis from Any Dynamic Images in Seconds
Paperid: 1002,
Authors: Peng Ying, Zhongnian Li, Meng Wei, Xinzheng Xu
Title: Reversible Privacy Preserving on Vision-Language Models via Adversarial Multimodal Key
Paperid: 1003,
Authors: Zhaoyun Jiang, Jiaqi Guo, Shakie Liu, Chao Han, Ting Liu, Jian-Guang Lou, Dongmei Zhang
Title: Illustration Layout Generation for Slide Enhancement with Pixel-based Diffusion Model
Paperid: 1004,
Authors: Qiuna Tan, Runqi Qiao, Guanting Dong, YiFan Zhang, MinhuiWu MinhuiWu, Jiapeng Wang, Miaoxuan Zhang, Yida Xu, Chong Sun, Chen Li, Honggang Zhang
Title: OCR-Critic: Aligning Multimodal Large Language Models’ Perception through Critical Feedback
Paperid: 1005,
Authors: Yongzheng Liu, Siru Zhong, Gefeng Luo, Weilin Ruan, Yuxuan Liang
Title: Towards Multi-Scenario Forecasting of Building Electricity Loads with Multimodal Data
Paperid: 1006,
Authors: Xingbo Yao, XuanminWang XuanminWang, Hui Xiong
Title: CitySculpt: 3D City Generation from Satellite Imagery with UV Diffusion
Paperid: 1007,
Authors: Haichuan Fang, Haoran Zhang, Yulin Du, Qiang Guo, Zhen Tian, Youwei Wang, Yangdong Ye
Title: CDIB: Consistency Discovery-guided Information Bottleneck for Multi-modal Knowledge Graph Reasoning
Paperid: 1008,
Authors: Wei Li, Yizhao Wan, Xiao Wu, Jianshuai Wang, Penglin Dai, Zhaoquan Yuan
Title: HOPNet: Learning Hand-Object-Person Interaction Network for Hand Contact State Detection
Paperid: 1009,
Authors: Yunyu Zou, Yishu Liu, Jun Liang, Bingzhi Chen
Title: SG-FSL: Cross-Domain Few-Shot Learning with Style-Decoupled Augmentation and Gradient-Conflict Adjustment
Paperid: 1010,
Authors: Zhenxuan Fang, Shuaibo Wang, Weisheng Dong, Junwei Xu, Fangfang Wu, Xin Li, Guangming Shi
Title: Beyond Visual Quality: Fidelity-Oriented Diffusion Model for Real-world Image Super-Resolution
Paperid: 1011,
Authors: Xuyao Liu, Jiahui Qu, Wenqian Dong
Title: Breaking the Spatial-Temporal Consistency Constraint: Towards Reference-Based Hyperspectral Image Super-Resolution
Paperid: 1012,
Authors: Xiangfei Sheng, Pangu Xie, Weidong Zou, Pengfei Chen, Tong Zhu, Leida Li
Title: InstructCrop: Teaching Multimodal Large Language Models to Crop Aesthetic Images
Paperid: 1013,
Authors: Qingtian Bian, Tieying Li, Marcus De Carvalho, Jiaxing Xu, Hui Fang, Yiping Ke
Title: Multi-Domain Enhancement via Residual Interwoven Transfer in Cross-Domain Sequential Recommendation
Paperid: 1014,
Authors: Jian Zhou, Yingjie Xie, Cunhang Fan, Huabin Wang, Zhao Lv, Liang Tao
Title: DHGCN: Dual HyperGraph Convolutional Network for EEG-Based Auditory Attention Detection
Paperid: 1015,
Authors: Xueyi Zhang, Jialu Sun, Chengwei Zhang, Xianghu Yue, Tianfang Xiao, Siqi Cai, Mingrui Lao, Haizhou Li
Title: EventLip: Enhancing Event-Based Lip Reading via Frequency-Aware Spatiotemporal Hypergraph Modeling
Paperid: 1016,
Authors: teng jin, Ziwen He, Zhangjie Fu, Songping Wang, Yueming Lyu, yufei shi
Title: Frequency Domain Distributed Perturbations: Towards Query-Efficient Black-Box Adversarial Video Attack
Paperid: 1017,
Authors: Yixin Xu, Hao Wu, Jingzhou Zhu, Fengyuan Xu, Sheng Zhong
Title: PriCAF: Privacy-Preserving Contribution Assessment in Federated Learning Before Model Training
Paperid: 1018,
Authors: Zhaohui Jiang, Xuening Feng, Tianchi Huang, Ruixiao Zhang, Paul Weng, Yifei Zhu
Title: Progressive Learning with Human Feedback for Personalized Adaptive Video Streaming
Paperid: 1019,
Authors: Shaohua Liu, Ning Gao, Zuoya Gu, Hongkun Dou, Yue Deng, Hongjue Li
Title: Spatiotemporal Degradation-Aware 3D Gaussian Splatting for Realistic Underwater Scene Reconstruction
Paperid: 1020,
Authors: YuXin Xie, Dongyue Chen, Yue Zhu, Tong Jia, Shizhuo Deng
Title: Noise-Aware Decoding with Salient Region Enhancing for Zero-Shot Image Captioning
Paperid: 1021,
Authors: Linxin Xiao, Xin Wang, Zeyang Zhang, Yang Yao, Wenwu Zhu
Title: DyNAS-DDI: Dynamic Pairwise Architecture Search for Generalizable Drug-Drug Interaction LLM
Paperid: 1022,
Authors: Ting Xiao, Minqian Sun, Yiqing Xia, Zhe Wang
Title: Dual-Prototype Learning in Multiple Instance Learning for Histopathology Image Classification
Paperid: 1023,
Authors: Hyungjun Doh, Dong Lee, Seunggeun Chi, Pin-Hao Huang, Kwonjoon Lee, Sangpil Kim, Karthik Ramani
Title: Occlusion-Aware and Consistent Amodal Completion for 3D Human-Object Interaction Reconstruction
Paperid: 1024,
Authors: Hang Yang, Le Hui, Jianjun Qian, Jian Yang, Yigong Zhang, Jin Xie
Title: Cross-View Geometric Collaboration for Generalizable Sparse View Neural Surface Reconstruction
Paperid: 1025,
Authors: Zhishuo Zhao, Yi Lin, Dongyue Guo, Junyu Fan
Title: AV-RISE: Hierarchical Cross-Modal Denoising for Learning Robust Audio-Visual Speech Representation
Paperid: 1026,
Authors: Xuan Hai, Xin Liu, Zihao Zhang, Ziyao Yu, Kong Xiangzhen, Song Li, Weina Niu, Rui Zhou, Qingguo Zhou
Title: SiFMimicEvader: Evading Fake Voice Detection with Adversarial Neural Mimicry Attacks
Paperid: 1027,
Authors: Hong Gao, Xiangkai Xu, Tianqi Zhu, Xiugang Dong, Yiming Bao, Min-Ling Zhang
Title: Radar-Mamba: 4D Millimeter-Wave Point Cloud Enhancement via State Space Models
Paperid: 1028,
Authors: Min Dang, Gang Liu, Jingqi Zhao, Adams Kong, Nan Luo, Di Wang
Title: DDFD: Diffusion-Based Denoising Fusion for Object Detection in Infrared-Visible Images
Paperid: 1029,
Authors: Wenpeng Lang, Saihui Hou, Yongzhen Huang
Title: Beyond Sparse Keypoints: Dense Pose Modeling for Robust Gait Recognition
Paperid: 1030,
Authors: Xiangui Huang, Taotao Lai, Yizhang Liu, Shuyuan Lin, Zuoyong Li
Title: Two-View Correspondence Pruning via Channel-Spatial Interaction and Bidirectional Consensus Interaction
Paperid: 1031,
Authors: Yadong Huo, Qibing Qin, Wenfeng Zhang, Lei Huang, Jie Nie
Title: Factorized Transformer Hashing with Adaptive Routing for Large-scale Image Retrieval
Paperid: 1032,
Authors: Guipeng Xv, Xinyu Li, Yi Liu, Chen Lin, Xiaoli Wang
Title: Unveiling the Impact of Multi-modal Content in Multi-modal Recommender Systems
Paperid: 1033,
Authors: Mingrui Li, Dong Li, Sijia Hu, Kangxu Wang, Zhenjun Zhao, Hongyu Wang
Title: SLAM-X: Generalizable Dynamic Removal for NeRF and Gaussian Splatting SLAM
Paperid: 1034,
Authors: Yuntian Xiao, Shoulong Zhang, Zihang Zhang, Jiahao Cui, Yan Wang, Shuai Li
Title: Phys4DRT: Physics-based 4D Generation for Real-Time Interaction with Time-Frequency Supervision
Paperid: 1035,
Authors: Na Li, Zihao Li, Zuoli Tang, Yuqing Yu, Lixin Zou, Chenliang Li
Title: Bridging the Gap: Consistent Image Outpainting via Training-Free Noise Optimization
Paperid: 1036,
Authors: Jiahuan Long, Wen Yao, Tingsong Jiang, Jiacheng Hou, Shuai Jia, Junqi Wu, xiaoya zhang, Xiaohu Zheng, Chao Ma
Title: CDUPatch: Color-Driven Universal Adversarial Patch Attack for Dual-Modal Visible-Infrared Detectors
Paperid: 1037,
Authors: Yingzhen Zhang, Jimin Dai, Qianliang Wu, Jian Yang, lei luo
Title: DNCOT: Diffusion-Cascaded Neural Optimal Transport for Scalable Multi-Domain Image-to-Image Translation
Paperid: 1038,
Authors: Xiaodong Wang, Hongmin Hu, Fei Yan, Lu junwen, Zhiqiang Zeng, Weidong Hong, Zhedong Zheng
Title: UniAD: Integrating Geometric and Semantic Cues for Unified Anomaly Detection
Paperid: 1039,
Authors: Mufan Liu, Wu Ran, Zhiquan He, Zuojie Xie, Hong Lu, Peirong Ma
Title: Implicit Retinex Decomposition with Chromaticity Disentanglement for Low-Light Image Enhancement
Paperid: 1040,
Authors: Quangui He, Jiahui Qu, Wenqian Dong, Song Xiao, Qinghao Gao
Title: Cycle-Consistent Mamba-Based Registration-Fusion Joint Network for Unregistered Hyperspectral Image Super-Resolution
Paperid: 1041,
Authors: Zhihao Jia, Meiyan Xu, Jingyuan Wang, Ziyu Jia, Yong Li, Xinliang Zhou, Chenyu Liu, Junfeng Yao, Yi Ding
Title: Sera: Separated Coarse-to-fine Representation Alignment for Cross-subject EEG-based Emotion Recognition
Paperid: 1042,
Authors: Wenrui Liu, Qian Chen, Wen Wang, Guanrou Yang, Weiqin Li, Minghui Fang, Jialong Zuo, Xiaoda Yang, Tao Jin, Jin Xu, Zemin Liu, Yafeng Chen, Bai Jionghao, Zhifang Guo
Title: Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation
Paperid: 1043,
Authors: Yan Zhang, Shiwen He, Lin Yuan, Jiaxu Leng, Xinbo Gao
Title: DichotomyIR: Universal Image Reconstruction via Dichotomy Classification and Uncertainty Elimination
Paperid: 1044,
Authors: Xiong Li, Yikang Yan, Zhenyu Wen, Qin Yuan, Fangda Guo, Zhen Hong, Ye Yuan
Title: Open3DSearch: Zero-Shot Precise Retrieval of 3D Shapes Using Text Descriptions
Paperid: 1045,
Authors: Hui Zhang, Yiteng Xu, Yonglin Tian, Yidong Li, Tiago Falk, Fei-Yue Wang
Title: Selective Shift: Towards Personalized Domain Adaptation in Multi-Agent Collaborative Perception
Paperid: 1046,
Authors: Yiqiang Guo, Lei Zhong, Bin Chen, Jia-Li Yin, Xiaolei Liu, Shouling Ji
Title: Focus on Generalization: Improving Adversarial Transferability via Bi-Level Bias Mitigation
Paperid: 1047,
Authors: Lei Liu, Xiangdong Su, Guanglai Gao
Title: Fourier Self-Adaptation for Transferring General Pretrained Models to Specific Domains
Paperid: 1048,
Authors: Xiaorui Ding, Huan Ma, Changqing Zhang
Title: A Theoretical Proof of Dynamic Multimodal Fusion Exacerbates Modality Greedy
Paperid: 1049,
Authors: He Wang, Longquan Dai, Shihao Pu, Shaomeng Wang, Jinhui Tang
Title: Generative Semantic Probing for Vision-Language Models via Hierarchical Feature Optimization
Paperid: 1050,
Authors: Yi Han, Yaochen Li, Peijun Chen, Wenlong Zhou, Jinhuo Yang, Jintao Chang
Title: SVDGNet: Shapley Value-Based Weight Adjustment for Unsupervised Image Style Transfer
Paperid: 1051,
Authors: Zizhuo Li, Chunbao Su, Fan Fan, Jun Huang, Jiayi Ma
Title: CorrNeXt: Making the ConvNet-Style Correspondence Pruner Stronger for Two-View Geometry
Paperid: 1052,
Authors: Yijun Wang, Siying Wu, Lubin Gan, Zheyu Zhang, Jing Zhang, Zhangchi Hu, Huyue Zhu, Peixi Wu, Xiaoyan Sun
Title: MeDKCoOp: Dual Knowledge-guided Graph Prompt Learning for Biomedical Vision-Language Models
Paperid: 1053,
Authors: SiyuanHuang SiyuanHuang, Jiahui Jin, Xin Lin, Xigang Sun, Yukun Ban
Title: IM-POI: Bridging ID and Multi-modal Gaps in Next POI Recommendation
Paperid: 1054,
Authors: Jiaqi Hou, Kewei Zhang, Tianyu Yang, Chengyu Jia, Qiqi Lin, Hui Wei, Zheng Wang
Title: FAB-Attack: Fabric-Aware Adversarial Attacks on Person Detectors under Motion Blur
Paperid: 1055,
Authors: Jin Han, Yixin Yang, Zhan Zhan, Boxin Shi, Imari Sato
Title: EDeF-Net: Spatio-temporal Association Network for Flicker Removal in Event Streams
Paperid: 1056,
Authors: Yanwei Xie, Weizhi Nie, Lanjun Wang, Hongshuo Tian, Changtai Shi, Anan Liu
Title: When Headlines Meet Minds: Empowering News Recommendations with Social Simulator
Paperid: 1057,
Authors: Abel Chai, Kelly Jee, Sue Han Lee, Tay Siang, Jules Vandeputte, Hervé Goeau, Pierre Bonnet, Alexis Joly
Title: Deep-Plant-Disease Dataset Is All You Need for Plant Disease Identification
Paperid: 1058,
Authors: Chenhui Qiang, Zhaoyang Wei, Xumeng Han, Zipeng Wang, Siyao Li, Xiangyuan Lan, Jianbin Jiao, Zhenjun Han
Title: VER-Bench: Evaluating Multimodal LLM on Reasoning with Fine-Grained Visual Evidence
Paperid: 1059,
Authors: Zihao Wang, Shulei Ji, Le Ma, Yuhang Jin, Shun Lei, Jianyi Chen, Haoying Fu, Roger Dannenberg, Kejun Zhang
Title: Multi-Accent Mandarin Dry-Vocal Singing Dataset: Benchmark for Singing Accent Recognition
Paperid: 1060,
Authors: Zhihua Wang, Weixia Zhang, Wei Zhou, Xiaohong Liu, Guangtao Zhai, Patrick Callet
Title: Evaluating Perceptual Color Preferences in Smartphone Photography: Dataset and Challenges
Paperid: 1061,
Authors: Liang Cheng, Hao Wang, Chenwei Wu, Haochen You, Xianhao Wu
Title: Unlocking Joint Image Deraining and Low-Light Enhancement: Benchmark and Baseline
Paperid: 1062,
Authors: Felix Immohr, Gareth Rendle, Annika Neidhardt, Anton Lammert, Bernd Froehlich, Alexander Raake
Title: ICS-MR: Interactive Conversation Scenarios for Assessment of Mixed Reality Communication
Paperid: 1063,
Authors: Mohammad Ghasempour, Hadi Amirpour, Christian Timmerer
Title: Nature-1k: The Raw Beauty of Nature in 4K at 60FPS
Paperid: 1064,
Authors: Sowmya Vijayakumar, Tong Xue, Abdallah Ali, Irene Viola, Ronan Flynn, Peter Corcoran, Pablo Cesar, Niall Murray
Title: RCQoEA-360VR: Real-time Continuous QoE scores for HMD-based 360° VR dataset
Paperid: 1065,
Authors: Tariq Shoura, Ali Dehaghi, Reza Razavi, Mohammad Moshirpour
Title: VIDEA-8K-60FPS Dataset: 8K 60FPS Video Sequences for Analysis and Development
Paperid: 1066,
Authors: Linlin Zong, Shilin Sui, Wenjun Liang, Wanyu song, LINLIN TIAN, Xinyue Liu, Xianchao Zhang, Bo Xu
Title: CH-SV: A Benchmark for Multi-Type Chinese Harmful Short Video Detection
Paperid: 1067,
Authors: Chenxi Wang, Yusheng Dai, Lei Sun, Jun Du, Jianqing Gao
Title: AudioAtlas: A Comprehensive and Balanced Benchmark Towards Movie-Oriented Text-to-Auido Generation
Paperid: 1068,
Authors: Sijing Wu, Yunhao Li, Huiyu Duan, Yanwei Jiang, Yucheng Zhu, Guangtao Zhai
Title: HVEval: Towards Unified Evaluation of Human-Centric Video Generation and Understanding
Paperid: 1069,
Authors: Ming Li, Yupeng Hu, Yinwei Wei, Hao Liu, Haocong Wang, Weili Guan
Title: DCount: Decoupled Spatial Perception and Attribute Discrimination for Referring Expression Counting
Paperid: 1070,
Authors: Zeyan Li, Cankun Guo, Yin Tang
Title: Modal Symbiosis: Variational Alignment Unveils New Horizons in Multimodal Representation Learning
Paperid: 1071,
Authors: Yue He, Jingxi Xie, Fengling Li, Lei Zhu, Jingjing Li
Title: Flip is Better than Noise: Unbiased Interest Generation for Multimedia Recommendation
Paperid: 1072,
Authors: Wenyu Yin, Shuyuan Lin, David Suter, Hanzi Wang
Title: Adaptive Graph Attention-Guided Parallel Sampling and Embedded Selection for Multi-Model Fitting
Paperid: 1073,
Authors: Renxiang Guan, Junhong Li, Siwei Wang, Wenxuan Tu, Miaomiao Li, En Zhu, Xinwang Liu, Chenping Chenping
Title: Multi-view Graph Clustering with Dual Relation Optimization for Remote Sensing Data
Paperid: 1074,
Authors: Ren Wang, Xin Wang, Tongtong Feng, Xinyue Gong, Guangyao Li, Yu-Wei Zhan, Qing Li, Wenwu Zhu
Title: Improving Compositional Generalization in Cross-Embodiment Learning via Mixture of Disentangled Prototypes
Paperid: 1075,
Authors: Liqi Yan, Xuebin Li, Jianhui Zhang, Fangli Guan, Kanglei Peng, Pan Li
Title: F-DDIM: A Featurized Denoising Diffusion Implicit Model for Facial Image Steganography
Paperid: 1076,
Authors: Min Tan, Guanhao Liu, Huijing Zhan, Yuyu Yin, Zhou Yu, Jiajun Ding, Yinfu FENG
Title: DiSCo: Disentangled Attribute Manipulation Retrieval via Semantic Reconstruction and Consistency Regularization
Paperid: 1077,
Authors: De Li, LQY LQY, Zhou Tan, Zeming Gan, Tiange Xia, Xianxian LI, Jinyan Wang
Title: FedRog: Robust Federated Graph Classification for Strong Heterogeneity and High-Noise Scenarios
Paperid: 1078,
Authors: KAI ZHU, Jun Yin
Title: Neighbor Contrastive Learning with Weakened Consensus Graph for Deep Multi-View Clustering
Paperid: 1079,
Authors: Te Song, Lianyong Qi, Weiming Liu, Fan Wang, Xiaolong Xu, Hongsheng Hu, Yang Cao, Xuyun Zhang, Amin Beheshti
Title: Boosting Guided Diffusion with Large Language Models for Multimodal Sequential Recommendation
Paperid: 1080,
Authors: Hongtao Wu, Yifeng Wu, Jia-Xuan Jiang, Wu Chengyu, Hong Wang, Yefeng Zheng
Title: SAMVSR: Leveraging Semantic Priors to Zone-Focused Mamba for Video Snow Removal
Paperid: 1081,
Authors: Che Liu, Yingji Zhang, Dong Zhang, Weijie Zhang, Chenggong Gong, Yu Lu, Shilin Zhou, Ziliang Gan, Ziao Wang, Haipang WU, Ji Liu, Andre Freitas, Qifan Wang, Zenglin Xu, Rongjunchen Zhang, Yong Dai
Title: NEXUS-O: AN OMNI-PERCEPTIVE AND -INTERACTIVE MODEL FOR LANGUAGE, AUDIO, AND VISION
Paperid: 1082,
Authors: Daixun Li, Sibo He, Jiayun Tian, Yusi Zhang, Weiying Xie, Mingxiang Cao, donglai Liu, Zirui Li, Tianlin Hui, Rui Huang, Yunsong Li
Title: Uni-Sight: An E2E Vision-Language-Action System Unifying Multi-View Alignment and Multi-Modal Fusion
Paperid: 1083,
Authors: Na Jiang, Wenhui Zheng, Xuqian Gu, Jingjing Wang
Title: OmniDoctor: Towards LLM-centric Lifelong Learning for New Emerging Medical VQA Tasks
Paperid: 1084,
Authors: Eungi Lee, Jae Yoon, Seok Bong Yoo
Title: SCOL: Style Code Orchestration in Latent Space for Proactive Face-Swapping Defense
Paperid: 1085,
Authors: Jielong Lu, Zhihao Wu, Jiajun Yu, QIANQIAN SHEN, Jiajun Bu, Haishuai Wang
Title: Where Views Meet Curves: Virtual Anchors for Hyperbolic Multi-View Graph Diffusion
Paperid: 1086,
Authors: Qi He, Xiao Wu, Jun-Yan He, Wei Li, Zhaoquan Yuan
Title: DualEnhance: External Multimodal Foundation Models Guidance and Internal Fast-Slow Teacher Regulation
Paperid: 1087,
Authors: Shuang Hao, Pengfei Ren, Lei Zhang, Haifeng Sun, Pan Ting, Menghao Zhang, Cong Liu, Qi Qi, Jianxin Liao, Jing-Yu Wang
Title: A Dual-Branch 3D Spatial-Aware Latent Diffusion for Realistic Depth Image Synthesis
Paperid: 1088,
Authors: Zhuo Su, Jufeng Li, Yan Zhang, Xin Li, Fuwei Zhang, Yuxin Feng, Fan Zhou
Title: Breaking the Synthetic Barrier: Towards Stable and Generalizable Real-World Image Dehazing
Paperid: 1089,
Authors: Linxuan Luo, Pan Mu, Cong Bai
Title: Physics-Coupled Frequency Dynamic Adaptation Network for Domain Generalized Underwater Object Detection
Paperid: 1090,
Authors: Yujia Zhu, Hao Yang, Yibo Zhao, Chunjie Ma, Weili Guan, Zan Gao
Title: Lightweight Relational Proposal Network with Dual-Branch Distillation for Video Moment Retrieval
Paperid: 1091,
Authors: Yiliang Zhu, Dayan Wu, Qinghang Su, Zexian Yang, Zheng Lin, Weiping Wang
Title: Mitigating the Evolving Semantic Entanglement in Continual Learning of Vision-Language Models
Paperid: 1092,
Authors: Fangli Ying, Zhihong Zhang, Liting Zhou, Cathal Gurrin, Jinhai Wang
Title: Identity-Preserving Facial Aesthetic Enhancement via Hierarchical Prompt Learning and Pivotal Tuning
Paperid: 1093,
Authors: Tong Chen, Bowen Du, Jiejie Zhao, Hanyang Xia, Haiquan Wang, Jiakai Wang
Title: BadMDA: Towards Backdoor Injection during Domain Adaptation to Collapse Multi-Agent Perception
Paperid: 1094,
Authors: Yiru Li, Yingying Zhu
Title: PLGeo: A Patch-level Framework to Overcome Orientation Discrepancies in Cross-view Geo-localization
Paperid: 1095,
Authors: Jinbao Wei, Yuhang Chen, Zhijie Wang, Gang Yang, Shimin Tao, Jian Gao, Aiping Liu, Xun Chen
Title: Rethinking Diffusion Bridge Model with Dual Alignments for Medical Image Synthesis
Paperid: 1096,
Authors: Wangsheng He, Wanru Xu, Ping Guo, Zhenjiang Miao, Yi Tian
Title: InstructStep: Fine-Grained Localization of Step Content and Relation in Instructional Video
Paperid: 1097,
Authors: Yuqi Chen, Xiubo Liang, Yu Zhao, Hongzhi Wang, Weidong Geng
Title: S²-Edit3DV: Diffusion-Guided Style Meets Structure for Consistent Multi-View 3D Video Generation
Paperid: 1098,
Authors: Xiubo Liang, Hongzhi Wang, Zigen Li, Jinxing Han, Yu Zhao, Weidong Geng
Title: SGM-Transformer: Rethinking Gradient Information Loss and Compensation in Spiking Neural Networks
Paperid: 1099,
Authors: Hongchen Wei, Zhenzhong Chen
Title: RealVG: Unleashing MLLMs for Training-Free Spatio-Temporal Video Grounding in the Wild
Paperid: 1100,
Authors: Minho Park, Young Jo, Jae-Hyeok Lee, Ji Lee, Dong-oh Kang, Yong Man Ro
Title: Focus Where It Matters: LLM-Guided Regional Identification for Instruction-based Image Editing
Paperid: 1101,
Authors: Xiaohang Zhang, Hui Gao, Bo Zhang, Xiao Chen, Kun Niu, Tan Yang, Wufan Wang, Wendong Wang
Title: Monocular Vision-based Fast 3D Mapping and Multi-Agent-Assisted Trajectory Planning for UAVs
Paperid: 1102,
Authors: Zeyang Bai, Yunbiao Wang, Dongbo Yu, Jun Xiao, Lupeng Liu
Title: GraphSplat: Sparse-View Generalizable 3D Gaussian Splatting is Worth Graph of Nodes
Paperid: 1103,
Authors: Yuxuan Xiong, Ye Chen, Yue Shi, Zhangli Hu, Bingbing Ni
Title: Rig-Reconstruct-Render (R³3D): Collaborative Representation for Editable and Skeleton-Drivable 3D Asset Generation
Paperid: 1104,
Authors: Jiale Zou, Yan Chen, Bingbing Jiang, Peng Zhou, Liang Du, Lei Duan, Yuhua Qian
Title: Robust Tensor Learning with Graph Diffusion for Scalable Multi-view Graph Clustering
Paperid: 1105,
Authors: Shengze Shi, Tao Ren, Guoliang Zhu, Guandong Feng, JUN HU
Title: Closing the Feedback Loop in Text2Vis: Refining Visualization with Vision-Language Models
Paperid: 1106,
Authors: Zhibing Zhang, Jiantao Lin, Cangqi Zhou, Rui Xia
Title: MPPR: Memory-Prior-based Prompt Refinement in Continuous Space for Advanced Text-to-Image Generation
Paperid: 1107,
Authors: Seung-gyeom Kim, Areum Kim, Eunchae Kim, Minho Chung, Yongjae Yoo
Title: Automatic Accessible Multimodal Translation of Graphics Using A Refreshable Pin Array
Paperid: 1108,
Authors: Jieyi Ge, Zhaodong Sun, Wei Peng, Chenhang Ying, Yuwei Chen, Kui Ren, Xiaobai Li
Title: Evidential Remote Physiological Measurement via Uncertainty-aware Fusion of Video and RF
Paperid: 1109,
Authors: Runze Zhao, Fuqing Zhu, Jizhong Han, Songlin Hu
Title: Visual Perception Uncertainty Learning for Hallucination Detection in Large Vision-Language Models
Paperid: 1110,
Authors: Yan Wang, Qindong Sun, Dongzhu Rong
Title: Audio-Visual Asynchrony Mitigation: Cross-Modal Alignment and Feature Reconstruction for Deepfake Detection
Paperid: 1111,
Authors: Jiayi Zeng, Tao Ren, Changhu Wang, Yifan Wang, Wei Ju, Zhipeng Sun, Xiao Luo
Title: DATE: Dual Prompt Learning with Information Bottleneck for Graph Out-of-distribution Generalization
Paperid: 1112,
Authors: Songning Lai, Ninghui Feng, Jiechao Gao, Hao Wang, Haochen Sui, Xin Zou, Jiayu Yang, Wenshuo Chen, Lijie Hu, Hang Zhao, Xuming Hu, Yutao Yue
Title: From Guesswork to Guarantee: Towards Faithful Multimedia Web Forecasting with TimeSieve
Paperid: 1113,
Authors: Lingren Wang, Wenxuan Tu, Jieren Cheng, Jianan Wang, Xiangyan Tang, Chenchen Wang
Title: Discovering Maximum Frequency Consensus: Lightweight Federated Learning for Medical Image Segmentation
Paperid: 1114,
Authors: Hongyan Xu, Zhongze Wu, Ang He, Xi Lin, Yi Chen, Xiu Su
Title: Addressing Granularity-induced Semantic Drift in OvOD via Graph-guided semantically consistent representation
Paperid: 1115,
Authors: Jiaming Liang, Chi-Man Pun
Title: I-C Attack: In-place and Cross-pixel Augmentations for Highly Transferable Transformation-based Attacks
Paperid: 1116,
Authors: Xin Peng, Bowen Liu, Renxiang Guan, Wenxuan Tu
Title: Multi-view Graph Clustering with Dual Structure Awareness for Remote Sensing Data
Paperid: 1117,
Authors: Lihong Qiao, ShiYi Gao, Yucheng Shu, Bin Xiao, Weisheng Li, Xinbo Gao
Title: Pathology-Aware Reconstruction with Discriminative Knowledge Boosting Alignment for Che-Xray Vision-Language Pre-training
Paperid: 1118,
Authors: Yiming Li, Peng Zhou, Xiaokang Qin, Hongwei Hu, Jun Sun, Yi Xu
Title: Position-LoRA: Enhanced Relation Customization through Structural Prior in Initial Latent Noise
Paperid: 1119,
Authors: Yuhan Jing, Bo He, Haifeng Sun, Qi Qi, Zirui Zhuang, Lei Zhang, Jianxin Liao, Jing-Yu Wang
Title: Foresail: LLM Sensor Knowledge Empowered Status-guided Network for Multivariate Time-series Classification
Paperid: 1120,
Authors: Jipeng Liu, Haichao Shi, Yaru Zhang, Xiao-Yu Zhang
Title: Knowledge Negative Distillation: Circumventing Overfitting to Unlock More Generalizable Deepfake Detection
Paperid: 1121,
Authors: Chengzhe Wang, Wenqing Ji, Chenyang Li, Tongjie Pan, Yalan Ye
Title: Toward Reliable Emotion Recognition: Alleviating Label Noise and Reducing Uncertain Prediction
Paperid: 1122,
Authors: Changjuan Ran, FANG LIU, Runqi Fang, Xiangyu Meng, Shenglan Cui, Yunfan Ye
Title: Where Watermark Meets Beauty: Expert-Guided Aesthetic Visible Watermarking for Digital Artworks
Paperid: 1123,
Authors: Yifei Deng, Chenglong Li, Futian WANG, Jin Tang
Title: Learning Hierarchical Cross-modal Association with Intra-modal Context for Text-Image Person Retrieval
Paperid: 1124,
Authors: Mingliang Zhai, Yiheng Wang, Haidong Hu, Chi-Man Pun, Hao Gao
Title: FGRFlow: Learning Fine-Grained Rigidity Scene Flow from 4D Radar Point Cloud
Paperid: 1125,
Authors: Hanzhe Yu, Yun Ye, Jintao Rong, Qi Xuan, Chen Ma
Title: RealHD: A High-Quality Dataset for Robust Detection of State-of-the-Art AI-Generated Images
Paperid: 1126,
Authors: Shan Wang, Weisi Lin, Yun Liu, Libao Zhang
Title: CLIP-HNet: Hybrid Network with Cross-Modal Guidance for Self-Supervised Remote Sensing Dehazing
Paperid: 1127,
Authors: Kun Cheng, Qibing Qin, Wenfeng Zhang, Lei Huang, Jie Nie
Title: Deep Probabilistic Binary Embedding via Learning Reliable Uncertainty for Cross-Modal Retrieval
Paperid: 1128,
Authors: Gwonjung Kim, Duyeol Lee, Jaehong Yang, Chae Eun Rhee
Title: See Through the Occlusions: Amodal Gaussian Splatting for Few-Shot 3D Reconstruction
Paperid: 1129,
Authors: Xinyu Zhang, Lingling Zhang, Yanrui Wu, Muye Huang, Jun Liu
Title: Cognitive Predictive Coding Network: Rethinking the Generalization in Raven’s Progressive Matrices
Paperid: 1130,
Authors: Haonan Cheng, Junwei Zhang, Hengyan Huang, Long Ye
Title: FG-Midiformer: A Symbolic Music Understanding Model towards Fine-Grained Learning of Multi-Attributes
Paperid: 1131,
Authors: Dongdong Hu, Yang Zhou, Xiaofeng Huang, Haibing Yin, Zhu Li
Title: Sparse4DGS: Flow-Geometry Assisted 4D Gaussian Splatting for Dynamic Sparse View Synthesis
Paperid: 1132,
Authors: Zhe Sun, Qiang Xu, Qi Zhang, Shan Liu, Ge Li
Title: Overfitted Point Cloud Attribute Codec Using Sparse Hierarchical Implicit Neural Representations
Paperid: 1133,
Authors: Liang Yue, Shao-Kui Zhang, Lin Yuan, Yi-Tao Chen, Zirui Zhou, Song-Hai Zhang
Title: Synthesizing 3D Scenes via Diffusion Model that Incorporates Indoor Scene Characteristics
Paperid: 1134,
Authors: Beijing Chen, Yuting Hong, Ziqiang Li, Zhangjie Fu
Title: DFPD: Dual-Forgery Proactive Defense against Both Deepfakes and Traditional Image Manipulations
Paperid: 1135,
Authors: Hua Li, Gaowei Lin, Zhiyuan Li, Sam Kwong, Runmin Cong
Title: FSCDiff: Frequency-Spatial Entangled Conditional Diffusion model for Underwater Salient Object Detection
Paperid: 1136,
Authors: Liang Zhao, Shubin Ma, Bo Xu, Qingchen Zhang
Title: Dual-Learning Based Penalized Multi-Align Clustering for Multi-View Incomplete and Disorderly Data
Paperid: 1137,
Authors: Hanling Wang, Qing Li, Li Chen, Haidong Kang, Fei Ma, Yong Jiang
Title: HoloTrace: LLM-based Bidirectional Causal Knowledge Graph for Edge-Cloud Video Anomaly Detection
Paperid: 1138,
Authors: Naisong Luo, Yuan Wang, Yuwen Pan, Rui Sun
Title: Focus on the Object: Gradient-based Feature Modulation for Camouflaged Object Segmentation
Paperid: 1139,
Authors: Zhaolin Wei, Xiuwen Shi, Dengpan Ye, Yuhan Lin, Zhigang Wang, JiaCheng Deng, Ziyi Liu, Long Tang
Title: PhonoFence: A Cross-Task Defense Framework for DeepFake via Phoneme-Level Adversarial Perturbations
Paperid: 1140,
Authors: Ziying Tan, Linbo Luo, Haiyan Yin, Yew-Soon Ong, Wentong Cai
Title: Crowd Dynamics Demand Adaptivity: Self-Adaptive Physics-Informed Neural Network for Crowd Simulation
Paperid: 1141,
Authors: Sensen Wang, Yuehu Liu, Chi Zhang
Title: BiOMamba: Mamba-based Forward-Then-Backward Temporal Modeling for Online Action Detection and Anticipation
Paperid: 1142,
Authors: Shuai Huang, Yongxiong Wang, Huan Luo, Haodong Jing, Chendong Qin, Jingqun Tang
Title: MINDEV: Multi-modal Integrated Diffusion Framework for Video Reconstruction from EEG Signals
Paperid: 1143,
Authors: Jiateng Liu, Hengcan Shi, Haiwen Liang, Xiaolin Xu, Yuan Zong, Yaonan Wang, Wenming ZHENG
Title: NaME: A Natural Micro-expression Dataset for Micro-expression Recognition in the Wild
Paperid: 1144,
Authors: Wenpeng Mu, Zheng Li, Qiang Xu, Xinghao Jiang, Tanfeng Sun
Title: ExDA: Towards Universal Detection and Plug-and-Play Attribution of AI-Generated Ex-Regulatory Images
Paperid: 1145,
Authors: Fenghua Yu, Jianwen Sun, Qian Wan, Meicheng Chen, Xiaoxuan Shen, Qing Li
Title: DiffuQKT: A Diffusion-Based Approach for Improved Question Representation in Knowledge Tracing
Paperid: 1146,
Authors: Shalayiding Sirejiding, Yue Ding, Yuxiang Lu, Xinyi Hou, Shaokai Wu, Qichen He, Chunlin Wang, Wenqiang GUO, Hongtao Lu
Title: CLIP-MT: Multi-Modal Knowledge-Driven Adaptive Scale Feature Allocation for Multi-Task Dense Prediction
Paperid: 1147,
Authors: Ruiqi Li, Yiu-ming Cheung
Title: Modeling and Identifying Distractors with Curriculum for Robust 3D Gaussian Splatting
Paperid: 1148,
Authors: Haotian Gan, Yudong Li, Wanyue Li, Weidong Tang
Title: Aligned or Apart? Multi-Agent Insights into Consumer and Brand Messaging Discrepancies
Paperid: 1149,
Authors: Songze Li, Yunfei Guo, Shen Chen, Bin Li, Kaiqing Lin, Changsheng Chen, Haodong Li, Taiping Yao, Shouhong Ding
Title: DITL²: Dual-Stage Invariance Transfer Learning for Generalizable Document Image Tampering Localization
Paperid: 1150,
Authors: Mengzhen Wang, Xunbin Huang, Jiayuan Xie, Shukai Ma, Jiale Men, DaYong Liang, Yi Cai
Title: From Model Diagram to Code: A Benchmark Dataset and Multi-Agent Framework
Paperid: 1151,
Authors: Rongqiang Fang, Yongqi Sun, Jidong Yuan, hongbo cao, Jinkun Dong
Title: A Language-Assisted Semantic-Aware Disentangled Method for Link Prediction on Heterogeneous Graphs
Paperid: 1152,
Authors: Nokap Park
Title: M2PE-Diff: Music-to-Pose Encoder for Dance Video Generation Leveraging Latent Diffusion Framework
Paperid: 1153,
Authors: Hancong Wang, Yue Yu, Hairong Zheng, Tong Zhang
Title: Test-Time Adaptation of Medical Vision-Language Models with Mixture of Modality Experts
Paperid: 1154,
Authors: Haichao Sha, Yuncheng Wu, Ruixuan Liu, Yang Cao, Hong Chen
Title: Differentially Private Visual Learning with Public Subspace Augmented by Synthetic Data
Paperid: 1155,
Authors: Yu Liu, Kun Sun, Chang Tang, Yuhua Qian, Xin Li
Title: TPDepth: Leveraging Text Prompts with ControlNet to Boost Diffusion-based Depth Estimation
Paperid: 1156,
Authors: Yidong Chen, Qi Li, Yuyang Yang, Wen Li, Sheng Ao, Cheng Wang
Title: Unleashing the Power of Data Generation in One-Pass Outdoor LiDAR Localization
Paperid: 1157,
Authors: Yunqiang Pei, Hongrong yang, Kaiyue Zhang, Guoqing Wang, Peng Wang, Chaoning Zhang, Yang Yang, Heng Tao Shen
Title: InteractGuide: LLM-Enhanced Multimodal Reasoning for User-Centric Interaction Recommendations in AR-HRI Authoring
Paperid: 1158,
Authors: Xuanming Jiang, Baoyi An, Zhengwei Zou, DingYu Nie, Jialie Shen, Xueming Qian, Guoshuai Zhao
Title: Ear with Eye: Lightweight Multimodal Audio-Visual Network Inspired by Bionic Structures
Paperid: 1159,
Authors: Guoyi Li, Die Hu, Xiaomeng Fu, Qirui Tang, Yulei Wu, Xiaodan Zhang, Honglei Lyu
Title: Entity Graph Alignment and Visual Reasoning for Multimodal Fake News Detection
Paperid: 1160,
Authors: Le Liu, Shizhou Zhang, Di Xu
Title: SUVIS: A Depth- and Motion-Encoded Stereoscopic System for Communicating Forecast Uncertainty
Paperid: 1161,
Authors: Jingyuan Fang, Yang Ning, Xiushan Nie, Xinfeng Liu, Zhiyong Cheng
Title: VLHP: Learning Discriminative Vision-Language Hybrid Prototypes for Weakly Supervised Semantic Segmentation
Paperid: 1162,
Authors: Xiangping Zheng, Xuan Feng, Bo Wu, Bin Ren, Wei Li, Xiuxin Hao, Xun Liang, Bin Tang, Zhiwen Yu
Title: Breaking Semantic Barriers: A Zero-Shot Generalized Framework for Graph Anomaly Detection
Paperid: 1163,
Authors: Zhenyu Xu, Junjie Wu, Zhiyan Piao, Xiaoqi Sheng, Yu Xiao, Xinyu Zhang
Title: AnyStyleDiffusion: Flexible Style Transfer with Consistent Content Adaptation Across Diffusion Models
Paperid: 1164,
Authors: Baoquan Zhao, Xiaofan Ma, Qianshi Pang, Ruomei Wang, Fan Zhou, Shujin Lin
Title: VisAug: Facilitating Speech-RichWeb Video Navigation and Engagement with Auto-Generated Visual Augmentations
Paperid: 1165,
Authors: Haifeng Zhao, Shuo Xu, Leilei Ma, Yufei Zhang, Lei Wang, Dengdi Sun
Title: Towards Space and Semantics: Object-Purified Representation Learning for Multi-Label Image Classification
Paperid: 1166,
Authors: Songpei Xu, Xuri Ge, Chaitanya Kaul, Roderick Murray-Smith
Title: HandSolo: A Mid-Air Hand Pose Interaction Method Based on Disentangled Degrees-of-Hand-Freedom
Paperid: 1167,
Authors: Guangfei Li, Quanxue Gao, Yu Lei, Yichen Bao, Qianqian Wang
Title: Multi-view Collaborative Representation Learning from Noisy Labels for VHR Imagery Classification
Paperid: 1168,
Authors: Mingsong Yang, Xinhong Hei, Kehai Chen, Haining Meng, HaoYang Dong, Qin zhao
Title: BIMCompNet: Multimodal Dataset for Geometric Deep Learning in Building Information Model
Paperid: 1169,
Authors: Zhixia Zhao, Qiyue Li, Jie Li, Richang Hong, Zhi Liu
Title: ViewGauss: A Head Movement Dataset for 6DoF Gaussian Splatting Video Viewing
Paperid: 1170,
Authors: Peirong Zhang, Yidan Zhang, Hanru Shi, Dianyu Wang, Xiaoxuan Liu, Lei Wang
Title: Referring Multi-Object Tracking in Satellite Videos: A New Benchmark and Baseline
Paperid: 1171,
Authors: Shifu Xiong, HangChen HangChen, Shi Cheng, Kai Shen, Hengshun Zhou, Genshun Wan, Chenyue Zhang, Kewei Li, Jun Du, Lirong Dai
Title: MISP-QEKS: A Large-Scale Dataset with Multimodal Cues for Query-by-Example Keyword Spotting
Paperid: 1172,
Authors: Keyue Shi, QIANQIAN SHEN, Zhaoming Ye, liangjun jiang, Jiajun Bu, Haishuai Wang
Title: LUMOS Dataset: Lumbar Multimodal Osteoporosis Screening with X-ray and CT images
Paperid: 1173,
Authors: Wenxu Gao, Liang Xie, Kangli Wang, Jingxuan Su, Changhao Peng, Wei Gao
Title: DPCSet: A Large-scale Dynamic Point Cloud Dataset for Compression and Perception
Paperid: 1174,
Authors: Zihou Zhang, Hao Li, Zhengwei Yang, Zechao Hu, Liang Li, Zheng Wang
Title: From Language to Instance: Generative Visual Prompting for Zero-shot Camouflaged Object Detection
Paperid: 1175,
Authors: Dongyang Ma, Zhengyu Ma, Wei Zhang, Yonghong Tian
Title: DSF-Net: Dynamic Sparse Fusion of Event-RGB via Spike-Triggered Attention for High-Speed Detection
Paperid: 1176,
Authors: Lin Wu, Wei Wei, Peizhuo Yu, Jianglin Lan
Title: Open-Vocabulary 3D Affordance Understanding via Functional Text Enhancement and Multilevel Representation Alignment
Paperid: 1177,
Authors: Shanghui Deng, Xiao Zheng, Chang Tang, Kun Sun, Yuanyuan Liu, Xinwang Liu
Title: Find True Collaborators: Banzhaf Index-based Cross View Alignment for Partially View-aligned Clustering
Paperid: 1178,
Authors: Jiehua Zhang, Liang Li, Chenggang Yan, Wei Ke, Yihong Gong
Title: Frequency-aware Correlation Discovering and Spatial Forgery Clue Distilling for Synthetic Image Detection
Paperid: 1179,
Authors: Chenglong Sun, Shijie Pang, Yuzheng Wang, Lizhe Qi
Title: RWKV3D: An RWKV-Based Model with Multiple Training Strategies for Point Cloud Analysis
Paperid: 1180,
Authors: Longquan Dai, He Wang, Xiaolu Wei, Shaomeng Wang, Jinhui Tang
Title: Conducting Conditional Diffusion by Estimating the Mean Vector of von Mises-Fisher Distribution
Paperid: 1181,
Authors: Yang Yu, Meiyu Liang, Wei Huang, Juncheng Zheng, Kangkang Lu, Yawen Li, Junping Du, Zhe Xue, Wu Liu
Title: Asymmetric Pre-aligned Anchor Contrastive Enhanced Diffusion Hashing Model for Incomplete Multimodal Retrieval
Paperid: 1182,
Authors: Pengfei Ren, Jing-Yu Wang, Haifeng Sun, Qi Qi, Jing Wang, Jianxin Liao
Title: Rule Meets Learning: Confidence-Aware Multi-View Fusion for Self-Supervised 3D Hand Pose Estimation
Paperid: 1183,
Authors: Jiacheng Ruan, Zongyun Zhang, Jingsheng Gao, Wenzhen Yuan, Ting Liu, yuzhuo fu
Title: MPI-CD: Multi-Path Information Contrastive Decoding for Mitigating Hallucinations in Large Vision-Language Models
Paperid: 1184,
Authors: Qian Li, Siyuan Liang, Yuzheng Zhang, Cheng Ji, Zongyu Chang, Shangguang Wang
Title: Meta-Knowledge Path Augmentation for Multi-Hop Reasoning on Satellite Commonsense Multi-Modal Knowledge Graphs
Paperid: 1185,
Authors: Fangxin Liu, Junjie Wang, Ning Yang, Zongwu Wang, Junping Zhao, Li Jiang, Haibing Guan
Title: ASTER: Adaptive Dynamic Layer-Skipping for Efficient Transformer Inference via Markov Decision Process
Paperid: 1186,
Authors: Chang Su, Beihong Jin, Fusang Zhang, Siheng Li, Zhi Wang
Title: Self-Supervised Human Mesh Recovery from Partial Point Cloud via a Self-Improving Loop
Paperid: 1187,
Authors: Runqi Wang, Caoyuan Ma, Jian Zhao, Hanrui Xu, Dongfang Sun, Haoyang Chen, Lin Xiong, Zheng Wang, Xuelong Li
Title: Leader is Guided: Interactive Motion Generation via Lead-Follow Paradigm and Trajectory Guidance
Paperid: 1188,
Authors: Chuan Zeng, Zhao Zhang, Wei Huang, Lei Zhang, Le Yi, kefu zhao
Title: DC²-SR: A Dual-Consistency Guided Curriculum Learning method for Thick-Slice Fetal MRI Super-Resolution
Paperid: 1189,
Authors: Bingfeng Liu, Songwei Pei, Shuhuai Wang, Wenzheng Yang, Qian Li, Shangguang Wang
Title: Prior-Constrained Relevant Feature driven Image Fusion with Hybrid Feature via Mode Decomposition
Paperid: 1190,
Authors: Dahao Fu, Jiangqun Ni, Jian Zhang
Title: JPEG-RAE: Reversible Adversarial Example for Privacy and Copyright Protection of JPEG Images
Paperid: 1191,
Authors: Ziqiang Shi, Rujie Liu, Jun Takahashi, Shan Jiang
Title: TrueCount: Improving Open-World Object Counting with Visual-Language Models and Dynamic Multi-Modal Inputs
Paperid: 1192,
Authors: Yan-Kai Liu, Shunyang Yao, Tao Xi, Bao-liang Lu, Wei-Long Zheng
Title: Human vs AI: How Digital Human News Anchors Affect Our Cognitive Processes?
Paperid: 1193,
Authors: Shibei Meng, Saihui Hou, Yang Fu, Xuecai Hu, Junzhou Huang, Yongzhen Huang
Title: Seeing from Magic Mirror: Contrastive Learning from Reconstruction for Pose-based Gait Recognition
Paperid: 1194,
Authors: Zhiqian Xia, Haifeng Xia, Shichao Jin, Wei Wang, Zhengming Ding, Xiaochun Cao
Title: DSPF: Dual-Stage Preservation and Fusion for Source-free Domain Adaptive Point Cloud Completion
Paperid: 1195,
Authors: Ziyi Li, Wei-Long Zheng, Bao-liang Lu
Title: Multimodal Emotion Recognition with Missing Modality via a Unified Multi-task Pre-training Framework
Paperid: 1196,
Authors: Dexuan Xu, Yanyuan Chen, Yu Huang, Shihao E, Yiwei Lou, Yongzhi Cao, Hanpin Wang, Meikang Qiu
Title: Medical Vision-Language Pre-training with Multimodal Variational Masked Autoencoder for Robust Medical VQA
Paperid: 1197,
Authors: Yue Hou, Yingke Su, Junran Wu, Ke Xu
Title: Test-time Graph OOD Detection via Dynamic Dictionary Expansion and OOD Score Calibration
Paperid: 1198,
Authors: Junpu Zhang, Shengju Yu, Suyuan Liu, Siwei Wang, Miaomiao Li, Xinwang Liu, En Zhu, Kunlun He
Title: Learning the Anchors with Similar Distributions to Original Data for Multi-view Clustering
Paperid: 1199,
Authors: Xubo Liu, wenya guo, Ruxue Yan, Xumeng Liu, Ying Zhang, Ru Zhou
Title: Rethinking the Reliability of Evidence in End-to-End Fact-Checking from the Causal Perspective
Paperid: 1200,
Authors: Youchen Xie, Chen Li, Sheng Qiu, ZhiJun Wang, Chenhui Li, Yibo Zhao, Zan Gao, Changbo Wang
Title: FluidGS: Physics Informed Gaussian Splatting for Dynamic Fluid Reconstruction from Sparse Views
Paperid: 1201,
Authors: Yiqing Hao, Yangru Huang, Yi Jin, Tao Wang, Yidong Li, Yigang Cen
Title: Tree of Prompts: Aligning Hierarchical Visual Prior for Continue Generalized Category Discovery
Paperid: 1202,
Authors: Jintian Ji, Songhe Feng
Title: Anchors Bring Stability and Efficiency: Fast Tensorial Multi-view Clustering on Shuffled Datasets
Paperid: 1203,
Authors: Duolin Wang, Guanyu Xing, Yanli Liu
Title: FlowTrack: Integrating Adjacent-Frame Motion Tracking and Adaptive Prediction for Robust Semi-Supervised VOS
Paperid: 1204,
Authors: Jinwen Wang, Youfang Lin, Xiaobo Hu, Siyu Yang, Sheng Han, Shuo Wang, Kai Lv
Title: From Pixels to Temporal Correlations: Learning Informative Representations for Reinforcement Learning Pre-training
Paperid: 1205,
Authors: KunSheng Ma, Fan Qi, Changsheng Xu
Title: Granular Music Attribute Transformation with Proximal Policy Optimization Adapters for Diffusion Model
Paperid: 1206,
Authors: Yan Zhong, Xinping Zhao, Li Zhang, Xinyuan Song, Tingting Jiang
Title: Adaptive Prompt Learning for Blind Image Quality Assessment with Multi-modal Mixed-datasets Training
Paperid: 1207,
Authors: Xiaolei Bo, Feiyang Yang, Feilong Xu, Xiaoli Zhang
Title: Cross-Counter-Repeat Attention for Enhanced Understanding of Visual Semantics in Radiology Report Generation
Paperid: 1208,
Authors: Junlei Zhou, Jiashi Gao, Xinwei Guo, Haiyan Wu, Quanying Liu, Xiangyu Zhao, Hongxin Wei, Xin Yao, Xuetao Wei
Title: Mitigating Stereotypes in Text-to-Image Generation: A Novel Perspective of Selective Neural Suppression
Paperid: 1209,
Authors: Qi Li, Yucan Zhou, Jiang Zhou, Xingyou Yang, Xiaoyan Gu
Title: Diverse and Public Features Cooperation via Gradient Rectification for Federated Prompt Learning
Paperid: 1210,
Authors: Qiuyu Liang, Yongqiang Zhang
Title: SAM based Region-Word Clustering and Inference Score Adjusting for Open-Vocabulary Object Detection
Paperid: 1211,
Authors: Dawei Lin, Meng Yuan, Ziming Wang, Tieru Wu, Yuanning Liu
Title: FreeCAD: A Multimodal Framework for 3D CAD Model Generation from Free-Form Prompts
Paperid: 1212,
Authors: Changshuo Wang, Shuting He, Xiang Fang, Fangzhe Nan, Prayag Tiwari
Title: Seeing the Overlooked: Bio-Visual Inspired Weak Saliency Feedback Transformer for Person Re-identification
Paperid: 1213,
Authors: Zihou Liu, Dongming Zhang, Jing Zhang, Jun Li, Yongdong Zhang
Title: RealText: Realistic Text Image Generation based on Glyph and Scene Aware Inpainting
Paperid: 1214,
Authors: Michael Kohl, Tobias Wursthorn, Christof Weiss
Title: Cross-Modal Metrics for Capturing Correspondences between Music Audio and Stage Lighting Signals
Paperid: 1215,
Authors: Xinbo Geng, Fan Shi, Xu Cheng, Chen Jia, Meng Zhao, Shengyong Chen
Title: LFMamba: Focal Stack-aware State Space Modeling for Light Field Salient Object Detection
Paperid: 1216,
Authors: Zhongfan Sun, Kan Guo, Yongli Hu, Daxin Tian, Qingqing Gao, Jiapu Wang, Junbin Gao, Yanfeng Sun, Baocai Yin
Title: Large-Small Model Synergy with Multimodal Fine-Grained Heuristics for Knowledge-Based Visual Question Answering
Paperid: 1217,
Authors: Wenzheng Yang, Songwei Pei, Bingfeng Liu, Qian Li, Shangguang Wang
Title: OGDepth: Leveraging Object Guidance in Diffusion Models for Enhanced Monocular Depth Estimation
Paperid: 1218,
Authors: Giovanni Zanin, Ritujoy Biswas, Pietro Morerio, Sylvio Barbon Junior, Alberto Carini, Alessio Del Bue, Vittorio Murino
Title: Direction-Aware Room Impulse Response Estimation for Immersive Audio Rendering in Real Environments
Paperid: 1219,
Authors: Ziwei Niu, Shiao Xie, Ziyue Wang, Yen Chen, Yueming Jin, Lanfen Lin
Title: EIR-SDG: Explore Invariant Representation for Single-source Domain Generalization in Medical Image Segmentation
Paperid: 1220,
Authors: Zezhou Chen, Ping Chen, Huan Hu, Xiang Liu, Zipeng Wang, Zhaoxiang Liu, Kai Wang, Shiguo Lian
Title: CP3: Customizable 3D Pop-Out Effect Creation for Immersive Content Using Multimodal Models
Paperid: 1221,
Authors: Yichi Zhang, Zhuo Chen, Lingbing Guo, yajing Xu, Lei Liang, Wen Zhang, Huajun Chen
Title: Client-Server Co-design with Multi-modal Codebooks Makes Better and Faster Federate Knowledge Sharing
Paperid: 1222,
Authors: Ruilin Yao, Yi Rong, Tianyu Zou, Bo Zhang, Jian Li, Shengwu Xiong, Shili Xiong
Title: MAP: Parameter-Efficient Tuning for Referring Expression Comprehension via Multi-Modal Adaptive Positional Encoding
Paperid: 1223,
Authors: Xiaoxuan Mu, Haoyu Tang, Han Jiang, Tianyuan Liang, Qinghai Zheng, Jihua Zhu
Title: FACE: A Dual-Template and Adaptive Curriculum Framework for Unsupervised Text-Based Person Search
Paperid: 1224,
Authors: Jiaye Zhang, Hongyi Wang, Peiru Yang, Zili Meng, Mingwei Xu
Title: Configuring Dynamic Multi-Stage Serverless Pipelines for Video Processing with Minimal Profiling Overhead
Paperid: 1225,
Authors: Dan Wu, Xincheng Ju, Dong Zhang, Shoushan Li, Erik Cambria, Guodong Zhou
Title: Emotion across Modalities and Cultures: Multilingual Multimodal Emotion-Cause Analysis with Memory-inspired Framework
Paperid: 1226,
Authors: Jie Fu, Bingkun BAO
Title: Retaining Temporal Semantics and Relation Topologies for Continual Weakly-Supervised Audio-Visual Video Parsing
Paperid: 1227,
Authors: Sujuan Hou, Zhihui Feng, Hao Xiong, Weiqing Min, Peng Li, Shuqiang Jiang
Title: DSDGF-Nutri: A Decoupled Self-Distillation Network with Gating Fusion For Food Nutritional Assessment
Paperid: 1228,
Authors: Youbo Mao, Ziyang Kang, Peng Li, Jiyao Chen, Zenglin Yang, Zhijun Li
Title: FCG: High-Throughput JPEG Heterogeneous Inference with Hybrid Parallel Pipeline on Mobile Devices
Paperid: 1229,
Authors: Feng-Kai Huang, Bo-Lun Huang, Li-Wu Tsao, Jhih-Ciang Wu, Hong-Han Shuai, Wen-Huang Cheng
Title: Flowing Crowd to Count Flows: A Self-Supervised Framework for Video Individual Counting
Paperid: 1230,
Authors: Haiyang Mei, Difei Gao, Xiaopeng Wei, Xin Yang, Mike Zheng Shou
Title: Can I Trust You? Advancing GUI Task Automation with Action Trust Score
Paperid: 1231,
Authors: Biao Dong, Lei Zhang
Title: Talking Head Generation via Viewpoint and Lighting Simulation Based on Global Representation
Paperid: 1232,
Authors: Jiahui Zhang, Mengtian Li, Jiewei Tang, Junyu Deng, Siyu Tian, Xiang Liu, Meng Zhang, Guangnan Ye, Yu-Gang Jiang
Title: EditMaster: Bridging Text instruction and Visual Example for Multimodal guided Image Editing
Paperid: 1233,
Authors: mo yang, luo chen, Jiali zhou
Title: Change-UP: Advancing Visualization and Inference Capability for Multi-level Remote Sensing Change Interpretation
Paperid: 1234,
Authors: Leyuan Liu, Shen Chen, Jingying Chen
Title: HumanPrinter: Reconstructing 3D Human from a Single Image Like a 3D Printer
Paperid: 1235,
Authors: Chuan Zhang, Zihan Li, ZiHao Xu, Xuhao Ren, Liehuang Zhu
Title: SepVAMark: Deep Separable Visual-Audio Fusion Watermarking for Source Tracing and Deepfake Detection
Paperid: 1236,
Authors: Junyi Wang, Yue Qi
Title: Visual Localization using Hybrid Feature Grid and Learned Weighted Global Point Cloud
Paperid: 1237,
Authors: Chuhang Ma, Shuai Tan, Junjie Wei, Ye Pan
Title: GOES: 3D Gaussian-based One-shot Head Animation with Any Emotion and Any Style
Paperid: 1238,
Authors: Kai Wang, Shijian Deng, Jing Shi, Dimitrios Hatzinakos, Yapeng Tian
Title: AV-DiT: Taming Image Diffusion Transformers for Efficient Joint Audio and Video Generation
Paperid: 1239,
Authors: Mingjie Wei, Weinan Zhang, Chen Zhang, Yifeng Ding, Donglin Di, Lei Ren, Chen Wei, Ting Liu
Title: PRISM: A Benchmark for Unveiling Cross-modal Knowledge Inconsistency in Large Vision-Language Models
Paperid: 1240,
Authors: Jiankun Zhu, Sicheng Zhao, Lulu Tian, Jing Jiang, Xi Chen, Hongxun Yao
Title: Emotion in a Bottle: Information Bottleneck Guided Disentanglement for Emotion Domain Adaptation
Paperid: 1241,
Authors: Jingxing Guo, Guilian Chen, Yimu Sun, Huisi Wu, Jing Qin
Title: Hierarchical Spatiotemporal Context Aggregation and Speckle-aware Deformable Convolution for Echocardiography Video Segmentation
Paperid: 1242,
Authors: Fenghao Tian, Mingtao Feng, Jianqiao Luo, Zijie Wu, Longlong Mei, Lijie Yang, Weisheng Dong, Yaonan Wang
Title: Generalizing to New Area: Self-Distillation Curriculum Learning for Fine-Grained Cross View Localization
Paperid: 1243,
Authors: Jiahua Bao, Siyao Cheng, Jiaxing Du, Changjiang He, Zeming Lang, Hao Zhang, Jie Liu
Title: BOLT: Fewer Tokens but More Performance Retention for Efficient Vision-Language Models Inference
Paperid: 1244,
Authors: Liu Yu, Jiajun Sun, PING KUANG, Rui Zhou, Fan Zhou, Zhikun Feng
Title: Bimodal Debiasing for Text-to-Image Diffusion: Adaptive Guidance in Textual and Visual Spaces
Paperid: 1245,
Authors: Chenyang Zhou, Menghejiya Menghejiya, TangChao TangChao, Licheng Wu
Title: UniMTR: Unified Recognition of Dual-style Traditional Mongolian Scripts via Contrastive Representation Alignment
Paperid: 1246,
Authors: Han Hu, WenLi Du, Bing Wang
Title: Efficient Video Anomaly Detection via Scene-Dependent Memory Assisted Inter-Frame RGB Difference Reconstruction
Paperid: 1247,
Authors: Xiongwei Dang, Wenxuan Liu, Xian Zhong, Zheng Wang
Title: SegTraj: A Segmented Trajectory-aware Spatio-Temporal Graph Convolutional Network for Social Group Detection
Paperid: 1248,
Authors: Xiao Fu, Pengyu Wang, Wei Xi, Kun Zhao, Jiadong Feng, Jizhong Zhao
Title: LES-CLIP: A Lightweight Emotion-Sensitive Adaptation of CLIP for Precise Similar Emotion Discrimination
Paperid: 1249,
Authors: Tiancheng LIU, JIAYI YE, Shumeng Zhang, Kang Zhang, Chen Liang
Title: Quantifying Structural Aesthetic Features and Personality Trait Preferences in $\textit{Kai Shu}$ Calligraphy
Paperid: 1250,
Authors: LiYan LiYan, Xingchen Hu, Jiyuan Liu, Zhong Liu
Title: Federated Incomplete Multi-view Clustering with Individual Structure Preservation and Central Representation Tensorization
Paperid: 1251,
Authors: Hanyu Guo, Suzhou Que, Junlong Gao, Hanzi Wang
Title: TFPA: Text Features Guided Dynamic Parameter Adjustment for Few Shot Action Recognition
Paperid: 1252,
Authors: Feng-Kai Huang, Hong-Wei Xu, Chu-Chuan Lee, Hong-Yi Tu, Hong-Han Shuai, Wen-Huang Cheng
Title: OinkTrack: An Ultra-Long-Term Dataset for Multi-Object Tracking and Re-Identification of Group-Housed Pigs
Paperid: 1253,
Authors: Wuxia Zhang, Yang Xin, Shibo Lv, Xin Zhang, Xiang Zhong, Jianmin Jiang
Title: EEG-Face: A Facial-Image Stimulated EEG Data-Set for Analysis of Brain Perceived Multimedia
Paperid: 1254,
Authors: Yuchen Zhang, Tailin Chen, Jiangbei Yue, Yueming Sun, Rahul Singh, Jianbo Jiao, ZEYU FU
Title: DeHate: A Holistic Hateful Video Dataset for Explicit and Implicit Hate Detection
Paperid: 1255,
Authors: Tuan Dang, Theron Wang, Hridayesh Lekhak, Kenny Zhu
Title: EmotionalCanines: A Dataset for Analysis of Arousal and Valence in Dog Vocalization
Paperid: 1256,
Authors: Chunyi Li, Bo Hu, Taiyang Chen, Leida Li, Lihuo He, Xinbo Gao
Title: Low-light Image Enhancement Quality Assessment: A Real-World Dataset and An Objective Method
Paperid: 1257,
Authors: Haohui Li, Bowen Qu, Wei Gao
Title: T23D-QA: An Open Dataset and Benchmark for Text-driven 3D Generation Quality Assessment
Paperid: 1258,
Authors: Jinsheng Wei, Jialiang Sun, Guanming Lu, Jingjie Yan, Dong Zhang
Title: Multi-information Hierarchical Fusion Transformer with Local Alignment and Global Correlation for Micro-Expression Recognition
Paperid: 1259,
Authors: Xianrun Xu, Baoyao Yang, Wanyun Li, Jingsong Lin, Yufei Xu
Title: Simple but Effective: Sub-Volume Contrastive Learning for Class-Imbalanced Semi-Supervised 3D Medical Image Segmentation
Paperid: 1260,
Authors: Pengyu Zeng, Jun Yin, Haoyuan Sun, Yuqin Dai, Maowei Jiang, Miao Zhang, Shuai Lu
Title: MRED-14: A Benchmark for Low-Energy Residential Floor Plan Generation with 14 Flexible Inputs
Paperid: 1261,
Authors: Jiawei Ge, Xinyu Zhang, Jiuxin Cao, Xuelin Zhu, Weijia Liu, Qingqing Gao, Biwei Cao, Kun Wang, Chang Liu, Bo Liu, Chen Feng, Ioannis Patras
Title: Gen4Track: A Tuning-free Data Augmentation Framework via Self-correcting Diffusion Model for Vision-Language Tracking
Paperid: 1262,
Authors: Xiangyu Zheng, Songcheng He, Wanyun Li, Xiaoqiang Li, Wei Zhang
Title: Shallow Features Matter: Hierarchical Memory with Heterogeneous Interaction for Unsupervised Video Object Segmentation
Paperid: 1263,
Authors: Lizhi Xiong, Linsen Ding, Ziqiang Li
Title: Detecting Forged HEVC Videos via Anomalous Bitrate-Compressed Traces: A Frame-Level Bitrate Analysis Framework
Paperid: 1264,
Authors: Kaihang Jiang, Waikeung Wong, Jianyang Qin, Xiaozhao Fang, Jie Wen, Bingzhi Chen, Hongbo Gao
Title: Label Prediction Inherited Hashing for Cross-Modal Retrieval: Applying Supervised Hashing to Unsupervised Tasks
Paperid: 1265,
Authors: Lin Zuo, Kunshan Yang, Mengmeng Jing, Xiangxu Zhao, Jiaqiao Chen
Title: Bridging Inter-Class Ambiguity and Spatial Variability in Flexible Object Recognition via Graph Distillation
Paperid: 1266,
Authors: Liqian Zhang, Feng Yuan, Haoran Xie, Fu Lee Wang, Zhaoqing Pan
Title: Evaluating Visual Quality of Autostereoscopic 3D Displays via a Multi-Modal Parameter Perception Network
Paperid: 1267,
Authors: Yanfeng Liu, Lefei Zhang
Title: Multimodal Decomposed Distillation with Instance Alignment and Uncertainty Compensation for Thermal Object Detection
Paperid: 1268,
Authors: Siyi Qian, Jian Fang, Yuzhou Mao, Yayun Zou, Wentao Zhang, Haiwei Xue
Title: Human Motion Generation in 3D Scenes from Open-Ended Textual Instructions with MLLM Planning
Paperid: 1269,
Authors: Wanyi Zhuang, Qi Chu, Tao Gong, Changtao Miao, Nenghai Yu
Title: Towards Good Generalizations for Diffusion Generated Image Detection Using Multiple Reconstruction Contrastive Learning
Paperid: 1270,
Authors: Yuezhou Li, Yuzhen Niu, Huangbiao Xu, Hui Da, Rui Xu, Wenxi Liu
Title: IPCMoE: Integrating Perceptual Cues with Mixture-of-Experts for Joint Low-Light Image Enhancement and Deblurring
Paperid: 1271,
Authors: Hua Wang, Hong Liu, Jiale Ren, Mingxin Tan, Zhongzien Jiang
Title: CLIP-6D: Empowering CLIP as a Zero-Shot 6D Pose Estimator Through Generalizable Object-Specific Representations
Paperid: 1272,
Authors: Changyu Rao, Gaozhi Liu, Sheng Li, Xinpeng Zhang, Zhenxing Qian
Title: DynMark: A Robust Watermarking Solution for Dynamic Screen Content with Small-size Screenshot Support
Paperid: 1273,
Authors: Chengpei Xu, Wenhao Zhou, Long Ma, Weimin Wang, Feng Xia, Binghao Li, Wenjie Zhang
Title: Bright to Dark: Stage-wise Bilevel Knowledge Transfer for Seeing Text in the Dark
Paperid: 1274,
Authors: Yuwei Zhou, Xin Wang, Hong Chen, Yipeng Zhang, Zeyang Zhang, Wenwu Zhu
Title: ModuleTeam: Open-Set Multi-Conditional Image Generation with Training-Free Latent Mixture of Any Control Module
Paperid: 1275,
Authors: Wanting Zhang, Jingxuan Zhang, Libao Zhang
Title: Saliency-Guided Adaptive Random Diffusion for Remote Sensing Images Restoration with Cloud and Haze
Paperid: 1276,
Authors: Yifan Zeng, Fangzhou Dong, Jian Zhao, Peijia Zheng, jian li, Huiyu Zhou
Title: Towards Culturally Fair Multimodal Generation: Quantifying and Mitigating Orientalist Biases in Text-to-Visual Models
Paperid: 1277,
Authors: Wei Jia, Li Jin, Kaiwen Wei, Yuying Shang, Nayu Liu, Zhicong Lu, Qing Liu, Linhao Zhang, Jiang Zhong, Yanfeng Hu
Title: U-MERE: Unconstrained Multimodal Entity and Relation Extraction with Collaborative Modeling and Order-Sensitive Optimization
Paperid: 1278,
Authors: Xingke Song, Jianxu Shangguan, Yiran Li, Jialu ZHANG, Jianfeng Ren, Ruibin Bai, Xin Chen, Xudong Jiang
Title: CEARI: Co-Evolutionary Agents for Reassembling and Inpainting Puzzles with Gaps and Missing Pieces
Paperid: 1279,
Authors: Swarna Chakraborty, Mylene Farias
Title: MT-DPCQA: A Multimodal Time-aware Learning Approach for No-Reference Dynamic Point Cloud Quality Assessment
Paperid: 1280,
Authors: Guitao Xu, Ziqi Yi, Peirong Zhang, Jiahuan Cao, Shihang Wu, Lianwen Jin
Title: From Pixels to Semantics: a Novel MLLM-Driven Approach for Explainable Tampered Text Detection
Paperid: 1281,
Authors: Yihang Liu, Ying Wen, Longzhen Yang, Lianghua He, Heng Tao Shen
Title: RadLAS: A Foundation Model for Interpretable Radiography Image Analysis with Lesion-Aware Self-Supervised Pre-training
Paperid: 1282,
Authors: Wenhui Wu, Guanqi Wen, Le Ou-Yang, Ran Wang, Sam Kwong
Title: DUIMC: Deep Unbalanced Incomplete Multi-View Clustering via Graph Constrained Imputation and Contrastive Learning
Paperid: 1283,
Authors: Seungkyu Leem, Seokhyun Jeong, Yeonho Cho, Yoonjae Lee, Jungjin Lee
Title: VRMusicStage: A System for Converting Fixed-Camera Music Stage Videos into Immersive VR Content
Paperid: 1284,
Authors: Jiaqing Fan, Hanwen Qian, Mengjuan Jiang, Fanzhang Li
Title: PeriodVOS: Learning Periodic Patterns for Unsupervised Video Object Segmentation via Adaptive Contextual Coupling
Paperid: 1285,
Authors: Jinxu Zhang, QiyuanFan QiyuanFan, Yongqi Yu, Yu Zhang
Title: DREAM: Integrating Hierarchical Multimodal Retrieval with Multi-page Multimodal Language Model for Documents VQA
Paperid: 1286,
Authors: Shanshan Li, Jiawei Hou, Da Huang, Yanwei Fu, Xiangyang Xue
Title: Ali-UI: Enhancing Complex Vision-Language Navigation with Alignment of Unified Map and Instruction Parsing
Paperid: 1287,
Authors: jie yu, Songping Mai, Peng Zhang, Yucheng Jiang, Jian Cheng
Title: Activation and Weight Distribution Balancing for Optimal Post-Training Quantization in Learned Image Compression
Paperid: 1288,
Authors: Bo Xu, Jie Wei, Hongya Wang, Ming Du, Hui Song, Yanghua Xiao
Title: Bridging the Unseen Gap: Label-Enhanced Information Bottleneck Distillation for Multimodal Named Entity Recognition
Paperid: 1289,
Authors: Hao Gu, Jiangyan Yi, Chenglong Wang, Jianhua Tao, Zheng Lian, Jiayi He, Yong Ren, Yujie Chen, Zhengqi Wen
Title: $\mathcal{A}LLM4ADD$: Unlocking the Capabilities of Audio Large Language Models for Audio Deepfake Detection
Paperid: 1290,
Authors: Junwei Zhao, Qianchun Luo, Shiliang Zhang, Shen Gao, Jie Wu
Title: HDCFN: Haze Distribution-aware Cross-modal Fusion Network for Infrared-guided Dense Haze Removal in UAVs
Paperid: 1291,
Authors: Qin Li, CongCongXiao CongCongXiao, Limei Liu, Han Peng, Junfeng Yang
Title: Skeleton Compression and Complementary Enhanced Fusion Under Branch-Stage Supervision for Human Action Recognition
Paperid: 1292,
Authors: Cheng Peng, Zhen Wang
Title: Method and Applications of Solid-State Lidar Modeling for X-in-the-Loop Testing of Autonomous Vehicles
Paperid: 1293,
Authors: Jinming Zhang, Yunlian Sun, Hongwen Zhang, Jinhui Tang
Title: EDMG: Towards Efficient Long Dance Motion Generation with Fundamental Movements from Dance Genres
Paperid: 1294,
Authors: Richen Liu, Lingyu Sun, Xuefeng Huang, Yiran Li, Jiang Zhang, Siru Chen, Zhouhao Wu, Ayush Kumar, Chufan Lai
Title: Meta-Illustrator: Transferring Illustrations from 2D Interactive Image Space to 3D Immersive Exploration Space
Paperid: 1295,
Authors: Yucheng Shu, Yaohui Wang, Lihong Qiao, Feiyan Li, Bin Xiao, Weisheng Li, Xinbo Gao
Title: The Overlooked Matters: Revisiting Background, Prototype, and Activation in Few-Shot Medical Image Segmentation
Paperid: 1296,
Authors: Sizhe Zhao, Chenyang Wang, Weiyu Zhao, Zonglin Li, Ming Li, Shengping Zhang
Title: REA-Listener: Real-Time Listening Head Generation with Dynamic Emotion Modeling and Flexible Modality Adaptation
Paperid: 1297,
Authors: Nan Gao, Junchao Zhu, YILONG ZHANG, Ronghua Liang, Guodao Sun, Peng Chen
Title: Dual Teacher with Dempster-Shafer Guidance for Decision Making in Semi-Supervised Small Object Detection
Paperid: 1298,
Authors: Shikun Sun, Chengrui Wang, Min Zhou, Zixuan Wang, Xiaoyu Qin, Tiezheng Ge, Bo Zheng, Jia Jia
Title: DEPO: Enhancing E-commerce Image Background Generation with Short Trajectory Direct Expected Preference Optimization
Paperid: 1299,
Authors: Zeyu Zhu, KE LIANG, Lingyuan Meng, Xingchen Hu, Xinwang Liu, Wanwei Liu, Kunlun He
Title: SALVG: Latent Variable Gene Augmented Graph Learning for Multi-View Clustering in Spatial Transcriptomics
Paperid: 1300,
Authors: Ana Rebelo, Pedro Ferreira, André Ribeiro, Rui Nóbrega
Title: Walking vs. Teleport in VR: Why Walking and Portals Matter in Small Spaces
Paperid: 1301,
Authors: Chunpeng Wang, Wenlong Ma, Li Zou, Zhiqiu Xia, Qi Li, Bin Ma, Yunan Liu
Title: Toward Robust Deepfake Detection: A Proactive Method Based on Watermarking and Knowledge Distillation
Paperid: 1302,
Authors: Huanqi Wu, Huangbiao Xu, Xiao Ke
Title: The Devil in the Stego Image: Far from Being Usable in Real-World Scenarios
Paperid: 1303,
Authors: Yijie Zhu, Yibo Lyu, Zitong YU, Rui Shao, Kaiyang Zhou, Liqiang Nie
Title: EmoSym: A Symbiotic Framework for Unified Emotional Understanding and Generation via Latent Reasoning
Paperid: 1304,
Authors: Zhaoyu Chen, Qian Huang, Xing Li, Yunfei Zhang, Shihao Han, Ge Gao, Yirui Wu, Xin Li, Ziyang Yin
Title: Geo-CF2Net: Geometry-Prior Cross-Frequency Interactive Fusion Network for Point Coud-based 3D Human Action Recognition
Paperid: 1305,
Authors: Huabin Wang, Yingfan Cheng, Wu Zheng, Jiayuan Cheng, Xin Li, Min Li, Fei Liu
Title: A Multi-illumination Dataset and a Illumination Domain Adaptation Network for Finger Vein Identification
Paperid: 1306,
Authors: Shenjie Jiang, Zhuoyu Wang, Xuecheng Wu, Hongru Ji, Mingxin Li, Xianghua Li, Chao Gao
Title: DDSE: A Decoupled Dual-Stream Enhanced Framework for Multimodal Sentiment Analysis with Text-Centric SSM
Paperid: 1307,
Authors: Yishu Liu, Zhiming Chen, Desen Wang, Xiaoling Luo, Bingzhi Chen, Guangming Lu
Title: PET-GPRA: Rethinking PET with Gradient-Aware Prompting and Router-Free Adapters for Few-shot Class-Incremental Learning
Paperid: 1308,
Authors: Xueyi Zhang, Peiyin Zhu, Yuan Liao, Xiyu Wang, Mingrui Lao, Siqi Cai, Yanming Guo, Haizhou Li
Title: TrustCLIP: Learning from Noisy Labels via Semantic Label Verification and Trust-aligned Gradient Projection
Paperid: 1309,
Authors: Yechao Xu, Zhengxing Sun, Qian Li, Yunhan Sun
Title: Text Prompted Spatiotemporal Sequence Prediction with Text-Vision Prompt Refiner and Masked Diffusion Transformers
Paperid: 1310,
Authors: Yanting Pei, Fan Yang
Title: Adaptive Neighbors and Uncertainty Estimation for Source-Free Unsupervised Domain Adaptation with Noisy Labels
Paperid: 1311,
Authors: Yang Liu, Zhang Zhiyong
Title: DSP: Dense-Sparse Parallel Networks for Self-supervised 3D Multi-person Pose Estimation from Multiple Views
Paperid: 1312,
Authors: Jianxiang Xie, Yao Wu, Yachao Zhang, Xiaopei Zhang, Yuan Xie, Yanyun Qu
Title: PLATO-TTA: Prototype-Guided Pseudo-Labeling and Adaptive Tuning for Multi-Modal Test-Time Adaptation of 3D Segmentation
Paperid: 1313,
Authors: Jiahuan Cao, Yang Liu, Peirong Zhang, Yongxin Shi, Kai Ding, Lianwen Jin
Title: TongGu-VL: Advancing Visual-Language Understanding in Chinese Classical Studies through Parameter Sensitivity-Guided Instruction Tuning
Paperid: 1314,
Authors: Zhuojun Wu, Dong Liu, Juan Liu, Yechen Wang, Linxi Li, Liwei Jin, Hui Bu, Pengyuan zhang, Ming Li
Title: SMIIP-NV: A Multi-Annotation Non-Verbal Expressive Speech Corpus in Mandarin for LLM-Based Speech Synthesis
Paperid: 1315,
Authors: Yulong Li, Yuxuan Zhang, Rui Chen, Feilong Tang, Zhixiang Lu, Ming Hu, Jianghao Wu, Haochen Xue, Mian Zhou, Chong Li, Jionglong Su, Imran Razzak
Title: Genesis: A Large-Scale Benchmark for Multimodal Large Language Model in Emotional Causality Analysis
Paperid: 1316,
Authors: Tingrui Shen, Bangzhen Liu, Zhirun Fan, Shiting Zhang, Weifeng Pan, Sun Fan, Dan Cao, Shengfeng He
Title: Language-Driven 3D Human Pose Estimation in Multi-Person Scenarios: A New Dataset and Approach
Paperid: 1317,
Authors: Weibin Wu, Zitong Wang, Zhengjie Luo, Wenqing Chen, Zibin Zheng
Title: Detecting Violations of Physical Common Sense in Images: A Challenge Dataset and Effective Model
Paperid: 1318,
Authors: Zixi Wang, Yubo Huang, Jingzehua Xu, Jinzhu Wei, Shuai Zhang, Xin Lai
Title: Multi-Modal Gradual Domain Osmosis: Stepwise Dynamic Learning with Batch Matching for Gradual Domain Adaptation
Paperid: 1319,
Authors: Yang Hu, Jingui Ma, Yucheng Yang, Jie Liang, Jinbo Yan, Jiahao Wu, Jiayu Yang, Yang Deng, Ronggang Wang
Title: Excavating the Most Critical Gaussians: Sparse Selection and Structural Optimization for Efficient 3DGS Compression
Paperid: 1320,
Authors: Shangheng Chen, Shengsheng Qian, Quan Fang, Jun Hu, Changsheng Xu
Title: A Large-Scale Dataset for Short-Video Topic Peak Prediction and a Large Heterogeneous Graph Model
Paperid: 1321,
Authors: Dirui Xie, Xiaofang Hu, ZihanWei ZihanWei, Zhengqiqi Yang, Yanlian Jiang, Yue Zhou
Title: Learning Structural Priors via Laplacian RWKV Diffusion with Light-Effect Dataset for Nighttime Visibility Enhancement
Paperid: 1322,
Authors: Leidong Fan, Zhang Qian, Qing Li
Title: Inverse-Tone-Mapped HDR Video Quality Assessment for Broadcast Television: A Comprehensive Dataset and SDR-Referenced Method
Paperid: 1323,
Authors: Qinfu Xu, Liyuan Pan, Shaozu Yuan, Yiwei Wei, Chunlei Wu
Title: From Subtle Hints to Grand Expressions – Mastering Fine-grained Emotions with Dynamic Multimodal Analysis
Paperid: 1324,
Authors: Seungmi Choi, TaeHwa Lee, Jun Yeong Cha, Suhyun Jo, Hyunmin Ban, Kwan-Jung Oh, Hyunsuk Ko, Hui Yong Kim
Title: Phase Distribution Matters: On the Importance of Phase Distribution Alignment (PDA) in Holographic Applications
Paperid: 1325,
Authors: Zhi Zeng, Jiaying Wu, Minnan Luo, Xiangzheng Kong, Zihan Ma, Guang Dai, Qinghua Zheng
Title: Understand, Refine and Summarize: Multi-Granularity Knowledge Progressive Enhancement Learning for Fake News Video Detection
Paperid: 1326,
Authors: Changhao Pan, Wenxiang Guo, Yu Zhang, Zhiyuan Zhu, ZheTao Chen, Han Wang, Zhou Zhao
Title: A Multimodal Evaluation Framework for Spatial Audio Playback Systems: From Localization to Listener Preference
Paperid: 1327,
Authors: Nan Ma, Beining Sun, Yiheng Han, Genbao Xu
Title: Kinematic Enhanced Hypergraph Convolutional Network for Skeleton-based Human Action Recognition with LLM Training Guides
Paperid: 1328,
Authors: Yefei Sheng, Jie Wang, Ming Tao, Bingkun BAO
Title: D²Gaussian: Dynamic Control with Discretized 3D View Modeling for Text-Driven 3D Gaussian Splatting Editing
Paperid: 1329,
Authors: Jiye Xie, Yifei Gao, Liangliang You, Xiang Xu, Haoran Xu, Zhiqiang Kou, Kexue Fu, Youyang Qu, Wenjie Yang, Jianwei Guo, Weiliang Meng, Longxiang Gao, Haoran Yang, Changwei Wang, Yu Zhang
Title: Collaboration Wins More: Dual-Modal Collaborative Attention Reinforcement for Mitigating Large Vision Language Models Hallucination
Paperid: 1330,
Authors: Shuyang Wang, Chunxiao Li, Anlong Ming
Title: IFS-Light: An Interactive Framework for Single-view Face Relighting with both Facial and Lighting Consistency
Paperid: 1331,
Authors: Nguyen Duy, Hoang Hoan, Thanh-Trung Phan
Title: Like or Not to Like: An Usecase of Vietnamese Street Food Videos on YouTube
Paperid: 1332,
Authors: Yuliang Chen, Xi Lin, Chao Sang, Xiu Su
Title: DualFPT: Handling Data Heterogeneity in Federated Prompt Tuning from both Generalized and Personalized Perspective
Paperid: 1333,
Authors: Guoxin Zhang, Zhonghong Ou, Kaiwen Xue, Jiangfeng Sun, Yifan Zhu, Siyuan Yao, Yiran Shen, Meina Song
Title: DGFSD: Bridging the Gap between Dense and Sparse for Fully Sparse 3D Object Detection
Paperid: 1334,
Authors: Feida Liu, Yifan Wang, Jiaqi Zheng, Boxi Liu, Guihai Chen
Title: Themis: Toward Stable Near-Zero Queuing Delay in Congestion Control for Low-Latency Interactive Video Streaming
Paperid: 1335,
Authors: Ahmad Alhilal, Ze Wu, Teemu Kämäräinen, Tristan Braud, Matti Siekkinen
Title: Congestion Control for VR Cloud Gaming: Integration and Comparison in Real VR Gaming Environment
Paperid: 1336,
Authors: Junxiao Ma, Jingjing Wang, Min Zhang, Guodong Zhou
Title: Skynet-V1: Towards Early Warning of Video Abnormal Events via A Spatial-temporal Causal-enhanced MoE Framework
Paperid: 1337,
Authors: Gyeongjin Kim, Sebin Lee, Daye Kim, Jungjin Lee, Minju Kim
Title: Bring the VibeOn: Designing a Multimodal Interface for Shared Emotional Experiences in Live-streamed Concerts
Paperid: 1338,
Authors: Yating Liu, Yang Zou, Xingyuan Li, Xingyue Zhu, Kaiqi Han, Zhiying Jiang, Long Ma, Jinyuan Liu
Title: Toward a Training-Free Plug-and-Play Refinement Framework for Infrared and Visible Image Registration and Fusion
Paperid: 1339,
Authors: Zhenbo Yu, Jimin Dai, Yingzhen Zhang, Jian Yang, lei luo
Title: SSAIM: Not All Self-Attentions Contain Effective Spatial Structure in Diffusion Models for Text-to-Image Editing
Paperid: 1340,
Authors: Junyu Gao, Xuan Yao, Yong Rui, Changsheng Xu
Title: Building Embodied EvoAgent: A Brain-inspired Paradigm for Bridging Multimodal Large Models and World Models
Paperid: 1341,
Authors: Bichen Wang, Yixin Sun, Yanyan Zhao, Bing Qin
Title: Beyond Snapshots: A Multimodal User-Level Dataset for Depression Detection in Dynamic Social Media Streams
Paperid: 1342,
Authors: Zhaohu Xing, Lihao Liu, Tian Ye, Sixiang Chen, Yijun Yang, Guang Liu, Lei Zhu
Title: Farther Than Mirror: Explore Pattern-Compensated Depth of Mirror with Temporal Changes for Video Mirror Detection
Paperid: 1343,
Authors: Nan An, Siqi Xu, Long Ma, Zhu Liu, Guangchao Han, Tengyu Ma, Risheng Liu
Title: Inter-Task Weaving in Image Enhancement: From a New Unified Architecture to a Better Meta-Representation Learning
Paperid: 1344,
Authors: Jiawei Zheng, Feiyan Liu, Xiaoli Wang
Title: Seeing Through Ambiguity: Effective Video-guided Machine Translation via Chaotic Fusion and Causally Aligned Spatio-temporal Attention
Paperid: 1345,
Authors: Yuwu Lu, Chunzhi Liu, Yihan Yang
Title: CWCP: Generalizing Virtual Reality to Real World with Contextual-Weather Correlation Pairing for Deraining and Desnowing
Paperid: 1346,
Authors: Zongxing Zhao, Shenzhi Yang, Xingkai Yao, Yuying Wang, Zhongqiu Chen, Xiaofang Zhang
Title: $\textbf{HGAC}_{\textbf{LLM}}$: Attribute Completion in Heterogeneous Graph with Integration of External Knowledge from Large Language Models
Paperid: 1347,
Authors: Yue Ling, Dong Zhao, Kaikai Deng, Kangwen Yin, Zixiao He, Yizong Wang, Huadong Ma
Title: Venus:Generating Large-scale mmWave Radar Data via Few 2D Videos for Gesture Recognition While Lying Down
Paperid: 1348,
Authors: Mingliang Yan, Yanhua Yu, Ruochi Zhang, Zhiyuan Liu, Ruicheng Zhang, Yimeng Ren, Kangkang Lu, Zhiyong Huang, Feng Luo, Zhen Cai
Title: DeepMolTex: Deep Alignment of Molecular Graphs with Large Language Models via Mixture of Modality Experts
Paperid: 1349,
Authors: Tianzuo Xin, Jing Wang, Xiyuan Jin, Xiaojun Ning, Zhiyang Feng, Youfang Lin
Title: MoCERNet: A Modality-Complete Modeling Framework for Emotion Recognition in Physiological Signals under Imperfect Modal Matching
Paperid: 1350,
Authors: Xiaojian Lin, Wenxin Zhang, Yuchu Jiang, Wangyu Wu, Yiran Guo, Kangxu Wang, Zongzheng Zhang, Guijin Wang, Lei Jin, Hao Zhao
Title: Butter: Frequency-Adaptive Feature Consistency and Progressive Hierarchical Fusion for Efficient Object Detection in Autonomous Driving
Paperid: 1351,
Authors: Siyuan Zhang, Xiaoping Wang, Jiang Li, Weibin Feng, Xin Zhan, Hongzhi Huang
Title: HAFUNet: A Hierarchical Attention Fusion Network for Monocular Depth Estimation Integrating Event and Frame Data
Paperid: 1352,
Authors: Weimin Cheng, Zhenyu Wang, Tao Huang, Fangfang Wu, Weisheng Dong
Title: Pushing the Limit of Binarized Neural Network for Image Super Resolution with Smooth Information Transmission
Paperid: 1353,
Authors: Yingbo Tang, Lingfeng Zhang, Shuyi Zhang, Yinuo Zhao, Xiaoshuai Hao
Title: RoboAfford: A Dataset and Benchmark for Enhancing Object and Spatial Affordance Learning in Robot Manipulation
Paperid: 1354,
Authors: Nora Hofer, Rainer Böhme
Title: Challenging Cases of Neural Image Compression: A Dataset of Visually Compelling Yet Semantically Incorrect Reconstructions
Paperid: 1355,
Authors: Shuai Yu, Xiaoliang He, Kangjie Dong, Yi Yu
Title: DUDA: A Two-stage Decoupling Unsupervised Domain Adaptation Framework for Semi-supervised Singing Melody Extraction from Polyphonic Music
Paperid: 1356,
Authors: Jie Wan, Jianhao Fu, Ziqi Yang, Kui Ren
Title: BTUAP: Boosting the Transferability of Universal Adversarial Perturbations in the Black-box Setting under various data dependencies
Paperid: 1357,
Authors: Muhammad Ali Farooq, Waseem Shariff, Peter Corcoran
Title: ThermVision: Exploring FLUX for Synthesizing Hyper-Realistic Thermal Face Data and Animations via Image to Video Translation
Paperid: 1358,
Authors: Wei Miao, Jiangrong Shen, Hongming Xu, Tommi Kärkkäinen, Qi Xu, Yi Xu, Fengyu Cong
Title: Advanced SpikingYOLOX: Extending Spiking Neural Network on Object Detection with Spike-based Partial Self-Attention and 2D-Spiking Transformer
Paperid: 1359,
Authors: Jingxing Guo, Guilian Chen, Yimu Sun, Huisi Wu, Jing Qin
Title: EchoVim: Making Vision Mamba Docile for Echocardiography Video Segmentation via Dynamic Interaction and Semantic Token-attentive Refinement
Paperid: 1360,
Authors: Qi Shen, Junchang Xin, Bing Dai, Shudi Zhang, Xinyao Liu, Zhiqiong Wang
Title: ElaSleepNet: Exploring an Elastic Multimodal Neural Network for Sleep Staging via Temporal and Contextual Consistency Learning
Paperid: 1361,
Authors: Kipp Freud, Daniel Collins, Delmiro Sampaio Neto, Grant Stevens
Title: AutoVec: Automatic generation of data and vector embeddings for arbitrary domains and cross-domain mappings using LLMs
Paperid: 1362,
Authors: Junlin Fang, Wenya Wang, Lingli Zhang, Fengmao Lv
Title: Why is a Bird’s Caption a Good Demonstration? Towards Effective Multimodal In-Context Learning without Dedicated Data
Paperid: 1363,
Authors: Weichen Zhang, Zile Zhou, Xin Zeng, LIU Xuchen, Jianjie Fang, Chen Gao, Jinqiang Cui, Yong Li, Xinlei Chen, Xiao-Ping Zhang
Title: Open3DVQA: A Benchmark for Embodied Spatial Concept Reasoning with Multimodal Large Language Model in Open Space
Paperid: 1364,
Authors: Bingcai Wei, Hui Liu, Chuang Qian, Zijian Li, Wangyu Wu, Zijie Meng
Title: Robust Single Image Sand Removal by Leveraging Uncertainty-aware SAM Priors and Prompt Learning with Refined Perceptual Loss
Paperid: 1365,
Authors: Sitian Gu, Zhiyu Pan, Chaoyi Hong, Chengxin Liu, Zhiguo Cao
Title: Dynamic Beauty is Easy to Find: A Large-Scale Composition-Aware Dataset and an End-to-End Framework for Video Reframing
Paperid: 1366,
Authors: Ronghui Li, Lingxiao Han, Shi Shu, Yueyao Liu, Yukang Lin, Yue Ma, Jie Guo, Ziwei Liu, Xiu Li
Title: A Motion is Worth a Hybrid Sentence: Taming Language Model for Unified Motion Generation by Fine-grained Planning
Paperid: 1367,
Authors: Yilin Zhang, Yanyan Wei, Zhao Zhang, Jicong Fan, Haijun Zhang, Shuicheng YAN
Title: From Outline to Detail: An Hierarchical End-to-end Framework for Coherent and Consistent Visual Novel Generation and Assembly
Paperid: 1368,
Authors: Dezhi Zheng, Kaijun Deng, Xianxu Hou, Jinbao Wang, Xiaoqin Wang, Linlin Shen
Title: Unknown Pixel Mask based finetuning of 2D Inpainting Models for Unbounded 3D Scene Generation from a Single Image
Paperid: 1369,
Authors: Dominika Wanat, Dawid Juszka, Mikołaj Leszczuk, Lucjan Janowski
Title: Bridging the Lab and the Wild: Behavioral Experiments as a Pathway to QoE Research Closer to Realistic Environment
Paperid: 1370,
Authors: Lamei Di, Bin Zhang, Yiming Wang, Wenxia Zhang
Title: Frequency Meets Semantics: Text-Visual Fusion with Directional Spectral Enhancement for Salient Object Detection in Optical Remote Sensing Images
Paperid: 1371,
Authors: Yu Chen, BinBin Yan, Shuo Chen, XinZhu Sang
Title: A Comprehensive Model for Visual Fatigue Assessment in 3D Light Field Displays Based on Eye Movement Data Analysis
Paperid: 1372,
Authors: Ziang Li, Chengxiang Si, Zhenyu Cheng
Title: Zero in on the Target: A Composite Robust Model for Retrieving Information in Traffic Data to Discover Network Attacks
Paperid: 1373,
Authors: Shuyang chu, Jingang Shi, Xu Cheng, Haoyu Chen, Xin Liu, Xu Jian, Guoying Zhao
Title: To Remember, To Adapt, To Preempt: A Stable Continual Test-Time Adaptation Framework for Remote Physiological Measurement in Dynamic Domain Shifts