[0] Targeted View-Invariant Adversarial Perturbations for 3D Object Recognition[cs.CV] 标题:3D目标识别中的定向视不变对抗扰动 作者:Christian Green, Mehmet Ergezer, Abdurrahman Zeybey 链接:http://arxiv.org/abs/2412.13376 备注:Accepted to AAAI-25 Workshop on Artificial Intelligence for Cyber Security (AICS): this http URL
[1] FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding[cs.CV] 标题:FlashVTG:视频时序定位的特征层叠与自适应分数处理网络 作者:Zhuo Cao, Bingqing Zhang, Heming Du, Xin Yu, Xue Li, Sen Wang 链接:http://arxiv.org/abs/2412.13441 代码:https://github.com/Zhuo-Cao/FlashVTG 备注:Accepted to WACV 2025
[2] ConDo: Continual Domain Expansion for Absolute Pose Regression[cs.CV]
标题:持续领域扩展用于绝对位姿回归 作者:Zijun Li, Zhipeng Cai, Bochun Yang, Xuelun Shen, Siqi Shen, Xiaoliang Fan, Michael Paulitsch, Cheng Wang 链接:http://arxiv.org/abs/2412.13452 备注:AAAI2025
[3] Pre-training a Density-Aware Pose Transformer for Robust LiDAR-based 3D Human Pose Estimation[cs.CV] 标题:预先训练一个密度感知的姿态转换器以实现鲁棒的基于激光雷达的3D人体姿态估计 作者:Xiaoqi An, Lin Zhao, Chen Gong, Jun Li, Jian Yang 链接:http://arxiv.org/abs/2412.13454 备注:Accepted to AAAI 2025
[4] Look Inside for More: Internal Spatial Modality Perception for 3D Anomaly Detection[cs.CV] 标题:探索内在以发现更多:3D异常检测的内部空间模态感知 作者:Hanzhe Liang, Guoyang Xie, Chengbin Hou, Bingshu Wang, Can Gao, Jinbao Wang 链接:http://arxiv.org/abs/2412.13461 备注:AAAI2025 Accepted
[5] FlexPose: Pose Distribution Adaptation with Limited Guidance[cs.CV] 标题:FlexPose:有限指导下的姿态分布适应 作者:Zixiao Wang, Junwu Weng, Mengyuan Liu, Bei Yu 链接:http://arxiv.org/abs/2412.13463 备注:Accepted by AAAI25, 12 pages, 10 figures
[6] Enabling Region-Specific Control via Lassos in Point-Based Colorization[cs.CV] 标题:通过拉索实现基于点的颜色着色区域特定控制的启用 作者:Sanghyeon Lee, Jooyeol Yun, Jaegul Choo 链接:http://arxiv.org/abs/2412.13469 备注:Accepted to AAAI2025
[8] Plug-and-Play Tri-Branch Invertible Block for Image Rescaling[cs.CV] 标题:通用三分支可逆块用于图像缩放 作者:Jingwei Bao, Jinhua Hao, Pengcheng Xu, Ming Sun, Chao Zhou, Shuyuan Zhu 链接:http://arxiv.org/abs/2412.13508 代码:https://github.com/Jingwei-Bao/T-InvBlocks 备注:Accepted by AAAI 2025. Code is available at this https URL
[9] Dynamic Adapter with Semantics Disentangling for Cross-lingual Cross-modal Retrieval[cs.CV] 标题:动态语义解耦型跨语言跨模态检索适配器 作者:Rui Cai, Zhiyu Dong, Jianfeng Dong, Xun Wang 链接:http://arxiv.org/abs/2412.13510 备注:Accepted by the 39th AAAI Conference on Artificial Intelligence (AAAI-25)
[10] Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning[cs.CV] 标题:基于查询中心的音频视觉认知网络用于瞬间检索、分割和步句标注 作者:Yunbin Tu, Liang Li, Li Su, Qingming Huang 链接:http://arxiv.org/abs/2412.13543 备注:Accepted by AAAI 2025
[11] Multi-View Pedestrian Occupancy Prediction with a Novel Synthetic Dataset[cs.CV] 标题:多视角新型合成数据集下的行人占用预测 作者:Sithu Aung, Min-Cheol Sagong, Junghyun Cho 链接:http://arxiv.org/abs/2412.13569 备注:AAAI 2025
[12] Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation[cs.CV] 标题:桥梁再启程:生成针对源图像情绪自适应的基于目标图像的中介模型 作者:Jiankun Zhu, Sicheng Zhao, Jing Jiang, Wenbo Tang, Zhaopan Xu, Tingting Han, Pengfei Xu, Hongxun Yao 链接:http://arxiv.org/abs/2412.13577 备注:Accepted by AAAI2025
[13] Generalizable Sensor-Based Activity Recognition via Categorical Concept Invariant Learning[cs.CV] 标题:通用型的基于传感器的活动识别:通过类别概念不变学习 作者:Di Xiong, Shuoyuan Wang, Lei Zhang, Wenbo Huang, Chaolei Han 链接:http://arxiv.org/abs/2412.13594 备注:Accepted by AAAI 2025
[14] Robust Tracking via Mamba-based Context-aware Token Learning[cs.CV] 标题:基于Mamba的上下文感知标记学习实现稳健跟踪 作者:Jinxia Xie, Bineng Zhong, Qihua Liang, Ning Li, Zhiyi Mo, Shuxiang Song 链接:http://arxiv.org/abs/2412.13611 代码:https://github.com/GXNU-ZhongLab/TemTrack 备注:AAAI2025
[15] Reverse Region-to-Entity Annotation for Pixel-Level Visual Entity Linking[cs.CV] 标题:反向区域到实体注释用于像素级视觉实体链接 作者:Zhengfei Xu, Sijia Zhao, Yanchao Hao, Xiaolong Liu, Lili Li, Yuyang Yin, Bo Li, Xi Chen, Xin Xin 链接:http://arxiv.org/abs/2412.13614 备注:AAAI 2025;Dataset are released at this https URL
[16] Consistency of Compositional Generalization across Multiple Levels[cs.CV] 标题:组成泛化多层次的连贯性 作者:Chuanhao Li, Zhen Li, Chenchen Jing, Xiaomeng Fan, Wenbo Ye, Yuwei Wu, Yunde Jia 链接:http://arxiv.org/abs/2412.13636 备注:Accepted by AAAI 2025
[17] VIIS: Visible and Infrared Information Synthesis for Severe Low-light Image Enhancement[cs.CV] 标题:VIIS:可见光与红外信息融合用于改善严重低光照图像 作者:Chen Zhao, Mengyuan Yu, Fan Yang, Peiguang Jing 链接:http://arxiv.org/abs/2412.13655 代码:https://github.com/Chenz418/VIIS 备注:Accepted to WACV 2025
[18] JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts[cs.CV] 标题:JoVALE:利用视听和语言上下文在视频中检测人类动作 作者:Taein Son, Soo Won Seo, Jisong Kim, Seok Hwan Lee, Jun Won Choi 链接:http://arxiv.org/abs/2412.13708 代码:https://github.com/taeiin/AAAI2025-JoVALE 备注:Accepted to AAAI Conference on Artificial Intelligence 2025, 9 pages, 5 figures
[19] Physics-Based Adversarial Attack on Near-Infrared Human Detector for Nighttime Surveillance Camera Systems[cs.CV]
标题:基于物理的近红外人体检测器针对夜间监控摄像头系统的对抗攻击 作者:Muyao Niu, Zhuoxiao Li, Yifan Zhan, Huy H. Nguyen, Isao Echizen, Yinqiang Zheng 链接:http://arxiv.org/abs/2412.13709 代码:https://github.com/MyNiuuu/AdvNIR 备注:Appeared in ACM MM 2023
[20] Mesoscopic Insights: Orchestrating Multi-scale & Hybrid Architecture for Image Manipulation Localization[cs.CV] 标题:微观洞察:为图像操纵定位构建跨尺度与混合架构 作者:Xuekang Zhu, Xiaochen Ma, Lei Su, Zhuohang Jiang, Bo Du, Xiwen Wang, Zeyu Lei, Wentao Feng, Chi-Man Pun, Jizhe Zhou 链接:http://arxiv.org/abs/2412.13753 代码:https://github.com/scu-zjz/Mesorch 备注:AAAI 2025. Code:
[21] On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process[cs.CV] 标题:关于解释知识蒸馏:测量和可视化知识迁移过程 作者:Gereziher Adhane, Mohammad Mahdi Dehshibi, Dennis Vetter, David Masip, Gemma Roig 链接:http://arxiv.org/abs/2412.13943 备注:Accepted to 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV'25). Includes 5 pages of supplementary material
[22] GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians[cs.CV] 标题:图肖像:基于图神经网络生成的3D高斯分布的紧凑型头部肖像 作者:Xiaobao Wei, Peng Chen, Ming Lu, Hui Chen, Feng Tian 链接:http://arxiv.org/abs/2412.13983 代码:https://github.com/ucwxb/GraphAvatar 备注:accepted by AAAI2025
[23] Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts[cs.CV] 标题:自适应概念瓶颈在分布变化下的基础模型 作者:Jihye Choi, Jayaram Raghuram, Yixuan Li, Somesh Jha 链接:http://arxiv.org/abs/2412.14097 备注:The preliminary version of the work appeared in the ICML 2024 Workshop on Foundation Models in the Wild
[24] GaraMoSt: Parallel Multi-Granularity Motion and Structural Modeling for Efficient Multi-Frame Interpolation in DSA Images[cs.CV] 标题:GaraMoSt:DSA图像中高效多帧插值的多粒度运动和结构建模的并行处理 作者:Ziyang Xu, Huangxuan Zhao, Wenyu Liu, Xinggang Wang 链接:http://arxiv.org/abs/2412.14118 代码:https://github.com/ZyoungXu/GaraMoSt 备注:Accepted by AAAI2025
自然语言处理会议: 19篇
[0] Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation[cs.CL] 标题:扩展LLMs到新语言:拉马和波斯语适应案例研究 作者:Samin Mahdizadeh Sani, Pouya Sadeghi, Thuy-Trang Vu, Yadollah Yaghoobzadeh, Gholamreza Haffari 链接:http://arxiv.org/abs/2412.13375 备注:accepted at COLING 2025
[1] An Automated Explainable Educational Assessment System Built on LLMs[cs.CL] 标题:基于大语言模型的自动化可解释教育评估系统 作者:Jiazheng Li, Artem Bobrov, David West, Cesare Aloisi, Yulan He 链接:http://arxiv.org/abs/2412.13381 备注:Accepted to AAAI 2025
[2] Enhancing Talk Moves Analysis in Mathematics Tutoring through Classroom Teaching Discourse[cs.CL] 标题:通过课堂教学话语提升数学辅导中的谈话行为分析 作者:Jie Cao, Abhijit Suresh, Jennifer Jacobs, Charis Clevenger, Amanda Howard, Chelsea Brown, Brent Milne, Tom Fischaber, Tamara Sumner, James H. Martin 链接:http://arxiv.org/abs/2412.13395 备注:Accepted to COLING'2025
[3] VaeDiff-DocRE: End-to-end Data Augmentation Framework for Document-level Relation Extraction[cs.CL] 标题:VaeDiff-DocRE:文档级关系抽取的端到端数据增强框架 作者:Khai Phan Tran, Wen Hua, Xue Li 链接:http://arxiv.org/abs/2412.13503 备注:COLING 2025
[4] Dynamic Adapter with Semantics Disentangling for Cross-lingual Cross-modal Retrieval[cs.CV] 标题:动态语义解耦型跨语言跨模态检索适配器 作者:Rui Cai, Zhiyu Dong, Jianfeng Dong, Xun Wang 链接:http://arxiv.org/abs/2412.13510 备注:Accepted by the 39th AAAI Conference on Artificial Intelligence (AAAI-25)
[5] CEHA: A Dataset of Conflict Events in the Horn of Africa[cs.CL] 标题:东非角冲突事件数据集 作者:Rui Bai, Di Lu, Shihao Ran, Elizabeth Olson, Hemank Lamba, Aoife Cahill, Joel Tetreault, Alex Jaimes 链接:http://arxiv.org/abs/2412.13511 备注:Accepted by COLING 2025
[6] Information-Theoretic Generative Clustering of Documents[cs.CL] 标题:信息论文档生成聚类 作者:Xin Du, Kumiko Tanaka-Ishii 链接:http://arxiv.org/abs/2412.13534 备注:Accepted to AAAI 2025
[7] Multi-Granularity Open Intent Classification via Adaptive Granular-Ball Decision Boundary[cs.CL] 标题:多粒度自适应粒度球决策边界开放意图分类 作者:Yanhua Li, Xiaocao Ouyang, Chaofan Pan, Jie Zhang, Sen Zhao, Shuyin Xia, Xin Yang, Guoyin Wang, Tianrui Li 链接:http://arxiv.org/abs/2412.13542 期刊:AAAI2025 备注:This paper has been Accepted on AAAI2025
[8] Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning[cs.CV] 标题:基于查询中心的音频视觉认知网络用于瞬间检索、分割和步句标注 作者:Yunbin Tu, Liang Li, Li Su, Qingming Huang 链接:http://arxiv.org/abs/2412.13543 备注:Accepted by AAAI 2025
[9] Socio-Culturally Aware Evaluation Framework for LLM-Based Content Moderation[cs.CL] 标题:基于LLM的中文内容审查的社会文化感知评价框架 作者:Shanu Kumar, Gauri Kholkar, Saish Mendke, Anubhav Sadana, Parag Agrawal, Sandipan Dandapat 链接:http://arxiv.org/abs/2412.13578 备注:Accepted in SUMEval Workshop in COLING 2025
[10] Reverse Region-to-Entity Annotation for Pixel-Level Visual Entity Linking[cs.CV] 标题:反向区域到实体注释用于像素级视觉实体链接 作者:Zhengfei Xu, Sijia Zhao, Yanchao Hao, Xiaolong Liu, Lili Li, Yuyang Yin, Bo Li, Xi Chen, Xin Xin 链接:http://arxiv.org/abs/2412.13614 备注:AAAI 2025;Dataset are released at this https URL
[11] Semantic Convergence: Harmonizing Recommender Systems via Two-Stage Alignment and Behavioral Semantic Tokenization[cs.CL] 标题:语义收敛:通过两阶段对齐和行为语义标记实现推荐系统的和谐统一 作者:Guanghan Li, Xun Zhang, Yufei Zhang, Yifan Yin, Guojun Yin, Wei Lin 链接:http://arxiv.org/abs/2412.13771 备注:7 pages, 3 figures, AAAI 2025
[13] Physics Reasoner: Knowledge-Augmented Reasoning for Solving Physics Problems with Large Language Models[cs.CL] 标题:物理学推理器:用于解决物理问题的基于大型语言模型的知识增强推理 作者:Xinyu Pang, Ruixin Hong, Zhanke Zhou, Fangrui Lv, Xinwei Yang, Zhilong Liang, Bo Han, Changshui Zhang 链接:http://arxiv.org/abs/2412.13791 备注:COLING 2025
[14] Enhancing Rhetorical Figure Annotation: An Ontology-Based Web Application with RAG Integration[cs.CL] 标题:增强修辞手法标注:一个基于本体论的具有RAG集成的网络应用程序 作者:Ramona Kühn, Jelena Mitrović, Michael Granitzer 链接:http://arxiv.org/abs/2412.13799 备注:The 31st International Conference on Computational Linguistics (COLING 2025)
[15] What makes a good metric? Evaluating automatic metrics for text-to-image consistency[cs.CL] 标题:评价文本到图像一致性自动指标的标准是什么? 作者:Candace Ross, Melissa Hall, Adriana Romero Soriano, Adina Williams 链接:http://arxiv.org/abs/2412.13989 备注:Accepted and presented at COLM 2024
[16] FarExStance: Explainable Stance Detection for Farsi[cs.CL] 标题:远态立场:针对波斯语的解释性立场检测 作者:Majid Zarharan, Maryam Hashemi, Malika Behroozrazegh, Sauleh Eetemadi, Mohammad Taher Pilehvar, Jennifer Foster 链接:http://arxiv.org/abs/2412.14008 备注:Accepted in COLING 2025
[17] Hansel: Output Length Controlling Framework for Large Language Models[cs.CL] 标题:汉塞尔:大型语言模型输出长度控制框架 作者:Seoha Song, Junhyun Lee, Hyeonmok Ko 链接:http://arxiv.org/abs/2412.14033 备注:13 pages, 6 figures; accepted to AAAI-25
[18] Compositional Generalization Across Distributional Shifts with Sparse Tree Operations[cs.CL] 标题:跨分布偏移的稀疏树操作组合泛化 作者:Paul Soulos, Henry Conklin, Mattia Opper, Paul Smolensky, Jianfeng Gao, Roland Fernandez 链接:http://arxiv.org/abs/2412.14076 代码:https://github.com/psoulos/sdtm 备注:NeurIPS 2024. Code available at this https URL