[0] Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor[cs.CV] 标题:学习纠正:视觉常识推理干扰因素的解释性反馈生成 作者:Jiali Chen, Xusen Hei, Yuqi Xue, Yuancheng Wei, Jiayuan Xie, Yi Cai, Qing Li 链接:http://arxiv.org/abs/2412.07801 备注:Accepted by ACM MM 2024
[1] Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery[cs.CV] 标题:像素到多边形:一种从遥感图像中实现端到端多边形建筑地块提取的序列预测方法 作者:Yeshwanth Kumar Adimoolam, Charalambos Poullis, Melinos Averkiou 链接:http://arxiv.org/abs/2412.07899 代码:https://github.com/yeshwanth95/Pix2Poly 备注:Accepted to WACV 2025. 20 pages, 13 figures, 8 tables
[2] PGRID: Power Grid Reconstruction in Informal Developments Using High-Resolution Aerial Imagery[cs.CV] 标题:PGRID:利用高分辨率航空影像在非正式发展中重建电力网络 作者:Simone Fobi Nsutezo, Amrita Gupta, Duncan Kebut, Seema Iyer, Luana Marotti, Rahul Dodhia, Juan M. Lavista Ferres, Anthony Ortiz
链接:http://arxiv.org/abs/2412.07944 备注:Accepted to WACV 2025 IEEE/CVF Winter Conference
[3] Balancing Shared and Task-Specific Representations: A Hybrid Approach to Depth-Aware Video Panoptic Segmentation[cs.CV] 标题:平衡共享表示与任务特定表示:一种用于深度感知视频全景分割的混合方法 作者:Kurt H.W. Stolle (Eindhoven University of Technology) 链接:http://arxiv.org/abs/2412.07966 代码:https://research.khws.io/multiformer 备注:Accepted at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025. Code and trained models are available at: this https URL
[4] BSAFusion: A Bidirectional Stepwise Feature Alignment Network for Unaligned Medical Image Fusion[cs.CV] 标题:BSAFusion:一种用于无配准医学图像融合的双向逐步特征对齐网络 作者:Huafeng Li, Dayong Su, Qing Cai, Yafei Zhang 链接:http://arxiv.org/abs/2412.08050 代码:https://github.com/slrl123/BSAFusion 备注:Accepted by AAAI2025
[5] Dense Depth from Event Focal Stack[cs.CV] 标题:密集深度自事件焦点堆栈 作者:Kenta Horikawa, Mariko Isogawa, Hideo Saito, Shohei Mori 链接:http://arxiv.org/abs/2412.08120 备注:Accepted at WACV2025
[6] Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation[cs.CV] 标题:瓦瑟斯坦距离在知识蒸馏方面与克鲁斯卡尔-莱布勒散度相抗衡 作者:Jiaming Lv, Haoyuan Yang, Peihua Li 链接:http://arxiv.org/abs/2412.08139 代码:https://peihuali.org/WKD 备注:Accepted to NeurIPS 2024. Equal contribution from first two authors
[7] AsyncDSB: Schedule-Asynchronous Diffusion Schrödinger Bridge for Image Inpainting[cs.CV] 标题:异步DSB:用于图像配色的调度异步扩散薛定谔桥 作者:Zihao Han, Baoquan Zhang, Lisai Zhang, Shanshan Feng, Kenghong Lin, Guotao Liang, Yunming Ye, Xiaochen Qi, Guangming Ye 链接:http://arxiv.org/abs/2412.08149 备注:Accepted by AAAI 2025
[8] TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning[cs.CV] 标题:文本精炼器:内部视觉特征作为高效精炼器用于视觉-语言模型提示微调 作者:Jingjing Xie, Yuxin Zhang, Jun Peng, Zhaohong Huang, Liujuan Cao 链接:http://arxiv.org/abs/2412.08176 代码:https://github.com/xjjxmu/TextRefiner 备注:Accepted by AAAI2025
[9] Textured Mesh Saliency: Bridging Geometry and Texture for Human Perception in 3D Graphics[cs.CV] 标题:纹理网格显著度:为三维图形中的人体感知融合几何和纹理 作者:Kaiwei Zhang, Dandan Zhu, Xiongkuo Min, Guangtao Zhai 链接:http://arxiv.org/abs/2412.08188 备注:to be published in AAAI 2025
[10] SAFIRE: Segment Any Forged Image Region[cs.CV] 标题:安全火焰:分段任何伪造图像区域 作者:Myung-Joon Kwon, Wonjun Lee, Seung-Hun Nam, Minji Son, Changick Kim 链接:http://arxiv.org/abs/2412.08197 代码:https://github.com/mjkwon2021/SAFIRE 期刊:Proceedings of the AAAI Conference on Artificial Intelligence, 2025 备注:Accepted at AAAI 2025. Code is available at: this https URL
[11] Hierarchical Classification for Automated Image Annotation of Coral Reef Benthic Structures[cs.CV] 标题:珊瑚礁底栖结构自动图像注解的分层分类 作者:Célia Blondin, Joris Guérin, Kelly Inagaki, Guilherme Longo, Laure Berti-Équille 链接:http://arxiv.org/abs/2412.08228 备注:Poster at Tackling Climate Change with Machine Learning: workshop at NeurIPS 2024
[12] Position-aware Guided Point Cloud Completion with CLIP Model[cs.CV] 标题:定位感知的CLIP模型引导的点云补全 作者:Feng Zhou, Qi Zhang, Ju Dai, Lei Li, Qing Fan, Junliang Xing 链接:http://arxiv.org/abs/2412.08271 备注:Accepted by AAAI25
[13] Digging into Intrinsic Contextual Information for High-fidelity 3D Point Cloud Completion[cs.CV] 标题:挖掘内在上下文信息以实现高保真3D点云补全 作者:Jisheng Chu, Wenrui Li, Xingtao Wang, Kanglin Ning, Yidan Lu, Xiaopeng Fan 链接:http://arxiv.org/abs/2412.08326 代码:https://github.com/JS-CHU/ContextualCompletion 备注:Accepted to AAAI2025
[14] CoDTS: Enhancing Sparsely Supervised Collaborative Perception with a Dual Teacher-Student Framework[cs.CV] 标题:CoDTS:使用双重师生框架增强稀疏监督的协同感知 作者:Yushan Han, Hui Zhang, Honglei Zhang, Jing Wang, Yidong Li 链接:http://arxiv.org/abs/2412.08344 备注:AAAI 2025
[15] ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature Enhancement[cs.CV] 标题:ConDSeg:一种基于对比驱动的特征增强的通用医学图像分割框架 作者:Mengqi Lei, Haochen Wu, Xinhua Lv, Xin Wang 链接:http://arxiv.org/abs/2412.08345 代码:https://github.com/Mengqi-Lei/ConDSeg 备注:This paper has been accepted by AAAI-2025
[16] Video Summarization using Denoising Diffusion Probabilistic Model[cs.CV] 标题:视频去噪扩散概率模型在视频摘要中的应用 作者:Zirui Shang, Yubo Zhu, Hongxi Li, Shuo yang, Xinxiao Wu 链接:http://arxiv.org/abs/2412.08357 备注:Accepted by AAAI2025
[18] Pragmatist: Multiview Conditional Diffusion Models for High-Fidelity 3D Reconstruction from Unposed Sparse Views[cs.CV] 标题:实用主义者:从未固定稀疏视图进行高保真3D重建的多视角条件扩散模型 作者:Songchun Zhang, Chunhui Zhao 链接:http://arxiv.org/abs/2412.08412 备注:Accepted by AAAI 2025. 13 pages, 8 figures
[19] PointCFormer: a Relation-based Progressive Feature Extraction Network for Point Cloud Completion[cs.CV] 标题:PointCFormer:一种基于关系的前向渐进特征提取网络用于点云补全 作者:Yi Zhong, Weize Quan, Dong-ming Yan, Jie Jiang, Yingmei Wei 链接:http://arxiv.org/abs/2412.08421 备注:9 pages, 8 figures, AAAI 2025, references added
[20] Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection[cs.CV] 标题:编排人类-物体交互检测的提示分布学习交响曲 作者:Mingda Jia, Liming Zhao, Ge Li, Yun Zheng 链接:http://arxiv.org/abs/2412.08506 备注:in Proceedings of the 39th AAAI Conference on Artificial Intelligence (AAAI-25)
[21] Learning to Decouple the Lights for 3D Face Texture Modeling[cs.CV] 标题:学习解耦光照以进行三维人脸纹理建模 作者:Tianxin Huang, Zhenyu Zhang, Ying Tai, Gim Hee Lee 链接:http://arxiv.org/abs/2412.08524 代码:https://tianxinhuang.github.io/projects/Deface 备注:Accepted by NeurIPS 2024
[22] SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level prompting[cs.CV] 标题:SenCLIP:通过地面提示增强Sentinel-2图像的零样本土地利用分类 作者:Pallavi Jain, Dino Ienco, Roberto Interdonato, Tristan Berchoux, Diego Marcos 链接:http://arxiv.org/abs/2412.08536 备注:Accepted at WACV'25
[24] SegFace: Face Segmentation of Long-Tail Classes[cs.CV] 标题:长尾类别人脸分割:SegFace 作者:Kartik Narayan, Vibashan VS, Vishal M. Patel 链接:http://arxiv.org/abs/2412.08647 代码:https://github.com/Kartik-3004/SegFace 备注:Accepted to AAAI 2025. Project Page: this https URL
自然语言处理会议: 12篇
[0] Bootstrapping Heterogeneous Graph Representation Learning via Large Language Models: A Generalized Approach[cs.CL] 标题:通过大型语言模型的自举构建异构图表示学习:一种通用方法 作者:Hang Gao, Chenhao Zhang, Fengge Wu, Junsuo Zhao, Changwen Zheng, Huaping Liu 链接:http://arxiv.org/abs/2412.08038 备注:Accepted by AAAI 2025
[1] Discrete Subgraph Sampling for Interpretable Graph based Visual Question Answering[cs.CL] 标题:离散子图采样用于可解释的基于图的视觉问答 作者:Pascal Tilli, Ngoc Thang Vu 链接:http://arxiv.org/abs/2412.08263 备注:Accepted at COLING 2025
[2] Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective[cs.CL] 标题:自适应提示在持续关系抽取中的应用:任务内方差视角 作者:Minh Le, Tien Ngoc Luu, An Nguyen The, Thanh-Thien Le, Trang Nguyen, Thanh Tung Nguyen, Linh Ngo Van, Thien Huu Nguyen 链接:http://arxiv.org/abs/2412.08285 备注:Accepted to AAAI 2025
[3] Rumor Detection on Social Media with Temporal Propagation Structure Optimization[cs.CL] 标题:基于时间传播结构优化的社交媒体谣言检测 作者:Xingyu Peng, Junran Wu, Ruomei Liu, Ke Xu 链接:http://arxiv.org/abs/2412.08316 备注:COLING'25
[4] BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language[cs.CL] 标题:BEIR-NL:荷兰语言的零样本信息检索基准 作者:Nikolay Banar, Ehsan Lotfi, Walter Daelemans 链接:http://arxiv.org/abs/2412.08329 备注:To be presented at BUCC 2025 (COLING)
[5] NyayaAnumana & INLegalLlama: The Largest Indian Legal Judgment Prediction Dataset and Specialized Language Model for Enhanced Decision Analysis[cs.CL] 标题:印度法理学推理与法律露露羊:最大的印度法律判决预测数据集及专用的语言模型,用以增强决策分析 作者:Shubham Kumar Nigam, Balaramamahanthi Deepak Patnaik, Shivam Mishra, Noel Shallum, Kripabandhu Ghosh, Arnab Bhattacharya 链接:http://arxiv.org/abs/2412.08385 备注:Accepted on COLING 2025
[6] SweetieChat: A Strategy-Enhanced Role-playing Framework for Diverse Scenarios Handling Emotional Support Agent[cs.CL] 标题:甜心聊:多样化场景下情感支持代理的策略增强角色扮演框架 作者:Jing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong 链接:http://arxiv.org/abs/2412.08389 备注:24 pages. Accepted by COLING 2025
[7] Learning to Reason via Self-Iterative Process Feedback for Small Language Models[cs.CL] 标题:通过自迭代过程反馈学习进行小型语言模型推理 作者:Kaiyuan Chen, Jin Wang, Xuejie Zhang 链接:http://arxiv.org/abs/2412.08393 备注:Accepted by COLING 2025
[9] Mitigating Out-of-Entity Errors in Named Entity Recognition: A Sentence-Level Strategy[cs.CL] 标题:缓解实体识别中的实体外错误:一种句子级策略 作者:Guochao Jiang, Ziqin Luo, Chengwei Hu, Zepeng Ding, Deqing Yang 链接:http://arxiv.org/abs/2412.08434 备注:Accepted by COLING 2025
[10] GR-NLP-TOOLKIT: An Open-Source NLP Toolkit for Modern Greek[cs.CL] 标题:现代希腊语的开源NLP工具包:GR-NLP-TOOLKIT 作者:Lefteris Loukas, Nikolaos Smyrnioudis, Chrysa Dikonomaki, Spyros Barbakos, Anastasios Toumazatos, John Koutsikakis, Manolis Kyriakakis, Mary Georgiou, Stavros Vassos, John Pavlopoulos, Ion Androutsopoulos 链接:http://arxiv.org/abs/2412.08520 备注:Accepted Demo Paper @ COLING 2025 (Github: this https URL, Demo: this https URL, API: this https URL)
[11] TECO: Improving Multimodal Intent Recognition with Text Enhancement through Commonsense Knowledge Extraction[cs.CL] 标题:TECO:通过常识知识提取进行文本增强以改善多模态意图识别 作者:Quynh-Mai Thi Nguyen, Lan-Nhi Thi Nguyen, Cam-Van Thi Nguyen 链接:http://arxiv.org/abs/2412.08529 备注:Accepted at PACLIC 2024