计算机视觉会议: 25篇

[0] Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor[cs.CV]
标题：学习纠正：视觉常识推理干扰因素的解释性反馈生成
作者：Jiali Chen, Xusen Hei, Yuqi Xue, Yuancheng Wei, Jiayuan Xie, Yi Cai, Qing Li
链接：http://arxiv.org/abs/2412.07801
备注：Accepted by ACM MM 2024

[1] Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery[cs.CV]
标题：像素到多边形：一种从遥感图像中实现端到端多边形建筑地块提取的序列预测方法
作者：Yeshwanth Kumar Adimoolam, Charalambos Poullis, Melinos Averkiou
链接：http://arxiv.org/abs/2412.07899
代码：https://github.com/yeshwanth95/Pix2Poly
备注：Accepted to WACV 2025. 20 pages, 13 figures, 8 tables

[2] PGRID: Power Grid Reconstruction in Informal Developments Using High-Resolution Aerial Imagery[cs.CV]
标题：PGRID：利用高分辨率航空影像在非正式发展中重建电力网络
作者：Simone Fobi Nsutezo, Amrita Gupta, Duncan Kebut, Seema Iyer, Luana Marotti, Rahul Dodhia, Juan M. Lavista Ferres, Anthony Ortiz
链接：http://arxiv.org/abs/2412.07944
备注：Accepted to WACV 2025 IEEE/CVF Winter Conference

[3] Balancing Shared and Task-Specific Representations: A Hybrid Approach to Depth-Aware Video Panoptic Segmentation[cs.CV]
标题：平衡共享表示与任务特定表示：一种用于深度感知视频全景分割的混合方法
作者：Kurt H.W. Stolle (Eindhoven University of Technology)
链接：http://arxiv.org/abs/2412.07966
代码：https://research.khws.io/multiformer
备注：Accepted at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025. Code and trained models are available at: this https URL

[4] BSAFusion: A Bidirectional Stepwise Feature Alignment Network for Unaligned Medical Image Fusion[cs.CV]
标题：BSAFusion：一种用于无配准医学图像融合的双向逐步特征对齐网络
作者：Huafeng Li, Dayong Su, Qing Cai, Yafei Zhang
链接：http://arxiv.org/abs/2412.08050
代码：https://github.com/slrl123/BSAFusion
备注：Accepted by AAAI2025

[5] Dense Depth from Event Focal Stack[cs.CV]
标题：密集深度自事件焦点堆栈
作者：Kenta Horikawa, Mariko Isogawa, Hideo Saito, Shohei Mori
链接：http://arxiv.org/abs/2412.08120
备注：Accepted at WACV2025

[6] Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation[cs.CV]
标题：瓦瑟斯坦距离在知识蒸馏方面与克鲁斯卡尔-莱布勒散度相抗衡
作者：Jiaming Lv, Haoyuan Yang, Peihua Li
链接：http://arxiv.org/abs/2412.08139
代码：https://peihuali.org/WKD
备注：Accepted to NeurIPS 2024. Equal contribution from first two authors

[7] AsyncDSB: Schedule-Asynchronous Diffusion Schrödinger Bridge for Image Inpainting[cs.CV]
标题：异步DSB：用于图像配色的调度异步扩散薛定谔桥
作者：Zihao Han, Baoquan Zhang, Lisai Zhang, Shanshan Feng, Kenghong Lin, Guotao Liang, Yunming Ye, Xiaochen Qi, Guangming Ye
链接：http://arxiv.org/abs/2412.08149
备注：Accepted by AAAI 2025

[8] TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning[cs.CV]
标题：文本精炼器：内部视觉特征作为高效精炼器用于视觉-语言模型提示微调
作者：Jingjing Xie, Yuxin Zhang, Jun Peng, Zhaohong Huang, Liujuan Cao
链接：http://arxiv.org/abs/2412.08176
代码：https://github.com/xjjxmu/TextRefiner
备注：Accepted by AAAI2025

[9] Textured Mesh Saliency: Bridging Geometry and Texture for Human Perception in 3D Graphics[cs.CV]
标题：纹理网格显著度：为三维图形中的人体感知融合几何和纹理
作者：Kaiwei Zhang, Dandan Zhu, Xiongkuo Min, Guangtao Zhai
链接：http://arxiv.org/abs/2412.08188
备注：to be published in AAAI 2025

[10] SAFIRE: Segment Any Forged Image Region[cs.CV]
标题：安全火焰：分段任何伪造图像区域
作者：Myung-Joon Kwon, Wonjun Lee, Seung-Hun Nam, Minji Son, Changick Kim
链接：http://arxiv.org/abs/2412.08197
代码：https://github.com/mjkwon2021/SAFIRE
期刊：Proceedings of the AAAI Conference on Artificial Intelligence, 2025
备注：Accepted at AAAI 2025. Code is available at: this https URL

[11] Hierarchical Classification for Automated Image Annotation of Coral Reef Benthic Structures[cs.CV]
标题：珊瑚礁底栖结构自动图像注解的分层分类
作者：Célia Blondin, Joris Guérin, Kelly Inagaki, Guilherme Longo, Laure Berti-Équille
链接：http://arxiv.org/abs/2412.08228
备注：Poster at Tackling Climate Change with Machine Learning: workshop at NeurIPS 2024

[12] Position-aware Guided Point Cloud Completion with CLIP Model[cs.CV]
标题：定位感知的CLIP模型引导的点云补全
作者：Feng Zhou, Qi Zhang, Ju Dai, Lei Li, Qing Fan, Junliang Xing
链接：http://arxiv.org/abs/2412.08271
备注：Accepted by AAAI25

[13] Digging into Intrinsic Contextual Information for High-fidelity 3D Point Cloud Completion[cs.CV]
标题：挖掘内在上下文信息以实现高保真3D点云补全
作者：Jisheng Chu, Wenrui Li, Xingtao Wang, Kanglin Ning, Yidan Lu, Xiaopeng Fan
链接：http://arxiv.org/abs/2412.08326
代码：https://github.com/JS-CHU/ContextualCompletion
备注：Accepted to AAAI2025

[14] CoDTS: Enhancing Sparsely Supervised Collaborative Perception with a Dual Teacher-Student Framework[cs.CV]
标题：CoDTS：使用双重师生框架增强稀疏监督的协同感知
作者：Yushan Han, Hui Zhang, Honglei Zhang, Jing Wang, Yidong Li
链接：http://arxiv.org/abs/2412.08344
备注：AAAI 2025

[15] ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature Enhancement[cs.CV]
标题：ConDSeg：一种基于对比驱动的特征增强的通用医学图像分割框架
作者：Mengqi Lei, Haochen Wu, Xinhua Lv, Xin Wang
链接：http://arxiv.org/abs/2412.08345
代码：https://github.com/Mengqi-Lei/ConDSeg
备注：This paper has been accepted by AAAI-2025

[16] Video Summarization using Denoising Diffusion Probabilistic Model[cs.CV]
标题：视频去噪扩散概率模型在视频摘要中的应用
作者：Zirui Shang, Yubo Zhu, Hongxi Li, Shuo yang, Xinxiao Wu
链接：http://arxiv.org/abs/2412.08357
备注：Accepted by AAAI2025

[17] LOMA: Language-assisted Semantic Occupancy Network via Triplane Mamba[cs.CV]
标题：LOMA：基于三平面曼巴的语言辅助语义占用网络
作者：Yubo Cui, Zhiheng Li, Jiaqiang Wang, Zheng Fang
链接：http://arxiv.org/abs/2412.08388
备注：Accepted by AAAI2025

[18] Pragmatist: Multiview Conditional Diffusion Models for High-Fidelity 3D Reconstruction from Unposed Sparse Views[cs.CV]
标题：实用主义者：从未固定稀疏视图进行高保真3D重建的多视角条件扩散模型
作者：Songchun Zhang, Chunhui Zhao
链接：http://arxiv.org/abs/2412.08412
备注：Accepted by AAAI 2025. 13 pages, 8 figures

[19] PointCFormer: a Relation-based Progressive Feature Extraction Network for Point Cloud Completion[cs.CV]
标题：PointCFormer：一种基于关系的前向渐进特征提取网络用于点云补全
作者：Yi Zhong, Weize Quan, Dong-ming Yan, Jie Jiang, Yingmei Wei
链接：http://arxiv.org/abs/2412.08421
备注：9 pages, 8 figures, AAAI 2025, references added

[20] Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection[cs.CV]
标题：编排人类-物体交互检测的提示分布学习交响曲
作者：Mingda Jia, Liming Zhao, Ge Li, Yun Zheng
链接：http://arxiv.org/abs/2412.08506
备注：in Proceedings of the 39th AAAI Conference on Artificial Intelligence (AAAI-25)

[21] Learning to Decouple the Lights for 3D Face Texture Modeling[cs.CV]
标题：学习解耦光照以进行三维人脸纹理建模
作者：Tianxin Huang, Zhenyu Zhang, Ying Tai, Gim Hee Lee
链接：http://arxiv.org/abs/2412.08524
代码：https://tianxinhuang.github.io/projects/Deface
备注：Accepted by NeurIPS 2024

[22] SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level prompting[cs.CV]
标题：SenCLIP：通过地面提示增强Sentinel-2图像的零样本土地利用分类
作者：Pallavi Jain, Dino Ienco, Roberto Interdonato, Tristan Berchoux, Diego Marcos
链接：http://arxiv.org/abs/2412.08536
备注：Accepted at WACV'25

[23] EOV-Seg: Efficient Open-Vocabulary Panoptic Segmentation[cs.CV]
标题：EOV-Seg：高效开放词汇全景分割
作者：Hongwei Niu, Jie Hu, Jianghang Lin, Shengchuan Zhang
链接：http://arxiv.org/abs/2412.08628
代码：https://github.com/nhw649/EOV-Seg
备注：Accepted by AAAI 2025

[24] SegFace: Face Segmentation of Long-Tail Classes[cs.CV]
标题：长尾类别人脸分割：SegFace
作者：Kartik Narayan, Vibashan VS, Vishal M. Patel
链接：http://arxiv.org/abs/2412.08647
代码：https://github.com/Kartik-3004/SegFace
备注：Accepted to AAAI 2025. Project Page: this https URL

自然语言处理会议: 12篇

[0] Bootstrapping Heterogeneous Graph Representation Learning via Large Language Models: A Generalized Approach[cs.CL]
标题：通过大型语言模型的自举构建异构图表示学习：一种通用方法
作者：Hang Gao, Chenhao Zhang, Fengge Wu, Junsuo Zhao, Changwen Zheng, Huaping Liu
链接：http://arxiv.org/abs/2412.08038
备注：Accepted by AAAI 2025

[1] Discrete Subgraph Sampling for Interpretable Graph based Visual Question Answering[cs.CL]
标题：离散子图采样用于可解释的基于图的视觉问答
作者：Pascal Tilli, Ngoc Thang Vu
链接：http://arxiv.org/abs/2412.08263
备注：Accepted at COLING 2025

[2] Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective[cs.CL]
标题：自适应提示在持续关系抽取中的应用：任务内方差视角
作者：Minh Le, Tien Ngoc Luu, An Nguyen The, Thanh-Thien Le, Trang Nguyen, Thanh Tung Nguyen, Linh Ngo Van, Thien Huu Nguyen
链接：http://arxiv.org/abs/2412.08285
备注：Accepted to AAAI 2025

[3] Rumor Detection on Social Media with Temporal Propagation Structure Optimization[cs.CL]
标题：基于时间传播结构优化的社交媒体谣言检测
作者：Xingyu Peng, Junran Wu, Ruomei Liu, Ke Xu
链接：http://arxiv.org/abs/2412.08316
备注：COLING'25

[4] BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language[cs.CL]
标题：BEIR-NL：荷兰语言的零样本信息检索基准
作者：Nikolay Banar, Ehsan Lotfi, Walter Daelemans
链接：http://arxiv.org/abs/2412.08329
备注：To be presented at BUCC 2025 (COLING)

[5] NyayaAnumana & INLegalLlama: The Largest Indian Legal Judgment Prediction Dataset and Specialized Language Model for Enhanced Decision Analysis[cs.CL]
标题：印度法理学推理与法律露露羊：最大的印度法律判决预测数据集及专用的语言模型，用以增强决策分析
作者：Shubham Kumar Nigam, Balaramamahanthi Deepak Patnaik, Shivam Mishra, Noel Shallum, Kripabandhu Ghosh, Arnab Bhattacharya
链接：http://arxiv.org/abs/2412.08385
备注：Accepted on COLING 2025

[6] SweetieChat: A Strategy-Enhanced Role-playing Framework for Diverse Scenarios Handling Emotional Support Agent[cs.CL]
标题：甜心聊：多样化场景下情感支持代理的策略增强角色扮演框架
作者：Jing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong
链接：http://arxiv.org/abs/2412.08389
备注：24 pages. Accepted by COLING 2025

[7] Learning to Reason via Self-Iterative Process Feedback for Small Language Models[cs.CL]
标题：通过自迭代过程反馈学习进行小型语言模型推理
作者：Kaiyuan Chen, Jin Wang, Xuejie Zhang
链接：http://arxiv.org/abs/2412.08393
备注：Accepted by COLING 2025

[8] Detecting Conversational Mental Manipulation with Intent-Aware Prompting[cs.CL]
标题：检测具有意图感知提示的对话心理操纵
作者：Jiayuan Ma, Hongbin Na, Zimu Wang, Yining Hua, Yue Liu, Wei Wang, Ling Chen
链接：http://arxiv.org/abs/2412.08414
代码：https://github.com/Anton-Jiayuan-MA/Manip-IAP
期刊：COLING2025

[9] Mitigating Out-of-Entity Errors in Named Entity Recognition: A Sentence-Level Strategy[cs.CL]
标题：缓解实体识别中的实体外错误：一种句子级策略
作者：Guochao Jiang, Ziqin Luo, Chengwei Hu, Zepeng Ding, Deqing Yang
链接：http://arxiv.org/abs/2412.08434
备注：Accepted by COLING 2025

[10] GR-NLP-TOOLKIT: An Open-Source NLP Toolkit for Modern Greek[cs.CL]
标题：现代希腊语的开源NLP工具包：GR-NLP-TOOLKIT
作者：Lefteris Loukas, Nikolaos Smyrnioudis, Chrysa Dikonomaki, Spyros Barbakos, Anastasios Toumazatos, John Koutsikakis, Manolis Kyriakakis, Mary Georgiou, Stavros Vassos, John Pavlopoulos, Ion Androutsopoulos
链接：http://arxiv.org/abs/2412.08520
备注：Accepted Demo Paper @ COLING 2025 (Github: this https URL, Demo: this https URL, API: this https URL)

[11] TECO: Improving Multimodal Intent Recognition with Text Enhancement through Commonsense Knowledge Extraction[cs.CL]
标题：TECO：通过常识知识提取进行文本增强以改善多模态意图识别
作者：Quynh-Mai Thi Nguyen, Lan-Nhi Thi Nguyen, Cam-Van Thi Nguyen
链接：http://arxiv.org/abs/2412.08529
备注：Accepted at PACLIC 2024

CV&AIGC顶会整理 [2024-12-12]

今日更新37篇：

请注意，大模型的论文多发布于自然语言处理会议中。而由于多模态的发展迅速，部分计算机视觉相关的论文也会发布在自然语言处理顶会中。

计算机视觉会议: 25篇

自然语言处理会议: 12篇

感谢arxiv.org