社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
Py学习  »  aigc

CV&AIGC顶会整理 [2024-12-12]

晓飞的算法工程笔记 • 3 天前 • 34 次点击  

今日更新37篇:

  • 计算机视觉会议 25篇
  • 自然语言处理会议 12篇
请注意,大模型的论文多发布于自然语言处理会议中。而由于多模态的发展迅速,部分计算机视觉相关的论文也会发布在自然语言处理顶会中。

计算机视觉会议: 25篇


[0] Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor[cs.CV]
标题:学习纠正:视觉常识推理干扰因素的解释性反馈生成
作者:Jiali Chen, Xusen Hei, Yuqi Xue, Yuancheng Wei, Jiayuan Xie, Yi Cai, Qing Li
链接:http://arxiv.org/abs/2412.07801
备注:Accepted by ACM MM 2024

[1] Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery[cs.CV]
标题:像素到多边形:一种从遥感图像中实现端到端多边形建筑地块提取的序列预测方法
作者:Yeshwanth Kumar Adimoolam, Charalambos Poullis, Melinos Averkiou
链接:http://arxiv.org/abs/2412.07899
代码:https://github.com/yeshwanth95/Pix2Poly
备注:Accepted to WACV 2025. 20 pages, 13 figures, 8 tables

[2] PGRID: Power Grid Reconstruction in Informal Developments Using High-Resolution Aerial Imagery[cs.CV]
标题:PGRID:利用高分辨率航空影像在非正式发展中重建电力网络
作者:Simone Fobi Nsutezo, Amrita Gupta, Duncan Kebut, Seema Iyer, Luana Marotti, Rahul Dodhia, Juan M. Lavista Ferres, Anthony Ortiz
链接:http://arxiv.org/abs/2412.07944
备注:Accepted to WACV 2025 IEEE/CVF Winter Conference

[3] Balancing Shared and Task-Specific Representations: A Hybrid Approach to Depth-Aware Video Panoptic Segmentation[cs.CV]
标题:平衡共享表示与任务特定表示:一种用于深度感知视频全景分割的混合方法
作者:Kurt H.W. Stolle (Eindhoven University of Technology)
链接:http://arxiv.org/abs/2412.07966
代码:https://research.khws.io/multiformer
备注:Accepted at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025. Code and trained models are available at: this https URL

[4] BSAFusion: A Bidirectional Stepwise Feature Alignment Network for Unaligned Medical Image Fusion[cs.CV]
标题:BSAFusion:一种用于无配准医学图像融合的双向逐步特征对齐网络
作者:Huafeng Li, Dayong Su, Qing Cai, Yafei Zhang
链接:http://arxiv.org/abs/2412.08050
代码:https://github.com/slrl123/BSAFusion
备注:Accepted by AAAI2025

[5] Dense Depth from Event Focal Stack[cs.CV]
标题:密集深度自事件焦点堆栈
作者:Kenta Horikawa, Mariko Isogawa, Hideo Saito, Shohei Mori
链接:http://arxiv.org/abs/2412.08120
备注:Accepted at WACV2025

[6] Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation[cs.CV]
标题:瓦瑟斯坦距离在知识蒸馏方面与克鲁斯卡尔-莱布勒散度相抗衡
作者:Jiaming Lv, Haoyuan Yang, Peihua Li
链接:http://arxiv.org/abs/2412.08139
代码:https://peihuali.org/WKD
备注:Accepted to NeurIPS 2024. Equal contribution from first two authors

[7] AsyncDSB: Schedule-Asynchronous Diffusion Schrödinger Bridge for Image Inpainting[cs.CV]
标题:异步DSB:用于图像配色的调度异步扩散薛定谔桥
作者:Zihao Han, Baoquan Zhang, Lisai Zhang, Shanshan Feng, Kenghong Lin, Guotao Liang, Yunming Ye, Xiaochen Qi, Guangming Ye
链接:http://arxiv.org/abs/2412.08149
备注:Accepted by AAAI 2025

[8] TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning[cs.CV]
标题:文本精炼器:内部视觉特征作为高效精炼器用于视觉-语言模型提示微调
作者:Jingjing Xie, Yuxin Zhang, Jun Peng, Zhaohong Huang, Liujuan Cao
链接:http://arxiv.org/abs/2412.08176
代码:https://github.com/xjjxmu/TextRefiner
备注:Accepted by AAAI2025

[9] Textured Mesh Saliency: Bridging Geometry and Texture for Human Perception in 3D Graphics[cs.CV]
标题:纹理网格显著度:为三维图形中的人体感知融合几何和纹理
作者:Kaiwei Zhang, Dandan Zhu, Xiongkuo Min, Guangtao Zhai
链接:http://arxiv.org/abs/2412.08188
备注:to be published in AAAI 2025

[10] SAFIRE: Segment Any Forged Image Region[cs.CV]
标题:安全火焰:分段任何伪造图像区域
作者:Myung-Joon Kwon, Wonjun Lee, Seung-Hun Nam, Minji Son, Changick Kim
链接:http://arxiv.org/abs/2412.08197
代码:https://github.com/mjkwon2021/SAFIRE
期刊:Proceedings of the AAAI Conference on Artificial Intelligence, 2025
备注:Accepted at AAAI 2025. Code is available at: this https URL

[11] Hierarchical Classification for Automated Image Annotation of Coral Reef Benthic Structures[cs.CV]
标题:珊瑚礁底栖结构自动图像注解的分层分类
作者:Célia Blondin, Joris Guérin, Kelly Inagaki, Guilherme Longo, Laure Berti-Équille
链接:http://arxiv.org/abs/2412.08228
备注:Poster at Tackling Climate Change with Machine Learning: workshop at NeurIPS 2024

[12] Position-aware Guided Point Cloud Completion with CLIP Model[cs.CV]
标题:定位感知的CLIP模型引导的点云补全
作者:Feng Zhou, Qi Zhang, Ju Dai, Lei Li, Qing Fan, Junliang Xing
链接:http://arxiv.org/abs/2412.08271
备注:Accepted by AAAI25

[13] Digging into Intrinsic Contextual Information for High-fidelity 3D Point Cloud Completion[cs.CV]
标题:挖掘内在上下文信息以实现高保真3D点云补全
作者:Jisheng Chu, Wenrui Li, Xingtao Wang, Kanglin Ning, Yidan Lu, Xiaopeng Fan
链接:http://arxiv.org/abs/2412.08326
代码:https://github.com/JS-CHU/ContextualCompletion
备注:Accepted to AAAI2025

[14] CoDTS: Enhancing Sparsely Supervised Collaborative Perception with a Dual Teacher-Student Framework[cs.CV]
标题:CoDTS:使用双重师生框架增强稀疏监督的协同感知
作者:Yushan Han, Hui Zhang, Honglei Zhang, Jing Wang, Yidong Li
链接:http://arxiv.org/abs/2412.08344
备注:AAAI 2025

[15] ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature Enhancement[cs.CV]
标题:ConDSeg:一种基于对比驱动的特征增强的通用医学图像分割框架
作者:Mengqi Lei, Haochen Wu, Xinhua Lv, Xin Wang
链接:http://arxiv.org/abs/2412.08345
代码:https://github.com/Mengqi-Lei/ConDSeg
备注:This paper has been accepted by AAAI-2025

[16] Video Summarization using Denoising Diffusion Probabilistic Model[cs.CV]
标题:视频去噪扩散概率模型在视频摘要中的应用
作者:Zirui Shang, Yubo Zhu, Hongxi Li, Shuo yang, Xinxiao Wu
链接:http://arxiv.org/abs/2412.08357
备注:Accepted by AAAI2025

[17] LOMA: Language-assisted Semantic Occupancy Network via Triplane Mamba[cs.CV]
标题:LOMA:基于三平面曼巴的语言辅助语义占用网络
作者:Yubo Cui, Zhiheng Li, Jiaqiang Wang, Zheng Fang
链接:http://arxiv.org/abs/2412.08388
备注:Accepted by AAAI2025

[18] Pragmatist: Multiview Conditional Diffusion Models for High-Fidelity 3D Reconstruction from Unposed Sparse Views[cs.CV]
标题:实用主义者:从未固定稀疏视图进行高保真3D重建的多视角条件扩散模型
作者:Songchun Zhang, Chunhui Zhao
链接:http://arxiv.org/abs/2412.08412
备注:Accepted by AAAI 2025. 13 pages, 8 figures

[19] PointCFormer: a Relation-based Progressive Feature Extraction Network for Point Cloud Completion[cs.CV]
标题:PointCFormer:一种基于关系的前向渐进特征提取网络用于点云补全
作者:Yi Zhong, Weize Quan, Dong-ming Yan, Jie Jiang, Yingmei Wei
链接:http://arxiv.org/abs/2412.08421
备注:9 pages, 8 figures, AAAI 2025, references added

[20] Orchestrating the Symphony of Prompt Distribution Learning for Human-Object Interaction Detection[cs.CV]
标题:编排人类-物体交互检测的提示分布学习交响曲
作者:Mingda Jia, Liming Zhao, Ge Li, Yun Zheng
链接:http://arxiv.org/abs/2412.08506
备注:in Proceedings of the 39th AAAI Conference on Artificial Intelligence (AAAI-25)

[21] Learning to Decouple the Lights for 3D Face Texture Modeling[cs.CV]
标题:学习解耦光照以进行三维人脸纹理建模
作者:Tianxin Huang, Zhenyu Zhang, Ying Tai, Gim Hee Lee
链接:http://arxiv.org/abs/2412.08524
代码:https://tianxinhuang.github.io/projects/Deface
备注:Accepted by NeurIPS 2024

[22] SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level prompting[cs.CV]
标题:SenCLIP:通过地面提示增强Sentinel-2图像的零样本土地利用分类
作者:Pallavi Jain, Dino Ienco, Roberto Interdonato, Tristan Berchoux, Diego Marcos
链接:http://arxiv.org/abs/2412.08536
备注:Accepted at WACV'25

[23] EOV-Seg: Efficient Open-Vocabulary Panoptic Segmentation[cs.CV]
标题:EOV-Seg:高效开放词汇全景分割
作者:Hongwei Niu, Jie Hu, Jianghang Lin, Shengchuan Zhang
链接:http://arxiv.org/abs/2412.08628
代码:https://github.com/nhw649/EOV-Seg
备注:Accepted by AAAI 2025

[24] SegFace: Face Segmentation of Long-Tail Classes[cs.CV]
标题:长尾类别人脸分割:SegFace
作者:Kartik Narayan, Vibashan VS, Vishal M. Patel
链接:http://arxiv.org/abs/2412.08647
代码:https://github.com/Kartik-3004/SegFace
备注:Accepted to AAAI 2025. Project Page: this https URL

自然语言处理会议: 12篇


[0] Bootstrapping Heterogeneous Graph Representation Learning via Large Language Models: A Generalized Approach[cs.CL]
标题:通过大型语言模型的自举构建异构图表示学习:一种通用方法
作者:Hang Gao, Chenhao Zhang, Fengge Wu, Junsuo Zhao, Changwen Zheng, Huaping Liu
链接:http://arxiv.org/abs/2412.08038
备注:Accepted by AAAI 2025

[1] Discrete Subgraph Sampling for Interpretable Graph based Visual Question Answering[cs.CL]
标题:离散子图采样用于可解释的基于图的视觉问答
作者:Pascal Tilli, Ngoc Thang Vu
链接:http://arxiv.org/abs/2412.08263
备注:Accepted at COLING 2025

[2] Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective[cs.CL]
标题:自适应提示在持续关系抽取中的应用:任务内方差视角
作者:Minh Le, Tien Ngoc Luu, An Nguyen The, Thanh-Thien Le, Trang Nguyen, Thanh Tung Nguyen, Linh Ngo Van, Thien Huu Nguyen
链接:http://arxiv.org/abs/2412.08285
备注:Accepted to AAAI 2025

[3] Rumor Detection on Social Media with Temporal Propagation Structure Optimization[cs.CL]
标题:基于时间传播结构优化的社交媒体谣言检测
作者:Xingyu Peng, Junran Wu, Ruomei Liu, Ke Xu
链接:http://arxiv.org/abs/2412.08316
备注:COLING'25

[4] BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language[cs.CL]
标题:BEIR-NL:荷兰语言的零样本信息检索基准
作者:Nikolay Banar, Ehsan Lotfi, Walter Daelemans
链接:http://arxiv.org/abs/2412.08329
备注:To be presented at BUCC 2025 (COLING)

[5] NyayaAnumana & INLegalLlama: The Largest Indian Legal Judgment Prediction Dataset and Specialized Language Model for Enhanced Decision Analysis[cs.CL]
标题:印度法理学推理与法律露露羊:最大的印度法律判决预测数据集及专用的语言模型,用以增强决策分析
作者:Shubham Kumar Nigam, Balaramamahanthi Deepak Patnaik, Shivam Mishra, Noel Shallum, Kripabandhu Ghosh, Arnab Bhattacharya
链接:http://arxiv.org/abs/2412.08385
备注:Accepted on COLING 2025

[6] SweetieChat: A Strategy-Enhanced Role-playing Framework for Diverse Scenarios Handling Emotional Support Agent[cs.CL]
标题:甜心聊:多样化场景下情感支持代理的策略增强角色扮演框架
作者:Jing Ye, Lu Xiang, Yaping Zhang, Chengqing Zong
链接:http://arxiv.org/abs/2412.08389
备注:24 pages. Accepted by COLING 2025

[7] Learning to Reason via Self-Iterative Process Feedback for Small Language Models[cs.CL]
标题:通过自迭代过程反馈学习进行小型语言模型推理
作者:Kaiyuan Chen, Jin Wang, Xuejie Zhang
链接:http://arxiv.org/abs/2412.08393
备注:Accepted by COLING 2025

[8] Detecting Conversational Mental Manipulation with Intent-Aware Prompting[cs.CL]
标题:检测具有意图感知提示的对话心理操纵
作者:Jiayuan Ma, Hongbin Na, Zimu Wang, Yining Hua, Yue Liu, Wei Wang, Ling Chen
链接:http://arxiv.org/abs/2412.08414
代码:https://github.com/Anton-Jiayuan-MA/Manip-IAP
期刊:COLING2025

[9] Mitigating Out-of-Entity Errors in Named Entity Recognition: A Sentence-Level Strategy[cs.CL]
标题:缓解实体识别中的实体外错误:一种句子级策略
作者:Guochao Jiang, Ziqin Luo, Chengwei Hu, Zepeng Ding, Deqing Yang
链接:http://arxiv.org/abs/2412.08434
备注:Accepted by COLING 2025

[10] GR-NLP-TOOLKIT: An Open-Source NLP Toolkit for Modern Greek[cs.CL]
标题:现代希腊语的开源NLP工具包:GR-NLP-TOOLKIT
作者:Lefteris Loukas, Nikolaos Smyrnioudis, Chrysa Dikonomaki, Spyros Barbakos, Anastasios Toumazatos, John Koutsikakis, Manolis Kyriakakis, Mary Georgiou, Stavros Vassos, John Pavlopoulos, Ion Androutsopoulos
链接:http://arxiv.org/abs/2412.08520
备注:Accepted Demo Paper @ COLING 2025 (Github: this https URL, Demo: this https URL, API: this https URL)

[11] TECO: Improving Multimodal Intent Recognition with Text Enhancement through Commonsense Knowledge Extraction[cs.CL]
标题:TECO:通过常识知识提取进行文本增强以改善多模态意图识别
作者:Quynh-Mai Thi Nguyen, Lan-Nhi Thi Nguyen, Cam-Van Thi Nguyen
链接:http://arxiv.org/abs/2412.08529
备注:Accepted at PACLIC 2024

感谢arxiv.org


Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/176842
 
34 次点击