社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
Py学习  »  aigc

CV&AIGC顶会整理 [2024-11-11]

晓飞的算法工程笔记 • 2 月前 • 124 次点击  

今日更新30篇:

  • 计算机视觉会议 15篇
  • 自然语言处理会议 15篇
请注意,大模型的论文多发布于自然语言处理会议中。而由于多模态的发展迅速,部分计算机视觉相关的论文也会发布在自然语言处理顶会中。

计算机视觉会议: 15篇


[0] Don't Look Twice: Faster Video Transformers with Run-Length Tokenization[cs.CV]
标题:不要两次看:带运行长度归一化的更快的视频转换器
作者:Rohan Choudhury, Guanglei Zhu, Sihan Liu, Koichiro Niinuma, Kris M. Kitani, László Jeni
链接:http://arxiv.org/abs/2411.05222
备注:16 pages, 6 figures. Accepted to NeurIPS 2024 (spotlight)

[1] Generalizable Single-Source Cross-modality Medical Image Segmentation via Invariant Causal Mechanisms[cs.CV]
标题:通用的单源跨模态医学图像分割:通过不变的因果机制
作者:Boqi Chen, Yuanzhi Zhu, Yunke Ao, Sebastiano Caprara, Reto Sutter, Gunnar Rätsch, Ender Konukoglu, Anna Susmelj
链接:http://arxiv.org/abs/2411.05223
代码:https://github.com/ratschlab/ICMSeg
备注:WACV 2025

[2] Hierarchical Visual Feature Aggregation for OCR-Free Document Understanding[cs.CV]
标题:分层视觉特征聚合用于无OCR文档理解
作者:Jaeyoo Park, Jin Young Choi, Jeonghyung Park, Bohyung Han
链接:http://arxiv.org/abs/2411.05254
备注:NeurIPS 2024

[3] ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving[cs.CV]
标题:ZOPP:面向自动驾驶的零样本离线全景感知框架
作者:Tao Ma, Hongbin Zhou, Qiusheng Huang, Xuemeng Yang, Jianfei Guo, Bo Zhang, Min Dou, Yu Qiao, Botian Shi, Hongsheng Li
链接:http://arxiv.org/abs/2411.05311
备注:Accepted by NeurIPS 2024

[4] Rate-aware Compression for NeRF-based Volumetric Video[cs.CV]
标题:基于NeRF的体积视频的速率感知压缩
作者:Zhiyu Zhang, Guo Lu, Huanxiong Liang, Zhengxue Cheng, Anni Tang, Li Song
链接:http://arxiv.org/abs/2411.05322
备注:Accepted by ACM MM 2024 (Oral)

[5] Enhancing Visual Classification using Comparative Descriptors[cs.CV]
标题:基于比较描述符的视觉分类增强
作者:Hankyeol Lee, Gawon Seo, Wonseok Choi, Geunyoung Jung, Kyungwoo Song, Jiyoung Jung
链接:http://arxiv.org/abs/2411.05357
备注:Accepted to WACV 2025. Main paper with 8 pages

[6] From Transparent to Opaque: Rethinking Neural Implicit Surfaces with -NeuS[cs.CV]
标题:从透明到不透明:以 -NeuS 重新思考神经隐式表面
作者:Haoran Zhang, Junkai Deng, Xuhui Chen, Fei Hou, Wencheng Wang, Hong Qin, Chen Qian, Ying He
链接:http://arxiv.org/abs/2411.05362
代码:https://github.com/728388808/alpha-NeuS
期刊:NeurIPS 2024

[7] VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM[cs.CV]
标题:VISTA:基于大型语言模型(LLM)的定制数学问题生成的可视化集成系统
作者:Jeongwoo Lee, Kwangsuk Park, Jihyeon Park
链接:http://arxiv.org/abs/2411.05423
备注:Accepted at NeurIPS 2024 Workshop on Large Foundation Models for Educational Assessment (FM-Assess)

[8] Do Histopathological Foundation Models Eliminate Batch Effects? A Comparative Study[cs.CV]
标题:病理组织学基础模型是否消除了批处理效应?一项比较研究
作者:Jonah Kömen, Hannah Marienwald, Jonas Dippel, Julius Hense
链接:http://arxiv.org/abs/2411.05489
备注:Accepted to AIM-FM Workshop @ NeurIPS'24

[9] Open-set object detection: towards unified problem formulation and benchmarking[cs.CV]
标题:开放集目标检测:迈向统一问题表述和基准测试
作者:Hejer Ammar, Nikita Kiselov, Guillaume Lapouge, Romaric Audigier
链接:http://arxiv.org/abs/2411.05564
备注:Accepted at ECCV 2024 Workshop: "The 3rd Workshop for Out-of-Distribution Generalization in Computer Vision Foundation Models"

[10] SynDroneVision: A Synthetic Dataset for Image-Based Drone Detection[cs.CV]
标题:SynDroneVision:基于图像的无人机检测合成数据集
作者:Tamara R. Lenhard, Andreas Weinmann, Kai Franke, Tobias Koch
链接:http://arxiv.org/abs/2411.05633
备注:Accepted at the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

[11] Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation[cs.CV]
标题:在线-低秩自适应:无任务的在线持续学习
作者:Xiwen Wei, Guihong Li, Radu Marculescu
链接:http://arxiv.org/abs/2411.05663
代码:https://github.com/Christina200/Online-LoRA-official.git
备注:WACV 2025

[12] Tell What You Hear From What You See -- Video to Audio Generation Through Text[cs.CV]
标题:从你所见听你所言——通过文本的视频音频生成
作者:Xiulong Liu, Kun Su, Eli Shlizerman
链接:http://arxiv.org/abs/2411.05679
备注:NeurIPS 2024

[13] Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity Recognition[cs.CV]
标题:自回归自适应超图变换器在骨骼动作识别中的应用
作者:Abhisek Ray, Ayush Raj, Maheshkumar H. Kolekar
链接:http://arxiv.org/abs/2411.05692
备注:Accepted to WACV 2025

[14] GazeSearch: Radiology Findings Search Benchmark[cs.CV]
标题:注视搜索:放射学检查发现搜索基准
作者:Trong Thang Pham, Tien-Phat Nguyen, Yuki Ikebe, Akash Awasthi, Zhigang Deng, Carol C. Wu, Hien Nguyen, Ngan Le
链接:http://arxiv.org/abs/2411.05780
备注:Aceepted WACV 2025

自然语言处理会议: 15篇


[0] Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale[cs.CL]
标题:性能引导的LLM知识蒸馏,实现大规模文本分类的高效能
作者:Flavio Di Palo, Prateek Singhi, Bilal Fadlallah
链接:http://arxiv.org/abs/2411.05045
备注:Published in EMNLP 2024

[1] FMEA Builder: Expert Guided Text Generation for Equipment Maintenance[cs.CL]
标题:FMEA构建器:设备维护的专家引导文本生成
作者:Karol Lynch, Fabio Lorenzi, John Sheehan, Duygu Kabakci-Zorlu, Bradley Eck
链接:http://arxiv.org/abs/2411.05054
备注:4 pages, 2 figures. AI for Critical Infrastructure Workshop @ IJCAI 2024

[2] Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model[cs.CV]
标题:精确度还是召回率?图像标题在训练文本到图像生成模型中的应用分析
作者:Sheng Cheng, Maitreya Patel, Yezhou Yang
链接:http://arxiv.org/abs/2411.05079
代码:https://github.com/shengcheng/Captions4T2I
备注:EMNLP 2024 Findings. Code: this https URL

[3] Beyond the Numbers: Transparency in Relation Extraction Benchmark Creation and Leaderboards[cs.CL]
标题:超越数字:关系抽取基准创建和排行榜的透明度
作者:Varvara Arzt, Allan Hanbury
链接:http://arxiv.org/abs/2411.05224
备注:This paper was accepted at the GenBench workshop at EMNLP2024

[4] CHATTER: A Character Attribution Dataset for Narrative Understanding[cs.CL]
标题:CHATTER:一个用于叙事理解的中文角色归因数据集
作者:Sabyasachee Baruah, Shrikanth Narayanan
链接:http://arxiv.org/abs/2411.05227
备注:submitted to NAACL 2025

[5] Revisiting the Robustness of Watermarking to Paraphrasing Attacks[cs.CL]
标题:重新审视水印抵抗释义攻击的能力
作者:Saksham Rastogi, Danish Pruthi
链接:http://arxiv.org/abs/2411.05277
备注:EMNLP 2024

[6] SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding[cs.CL]
标题:SpecHub:多轮预测解码的可证明加速
作者:Ryan Sun, Tianyi Zhou, Xun Chen, Lichao Sun
链接:http://arxiv.org/abs/2411.05289
代码:https://github.com/MasterGodzilla/Speculative_decoding_OT
备注:EMNLP 2024 (Main)

[7] SciDQA: A Deep Reading Comprehension Dataset over Scientific Papers[cs.CL]
标题:SciDQA:一篇关于科学论文的深度阅读理解数据集
作者:Shruti Singh, Nandan Sarkar, Arman Cohan
链接:http://arxiv.org/abs/2411.05338
备注:18 pages, Accepted to EMNLP 2024

[8] Towards Low-Resource Harmful Meme Detection with LMM Agents[cs.CL]
标题:关于使用低资源多模态代理的低资源有害迷因检测
作者:Jianzhao Huang, Hongzhan Lin, Ziyan Liu, Ziyang Luo, Guang Chen, Jing Ma
链接:http://arxiv.org/abs/2411.05383
备注:EMNLP 2024

[9] VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM[cs.CV]
标题:VISTA:基于大型语言模型(LLM)的定制数学问题生成的可视化集成系统
作者:Jeongwoo Lee, Kwangsuk Park, Jihyeon Park
链接:http://arxiv.org/abs/2411.05423
备注:Accepted at NeurIPS 2024 Workshop on Large Foundation Models for Educational Assessment (FM-Assess)

[10] Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024[cs.CL]
标题:多跳证据追寻遇见网络:Papelo队在FEVER 2024
作者:Christopher Malon
链接:http://arxiv.org/abs/2411.05762
备注:To appear in the Seventh FEVER Workshop at EMNLP 2024

[11] FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents[cs.CL]
标题:FinDVer:针对长段及混合内容金融文件的可解释性声明验证
作者:Yilun Zhao, Yitao Long, Yuru Jiang, Chengye Wang, Weiyuan Chen, Hongjun Liu, Yiming Zhang, Xiangru Tang, Chen Zhao, Arman Cohan
链接:http://arxiv.org/abs/2411.05764
备注:EMNLP 2024

[12] Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?[cs.CL]
标题:事实还是虚构?大型语言模型是否能可靠地标注政治真相?
作者:Veronica Chatrath, Marcelo Lotif, Shaina Raza
链接:http://arxiv.org/abs/2411.05775
备注:Accepted at Socially Responsible Language Modelling Research (SoLaR) Workshop at NeurIPS 2024

[13] Using Language Models to Disambiguate Lexical Choices in Translation[cs.CL]
标题:利用语言模型在翻译中消解词汇选择歧义
作者:Josh Barua, Sanjay Subramanian, Kayo Yin, Alane Suhr
链接:http://arxiv.org/abs/2411.05781
备注:Accepted to EMNLP 2024

[14] ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles[cs.CV]
标题:美国手语STEM百科:解读STEM文章的数据集和基准
作者:Kayo Yin, Chinmay Singh, Fyodor O. Minakov, Vanessa Milan, Hal Daumé III, Cyril Zhang, Alex X. Lu, Danielle Bragg
链接:http://arxiv.org/abs/2411.05783
备注:Accepted to EMNLP 2024

感谢arxiv.org


Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/175847
 
124 次点击