社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
Py学习  »  aigc

CV&AIGC顶会整理 [2024-10-22]

晓飞的算法工程笔记 • 4 周前 • 49 次点击  

今日更新48篇:

  • 计算机视觉会议 22篇
  • 自然语言处理会议 26篇
请注意,大模型的论文多发布于自然语言处理会议中。而由于多模态的发展迅速,部分计算机视觉相关的论文也会发布在自然语言处理顶会中。

计算机视觉会议: 22篇


[0] Optimizing Parking Space Classification: Distilling Ensembles into Lightweight Classifiers[cs.CV]
标题:优化停车位分类:将从集成模型提取到轻量级分类器
作者:Paulo Luza Alves, André Hochuli, Luiz Eduardo de Oliveira, Paulo Lisboa de Almeida
链接:http://arxiv.org/abs/2410.14705
备注:Accepted for presentation at the International Conference on Machine Learning and Applications (ICMLA) 2024

[1] Non-Invasive to Invasive: Enhancing FFA Synthesis from CFP with a Benchmark Dataset and a Novel Network[cs.CV]
标题:非侵入式到侵入式:通过基准数据集和新型网络增强自CFP的FFA合成
作者:Hongqiu Wang, Zhaohu Xing, Weitong Wu, Yijun Yang, Qingqing Tang, Meixia Zhang, Yanwu Xu, Lei Zhu
链接:http://arxiv.org/abs/2410.14965
代码:https://github.com/whq-xxh/FFA-Synthesis
备注:ACMMM 24 MCHM

[2] DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain[cs.CV]
标题:离散余弦域中的DCDepth:渐进式单目深度估计
作者:Kun Wang, Zhiqiang Yan, Junkai Fan, Wanlu Zhu, Xiang Li, Jun Li, Jian Yang
链接:http://arxiv.org/abs/2410.14980
代码:https://github.com/w2kun/DCDepth
备注:Accepted by NeurIPS-2024

[3] Quanta Video Restoration[cs.CV]
标题:量子视频恢复
作者:Prateek Chennuri, Yiheng Chi, Enze Jiang, G. M. Dilshan Godaliyadda, Abhiram Gnanasambandam, Hamid R. Sheikh, Istvan Gyongy, Stanley H. Chan
链接:http://arxiv.org/abs/2410.14994
代码:https://github.com/chennuriprateek/Quanta_Video_Restoration-QUIVER-
期刊:European Conference on Computer Vision (ECCV) 2024

[4] How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold[cs.CV]
标题:需要多少梵·高才能成为梵·高?寻找模仿阈值
作者:Sahil Verma, Royi Rassin, Arnav Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar
链接:http://arxiv.org/abs/2410.15002
代码:https://github.com/vsahil/MIMETIC-2.git
备注:Accepted at ATTRIB, RegML, and SafeGenAI workshops at NeurIPS 2024 and NLLP Workshop 2024

[5] DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer[cs.CV]
标题:扩散模型在风格迁移中的潜在能力释放:DiffuseST
作者:Ying Hu, Chenyi Zhuang, Pan Gao
链接:http://arxiv.org/abs/2410.15007
代码:https://github.com/I2-Multimedia-Lab/DiffuseST
备注:Accepted to ACMMM Asia 2024. Code is available at this https URL

[6] Scene Graph Generation with Role-Playing Large Language Models[cs.CV]
标题:场景图生成与角色扮演大型语言模型
作者:Guikun Chen, Jin Li, Wenguan Wang
链接:http://arxiv.org/abs/2410.15364
代码:https://github.com/guikunchen/SDSGG
备注:NeurIPS 2024. Code: this https URL

[7] IPO: Interpretable Prompt Optimization for Vision-Language Models[cs.CV]
标题:视觉-语言模型的可解释提示优化
作者:Yingjun Du, Wenfang Sun, Cees G. M. Snoek
链接:http://arxiv.org/abs/2410.15397
备注:Accepted by NeurIPS 2024

[8] BoostAdapter: Improving Test-Time Adaptation via Regional Bootstrapping[cs.CV]
标题:区域引导:通过区域自举提升测试时适应性
作者:Taolin Zhang, Jinpeng Wang, Hang Guo, Tao Dai, Bin Chen, Shu-Tao Xia
链接:http://arxiv.org/abs/2410.15430
备注:NeurIPS 2024

[9] Generalized Multimodal Fusion via Poisson-Nernst-Planck Equation[cs.CV]
标题:通用多模态融合通过泊松-内斯特-普朗克方程
作者:Jiayu Xiong, Jing Wang, Hengjing Xiang, Jun Xue, Chen Xu, Zhouqiang Jiang
链接:http://arxiv.org/abs/2410.15475
备注:NeurIPS 2024 Rejected paper, 28 pages

[10] ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos[cs.CV]
标题:ARTS:利用解耦骨骼表示的半解析回归器进行视频中的人类网格恢复
作者:Tao Tang, Hong Liu, Yingxuan You, Ti Wang, Wenhao Li
链接:http://arxiv.org/abs/2410.15582
代码:https://github.com/TangTao-PKU/ARTS
备注:Accepted by ACM MM 2024. Project page: this https URL

[11] Fully Explicit Dynamic Gaussian Splatting[cs.CV]
标题:完全解析动态高斯粒状映射
作者:Junoh Lee, Chang-Yeon Won, Hyunjun Jung, Inhwan Bae, Hae-Gon Jeon
链接:http://arxiv.org/abs/2410.15629
备注:Accepted at NeurIPS 2024

[12] TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight[cs.CV]
标题:TALoS:基于视线的测试时间自适应增强语义场景补全
作者:Hyun-Kurl Jang, Jihun Kim, Hyeokjun Kweon, Kuk-Jin Yoon
链接:http://arxiv.org/abs/2410.15674
代码:https://github.com/blue-531/TALoS
备注:Accepted at NeurIPS 2024. Code is available at this https URL

[13] LiMTR: Time Series Motion Prediction for Diverse Road Users through Multimodal Feature Integration[cs.CV]
标题:多模态特征集成用于多样化道路使用者的时间序列运动预测:LiMTR
作者:Camiel Oerlemans, Bram Grooten, Michiel Braat, Alaa Alassi, Emilia Silvas, Decebal Constantin Mocanu
链接:http://arxiv.org/abs/2410.15819
代码:https://github.com/Cing2/LiMTR
备注:Accepted at the NeurIPS 2024 workshop Time Series in the Age of Large Models. Code available at this https URL

[14] Random Token Fusion for Multi-View Medical Diagnosis[cs.CV]
标题:随机 tokens 融合用于多视图医学诊断
作者:Jingyu Guo, Christos Matsoukas, Fredrik Strand, Kevin Smith
链接:http://arxiv.org/abs/2410.15847
备注:Originally published at the NeurIPS 2024 Workshop on Advancements In Medical Foundation Models: Explainability, Robustness, Security, and Beyond (AIM-FM)

[15] Visual Motif Identification: Elaboration of a Curated Comparative Dataset and Classification Methods[cs.CV]
标题:视觉模式识别:构建一个精选比较数据集和分类方法的阐述
作者:Adam Phillips (1), Daniel Grandes Rodriguez (1), Miriam Sánchez-Manzano (1), Alan Salvadó (1), Manuel Garin (1), Gloria Haro (1), Coloma Ballester (1) ((1) Universitat Pompeu Fabra, Barcelona, Spain)
链接:http://arxiv.org/abs/2410.15866
备注:17 pages, 11 figures, one table, to be published in the conference proceedings of ECCV 2024

[16] Mitigating Object Hallucination via Concentric Causal Attention[cs.CV]
标题:通过同心因果注意力缓解物体幻觉
作者:Yun Xing, Yiheng Li, Ivan Laptev, Shijian Lu
链接:http://arxiv.org/abs/2410.15926
代码:https://github.com/xing0047/cca-llava
备注:To appear at NeurIPS 2024. Code is available at this https URL

[17] Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly[cs.CV]
标题:使用深度先验组装的单图像零样本场景重建
作者:Junsheng Zhou, Yu-Shen Liu, Zhizhong Han
链接:http://arxiv.org/abs/2410.15971
备注:To appear at NeurIPS 2024. Project page: this https URL

[18] START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation[cs.CV]
标题:START:一种基于显著性驱动的全局Token感知转换的一般化状态空间模型
作者:Jintao Guo, Lei Qi, Yinghuan Shi, Yang Gao
链接:http://arxiv.org/abs/2410.16020
代码:https://github.com/lingeringlight/START
备注:Accepted by NeurIPS2024. The code is available at this https URL

[19] Towards Combating Frequency Simplicity-biased Learning for Domain Generalization[cs.CV]
标题:向着对抗频率简单偏差学习的领域泛化方法
作者:Xilin He, Jingyu Hu, Qinliang Lin, Cheng Luo, Weicheng Xie, Siyang Song, Muhammad Haris Khan, Linlin Shen
链接:http://arxiv.org/abs/2410.16146
代码:https://github.com/C0notSilly/AdvFrequency
备注:Accepted by NeurIPS 2024

[20] Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models[cs.CV]
标题:扭曲扩散:使用图像扩散模型解决视频逆问题
作者:Giannis Daras, Weili Nie, Karsten Kreis, Alex Dimakis, Morteza Mardani, Nikola Borislavov Kovachki, Arash Vahdat
链接:http://arxiv.org/abs/2410.16152
备注:Accepted in NeurIPS 2024

[21] 3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors[cs.CV]
标题:3DGS-增强器:通过视图一致二维扩散先验增强无限3D高斯喷溅
作者:Xi Liu, Chaoyi Zhou, Siyu Huang
链接:http://arxiv.org/abs/2410.16266
代码:https://xiliu8006.github.io/3DGS-Enhancer-project
备注:Accepted by NeurIPS 2024 Spotlight

自然语言处理会议: 26篇


[0] QuAILoRA: Quantization-Aware Initialization for LoRA[cs.CL]
标题:QuAILoRA:LoRA 中的量化感知初始化
作者:Neal Lawton, Aishwarya Padmakumar, Judith Gaspers, Jack FitzGerald, Anoop Kumar, Greg Ver Steeg, Aram Galstyan
链接:http://arxiv.org/abs/2410.14713
备注:12 pages, 7 figures. Submitted to the 4th NeurIPS Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV)

[1] Rethinking Token Reduction for State Space Models[cs.CL]
标题:重新思考状态空间模型中的标记缩减
作者:Zheng Zhan, Yushu Wu, Zhenglun Kong, Changdi Yang, Yifan Gong, Xuan Shen, Xue Lin, Pu Zhao, Yanzhi Wang
链接:http://arxiv.org/abs/2410.14725
备注:EMNLP 2024

[2] TimeSeriesExam: A time series understanding exam[cs.CL]
标题:时间序列考察:时间序列理解考试
作者:Yifu Cai, Arjun Choudhry, Mononito Goswami, Artur Dubrawski
链接:http://arxiv.org/abs/2410.14752
备注:Accepted at NeurIPS'24 Time Series in the Age of Large Models Workshop

[3] Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection[cs.CL]
标题:哪些LLM难以检测?对导致LLM文本检测困难的潜在因素的详细分析
作者:Shantanu Thorat, Tianbao Yang
链接:http://arxiv.org/abs/2410.14875
备注:Accepted at NeurIPS 2024 - Safe Generative AI Workshop

[4] Class-RAG: Content Moderation with Retrieval Augmented Generation[cs.CL]
标题:内容增强生成用于内容审核的Class-RAG
作者:Jianfa Chen, Emily Shen, Trupti Bavalatti, Xiaowen Lin, Yongkai Wang, Shuming Hu, Harihar Subramanyam, Ksheeraj Sai Vepuri, Ming Jiang, Ji Qi, Li Chen, Nan Jiang, Ankit Jain
链接:http://arxiv.org/abs/2410.14881
备注:11 pages, submit to ACL

[5] From Test-Taking to Test-Making: Examining LLM Authoring of Commonsense Assessment Items[cs.CL]
标题:从试题作答到试题编制:探讨大型语言模型在常识评估项目创作中的表现
作者:Melissa Roemmele, Andrew S. Gordon
链接:http://arxiv.org/abs/2410.14897
备注:Accepted at Findings of EMNLP 2024

[6] A Survey of Ontology Expansion for Conversational Understanding[cs.CL]
标题:对话理解中的本体扩展综述
作者:Jinggui Liang, Yuxia Wu, Yuan Fang, Hao Fei, Lizi Liao
链接:http://arxiv.org/abs/2410.15019
代码:https://github.com/liangjinggui/Ontology-Expansion
备注:Accepted by EMNLP 2024, code and data are available at this https URL: this https URL

[7] Are LLMs Good Zero-Shot Fallacy Classifiers?[cs.CL]
标题:大型语言模型是好的零样本谬误分类器吗?
作者:Fengjun Pan, Xiaobao Wu, Zongrui Li, Anh Tuan Luu
链接:http://arxiv.org/abs/2410.15050
代码:https://github.com/panFJCharlotte98/Fallacy_Detection
备注:Accepted to EMNLP2024 main conference

[8] Toward Robust RALMs: Revealing the Impact of Imperfect Retrieval on Retrieval-Augmented Language Models[cs.CL]
标题:向更鲁棒的RA-LMs迈进:揭示不完美检索对检索增强语言模型的影响
作者:Seong-Il Park, Jay-Yoon Lee
链接:http://arxiv.org/abs/2410.15107
备注:Accepted for publication in Transactions of the Association for Computational Linguistics (TACL)

[9] MELT: Materials-aware Continued Pre-training for Language Model Adaptation to Materials Science[cs.CL]
标题:材料感知的语言模型自适应材料科学的学习连续预训练:MELT
作者:Junho Kim, Yeachan Kim, Jun-Hyung Park, Yerim Oh, Suho Kim, SangKeun Lee
链接:http://arxiv.org/abs/2410.15126
备注:Accepted at EMNLP 2024 (Findings)

[10] Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning[cs.CL]
标题:更少即是更多:迁移学习中对中间任务的参数高效选择
作者:David Schulte, Felix Hamborg, Alan Akbik
链接:http://arxiv.org/abs/2410.15148
备注:EMNLP 2024 Main Conference

[11] Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction[cs.CL]
标题:用大型语言模型解释图神经网络:分子性质预测的对比视角
作者:Yinhan He, Zaiyi Zheng, Patrick Soga, Yaozhen Zhu, yushun Dong, Jundong Li
链接:http://arxiv.org/abs/2410.15165
代码:https://github.com/YinhanHe123/new
期刊:EMNLP 2024 (Findings)

[12] An Electoral Approach to Diversify LLM-based Multi-Agent Collective Decision-Making[cs.CL]
标题:基于多智能体集体决策的LLM分化选举方法
作者:Xiutian Zhao, Ke Wang, Wei Peng
链接:http://arxiv.org/abs/2410.15168
备注:Accepted to EMNLP 2024

[13] IPO: Interpretable Prompt Optimization for Vision-Language Models[cs.CV]
标题:视觉-语言模型的可解释提示优化
作者:Yingjun Du, Wenfang Sun, Cees G. M. Snoek
链接:http://arxiv.org/abs/2410.15397
备注:Accepted by NeurIPS 2024

[14] Evaluating Consistencies in LLM responses through a Semantic Clustering of Question Answering[cs.CL]
标题:评估基于语义聚类的LLM应答的一致性
作者:Yanggyu Lee, Jihie Kim
链接:http://arxiv.org/abs/2410.15440
备注:Accepted to the Trustworthy AI Workshop at IJCAI 2024

[15] "What is the value of {templates}?" Rethinking Document Information Extraction Datasets for LLMs[cs.CL]
标题:{模板}的价值何在?再谈为大型语言模型重思文档信息提取数据集
作者:Ran Zmigrod, Pranav Shetty, Mathieu Sibue, Zhiqiang Ma, Armineh Nourbakhsh, Xiaomo Liu, Manuela Veloso
链接:http://arxiv.org/abs/2410.15484
备注:Accepted to EMNLP Findings 2024

[16] Pruning Foundation Models for High Accuracy without Retraining[cs.CL]
标题:剪枝基础模型以达到不重训练的高精度
作者:Pu Zhao, Fei Sun, Xuan Shen, Pinrui Yu, Zhenglun Kong, Yanzhi Wang, Xue Lin
链接:http://arxiv.org/abs/2410.15567
代码:https://github.com/piuzha/APT
备注:Accepted by EMNLP 2024 findings

[17] Scalable Data Ablation Approximations for Language Models through Modular Training and Merging[cs.CL]
标题:可扩展的数据消融近似:通过模块化训练与合并的语言模型
作者:Clara Na, Ian Magnusson, Ananya Harsh Jha, Tom Sherborne, Emma Strubell, Jesse Dodge, Pradeep Dasigi
链接:http://arxiv.org/abs/2410.15661
备注:EMNLP 2024. 17 pages

[18] Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding[cs.CL]
标题:通过对比解码缓解大型语言模型在医学信息提取中的幻觉问题
作者:Derong Xu, Ziheng Zhang, Zhihong Zhu, Zhenxi Lin, Qidong Liu, Xian Wu, Tong Xu, Xiangyu Zhao, Yefeng Zheng, Enhong Chen
链接:http://arxiv.org/abs/2410.15702
备注:Accepted by EMNLP 2024 Findings

[19] Who's Who: Large Language Models Meet Knowledge Conflicts in Practice[cs.CL]
标题:谁是首席:大型语言模型在实践中遭遇知识冲突
作者:Quang Hieu Pham, Hoang Ngo, Anh Tuan Luu, Dat Quoc Nguyen
链接:http://arxiv.org/abs/2410.15737
备注:Accepted to EMNLP 2024 Findings

[20] Toeing the Party Line: Election Manifestos as a Key to Understand Political Discourse on Twitter[cs.CL]
标题:响应党的路线:选举宣言理解Twitter上政治话语的关键
作者:Maximilian Maurer, Tanise Ceron, Sebastian Padó, Gabriella Lapesa
链接:http://arxiv.org/abs/2410.15743
备注:9 pages, accepted at EMNLP (Findings) 2024

[21] Improve Dense Passage Retrieval with Entailment Tuning[cs.CL]
标题:提升基于语义蕴含的密集文本检索能力
作者:Lu Dai, Hao Liu, Hui Xiong
链接:http://arxiv.org/abs/2410.15801
备注:EMNLP 2024 Main

[22] Mitigating Object Hallucination via Concentric Causal Attention[cs.CV]
标题:通过同心因果注意力缓解物体幻觉
作者:Yun Xing, Yiheng Li, Ivan Laptev, Shijian Lu
链接:http://arxiv.org/abs/2410.15926
代码:https://github.com/xing0047/cca-llava
备注:To appear at NeurIPS 2024. Code is available at this https URL

[23] Large Language Models Know What To Say But Not When To Speak[cs.CL]
标题:大型语言模型知道说什么,但不知道何时开口说
作者:Muhammad Umair, Vasanth Sarathy, JP de Ruiter
链接:http://arxiv.org/abs/2410.16044
备注:EMNLP 2024 (Findings)

[24] Surprise! Uniform Information Density Isn't the Whole Story: Predicting Surprisal Contours in Long-form Discourse[cs.CL]
标题:惊喜!均匀信息密度并非全部:在长篇话语中预测惊讶度轮廓
作者:Eleftheria Tsipidi, Franz Nowak, Ryan Cotterell, Ethan Wilcox, Mario Giulianelli, Alex Warstadt
链接:http://arxiv.org/abs/2410.16062
备注:EMNLP 2024 (main conference)

[25] Analysing the Residual Stream of Language Models Under Knowledge Conflicts[cs.CL]
标题:分析知识冲突下语言模型残余流
作者:Yu Zhao, Xiaotang Du, Giwon Hong, Aryo Pradipta Gema, Alessio Devoto, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini
链接:http://arxiv.org/abs/2410.16090
备注:Foundation Model Interventions Workshop @ NeurIPS 2024

感谢arxiv.org


Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/175249
 
49 次点击