[0] Optimizing Parking Space Classification: Distilling Ensembles into Lightweight Classifiers[cs.CV] 标题:优化停车位分类:将从集成模型提取到轻量级分类器 作者:Paulo Luza Alves, André Hochuli, Luiz Eduardo de Oliveira, Paulo Lisboa de Almeida 链接:http://arxiv.org/abs/2410.14705 备注:Accepted for presentation at the International Conference on Machine Learning and Applications (ICMLA) 2024
[1] Non-Invasive to Invasive: Enhancing FFA Synthesis from CFP with a Benchmark Dataset and a Novel Network[cs.CV] 标题:非侵入式到侵入式:通过基准数据集和新型网络增强自CFP的FFA合成 作者:Hongqiu Wang, Zhaohu Xing, Weitong Wu, Yijun Yang, Qingqing Tang, Meixia Zhang, Yanwu Xu, Lei Zhu 链接:http://arxiv.org/abs/2410.14965 代码:https://github.com/whq-xxh/FFA-Synthesis 备注:ACMMM 24 MCHM
[2] DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain[cs.CV] 标题:离散余弦域中的DCDepth:渐进式单目深度估计 作者:Kun Wang, Zhiqiang Yan, Junkai Fan, Wanlu Zhu, Xiang Li, Jun Li, Jian Yang 链接:http://arxiv.org/abs/2410.14980 代码:https://github.com/w2kun/DCDepth 备注:Accepted by NeurIPS-2024
[3] Quanta Video Restoration[cs.CV] 标题:量子视频恢复 作者:Prateek Chennuri, Yiheng Chi, Enze Jiang, G. M. Dilshan Godaliyadda, Abhiram Gnanasambandam, Hamid R. Sheikh, Istvan Gyongy, Stanley H. Chan 链接:http://arxiv.org/abs/2410.14994 代码:https://github.com/chennuriprateek/Quanta_Video_Restoration-QUIVER- 期刊:European Conference on Computer Vision (ECCV) 2024
[4] How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold[cs.CV] 标题:需要多少梵·高才能成为梵·高?寻找模仿阈值 作者:Sahil Verma, Royi Rassin, Arnav Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar 链接:http://arxiv.org/abs/2410.15002 代码:https://github.com/vsahil/MIMETIC-2.git 备注:Accepted at ATTRIB, RegML, and SafeGenAI workshops at NeurIPS 2024 and NLLP Workshop 2024
[5] DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer[cs.CV] 标题:扩散模型在风格迁移中的潜在能力释放:DiffuseST 作者:Ying Hu, Chenyi Zhuang, Pan Gao 链接:http://arxiv.org/abs/2410.15007 代码:https://github.com/I2-Multimedia-Lab/DiffuseST 备注:Accepted to ACMMM Asia 2024. Code is available at this https URL
[6] Scene Graph Generation with Role-Playing Large Language Models[cs.CV] 标题:场景图生成与角色扮演大型语言模型 作者:Guikun Chen, Jin Li, Wenguan Wang 链接:http://arxiv.org/abs/2410.15364 代码:https://github.com/guikunchen/SDSGG 备注:NeurIPS 2024. Code: this https URL
[7] IPO: Interpretable Prompt Optimization for Vision-Language Models[cs.CV] 标题:视觉-语言模型的可解释提示优化 作者:Yingjun Du, Wenfang Sun, Cees G. M. Snoek 链接:http://arxiv.org/abs/2410.15397 备注:Accepted by NeurIPS 2024
[8] BoostAdapter: Improving Test-Time Adaptation via Regional Bootstrapping[cs.CV] 标题:区域引导:通过区域自举提升测试时适应性 作者:Taolin Zhang, Jinpeng Wang, Hang Guo, Tao Dai, Bin Chen, Shu-Tao Xia 链接:http://arxiv.org/abs/2410.15430 备注:NeurIPS 2024
[10] ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos[cs.CV] 标题:ARTS:利用解耦骨骼表示的半解析回归器进行视频中的人类网格恢复 作者:Tao Tang, Hong Liu, Yingxuan You, Ti Wang, Wenhao Li 链接:http://arxiv.org/abs/2410.15582 代码:https://github.com/TangTao-PKU/ARTS 备注:Accepted by ACM MM 2024. Project page: this https URL
[12] TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight[cs.CV] 标题:TALoS:基于视线的测试时间自适应增强语义场景补全 作者:Hyun-Kurl Jang, Jihun Kim, Hyeokjun Kweon, Kuk-Jin Yoon 链接:http://arxiv.org/abs/2410.15674 代码:https://github.com/blue-531/TALoS 备注:Accepted at NeurIPS 2024. Code is available at this https URL
[13] LiMTR: Time Series Motion Prediction for Diverse Road Users through Multimodal Feature Integration[cs.CV] 标题:多模态特征集成用于多样化道路使用者的时间序列运动预测:LiMTR 作者:Camiel Oerlemans, Bram Grooten, Michiel Braat, Alaa Alassi, Emilia Silvas, Decebal Constantin Mocanu 链接:http://arxiv.org/abs/2410.15819 代码:https://github.com/Cing2/LiMTR 备注:Accepted at the NeurIPS 2024 workshop Time Series in the Age of Large Models. Code available at this https URL
[14] Random Token Fusion for Multi-View Medical Diagnosis[cs.CV] 标题:随机 tokens 融合用于多视图医学诊断 作者:Jingyu Guo, Christos Matsoukas, Fredrik Strand, Kevin Smith 链接:http://arxiv.org/abs/2410.15847 备注:Originally published at the NeurIPS 2024 Workshop on Advancements In Medical Foundation Models: Explainability, Robustness, Security, and Beyond (AIM-FM)
[15] Visual Motif Identification: Elaboration of a Curated Comparative Dataset and Classification Methods[cs.CV] 标题:视觉模式识别:构建一个精选比较数据集和分类方法的阐述
作者:Adam Phillips (1), Daniel Grandes Rodriguez (1), Miriam Sánchez-Manzano (1), Alan Salvadó (1), Manuel Garin (1), Gloria Haro (1), Coloma Ballester (1) ((1) Universitat Pompeu Fabra, Barcelona, Spain) 链接:http://arxiv.org/abs/2410.15866 备注:17 pages, 11 figures, one table, to be published in the conference proceedings of ECCV 2024
[16] Mitigating Object Hallucination via Concentric Causal Attention[cs.CV] 标题:通过同心因果注意力缓解物体幻觉 作者:Yun Xing, Yiheng Li, Ivan Laptev, Shijian Lu 链接:http://arxiv.org/abs/2410.15926 代码:https://github.com/xing0047/cca-llava 备注:To appear at NeurIPS 2024. Code is available at this https URL
[17] Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly[cs.CV] 标题:使用深度先验组装的单图像零样本场景重建 作者:Junsheng Zhou, Yu-Shen Liu, Zhizhong Han 链接:http://arxiv.org/abs/2410.15971 备注:To appear at NeurIPS 2024. Project page: this https URL
[18] START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation[cs.CV] 标题:START:一种基于显著性驱动的全局Token感知转换的一般化状态空间模型 作者:Jintao Guo, Lei Qi, Yinghuan Shi, Yang Gao 链接:http://arxiv.org/abs/2410.16020 代码:https://github.com/lingeringlight/START 备注:Accepted by NeurIPS2024. The code is available at this https URL
[19] Towards Combating Frequency Simplicity-biased Learning for Domain Generalization[cs.CV] 标题:向着对抗频率简单偏差学习的领域泛化方法 作者:Xilin He, Jingyu Hu, Qinliang Lin, Cheng Luo, Weicheng Xie, Siyang Song, Muhammad Haris Khan, Linlin Shen 链接:http://arxiv.org/abs/2410.16146 代码:https://github.com/C0notSilly/AdvFrequency 备注:Accepted by NeurIPS 2024
[20] Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models[cs.CV] 标题:扭曲扩散:使用图像扩散模型解决视频逆问题 作者:Giannis Daras, Weili Nie, Karsten Kreis, Alex Dimakis, Morteza Mardani, Nikola Borislavov Kovachki, Arash Vahdat 链接:http://arxiv.org/abs/2410.16152 备注:Accepted in NeurIPS 2024
[21] 3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors[cs.CV] 标题:3DGS-增强器:通过视图一致二维扩散先验增强无限3D高斯喷溅 作者:Xi Liu, Chaoyi Zhou, Siyu Huang 链接:http://arxiv.org/abs/2410.16266 代码:https://xiliu8006.github.io/3DGS-Enhancer-project 备注:Accepted by NeurIPS 2024 Spotlight
自然语言处理会议: 26篇
[0] QuAILoRA: Quantization-Aware Initialization for LoRA[cs.CL] 标题:QuAILoRA:LoRA 中的量化感知初始化 作者:Neal Lawton, Aishwarya Padmakumar, Judith Gaspers, Jack FitzGerald, Anoop Kumar, Greg Ver Steeg, Aram Galstyan 链接:http://arxiv.org/abs/2410.14713 备注:12 pages, 7 figures. Submitted to the 4th NeurIPS Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV)
[1] Rethinking Token Reduction for State Space Models[cs.CL] 标题:重新思考状态空间模型中的标记缩减 作者:Zheng Zhan, Yushu Wu, Zhenglun Kong, Changdi Yang, Yifan Gong, Xuan Shen, Xue Lin, Pu Zhao, Yanzhi Wang 链接:http://arxiv.org/abs/2410.14725 备注:EMNLP 2024
[2] TimeSeriesExam: A time series understanding exam[cs.CL] 标题:时间序列考察:时间序列理解考试 作者:Yifu Cai, Arjun Choudhry, Mononito Goswami, Artur Dubrawski 链接:http://arxiv.org/abs/2410.14752 备注:Accepted at NeurIPS'24 Time Series in the Age of Large Models Workshop
[3] Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection[cs.CL] 标题:哪些LLM难以检测?对导致LLM文本检测困难的潜在因素的详细分析 作者:Shantanu Thorat, Tianbao Yang 链接:http://arxiv.org/abs/2410.14875 备注:Accepted at NeurIPS 2024 - Safe Generative AI Workshop
[4] Class-RAG: Content Moderation with Retrieval Augmented Generation[cs.CL] 标题:内容增强生成用于内容审核的Class-RAG 作者:Jianfa Chen, Emily Shen, Trupti Bavalatti, Xiaowen Lin, Yongkai Wang, Shuming Hu, Harihar Subramanyam, Ksheeraj Sai Vepuri, Ming Jiang, Ji Qi, Li Chen, Nan Jiang, Ankit Jain 链接:http://arxiv.org/abs/2410.14881 备注:11 pages, submit to ACL
[5] From Test-Taking to Test-Making: Examining LLM Authoring of Commonsense Assessment Items[cs.CL]
标题:从试题作答到试题编制:探讨大型语言模型在常识评估项目创作中的表现 作者:Melissa Roemmele, Andrew S. Gordon 链接:http://arxiv.org/abs/2410.14897 备注:Accepted at Findings of EMNLP 2024
[6] A Survey of Ontology Expansion for Conversational Understanding[cs.CL] 标题:对话理解中的本体扩展综述 作者:Jinggui Liang, Yuxia Wu, Yuan Fang, Hao Fei, Lizi Liao 链接:http://arxiv.org/abs/2410.15019 代码:https://github.com/liangjinggui/Ontology-Expansion 备注:Accepted by EMNLP 2024, code and data are available at this https URL: this https URL
[7] Are LLMs Good Zero-Shot Fallacy Classifiers?[cs.CL] 标题:大型语言模型是好的零样本谬误分类器吗? 作者:Fengjun Pan, Xiaobao Wu, Zongrui Li, Anh Tuan Luu 链接:http://arxiv.org/abs/2410.15050 代码:https://github.com/panFJCharlotte98/Fallacy_Detection 备注:Accepted to EMNLP2024 main conference
[8] Toward Robust RALMs: Revealing the Impact of Imperfect Retrieval on Retrieval-Augmented Language Models[cs.CL] 标题:向更鲁棒的RA-LMs迈进:揭示不完美检索对检索增强语言模型的影响 作者:Seong-Il Park, Jay-Yoon Lee 链接:http://arxiv.org/abs/2410.15107 备注:Accepted for publication in Transactions of the Association for Computational Linguistics (TACL)
[9] MELT: Materials-aware Continued Pre-training for Language Model Adaptation to Materials Science[cs.CL] 标题:材料感知的语言模型自适应材料科学的学习连续预训练:MELT 作者:Junho Kim, Yeachan Kim, Jun-Hyung Park, Yerim Oh, Suho Kim, SangKeun Lee 链接:http://arxiv.org/abs/2410.15126 备注:Accepted at EMNLP 2024 (Findings)
[10] Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning[cs.CL] 标题:更少即是更多:迁移学习中对中间任务的参数高效选择 作者:David Schulte, Felix Hamborg, Alan Akbik 链接:http://arxiv.org/abs/2410.15148 备注:EMNLP 2024 Main Conference
[11] Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction[cs.CL] 标题:用大型语言模型解释图神经网络:分子性质预测的对比视角 作者:Yinhan He, Zaiyi Zheng, Patrick Soga, Yaozhen Zhu, yushun Dong, Jundong Li 链接:http://arxiv.org/abs/2410.15165 代码:https://github.com/YinhanHe123/new 期刊:EMNLP 2024 (Findings)
[12] An Electoral Approach to Diversify LLM-based Multi-Agent Collective Decision-Making[cs.CL] 标题:基于多智能体集体决策的LLM分化选举方法 作者:Xiutian Zhao, Ke Wang, Wei Peng 链接:http://arxiv.org/abs/2410.15168 备注:Accepted to EMNLP 2024
[13] IPO: Interpretable Prompt Optimization for Vision-Language Models[cs.CV] 标题:视觉-语言模型的可解释提示优化 作者:Yingjun Du, Wenfang Sun, Cees G. M. Snoek 链接:http://arxiv.org/abs/2410.15397 备注:Accepted by NeurIPS 2024
[14] Evaluating Consistencies in LLM responses through a Semantic Clustering of Question Answering[cs.CL] 标题:评估基于语义聚类的LLM应答的一致性 作者:Yanggyu Lee, Jihie Kim 链接:http://arxiv.org/abs/2410.15440 备注:Accepted to the Trustworthy AI Workshop at IJCAI 2024
[15] "What is the value of {templates}?" Rethinking Document Information Extraction Datasets for LLMs[cs.CL] 标题:{模板}的价值何在?再谈为大型语言模型重思文档信息提取数据集 作者:Ran Zmigrod, Pranav Shetty, Mathieu Sibue, Zhiqiang Ma, Armineh Nourbakhsh, Xiaomo Liu, Manuela Veloso 链接:http://arxiv.org/abs/2410.15484 备注:Accepted to EMNLP Findings 2024
[16] Pruning Foundation Models for High Accuracy without Retraining[cs.CL] 标题:剪枝基础模型以达到不重训练的高精度 作者:Pu Zhao, Fei Sun, Xuan Shen, Pinrui Yu, Zhenglun Kong, Yanzhi Wang, Xue Lin 链接:http://arxiv.org/abs/2410.15567 代码:https://github.com/piuzha/APT 备注:Accepted by EMNLP 2024 findings
[17] Scalable Data Ablation Approximations for Language Models through Modular Training and Merging[cs.CL] 标题:可扩展的数据消融近似:通过模块化训练与合并的语言模型 作者:Clara Na, Ian Magnusson, Ananya Harsh Jha, Tom Sherborne, Emma Strubell, Jesse Dodge, Pradeep Dasigi 链接:http://arxiv.org/abs/2410.15661 备注:EMNLP 2024. 17 pages
[18] Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding[cs.CL] 标题:通过对比解码缓解大型语言模型在医学信息提取中的幻觉问题
作者:Derong Xu, Ziheng Zhang, Zhihong Zhu, Zhenxi Lin, Qidong Liu, Xian Wu, Tong Xu, Xiangyu Zhao, Yefeng Zheng, Enhong Chen 链接:http://arxiv.org/abs/2410.15702 备注:Accepted by EMNLP 2024 Findings
[19] Who's Who: Large Language Models Meet Knowledge Conflicts in Practice[cs.CL] 标题:谁是首席:大型语言模型在实践中遭遇知识冲突 作者:Quang Hieu Pham, Hoang Ngo, Anh Tuan Luu, Dat Quoc Nguyen 链接:http://arxiv.org/abs/2410.15737 备注:Accepted to EMNLP 2024 Findings
[20] Toeing the Party Line: Election Manifestos as a Key to Understand Political Discourse on Twitter[cs.CL] 标题:响应党的路线:选举宣言理解Twitter上政治话语的关键 作者:Maximilian Maurer, Tanise Ceron, Sebastian Padó, Gabriella Lapesa 链接:http://arxiv.org/abs/2410.15743 备注:9 pages, accepted at EMNLP (Findings) 2024
[21] Improve Dense Passage Retrieval with Entailment Tuning[cs.CL] 标题:提升基于语义蕴含的密集文本检索能力 作者:Lu Dai, Hao Liu, Hui Xiong 链接:http://arxiv.org/abs/2410.15801 备注:EMNLP 2024 Main
[22] Mitigating Object Hallucination via Concentric Causal Attention[cs.CV] 标题:通过同心因果注意力缓解物体幻觉 作者:Yun Xing, Yiheng Li, Ivan Laptev, Shijian Lu 链接:http://arxiv.org/abs/2410.15926 代码:https://github.com/xing0047/cca-llava 备注:To appear at NeurIPS 2024. Code is available at this https URL
[23] Large Language Models Know What To Say But Not When To Speak[cs.CL] 标题:大型语言模型知道说什么,但不知道何时开口说 作者:Muhammad Umair, Vasanth Sarathy, JP de Ruiter 链接:http://arxiv.org/abs/2410.16044 备注:EMNLP 2024 (Findings)
[24] Surprise! Uniform Information Density Isn't the Whole Story: Predicting Surprisal Contours in Long-form Discourse[cs.CL] 标题:惊喜!均匀信息密度并非全部:在长篇话语中预测惊讶度轮廓 作者:Eleftheria Tsipidi, Franz Nowak, Ryan Cotterell, Ethan Wilcox, Mario Giulianelli, Alex Warstadt 链接:http://arxiv.org/abs/2410.16062 备注:EMNLP 2024 (main conference)
[25] Analysing the Residual Stream of Language Models Under Knowledge Conflicts[cs.CL] 标题:分析知识冲突下语言模型残余流 作者:Yu Zhao, Xiaotang Du, Giwon Hong, Aryo Pradipta Gema, Alessio Devoto, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini 链接:http://arxiv.org/abs/2410.16090 备注:Foundation Model Interventions Workshop @ NeurIPS 2024