[0] A SAM based Tool for Semi-Automatic Food Annotation[cs.CV] 标题:基于SAM的半自动食品标注工具 作者:Lubnaa Abdur Rahman, Ioannis Papathanail, Lorenzo Brigato, Stavroula Mougiakakou 链接:http://arxiv.org/abs/2410.19756 备注:Accepted Demo Paper - ECAI 2024
[1] Radon Implicit Field Transform (RIFT): Learning Scenes from Radar Signals[cs.CV] 标题:雷达信号场景学习:拉登隐式场变换(RIFT) 作者:Daqian Bao, Alex Saad-Falcon, Justin Romberg 链接:http://arxiv.org/abs/2410.19801 备注:A version of this work is under review as a submission to ICLR 2025 Conference
[2] DivShift: Exploring Domain-Specific Distribution Shift in Volunteer-Collected Biodiversity Datasets[cs.CV] 标题:生物多样性志愿收集数据集中的领域特定分布变化的探索:DivShift 作者:Elena Sierra, Lauren E. Gillespie, Salim Soltani, Moises Exposito-Alonso, Teja Kattenborn 链接:http://arxiv.org/abs/2410.19816 备注:Accepted to NeurIPS 2024 Workshop on Tackling Climate Change with Machine Learning
[4] Improving Multimodal Large Language Models Using Continual Learning[cs.CV] 标题:使用持续学习提升多模态大型语言模型 作者:Shikhar Srivastava, Md Yousuf Harun, Robik Shrestha, Christopher Kanan 链接:http://arxiv.org/abs/2410.19925 备注:NeurIPS 2024 Workshop on Scalable Continual Learning for Lifelong Foundation Models
[5] Resolving Domain Shift For Representations Of Speech In Non-Invasive Brain Recordings[eess.IV] 标题:解决非侵入式脑录音中文特征表示中的领域迁移问题 作者:Jeremiah Ridge, Oiwi Parker Jones 链接:http://arxiv.org/abs/2410.19986 备注:Submitted to ICLR 2025
[6] SCube: Instant Large-Scale Scene Reconstruction using VoxSplats[cs.CV] 标题:SCube:利用VoxSplats的即时大规模场景重建 作者:Xuanchi Ren, Yifan Lu, Hanxue Liang, Zhangjie Wu, Huan Ling, Mike Chen, Sanja Fidler, Francis Williams, Jiahui Huang 链接:http://arxiv.org/abs/2410.20030 备注:NeurIPS 2024. Project page: this https URL
[7] ResAD: A Simple Framework for Class Generalizable Anomaly Detection[cs.CV] 标题:ResAD:一个适用于类别可泛化异常检测的简单框架 作者:Xincheng Yao, Zixin Chen, Chao Gao, Guangtao Zhai, Chongyang Zhang 链接:http://arxiv.org/abs/2410.20047 代码:https://github.com/xcyao00/ResAD 备注:This paper was accepted as a spotlight papaer by NeurIPS 2024
[8] Anatomical 3D Style Transfer Enabling Efficient Federated Learning with Extremely Low Communication Costs[cs.CV] 标题:三维解剖风格迁移,实现高效且极具低通信成本的联邦学习 作者:Yuto Shibata, Yasunori Kudo, Yohei Sugawara 链接:http://arxiv.org/abs/2410.20102 备注:Accepted by AIM-FM Workshop at NeurIPS 2024
[9] AdaNeg: Adaptive Negative Proxy Guided OOD Detection with Vision-Language Models[cs.CV] 标题:AdaNeg:自适应负代理引导的视觉语言模型模式化异常检测 作者:Yabin Zhang, Lei Zhang 链接:http://arxiv.org/abs/2410.20149 代码:https://github.com/YBZh/OpenOOD-VLM 备注:NIPS 2024 Camera Ready, Codes are available at \url{this https URL}
[10] Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models[cs.CV] 标题:人类-物体交互检测与大型关系驱动扩散模型的协同 作者:Liulei Li, Wenguan Wang, Yi Yang 链接:http://arxiv.org/abs/2410.20155 备注:NeurIPS 2024
[11] You Never Know: Quantization Induces Inconsistent Biases in Vision-Language Foundation Models[cs.CV] 标题:你永远不会知道:量化在视觉-语言基础模型中引发不一致偏差 作者:Eric Slyman, Anirudh Kanneganti, Sanghyun Hong, Stefan Lee 链接:http://arxiv.org/abs/2410.20265 备注:Workshop paper at NeurIPS 2024 RBFM. 6 pages, 3 figures
[12] Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions[cs.CV] 标题:和谐4D:自然环境下近距离人际互动视频数据集 作者:Rawal Khirodkar, Jyun-Ting Song, Jinkun Cao, Zhengyi Luo, Kris Kitani 链接:http://arxiv.org/abs/2410.20294 代码:https://jyuntins.github.io/harmony4d/ 备注:NeurIPS 2024
[13] Historical Test-time Prompt Tuning for Vision Foundation Models[cs.CV] 标题:历史测试时提示微调用于视觉基础模型 作者:Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Ling Shao, Shijian Lu 链接:http://arxiv.org/abs/2410.20346 备注:NeurIPS 2024 Camera Ready
[15] RopeTP: Global Human Motion Recovery via Integrating Robust Pose Estimation with Diffusion Trajectory Prior[cs.CV] 标题:绳索TP:通过融合鲁棒姿态估计与扩散轨迹先验实现全局人类运动恢复 作者:Mingjiang Liang, Yongkang Cheng, Hualin Liang, Shaoli Huang, Wei Liu 链接:http://arxiv.org/abs/2410.20358 备注:Accepted by WACV 2025 (Round 1)
[16] Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios[cs.CV] 标题:条件性生成对抗网络提升音频高效且逼真的全局手势生成中的扩散模型 作者:Yongkang Cheng, Mingjiang Liang, Shaoli Huang, Gaoge Han, Jifeng Ning, Wei Liu 链接:http://arxiv.org/abs/2410.20359 备注:Accepted by WACV 2025 (Round 1)
[17] Open-Vocabulary Object Detection via Language Hierarchy[cs.CV] 标题:开放词汇物体检测通过语言层级 作者:Jiaxing Huang, Jingyi Zhang, Kai Jiang, Shijian Lu 链接:http://arxiv.org/abs/2410.20371 备注:NeurIPS 2024 Camera Ready
[18] Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis[cs.CV] 标题:点云分析泛化性促导学习调控框架 作者:Hongyu Sun, Qiuhong Ke, Yongcai Wang, Wang Chen, Kang Yang, Deying Li, Jianfei Cai 链接:http://arxiv.org/abs/2410.20406 代码:https://github.com/auniquesun/Point-PRC 备注:accepted by NeurIPS 2024
[20] BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events[cs.CV] 标题:眨眼视觉:基于RGB帧和事件的视觉光流、场景光流和点追踪估计的基准测试 作者:Yijin Li, Yichen Shen, Zhaoyang Huang, Shuo Chen, Weikang Bian, Xiaoyu Shi, Fu-Yun Wang, Keqiang Sun, Hujun Bao, Zhaopeng Cui, Guofeng Zhang, Hongsheng Li 链接:http://arxiv.org/abs/2410.20451 备注:Accepted to ECCV 2024. Project Page: this https URL
[21] Unlocking Comics: The AI4VA Dataset for Visual Understanding[cs.CV] 标题:解锁漫画:AI4VA数据集助力视觉理解 作者:Peter Grönquist, Deblina Bhattacharjee, Bahar Aydemir, Baran Ozaydin, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk 链接:http://arxiv.org/abs/2410.20459 备注:ECCV 2024 Workshop Proceedings
[22] GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation[cs.CV] 标题:地网:通过噪声纹理移植实现对扩散变换器的定位 作者:Phillip Y. Lee, Taehoon Yoon, Minhyuk Sung 链接:http://arxiv.org/abs/2410.20474 备注:Accepted to NeurIPS 2024. Project Page: this https URL
[23] What Factors Affect Multi-Modal In-Context Learning? An In-Depth Exploration[cs.CV] 标题:多模态情境学习受哪些因素影响?深度探讨 作者:Libo Qin, Qiguang Chen, Hao Fei, Zhi Chen, Min Li, Wanxiang Che 链接:http://arxiv.org/abs/2410.20482 备注:Accepted at NeurIPS 2024
[24] Referring Human Pose and Mask Estimation in the Wild[cs.CV] 标题:野外的参照人体姿态和面部遮挡估计 作者:Bo Miao, Mingtao Feng, Zijie Wu, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian 链接:http://arxiv.org/abs/2410.20508 代码:https://github.com/bo-miao/RefHuman 备注:Accepted by NeurIPS 2024. this https URL
[25] Asynchronous Perception Machine For Efficient Test-Time-Training[cs.CV] 标题:异步感知机:高效测试时训练方法 作者:Rajat Modi, Yogesh Singh Rawat 链接:http://arxiv.org/abs/2410.20535 代码:https://github.com/rajatmodi62/apm 备注:Accepted to NeurIPS 2024 Main Track. APM is a step to getting Geoffrey Hinton's GLOM working. Original GLOM paper said: "This paper was quickly hijacked by the need to justify the design decisions". 3 years have passed us. This work provides some justifications and been peer-reviewed and accepted by our peers. A humble blogpost can be found at this https URL
[26] Neural rendering enables dynamic tomography[cs.CV] 标题:神经渲染实现动态断层扫描 作者:Ivan Grega, William F. Whitney, Vikram S. Deshpande 链接:http://arxiv.org/abs/2410.20558 备注:24 pages, 14 figures. Submitted to NeurIPS 2024 ML4PS. For associated visualizations, see this https URL
[27] Normal-GS: 3D Gaussian Splatting with Normal-Involved Rendering[cs.CV] 标题:正常-高斯瞬移:包含法线相关渲染的3D高斯瞬移 作者:Meng Wei, Qianyi Wu, Jianmin Zheng, Hamid Rezatofighi, Jianfei Cai 链接:http://arxiv.org/abs/2410.20593 备注:9 pages, 5 figures, accepted at NeurIPS 2024
[29] Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding[cs.CV] 标题:双向循环用于高斯过程隐式编码的心脏运动跟踪 作者:Jiewen Yang, Yiqun Lin, Bin Pu, Xiaomeng Li 链接:http://arxiv.org/abs/2410.20752 代码:https://github.com/xmed-lab/GPTrack 备注:Paper Accepted by NeurIPS 2024
[30] CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos[cs.CV] 标题:心网:从超声心动图视频中学习重建心脏疾病异常的建模方法 作者:Jiewen Yang, Yiqun Lin, Bin Pu, Jiarong Guo, Xiaowei Xu, Xiaomeng Li 链接:http://arxiv.org/abs/2410.20769 代码:https://github.com/xmed-lab/CardiacNet 备注:Paper Accepted by ECCV 2024 with Oral Presentation
[31] Long-Tailed Out-of-Distribution Detection via Normalized Outlier Distribution Adaptation[cs.CV] 标题:基于规范化异常分布自适应的长尾异常检测 作者:Wenjun Miao, Guansong Pang, Jin Zheng, Xiao Bai 链接:http://arxiv.org/abs/2410.20807 代码:https://github.com/mala-lab/AdaptOD 备注:NIPS2024
[32] Grid4D: 4D Decomposed Hash Encoding for High-fidelity Dynamic Gaussian Splatting[cs.CV] 标题:Grid4D:适用于高保真动态高斯散斑的4D分解哈希编码 作者:Jiawei Xu, Zexin Fan, Jian Yang, Jin Xie 链接:http://arxiv.org/abs/2410.20815 备注:Accepted by NeurIPS 2024
[33] Novel Object Synthesis via Adaptive Text-Image Harmony[cs.CV] 标题:自适应文本-图像和谐的新型物体合成 作者:Zeren Xiong, Zedong Zhang, Zikun Chen, Shuo Chen, Xiang Li, Gan Sun, Jian Yang, Jun Li 链接:http://arxiv.org/abs/2410.20823 备注:NeurIPS2024
[34] Skinned Motion Retargeting with Dense Geometric Interaction Perception[cs.CV] 标题:皮肤运动重定向和密集几何交互感知 作者:Zijie Ye, Jia-Wei Liu, Jia Jia, Shikun Sun, Mike Zheng Shou 链接:http://arxiv.org/abs/2410.20986 代码:https://github.com/abcyzj/MeshRet 备注:NeurIPS 2024 Spotlight
[35] Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition[cs.CV] 标题:通过高斯邻域最小化改进视觉提示调整以增强长尾视觉识别 作者:Mengke Li, Ye Liu, Yang Lu, Yiqun Zhang, Yiu-ming Cheung, Hui Huang 链接:http://arxiv.org/abs/2410.21042 代码:https://github.com/Keke921/GNM-PT 备注:NeurIPS 2024
自然语言处理会议: 24篇
[0] Ensembling Finetuned Language Models for Text Classification[cs.CL] 标题:多任务微调语言模型融合用于文本分类 作者:Sebastian Pineda Arango, Maciej Janowski, Lennart Purucker, Arber Zela, Frank Hutter, Josif Grabocka 链接:http://arxiv.org/abs/2410.19889 备注:Workshop on Fine-Tuning in Modern Machine Learning @ NeurIPS 2024. arXiv admin note: text overlap with arXiv:2410.04520
[1] Improving Multimodal Large Language Models Using Continual Learning[cs.CV] 标题:使用持续学习提升多模态大型语言模型 作者:Shikhar Srivastava, Md Yousuf Harun, Robik Shrestha, Christopher Kanan 链接:http://arxiv.org/abs/2410.19925 备注:NeurIPS 2024 Workshop on Scalable Continual Learning for Lifelong Foundation Models
[2] Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models[cs.CL] 标题:逐层分析:揭示指令调整的大型语言模型中多任务学习的发生位置 作者:Zheng Zhao, Yftah Ziser, Shay B. Cohen 链接:http://arxiv.org/abs/2410.20008 备注:Accepted to EMNLP 2024
[3] Attacks against Abstractive Text Summarization Models through Lead Bias and Influence Functions[cs.CL] 标题:通过首段偏见和影响函数对抽象文本摘要模型进行的攻击 作者:Poojitha Thota, Shirin Nilizadeh 链接:http://arxiv.org/abs/2410.20019 备注:10 pages, 3 figures, Accepted at EMNLP Findings 2024
[4] LinBridge: A Learnable Framework for Interpreting Nonlinear Neural Encoding Models[cs.CL] 标题:林桥:一种用于解释非线性神经编码模型的可学习框架 作者:Xiaohui Gao, Yue Cheng, Peiyang Li, Yijie Niu, Yifan Ren, Yiheng Liu, Haiyang Sun, Zhuoyi Li, Weiwei Xing, Xintao Hu 链接:http://arxiv.org/abs/2410.20053 备注:9 pages of main text, 23 pages total, submitted to ICLR 2025 and currently under review
[5] Reasoning or a Semblance of it? A Diagnostic Study of Transitive Reasoning in LLMs[cs.CL] 标题:原因还是原因的幻影?关于大型语言模型中移转推理的诊断性研究 作者:Houman Mehrafarin, Arash Eshghi, Ioannis Konstas 链接:http://arxiv.org/abs/2410.20200 备注:To appear in EMNLP Main 2024
[6] Pseudo-Label Enhanced Prototypical Contrastive Learning for Uniformed Intent Discovery[cs.CL] 标题:伪标签增强的典型对比学习在制服意图发现中的应用 作者:Yimin Deng, Yuxia Wu, Guoshuai Zhao, Li Zhu, Xueming Qian 链接:http://arxiv.org/abs/2410.20219 备注:Accepted by EMNLP 2024 Findings
[7] Library Learning Doesn't: The Curious Case of the Single-Use "Library"[cs.CL] 标题:图书馆学习并不存在:单次使用“图书馆”的奇妙案例 作者:Ian Berlot-Attwell, Frank Rudzicz, Xujie Si 链接:http://arxiv.org/abs/2410.20274 代码:https://github.com/ikb-a/curious-case 备注:24 pages, 7 figures. Accepted to the 4th MATH-AI Workshop at NeurIPS'24
[8] Fast Best-of-N Decoding via Speculative Rejection[cs.CL] 标题:快速通过推测性拒绝进行最佳N解码 作者:Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette 链接:http://arxiv.org/abs/2410.20290 备注:NeurIPS 2024
[9] Accelerating Direct Preference Optimization with Prefix Sharing[cs.CL] 标题:加速使用前缀共享的直接偏好优化 作者:Franklin Wang, Sumanth Hegde 链接:http://arxiv.org/abs/2410.20305 代码:https://github.com/frankxwang/dpo-prefix-sharing 备注:To appear in NeurIPS 2024 in the Fine-Tuning in Machine Learning Workshop
[10] Historical Test-time Prompt Tuning for Vision Foundation Models[cs.CV] 标题:历史测试时提示微调用于视觉基础模型 作者:Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Ling Shao, Shijian Lu 链接:http://arxiv.org/abs/2410.20346 备注:NeurIPS 2024 Camera Ready
[11] Open-Vocabulary Object Detection via Language Hierarchy[cs.CV] 标题:开放词汇物体检测通过语言层级 作者:Jiaxing Huang, Jingyi Zhang, Kai Jiang, Shijian Lu
链接:http://arxiv.org/abs/2410.20371 备注:NeurIPS 2024 Camera Ready
[12] What Factors Affect Multi-Modal In-Context Learning? An In-Depth Exploration[cs.CV] 标题:多模态情境学习受哪些因素影响?深度探讨 作者:Libo Qin, Qiguang Chen, Hao Fei, Zhi Chen, Min Li, Wanxiang Che 链接:http://arxiv.org/abs/2410.20482 备注:Accepted at NeurIPS 2024
[13] : Analysing the Influence of the Speaker's Ethnicity on Hate Classification[cs.CL] 标题:《谁发声才重要》:分析说话者种族背景对仇恨分类的影响 作者:Ananya Malik, Kartik Sharma, Lynnette Hui Xian Ng, Shaily Bhatt 链接:http://arxiv.org/abs/2410.20490 备注:9 pages, 3 figures, 3 tables. To appear in NeurIPS SafeGenAI 2024 Workshop
[14] SubjECTive-QA: Measuring Subjectivity in Earnings Call Transcripts' QA Through Six-Dimensional Feature Analysis[cs.CL] 标题:主观-QA:通过六维特征分析衡量收入会议记录中的QA主观性 作者:Huzaifa Pardawala, Siddhant Sukhani, Agam Shah, Veer Kejriwal, Abhishek Pillai, Rohan Bhasin, Andrew DiBiasio, Tarun Mandapati, Dhruv Adha, Sudheer Chava 链接:http://arxiv.org/abs/2410.20651 备注:Accepted at NeurIPS 2024
[15] Evaluating LLMs for Targeted Concept Simplification forDomain-Specific Texts[cs.CL] 标题:评估针对特定领域文本的目标性概念简化的大型语言模型 作者:Sumit Asthana, Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella Lapata 链接:http://arxiv.org/abs/2410.20763 备注:to appear in proceedings of EMNLP 2024
[16] KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation[cs.CL] 标题:KD-LoRA:基于LoRA和知识蒸馏的效率微调混合方法 作者:Rambod Azimi, Rishav Rishav, Marek Teichmann, Samira Ebrahimi Kahou 链接:http://arxiv.org/abs/2410.20777 代码:https://github.com/rambodazimi/KD-LoRA 备注:Accepted at 4th NeurIPS Efficient Natural Language and Speech Processing Workshop (ENLSP-IV 2024)
[17] Graph-based Uncertainty Metrics for Long-form Language Model Outputs[cs.CL] 标题:基于图的长期语言模型输出不确定性度量 作者:Mingjian Jiang, Yangjun Ruan, Prasanna Sattigeri, Salim Roukos, Tatsunori Hashimoto 链接:http://arxiv.org/abs/2410.20783 备注:Accepted as a Spotlight paper at NeurIPS 2024
[18] NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates[cs.CL] 标题:новый термин: Оценка производительности реального времени новых терминов для больших языковых моделей с годовым обновлением 作者:Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Min Zhang, Zhaopeng Tu 链接:http://arxiv.org/abs/2410.20814 代码:https://github.com/hexuandeng/NewTerm 备注:Accepted to NeurIPS 2024 Datasets and Benchmarks Track
[19] The Zeno's Paradox of `Low-Resource' Languages[cs.CL] 标题:《“低资源”语言的芝诺悖论》 作者:Hellina Hailu Nigatu, Atnafu Lambebo Tonja, Benjamin Rosman, Thamar Solorio, Monojit Choudhury 链接:http://arxiv.org/abs/2410.20817 备注:Accepted at EMNLP 2024
[20] Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency[cs.CL] 标题:符号等价和语义一致性自动范化数学语句 作者:Zenan Li, Yifan Wu, Zhaoyu Li, Xinming Wei, Xian Zhang, Fan Yang, Xiaoxing Ma 链接:http://arxiv.org/abs/2410.20936 代码:https://github.com/Miracle-Messi/Isa-AutoFormal 备注:Published as a conference paper at NeurIPS 2024. Code is available at [this https URL](this https URL)
[21] DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning[cs.CL] 标题:DeTeCtive:通过多层次对比学习检测人工智能生成的文本 作者:Xun Guo, Shan Zhang, Yongxin He, Ting Zhang, Wanquan Feng, Haibin Huang, Chongyang Ma 链接:http://arxiv.org/abs/2410.20964 代码:https://github.com/heyongxin233/DeTeCtive 备注:To appear in NeurIPS 2024. Code is available at this https URL
[22] Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments[cs.CL] 标题:迈向统一反事实解释评估:利用大型语言模型进行以人为本的评估 作者:Marharyta Domnich, Julius Valja, Rasmus Moorits Veski, Giacomo Magnifico, Kadi Tulver, Eduard Barbu, Raul Vicente 链接:http://arxiv.org/abs/2410.21131 备注:This paper has been submitted in August and is currently under review to AAAI-2025
[23] SciER: An Entity and Relation Extraction Dataset for Datasets, Methods, and Tasks in Scientific Documents[cs.CL] 标题:SciER:科学文档中数据集、方法和任务的实体与关系抽取数据集 作者:Qi Zhang, Zhijia Chen, Huitong Pan, Cornelia Caragea, Longin Jan Latecki, Eduard Dragut 链接:http://arxiv.org/abs/2410.21155 备注:EMNLP2024 Main