社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
Py学习  »  aigc

CV&AIGC顶会整理 [2024-12-10]

晓飞的算法工程笔记 • 3 月前 • 126 次点击  

今日更新34篇:

  • 计算机视觉会议 21篇
  • 自然语言处理会议 13篇
请注意,大模型的论文多发布于自然语言处理会议中。而由于多模态的发展迅速,部分计算机视觉相关的论文也会发布在自然语言处理顶会中。

计算机视觉会议: 21篇


[0] TagFog: Textual Anchor Guidance and Fake Outlier Generation for Visual Out-of-Distribution Detection[cs.CV]
标题:TagFog:文本锚点引导与虚假异常点生成用于视觉出分布检测
作者:Jiankang Chen, Tong Zhang, Wei-Shi Zheng, Ruixuan Wang
链接:http://arxiv.org/abs/2412.05292
代码:https://github.com/Cverchen/TagFog
期刊:Proceedings of the AAAI Conference on Artificial Intelligence, 2024
备注:10 pages, 4 figures

[1]-NeRF: Leveraging Attenuation Priors in Neural Radiance Field for 3D Computed Tomography Reconstruction[cs.CV]
标题:-NeRF:在神经网络辐射场中利用衰减先验进行3D计算机断层扫描重建
作者:Li Zhou, Changsheng Fang, Bahareh Morovati, Yongtong Liu, Shuo Han, Yongshun Xu, Hengyong Yu
链接:http://arxiv.org/abs/2412.05322
备注:The paper was submitted to CVPR 2025

[2] Generative Model-Based Fusion for Improved Few-Shot Semantic Segmentation of Infrared Images[cs.CV]
标题:基于生成模型的融合技术用于红外图像的少样本语义分割提升
作者:Junno Yun, Mehmet Akçakaya
链接:http://arxiv.org/abs/2412.05341
备注:Winter Conference on Applications of Computer Vision (WACV), 2025

[3] Swap Path Network for Robust Person Search Pre-training[cs.CV]
标题:交换路径网络用于鲁棒的人体搜索预训练
作者:Lucas Jaffe, Avideh Zakhor
链接:http://arxiv.org/abs/2412.05433
代码:https://github.com/LLNL/spnet
备注:WACV 2025; Code: this https URL

[4] CigTime: Corrective Instruction Generation Through Inverse Motion Editing[cs.CV]
标题:逆向动作编辑中通过纠正性指令生成时间
作者:Qihang Fang, Chengcheng Tang, Bugra Tekin, Yanchao Yang
链接:http://arxiv.org/abs/2412.05460
备注:20 pages, 8 figures, NeurIPS 2024

[5] Video2Reward: Generating Reward Function from Videos for Legged Robot Behavior Learning[cs.CV]
标题:视频2奖励:生成奖励函数以用于地面机器人行为学习
作者:Runhao Zeng, Dingjie Zhou, Qiwei Liang, Junlin Liu, Hui Li, Changxin Huang, Jianqiang Li, Xiping Hu, Fuchun Sun
链接:http://arxiv.org/abs/2412.05515
期刊:Proceedings of the 27th European Conference on Artificial Intelligence (ECAI 2024), Santiago de Compostela, Spain, October 19-24, 2024. Frontiers in Artificial Intelligence and Applications, vol. 392, IOS Press, pp. 4369-4376
备注:8 pages, 6 figures, ECAI2024

[6] Template-free Articulated Gaussian Splatting for Real-time Reposable Dynamic View Synthesis[cs.CV]
标题:无模板的自由连接高斯喷溅实时可重用动态视点合成
作者:Diwen Wan, Yuxiang Wang, Ruijie Lu, Gang Zeng
链接:http://arxiv.org/abs/2412.05570
备注:Accepted by NeurIPS 2024

[7] TB-HSU: Hierarchical 3D Scene Understanding with Contextual Affordances[cs.CV]
标题:TB-HSU:基于情境能力的层次化三维场景理解
作者:Wenting Xu, Viorela Ila, Luping Zhou, Craig T. Jin
链接:http://arxiv.org/abs/2412.05596
备注:Submitted to AAAI2025

[8] Self-Supervised Learning with Probabilistic Density Labeling for Rainfall Probability Estimation[cs.CV]
标题:基于概率密度标签的自监督学习在降雨概率估计中的应用
作者:Junha Lee, Sojung An, Sujeong You, Namik Cho
链接:http://arxiv.org/abs/2412.05825
代码:https://github.com/joonha425/SSLPDL
备注:Accepted by WACV 2025

[9] LVP-CLIP:Revisiting CLIP for Continual Learning with Label Vector Pool[cs.CV]
标题:LVP-CLIP:重新审视CLIP以实现连续学习的标签向量池
作者:Yue Ma, Huantao Ren, Boyu Wang, Jingang Jin, Senem Velipasalar, Qinru Qiu
链接:http://arxiv.org/abs/2412.05840
备注:submitted to CVPR2025

[10] BiDM: Pushing the Limit of Quantization for Diffusion Models[cs.CV]
标题:生物扩散模型:推动扩散模型量化极限
作者:Xingyu Zheng, Xianglong Liu, Yichen Bian, Xudong Ma, Yulun Zhang, Jiakai Wang, Jinyang Guo, Haotong Qin
链接:http://arxiv.org/abs/2412.05926
代码:https://github.com/Xingyu-Zheng/BiDM
备注:NeurIPS 2024

[11] HSDA: High-frequency Shuffle Data Augmentation for Bird's-Eye-View Map Segmentation[cs.CV]
标题:HSDA:高频洗牌数据增强技术用于鸟瞰图地图分割
作者:Calvin Glisson, Qiuxiao Chen
链接:http://arxiv.org/abs/2412.06127
代码:https://github.com/Zarhult/HSDA
备注:Accepted for publication at the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 8 pages excluding references, 5 figures

[12] Data Free Backdoor Attacks[cs.CV]
标题:数据无关后门攻击
作者:Bochuan Cao, Jinyuan Jia, Chuxuan Hu, Wenbo Guo, Zhen Xiang, Jinghui Chen, Bo Li, Dawn Song
链接:http://arxiv.org/abs/2412.06219
备注:24 pages, 8 figures, accepted by NeurIPS 2024

[13] No Annotations for Object Detection in Art through Stable Diffusion[cs.CV]
标题:无标注通过稳定扩散进行艺术作品中对象检测
作者:Patrick Ramos, Nicolas Gonthier, Selina Khan, Yuta Nakashima, Noa Garcia
链接:http://arxiv.org/abs/2412.06286
代码:https://github.com/patrick-john-ramos/nada
备注:8 pages, 6 figures, to be published in WACV 2025

[14] LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations[cs.CV]
标题:LLaVA-SpaceSGG:开放词汇场景图生成中的视觉指令微调与增强空间关系
作者:Mingjie Xu, Mengyang Wu, Yuzhi Zhao, Jason Chun Lok Li, Weifeng Ou
链接:http://arxiv.org/abs/2412.06322
代码:https://github.com/Endlinc/LLaVA-SpaceSGG
备注:Accepted by the WACV 2025, including supplementary material

[15] Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation[cs.CV]
标题:智能体超脱RGB之旅:揭露适用于视觉-语言导航的混合语义-空间环境表示
作者:Xuesong Zhang, Yunbo Xu, Jia Li, Zhenzhen Hu, Richnag Hong
链接:http://arxiv.org/abs/2412.06465
备注:underreview in CVPR 2025

[16] Active Learning with Context Sampling and One-vs-Rest Entropy for Semantic Segmentation[cs.CV]
标题:基于上下文采样和一对多熵的主动学习用于语义分割
作者:Fei Wu, Pablo Marquez-Neila, Hedyeh Rafi-Tarii, Raphael Sznitman
链接:http://arxiv.org/abs/2412.06470
备注:WACV 2025, 8 pages

[17] BATseg: Boundary-aware Multiclass Spinal Cord Tumor Segmentation on 3D MRI Scans[cs.CV]
标题:BATseg:基于3D MRI扫描的边界感知多类脊髓肿瘤分割
作者:Hongkang Song, Zihui Zhang, Yanpeng Zhou, Jie Hu, Zishuo Wang, Hou Him Chan, Chon Lok Lei, Chen Xu, Yu Xin, Bo Yang
链接:http://arxiv.org/abs/2412.06507
代码:https://github.com/vLAR-group/BATseg
备注:ECCV 2024 Workshop on BioImage Computing. Code and data are available at: this https URL

[18] Bridging the Divide: Reconsidering Softmax and Linear Attention[cs.CV]
标题:跨越鸿沟:重新审视Softmax与线性注意力
作者:Dongchen Han, Yifan Pu, Zhuofan Xia, Yizeng Han, Xuran Pan, Xiu Li, Jiwen Lu, Shiji Song, Gao Huang
链接:http://arxiv.org/abs/2412.06590
代码:https://github.com/LeapLabTHU/InLine
备注:NeurIPS 2024

[19] Class Balance Matters to Active Class-Incremental Learning[cs.CV]
标题:类平衡对主动类增量学习很重要
作者:Zitong Huang, Ze Chen, Yuanze Li, Bowen Dong, Erjin Zhou, Yong Liu, Rick Siow Mong Goh, Chun-Mei Feng, Wangmeng Zuo
链接:http://arxiv.org/abs/2412.06642
代码:https://github.com/1170300714/CBS
备注:ACM MM 2024

[20] Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation[cs.CV]
标题:触觉幻融合:利用触觉感知进行3D生成
作者:Ruihan Gao, Kangle Deng, Gengshan Yang, Wenzhen Yuan, Jun-Yan Zhu
链接:http://arxiv.org/abs/2412.06785
代码:https://ruihangao.github.io/TactileDreamFusion/
备注:Accepted to NeurIPS 2024. Project webpage: this https URL Code: this https URL

自然语言处理会议: 13篇


[0] CALICO: Conversational Agent Localization via Synthetic Data Generation[cs.CL]
标题:合成数据生成对话代理定位系统英文缩写翻译为中文:CALICO:通过合成数据生成实现的对话代理定位
作者:Andy Rosenbaum, Pegah Kharazmi, Ershad Banijamali, Lu Zeng, Christopher DiPersio, Pan Wei, Gokmen Oz, Clement Chung, Karolina Owczarzak, Fabian Triefenbach, Wael Hamza
链接:http://arxiv.org/abs/2412.05388
备注:Accepted to The 37th International Conference on Neural Information Processing Systems (NeurIPS 2023) December 10-16, 2023 - SyntheticData4ML Workshop, New Orleans, United States this https URL

[1] A polar coordinate system represents syntax in large language models[cs.CL]
标题:大型语言模型中的语法定义可以用极坐标系表示
作者:Pablo Diego-Simón, Stéphane D'Ascoli, Emmanuel Chemla, Yair Lakretz, Jean-Rémi King
链接:http://arxiv.org/abs/2412.05571
期刊:NeurIPS 2024

[2] On the effective transfer of knowledge from English to Hindi Wikipedia[cs.CL]
标题:关于英语到印地语维基百科知识有效迁移的研究
作者:Paramita Das, Amartya Roy, Ritabrata Chakraborty, Animesh Mukherjee
链接:http://arxiv.org/abs/2412.05708
备注:accepted at COLING Industry Track 2025

[3] Uncovering Uncertainty in Transformer Inference[cs.CL]
标题:揭示Transformer推理中的不确定性
作者:Greyson Brothers, Willa Mannering, Amber Tien, John Winder
链接:http://arxiv.org/abs/2412.05768
备注:Accepted poster at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Workshop on Foundation Model Interventions

[4] Speech Is Not Enough: Interpreting Nonverbal Indicators of Common Knowledge and Engagement[cs.CL]
标题:言语不足:诠释共同知识及参与的非语言指标
作者:Derek Palmer, Yifan Zhu, Kenneth Lai, Hannah VanderHoeven, Mariah Bradford, Ibrahim Khebour, Carlos Mabrey, Jack Fitzgerald, Nikhil Krishnaswamy, Martha Palmer, James Pustejovsky
链接:http://arxiv.org/abs/2412.05797
备注:3 pages, 2 figures, appearing at AAAI 2025 Demos Track

[5] 1-800-SHARED-TASKS at RegNLP: Lexical Reranking of Semantic Retrieval (LeSeR) for Regulatory Question Answering[cs.CL]
标题:1-800-SHARED-TASKS在RegNLP中:语义检索的词汇重排序(LeSeR)用于监管问答
作者:Jebish Purbey, Drishti Sharma, Siddhant Gupta, Khawaja Murad, Siddartha Pullakhandam, Ram Mohan Rao Kadiyala
链接:http://arxiv.org/abs/2412.06009
备注:5 pages, Accepted to RegNLP @ COLING 2025

[6] Steering Large Language Models to Evaluate and Amplify Creativity[cs.CL]
标题:引导大型语言模型以评估和增强创造力
作者:Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Shao-yen Tseng, Vasudev Lal
链接:http://arxiv.org/abs/2412.06060
备注:(Spotlight) NeurIPS 2024 Workshop on Creativity & Generative AI. Authors 1 and 2 contributed equally

[7] Annotations for Exploring Food Tweets From Multiple Aspects[cs.CL]
标题:关于多角度探索食品推文的标注
作者:Matīss Rikters, Edison Marrese-Taylor, Rinalds Vīksna
链接:http://arxiv.org/abs/2412.06179
期刊:Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

[8] SafeWorld: Geo-Diverse Safety Alignment[cs.CL]
标题:SafeWorld:地理多元化的安全同步
作者:Da Yin, Haoyi Qiu, Kung-Hsiang Huang, Kai-Wei Chang, Nanyun Peng
链接:http://arxiv.org/abs/2412.06483
代码:https://github.com/PlusLabNLP/SafeWorld
备注:Accepted by NeurIPS 2024

[9] Data Quality Enhancement on the Basis of Diversity with Large Language Models for Text Classification: Uncovered, Difficult, and Noisy[cs.CL]
标题:基于多样化的大量语言模型文本分类中的数据质量提升:揭示、困难与噪声
作者:Min Zeng, Caiquan Liu, Shiqi Zhang, Li Xie, Chen Sang, Xiaoxin Chen, Xiaoxin Chen
链接:http://arxiv.org/abs/2412.06575
备注:Accepted by COLING 2025(main, long paper)

[10] GEAR: A Simple GENERATE, EMBED, AVERAGE AND RANK Approach for Unsupervised Reverse Dictionary[cs.CL]
标题:GEAR:一种简单的生成、嵌入、平均和排名的无监督反向字典方法
作者:Fatemah Almeman, Luis Espinosa-Anke
链接:http://arxiv.org/abs/2412.06654
备注:9 pages, Accepted at COLING 2025

[11] I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token[cs.CL]
标题:我不知道:使用[我不确定]标记的显式不确定性建模
作者:Roi Cohen, Konstantin Dobler, Eden Biran, Gerard de Melo
链接:http://arxiv.org/abs/2412.06676
备注:Published at NeurIPS 2024

[12] JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM[cs.CL]
标题:JAPAGEN:通过LLM生成日本语训练数据集实现的低/无样本学习高效方法
作者:Takuro Fujii, Satoru Katsumata
链接:http://arxiv.org/abs/2412.06738
备注:Accepted by PACLIC38 (2024)

感谢arxiv.org


Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/176768
 
126 次点击