多模态大语言模型arxiv论文略读(八十一)
What is the Visual Cognition Gap between Humans and Multimodal LLMs? ➡️ 论文标题:What is the Visual Cognition Gap between Humans and Multimodal LLMs? ➡️ 论文作者:Xu Cao, Bolin Lai, Wenqian Ye, Yunsheng Ma, Joerg Heintz, Jintai …
2025-06-17