ViTA: an Efficient Video-to-Text Algorithm Using VLM for RAG-based Video Analysis System
Computer Vision and Pattern Recognition(2024)
关键词
Video Analytics,Retrieval Augmented Generation (RAG),Natural Language Processing,Vision Language Models (VLMs),Large Language Models (LLMs)
AI 理解论文
溯源树
样例

生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要