More Than Vanilla Fusion: a Simple, Decoupling-free, Attention Module for Multimodal Fusion Based on Signal Theory

Peiwen Sun, Yifan Zhang, Zishan Liu,Donghao Chen,Honggang Zhang

arXiv (Cornell University)（2023）

Cited 0|Views23

Key words

Audio-Visual Speech Recognition

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined