Chrome Extension
WeChat Mini Program
Use on ChatGLM

Multimodal Deep Learning Using On-Chip Diffractive Optics with in Situ Training Capability

Nature Communications(2024)

Huazhong Univ Sci & Technol | Chinese Univ Hong Kong | Univ Shanghai Sci & Technol

Cited 5|Views17
Abstract
Multimodal deep learning plays a pivotal role in supporting the processing and learning of diverse data types within the realm of artificial intelligence generated content (AIGC). However, most photonic neuromorphic processors for deep learning can only handle a single data modality (either vision or audio) due to the lack of abundant parameter training in optical domain. Here, we propose and demonstrate a trainable diffractive optical neural network (TDONN) chip based on on-chip diffractive optics with massive tunable elements to address these constraints. The TDONN chip includes one input layer, five hidden layers, and one output layer, and only one forward propagation is required to obtain the inference results without frequent optical-electrical conversion. The customized stochastic gradient descent algorithm and the drop-out mechanism are developed for photonic neurons to realize in situ training and fast convergence in the optical domain. The TDONN chip achieves a potential throughput of 217.6 tera-operations per second (TOPS) with high computing density (447.7 TOPS/mm2), high system-level energy efficiency (7.28 TOPS/W), and low optical latency (30.2 ps). The TDONN chip has successfully implemented four-class classification in different modalities (vision, audio, and touch) and achieve 85.7% accuracy on multimodal test sets. Our work opens up a new avenue for multimodal deep learning with integrated photonic processors, providing a potential solution for low-power AI large models using photonic technology. Most photonic processors can only handle a single data modality due to the lack of abundant parameter training in optical domain. Here, authors propose and demonstrate a trainable diffractive optical neural network chip based on on-chip diffractive optics with tunable elements to address these constraints.
More
Translated text
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出并实现了一款基于芯片上衍射光学和大量可调节元素的可训练衍射光学神经网络芯片(TDONN),能够在光学域内进行现场训练,实现了多模态数据的处理和分类。

方法】:通过定制化的随机梯度下降算法和dropout机制,实现了光子神经元的现场训练和快速收敛。

实验】:TDONN芯片在不同模态(视觉、音频和触觉)的四类分类任务上取得了85.7%的准确率,使用了专门设计的训练数据集,并达到了217.6 TOPS的潜在吞吐量、447.7 TOPS/mm²的高计算密度、7.28 TOPS/W的高系统级能量效率以及30.2 ps的低光学延迟。