Chrome Extension
WeChat Mini Program
Use on ChatGLM

PuLID: Pure and Lightning ID Customization Via Contrastive Alignment

NeurIPS 2024(2024)

MS student | Researcher | PhD student

Cited 50|Views255
Abstract
We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method for text-to-image generation. By incorporating a Lightning T2I branch with a standard diffusion one, PuLID introduces both contrastive alignment loss and accurate ID loss, minimizing disruption to the original model and ensuring high ID fidelity. Experiments show that PuLID achieves superior performance in both ID fidelity and editability. Another attractive property of PuLID is that the image elements (\eg, background, lighting, composition, and style) before and after the ID insertion are kept as consistent as possible. Codes and models are available at https://github.com/ToTheBeginning/PuLID
More
Translated text
Key words
diffusion,controllable image generation,image customization
PDF
Bibtex
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
  • Pretraining has recently greatly promoted the development of natural language processing (NLP)
  • We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
  • We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
  • The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
  • Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Try using models to generate summary,it takes about 60s
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers
A. Linear-probe
2021

被引用2135 | 浏览

Li Yun Chen,Mengyi Zhao,Yiheng Liu, Mingyue Ding, Yifan Song,Shizun Wang, Qianqian Wang,Yang Hao,Jing Liu, Ke-Lin Du,Min Zheng
2023

被引用2 | 浏览

Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点:介绍了一种全新的文本到图像生成的ID定制方法PuLID,通过引入对比对齐损失和准确ID损失,提高了ID的保真度和可编辑性。

方法:通过在标准扩散模型上结合Lightning T2I分支,实现了PuLID的ID定制方法,同时最小化对原模型的干扰。

实验:实验证明,PuLID在ID保真度和可编辑性方面表现优秀,并能在ID插入前后尽可能保持图像元素的一致性。数据集名称和结果未提及。