Landan/text encoder refactor #124

Landanjs · 2024-02-29T04:53:43Z

This PR has a few new features:

Changes SDXLTextEncoder and SDXLTokenizer to be MultiTextEncoder and MultiTokenizer, respectively. These two new classes enable using arbitrary combinations of OpenAI CLIP, OpenCLIP, E5 and possibly other models with minor tweaks.
a. CONFIG CHANGE: Since there is no more SDXL specific tokenizer, a new argument sdxl_conditioning is added in image_caption.py to specify whether to generate SDXL-style conditioning in the dataset.
Use the attention_mask in the text encoder.

Minor:

Set qkv_clip = None

jazcollins

left some comments!!

diffusion/datasets/image_caption.py

diffusion/models/models.py

jazcollins

thanks for refactoring!!

since we newly have the MultiTokenizer, we will want to change this line as well to access one of the tokenizers in the list:

diffusion/diffusion/evaluation/clean_fid_eval.py

Line 163 in fa13a30

    
           text_captions = self.tokenizer.tokenizer.batch_decode(captions[:, 0, :], skip_special_tokens=True)

diffusion/datasets/image_caption.py

diffusion/models/text_encoder.py

diffusion/models/stable_diffusion.py

diffusion/models/text_encoder.py

Landanjs · 2024-03-11T20:35:46Z

Ooops, requested review before implementing .batch_decode...

coryMosaicML

Looks good! Thanks for doing this!

diffusion/models/stable_diffusion.py

diffusion/models/text_encoder.py

jazcollins

lgtm, this is awesome landan!!!

doesn't necessarily have to be in the same PR, but it would be good to update the SDXL sample yamls (both mosaic and hydra) with the new relevant yaml params (tokenizer_names_or_paths and sdxl_conditioning in the dataloader, etc.)

Landanjs added 11 commits October 17, 2023 11:27

Merge branch 'main' of github.com:mosaicml/diffusion

fc92a74

Merge branch 'main' of github.com:mosaicml/diffusion

b74d7d0

Merge branch 'main' of github.com:mosaicml/diffusion

4751a2f

Merge branch 'main' of github.com:mosaicml/diffusion

cbcaf13

Merge branch 'main' of github.com:mosaicml/diffusion

7004cc8

Merge branch 'main' of github.com:mosaicml/diffusion

db86c67

Adjust image_caption dataset to use arbitrary number of tokenizers

3ebe3d4

First pass on models.py

5359cf1

Partial pass on stable_diffusion.py

1fd973b

remove sdxl classes

c906981

Half pass on generate()

2186d75

jazcollins reviewed Mar 4, 2024

View reviewed changes

diffusion/datasets/image_caption.py Outdated Show resolved Hide resolved

diffusion/models/models.py Outdated Show resolved Hide resolved

Landanjs added 11 commits March 5, 2024 10:20

Make MultiTextEncoder class and adjust StableDiffusion.forward()

cde65fc

Actually add MultiTextEncoder

f755c9e

Refactor tokenizers into MultiTokenizer

9910da4

remove defaults, use multitokenizer in factory function

69765f7

Missed some other lines

11b8282

get things to work with generate()

d630da2

List[str] -> Tuple[str, ...] and style

75e1bd4

Fix typing, safer text_encoder output, fix inference

8e61077

Update unet.cross_attention_dim

4e1b4af

Merge branch 'main' of github.com:mosaicml/diffusion

066e68f

Merge with main

704437b

Landanjs marked this pull request as ready for review March 6, 2024 00:01

Landanjs added 4 commits March 6, 2024 08:53

() around nullcontext

e562ebe

Hacky fix

c013677

Special projection case

1470fd7

Fix docstrings, check size

936d55b

jazcollins reviewed Mar 8, 2024

View reviewed changes

diffusion/datasets/image_caption.py Outdated Show resolved Hide resolved

diffusion/models/text_encoder.py Show resolved Hide resolved

coryMosaicML reviewed Mar 8, 2024

View reviewed changes

diffusion/models/text_encoder.py Outdated Show resolved Hide resolved

Landanjs added 13 commits March 8, 2024 20:29

style

8e882cc

Merge branch 'main' of github.com:mosaicml/diffusion

36cad9d

Merge branch 'main' into landan/text-encoder-refactor

3fedc8e

style

eb894a7

SDXL generate test

4a21585

style

c55ffc6

really hacky code... fix soon

00036f3

oops :(

8dcd41e

update error message

424ff8e

Missed squeeze

3973e06

missed test

962b7d3

Make a bit cleaner

0c4c09c

stupid check

ba9586e

Landanjs requested review from jazcollins and coryMosaicML March 11, 2024 19:44

Landanjs commented Mar 11, 2024

View reviewed changes

diffusion/models/stable_diffusion.py Show resolved Hide resolved

Landanjs commented Mar 11, 2024

View reviewed changes

diffusion/models/stable_diffusion.py Show resolved Hide resolved

Landanjs commented Mar 11, 2024

View reviewed changes

diffusion/models/text_encoder.py Show resolved Hide resolved

Landanjs added 3 commits March 11, 2024 13:51

Minor fixes and independent projection dimension

df576a0

batch_decode

40f593e

style

19fabf5

coryMosaicML approved these changes Mar 11, 2024

View reviewed changes

diffusion/models/stable_diffusion.py Show resolved Hide resolved

Skylion007 reviewed Mar 11, 2024

View reviewed changes

diffusion/models/stable_diffusion.py Show resolved Hide resolved

Skylion007 reviewed Mar 11, 2024

View reviewed changes

diffusion/models/stable_diffusion.py Show resolved Hide resolved

Skylion007 reviewed Mar 11, 2024

View reviewed changes

diffusion/models/text_encoder.py Outdated Show resolved Hide resolved

RuntimeError -> ValueError

3b9a063

jazcollins approved these changes Mar 12, 2024

View reviewed changes

Landanjs merged commit da716d7 into mosaicml:main Mar 12, 2024
5 checks passed

A-Jacobson mentioned this pull request Mar 15, 2024

Add option to use E5 text encoder for SDXL #108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Landan/text encoder refactor #124

Landan/text encoder refactor #124

Landanjs commented Feb 29, 2024 •

edited

Loading

jazcollins left a comment

jazcollins left a comment

Landanjs commented Mar 11, 2024

coryMosaicML left a comment

jazcollins left a comment

Landan/text encoder refactor #124

Landan/text encoder refactor #124

Conversation

Landanjs commented Feb 29, 2024 • edited Loading

jazcollins left a comment

Choose a reason for hiding this comment

jazcollins left a comment

Choose a reason for hiding this comment

Landanjs commented Mar 11, 2024

coryMosaicML left a comment

Choose a reason for hiding this comment

jazcollins left a comment

Choose a reason for hiding this comment

Landanjs commented Feb 29, 2024 •

edited

Loading